Python Forum
Parsing "aTimeLogger" Android app data to graphs using pandas
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Parsing "aTimeLogger" Android app data to graphs using pandas
#1
Greetings Pythonistas!

I’ve got a data set in CSV format which spans from July 2019 up to December 2023. It is a collection of productivity intervals that I’ve meticulously recorded while doing social science research like Philosophy as well as learning how to program Python, Django, some DevOps, and a little algorithmic design. When I finish spending 25 minutes on my lunch hour on my tablet watching Udemy course content teaching the basics of Binary Search Trees, at the end of my lunch break, using an app called aTimeLogger on my Android phone I enter the “Activity” type (discrete subjects such as “Python”, “algorithms”, “Django”, “sysadmin”, or even “Magick” / “writing”). I also enter the date, the start time, and the end time. After entering a start time and end time, a time delta is automatically calculated. Then I write an annotation (1-2 sentences) which is sort of like a mental note for my future reference. See attached for the data set. Take note: I purged the “Comment” (annotation) data objects (string fields) from the data set for the sake of this forum thread.

For additional context, over the term of the data set, I’ve spent a total of ~378 hours doing something “Python” related and ~579 hours working on Django course content (or a Django based web project). Those are the two largest categories. The rest of the Activities aren’t as data dense.

I am trying to plot these activities as line graphs over time. The x-axis is the continuous element of time. The y-axis are the summed duration time deltas for every entry. If I take just the ‘Python’ activity and plot every instance as a line graph over the ~5 year span/period, the data points are noisy, busy, and hard to follow. But they still plot successfully. Here is my initilization of the data:

import pandas as pd
import matplotlib.pyplot as plt

bulk_df = pd.read_csv("data/all-comments-removed.csv",parse_dates=["From","To",])
bulk_df['Duration'] = pd.to_timedelta(bulk_df['Duration'])
Here is some general info about my dataset:

bulk_df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2583 entries, 0 to 2582
Data columns (total 4 columns):
 #   Column    Non-Null Count  Dtype          
---  ------    --------------  -----          
 0   Activity  2583 non-null   object         
 1   Duration  2583 non-null   timedelta64[ns]
 2   From      2583 non-null   datetime64[ns] 
 3   To        2583 non-null   datetime64[ns] 
dtypes: datetime64[ns](2), object(1), timedelta64[ns](1)
memory usage: 80.8+ KB
Here is my ‘Python’ activity data parsed:

# Create a DataFrame
python = bulk_df[bulk_df["Activity"] == "Python"] #.rolling(window=90).mean()

# Plotting
python.plot(x='From', y='Duration', kind='line', marker='o',figsize=(14, 8))
Here is the output so far:

   

That kind of works.

But what I really want to do is use pandas’ rolling() method and pass in integers like ‘182’ days for a half year and ‘90’ for 3 months (quarter year). I tried that . You can see my code snippet above where I tried to use .rolling(). My Jupyter Notebook didn’t like it so I commented it out. I am clearly doing something wrong.

Here was a separate attempt at a different stage in my coding session this morning where I tried to plot the “Python” activity with the original noisy data points which I tried to simulatenously contrast with a rolling window:

# Convert timedelta to hours
bulk_df['Duration_hours'] = bulk_df['Duration'].dt.total_seconds() / 3600

# Create a DataFrame
python = bulk_df[bulk_df["Activity"] == "Python"]

# Plotting
# python.plot(x='From', y='Duration_hours', kind='line', marker='o')

# Calculate the rolling mean with a window of 2
rolling_mean = python['Duration_hours'].rolling(window=2).mean()

# Plotting
python.plot(x='From', y='Duration_hours', kind='line', marker='o',figsize=(14, 8))
rolling_mean.plot(x='From', y='Duration_hours', kind='line', marker='o', label='Rolling Mean')
Here was the strange output:

   

How do I tell pandas to show the quarterly and half year averages of the timedelta data point for the “Python” activity. Thanks!

Attached Files

.csv   all-comments-removed.csv (Size: 144.36 KB / Downloads: 4)
Reply


Messages In This Thread
Parsing "aTimeLogger" Android app data to graphs using pandas - by Drone4four - May-12-2024, 09:15 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Parsing and summing time deltas (duration) onto bar + pie charts using pandas - - DRY Drone4four 2 819 Feb-10-2024, 06:04 PM
Last Post: Drone4four
  Grouping in pandas/multi-index data frame Aleqsie 3 936 Jan-06-2024, 03:55 PM
Last Post: deanhystad
Smile How to further boost the data read write speed using pandas tjk9501 1 1,373 Nov-14-2022, 01:46 PM
Last Post: jefsummers
  How to plot 2 graphs in one figure? man0s 1 1,531 Apr-25-2022, 09:18 AM
Last Post: Axel_Erfurt
Thumbs Up can't access data from URL in pandas/jupyter notebook aaanoushka 1 1,994 Feb-13-2022, 01:19 PM
Last Post: jefsummers
Question Sorting data with pandas TheZaind 4 2,551 Nov-22-2021, 07:33 PM
Last Post: aserian
  Pandas Data frame column condition check based on length of the value aditi06 1 2,842 Jul-28-2021, 11:08 AM
Last Post: jefsummers
  [Pandas] Write data to Excel with dot decimals manonB 1 6,185 May-05-2021, 05:28 PM
Last Post: ibreeden
  pandas.to_datetime: Combine data from 2 columns ju21878436312 1 2,564 Feb-20-2021, 08:25 PM
Last Post: perfringo
  pandas read_csv can't handle missing data mrdominikku 0 2,667 Jul-09-2020, 12:26 PM
Last Post: mrdominikku

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020