Parsing "aTimeLogger" Android app data to graphs using pandas

Drone4four · Jun-12-2024, 06:56 PM

I have returned to this same project and wish to extend my data analysis. My two latest code snippets and graphs can be found below.

Here are my questions for each pair:

In the first snippet and graph, in my Jupyter Notebook pandas and matplotlib show two categories successfully thanks to the helpful feedback from other forum members. So thank you to those who have contributed to the discussion so far. But I noticed that when I change the alpha (translucency) variable, the time spent on the different categories overlap each other. How do I stack the data instead? That’s my first question.
In the second snippet and graph, only one category shows up (”Magick”). How do I get the other “Research” category to show? As far as I can tell, the way I parse, modify, and cast function calls and methods against the two dataframes should work. I’ve been swapping out variable names, tried refactoring, as well as making large and small other changes without success. Who here can identify what I might be missing to get both categories to show (instead of one)? (My additional intent here is to ensure they also stack (rather than overlapping) like I have set out to do with the first graph).

First pair:

import pandas as pd
pd.set_option('display.expand_frame_repr', False)
import matplotlib.pyplot as plt
 
bulk_df = pd.read_csv('data/all-comments-removed.csv', parse_dates=["From", "To"])
bulk_df['Duration'] = pd.to_timedelta(bulk_df['Duration'])
bulk_df['Duration_hours'] = bulk_df['Duration'].dt.total_seconds() / 3600

# Copy so changes made to python_df dos not affect bulk_df and vice versa
python_df = bulk_df[bulk_df["Activity"] == "Python"].copy()
python_df.set_index('From', inplace=True)

# Calculate rolling means using the index now
python_df['Rolling_Mean_90'] = python_df['Duration_hours'].rolling('90D').mean()
python_df['Rolling_Mean_182'] = python_df['Duration_hours'].rolling('182D').mean()

# Copy so changes made to django_df dos not affect bulk_df and vice versa
django_df = bulk_df[bulk_df["Activity"] == "Django"].copy()
django_df.set_index('From', inplace=True)
# Calculate rolling means using the index now
django_df['Rolling_Mean_90'] = django_df['Duration_hours'].rolling('90D').mean()
django_df['Rolling_Mean_182'] = django_df['Duration_hours'].rolling('182D').mean()

python_df_Month = python_df['Rolling_Mean_90'].resample('MS').sum()
django_df_Month = django_df['Rolling_Mean_90'].resample('MS').sum()
# py_dj_Month_combined = python_df_Month.add(django_df_Month, fill_value=0)

plt.figure(figsize=(14, 8))
plt.bar(python_df_Month.index, python_df_Month, label='Python 90-Day Rolling Mean',width=20, alpha=0.5) # color='red')
plt.bar(django_df_Month.index, django_df_Month, label='Django 90-Day Rolling Mean', width=20, alpha=0.5) #, color='blue')
plt.legend()
plt.title('Stacked Bar Chart for Python and Django Activities')
plt.xlabel('Date')
plt.ylabel('Hours Spent')
plt.show()

That renders as:

Second pair:

import pandas as pd
pd.set_option('display.expand_frame_repr', False)
import matplotlib.pyplot as plt

# Load the data
bulk_df = pd.read_csv('data/all-comments-removed.csv', parse_dates=["From", "To"])
bulk_df['Duration'] = pd.to_timedelta(bulk_df['Duration'])
bulk_df['Duration_hours'] = bulk_df['Duration'].dt.total_seconds() / 3600

# Copy and filter data for "Magick" activity and calculate rolling means
magick_df = bulk_df[bulk_df["Activity"] == "Magick"].copy()
magick_df.set_index('From', inplace=True)
magick_df['Rolling_Mean_90'] = magick_df['Duration_hours'].rolling('90D').mean()
magick_df['Rolling_Mean_182'] = magick_df['Duration_hours'].rolling('182D').mean()

# Copy and filter data for "Research (general)" activity and calculate rolling means
research_df = bulk_df[bulk_df["Activity"] == "Research (general)"].copy()
research_df.set_index('From', inplace=True)
research_df['Rolling_Mean_90'] = research_df['Duration_hours'].rolling('90D').mean()
research_df['Rolling_Mean_182'] = research_df['Duration_hours'].rolling('182D').mean()

# Resample data
magick_df_Month = magick_df['Rolling_Mean_90'].resample('MS').sum()
research_df_Month = research_df['Rolling_Mean_90'].resample('MS').sum()

# Plot the combined data with wider bars
plt.figure(figsize=(12, 6))
plt.bar(research_df_Month.index, research_df_Month, label='"Research" 90-Day Rolling Mean', width=20, alpha=0.5, color='blue')
plt.bar(magick_df_Month.index, magick_df_Month, label='"Magick" ("Philosophy") 90-Day Rolling Mean',width=20, alpha=0.5, color='red')

plt.legend()
plt.title('Stacked Bar Chart for Magick and Research Activities')
plt.xlabel('Date')
plt.ylabel('Hours Spent')
plt.show()

That shows as:

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Parsing and summing time deltas (duration) onto bar + pie charts using pandas - - DRY	Drone4four	2	824	Feb-10-2024, 06:04 PM Last Post: Drone4four
	Grouping in pandas/multi-index data frame	Aleqsie	3	942	Jan-06-2024, 03:55 PM Last Post: deanhystad
	How to further boost the data read write speed using pandas	tjk9501	1	1,379	Nov-14-2022, 01:46 PM Last Post: jefsummers
	How to plot 2 graphs in one figure?	man0s	1	1,537	Apr-25-2022, 09:18 AM Last Post: Axel_Erfurt
	can't access data from URL in pandas/jupyter notebook	aaanoushka	1	1,998	Feb-13-2022, 01:19 PM Last Post: jefsummers
	Sorting data with pandas	TheZaind	4	2,557	Nov-22-2021, 07:33 PM Last Post: aserian
	Pandas Data frame column condition check based on length of the value	aditi06	1	2,847	Jul-28-2021, 11:08 AM Last Post: jefsummers
	[Pandas] Write data to Excel with dot decimals	manonB	1	6,190	May-05-2021, 05:28 PM Last Post: ibreeden
	pandas.to_datetime: Combine data from 2 columns	ju21878436312	1	2,569	Feb-20-2021, 08:25 PM Last Post: perfringo
	pandas read_csv can't handle missing data	mrdominikku	0	2,671	Jul-09-2020, 12:26 PM Last Post: mrdominikku

Parsing "aTimeLogger" Android app data to graphs using pandas

User Panel Messages

Announcements