Consolidating code - Printable Version

Consolidating code - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Data Science (https://python-forum.io/forum-44.html)
+--- Thread: Consolidating code (/thread-21280.html)

Consolidating code - jhartc - Sep-22-2019

I have the following code:

# Load the Pandas libraries with alias 'pd' 
import pandas as pd 
import numpy as np
# Read data from file 'filename.csv' 
# (in the same directory that your python process is based)
# Control delimiters, rows, column names with read_csv (see later) 
df = pd.read_csv("../example.csv", parse_dates=['DateTime']) 
# Preview the first 5 lines of the loaded data 



df = df.assign(Burned = df['Quantity'])
df.loc[df['To'] != '0x0000000000000000000000000000000000000000', 'Burned'] = 0.0
# OR:

df['cum_sum'] = df['Burned'].cumsum()
df['percent_burned'] = df['cum_sum']/df['Quantity'].max()*100.0

per_day           =    df.groupby(df['DateTime'].dt.date)['Burned'].count().reset_index(name='Trx')
per_day_burned    =    df.groupby(df['DateTime'].dt.date)['Burned'].sum().reset_index(name='Burned') 

per_day['Burned'] = per_day_burned['Burned']
per_day['Burned_per_trx']=per_day['Burned']/per_day['Trx']

Clearly my code needs some work. You can see that I want to create a dataFrame called per_day that groups the transactions in df by date so that instead of getting thousands of transaction a day, it just shows the total number of transactions in a day.

You can see that I create per_day_burned to get another dataFrame before combining it into the per_day dataFrame. Is there a way to consolidate per_day and per_day_burned into a single line of code? Or how can I directly have per_day_burned be part of per_day? I dont even want the per_day_burned dataFrame. I just want it to be a class in per_day (like the 3rd column). How do I do this?

Doing something like per_day['Burned']=df.groupby(df['DateTime'].dt.date)['Burned'].sum().reset_index(name='Burned') because it sends to arguments to a dataFrame expecting 1 argument.

The second question I have is this:

The code:

 per_day           =    df.groupby(df['DateTime'].dt.date)['Burned'].count().reset_index(name='Trx')

gives the total sum of transactions on a given day. How do I exlcude the transactions from being included if 'Burned' = 0 for that transaction?