![]() |
Consolidating code - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Data Science (https://python-forum.io/forum-44.html) +--- Thread: Consolidating code (/thread-21280.html) |
Consolidating code - jhartc - Sep-22-2019 I have the following code: # Load the Pandas libraries with alias 'pd' import pandas as pd import numpy as np # Read data from file 'filename.csv' # (in the same directory that your python process is based) # Control delimiters, rows, column names with read_csv (see later) df = pd.read_csv("../example.csv", parse_dates=['DateTime']) # Preview the first 5 lines of the loaded data df = df.assign(Burned = df['Quantity']) df.loc[df['To'] != '0x0000000000000000000000000000000000000000', 'Burned'] = 0.0 # OR: df['cum_sum'] = df['Burned'].cumsum() df['percent_burned'] = df['cum_sum']/df['Quantity'].max()*100.0 per_day = df.groupby(df['DateTime'].dt.date)['Burned'].count().reset_index(name='Trx') per_day_burned = df.groupby(df['DateTime'].dt.date)['Burned'].sum().reset_index(name='Burned') per_day['Burned'] = per_day_burned['Burned'] per_day['Burned_per_trx']=per_day['Burned']/per_day['Trx']Clearly my code needs some work. You can see that I want to create a dataFrame called per_day that groups the transactions in df by date so that instead of getting thousands of transaction a day, it just shows the total number of transactions in a day. You can see that I create per_day_burned to get another dataFrame before combining it into the per_day dataFrame. Is there a way to consolidate per_day and per_day_burned into a single line of code? Or how can I directly have per_day_burned be part of per_day? I dont even want the per_day_burned dataFrame. I just want it to be a class in per_day (like the 3rd column). How do I do this? Doing something like per_day['Burned']=df.groupby(df['DateTime'].dt.date)['Burned'].sum().reset_index(name='Burned') because it sends to arguments to a dataFrame expecting 1 argument. The second question I have is this: The code: per_day = df.groupby(df['DateTime'].dt.date)['Burned'].count().reset_index(name='Trx')gives the total sum of transactions on a given day. How do I exlcude the transactions from being included if 'Burned' = 0 for that transaction? |