Date Time Series Help...Please

spra8560 · Feb-01-2024, 04:24 AM

Hello,

I am trying to figure out why I can't get this to work. I have a SQL server database and I am able to connect to it and query it by pyodbc (SQLAlchemy won't work) and I save that as a csv file. In that file, I combined over 100 tables with about 84,000 rows each. Each of the rows are TIMESTAMP at 15 min intervals (energy meters) and I am attempting to sum up the consumption values (already computed in the csv file (difference from the previous VALUE associated TIMESTAMP)). In this python script, I am trying to generate 54 columns where the first column is Tablename from the csv file, Most Recent TIMESTAMP of the associated data pull, Week_1, Week_2, Week_3.....Week_52. The week columns are supposed to sum up the Consumption column from the csv for each week starting from today's date going back a full 52 weeks (for a full year). Each row in this table should be only one unique table name per row.

Python generates a csv that has a unique tablename per row but the calculations are in close to 3270631 vice the total should be 15769 KWH for the entire week. Any help would be greatly appreciated. I did make sure the TIMESTAMP is pandas formatted for date time when I parsed it, so not sure what it is basing the calculations off of. Here is my code:

import pandas as pd

# Load the combined_data.csv file
combined_df = pd.read_csv('combined_data.csv')

# Ensure 'TIMESTAMP' is in datetime format
combined_df['TIMESTAMP'] = pd.to_datetime(combined_df['TIMESTAMP'])

# Calculate the week number for each row
combined_df['Week_Number'] = combined_df['TIMESTAMP'].dt.strftime('%Y-%U')

# Calculate Consumption based on the previous row
combined_df['Consumption'] = combined_df.groupby('TableName')['VALUE'].diff()

# Exclude rows where Consumption is zero or negative
combined_df['Consumption'] = combined_df['Consumption'].apply(lambda x: x if x > 0 else pd.NA)

# Calculate the current week for filtering
current_week = pd.to_datetime('today').strftime('%Y-%U')

# Filter rows for the current week and the three previous weeks
filtered_df = combined_df[combined_df['Week_Number'].between(current_week, current_week) |
combined_df['Week_Number'].between(current_week, current_week) |
combined_df['Week_Number'].between(str(pd.to_datetime(current_week) - pd.DateOffset(weeks=1)), current_week) |
combined_df['Week_Number'].between(str(pd.to_datetime(current_week) - pd.DateOffset(weeks=2)), current_week) |
combined_df['Week_Number'].between(str(pd.to_datetime(current_week) - pd.DateOffset(weeks=3)), current_week)]

# Group by TableName and Week_Number and sum the Consumption values
weekly_sum_df = filtered_df.groupby(['TableName', 'Week_Number'])['Consumption'].sum().unstack().reset_index()

# Rename the columns to represent each week
weekly_sum_df.columns = ['TableName'] + [f'Week_{i}' for i in range(1, weekly_sum_df.shape[1])]

# Save the result to a CSV file
weekly_sum_df.to_csv('weekly_sum_by_table.csv', index=False)

print("Weekly sums by table saved to 'weekly_sum_by_table.csv'.")

**deanhystad** · Feb-01-2024, 01:09 PM

I think you are doing it wrong. I think Consumption should be computed when you are making the combined_data.csv.

for table in tables
    get table
    compute totals for table
    add table as row to combined dataframe

This could be way off because I'm having trouble following your description. A short example would be helpful.

spra8560 · Feb-01-2024, 01:38 PM

(Feb-01-2024, 01:09 PM)deanhystad Wrote: I think you are doing it wrong. I think Consumption should be computed when you are making the combined_data.csv.
for table in tables
    get table
    compute totals for table
    add table as row to combined dataframe
This could be way off because I'm having trouble following your description. A short example would be helpful.

Thanks. I was able to get it to work by splitting the entire csv file into arrays and loading them into numpy. For some reason, even though I parsed it in with a date_time format, it didn't like it in pandas

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Compare current date on calendar with date format file name	Fioravanti	1	1,848	Mar-26-2024, 08:23 AM Last Post: Pedroski55
	Python date format changes to date & time	1418	4	2,496	Jan-20-2024, 04:45 AM Last Post: 1418
	Downloading time zone aware files, getting wrong files(by date))s	tester_V	9	2,725	Jul-23-2023, 08:32 AM Last Post: deanhystad
	Formatting a date time string read from a csv file	DosAtPython	5	4,790	Jun-19-2023, 02:12 PM Last Post: DosAtPython
	Print names in x-axis of a time-series values	hobbyist	4	2,188	Apr-22-2023, 09:29 PM Last Post: deanhystad
	Time series	JasminQuinn	0	1,416	Apr-22-2022, 10:33 PM Last Post: JasminQuinn
	Wait til a date and time	KatManDEW	2	2,075	Mar-11-2022, 08:05 PM Last Post: KatManDEW
	How to read rainfall time series and insert missing data points	MadsM	4	3,200	Jan-06-2022, 10:39 AM Last Post: amdi40
	Date format and past date check function	Turtle	5	9,395	Oct-22-2021, 09:45 PM Last Post: deanhystad
	How to add previous date infront of every unique customer id's invoice date	ur_enegmatic	1	2,801	Feb-06-2021, 10:48 PM Last Post: eddywinch82

Date Time Series Help...Please

User Panel Messages

Announcements