Python Forum
Date Time Series Help...Please
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Date Time Series Help...Please
#1
Hello,

I am trying to figure out why I can't get this to work. I have a SQL server database and I am able to connect to it and query it by pyodbc (SQLAlchemy won't work) and I save that as a csv file. In that file, I combined over 100 tables with about 84,000 rows each. Each of the rows are TIMESTAMP at 15 min intervals (energy meters) and I am attempting to sum up the consumption values (already computed in the csv file (difference from the previous VALUE associated TIMESTAMP)). In this python script, I am trying to generate 54 columns where the first column is Tablename from the csv file, Most Recent TIMESTAMP of the associated data pull, Week_1, Week_2, Week_3.....Week_52. The week columns are supposed to sum up the Consumption column from the csv for each week starting from today's date going back a full 52 weeks (for a full year). Each row in this table should be only one unique table name per row.

Python generates a csv that has a unique tablename per row but the calculations are in close to 3270631 vice the total should be 15769 KWH for the entire week. Any help would be greatly appreciated. I did make sure the TIMESTAMP is pandas formatted for date time when I parsed it, so not sure what it is basing the calculations off of. Here is my code:

import pandas as pd

# Load the combined_data.csv file
combined_df = pd.read_csv('combined_data.csv')

# Ensure 'TIMESTAMP' is in datetime format
combined_df['TIMESTAMP'] = pd.to_datetime(combined_df['TIMESTAMP'])

# Calculate the week number for each row
combined_df['Week_Number'] = combined_df['TIMESTAMP'].dt.strftime('%Y-%U')

# Calculate Consumption based on the previous row
combined_df['Consumption'] = combined_df.groupby('TableName')['VALUE'].diff()

# Exclude rows where Consumption is zero or negative
combined_df['Consumption'] = combined_df['Consumption'].apply(lambda x: x if x > 0 else pd.NA)

# Calculate the current week for filtering
current_week = pd.to_datetime('today').strftime('%Y-%U')

# Filter rows for the current week and the three previous weeks
filtered_df = combined_df[combined_df['Week_Number'].between(current_week, current_week) |
combined_df['Week_Number'].between(current_week, current_week) |
combined_df['Week_Number'].between(str(pd.to_datetime(current_week) - pd.DateOffset(weeks=1)), current_week) |
combined_df['Week_Number'].between(str(pd.to_datetime(current_week) - pd.DateOffset(weeks=2)), current_week) |
combined_df['Week_Number'].between(str(pd.to_datetime(current_week) - pd.DateOffset(weeks=3)), current_week)]

# Group by TableName and Week_Number and sum the Consumption values
weekly_sum_df = filtered_df.groupby(['TableName', 'Week_Number'])['Consumption'].sum().unstack().reset_index()

# Rename the columns to represent each week
weekly_sum_df.columns = ['TableName'] + [f'Week_{i}' for i in range(1, weekly_sum_df.shape[1])]

# Save the result to a CSV file
weekly_sum_df.to_csv('weekly_sum_by_table.csv', index=False)

print("Weekly sums by table saved to 'weekly_sum_by_table.csv'.")
Reply
#2
I think you are doing it wrong. I think Consumption should be computed when you are making the combined_data.csv.

for table in tables
    get table
    compute totals for table
    add table as row to combined dataframe
This could be way off because I'm having trouble following your description. A short example would be helpful.
Reply
#3
(Feb-01-2024, 01:09 PM)deanhystad Wrote: I think you are doing it wrong. I think Consumption should be computed when you are making the combined_data.csv.

for table in tables
    get table
    compute totals for table
    add table as row to combined dataframe
This could be way off because I'm having trouble following your description. A short example would be helpful.

Thanks. I was able to get it to work by splitting the entire csv file into arrays and loading them into numpy. For some reason, even though I parsed it in with a date_time format, it didn't like it in pandas
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Compare current date on calendar with date format file name Fioravanti 1 249 Mar-26-2024, 08:23 AM
Last Post: Pedroski55
  Python date format changes to date & time 1418 4 622 Jan-20-2024, 04:45 AM
Last Post: 1418
  Downloading time zone aware files, getting wrong files(by date))s tester_V 9 1,057 Jul-23-2023, 08:32 AM
Last Post: deanhystad
  Formatting a date time string read from a csv file DosAtPython 5 1,300 Jun-19-2023, 02:12 PM
Last Post: DosAtPython
  Print names in x-axis of a time-series values hobbyist 4 1,244 Apr-22-2023, 09:29 PM
Last Post: deanhystad
  Time series JasminQuinn 0 1,025 Apr-22-2022, 10:33 PM
Last Post: JasminQuinn
  Wait til a date and time KatManDEW 2 1,439 Mar-11-2022, 08:05 PM
Last Post: KatManDEW
  How to read rainfall time series and insert missing data points MadsM 4 2,192 Jan-06-2022, 10:39 AM
Last Post: amdi40
  Date format and past date check function Turtle 5 4,281 Oct-22-2021, 09:45 PM
Last Post: deanhystad
  How to add previous date infront of every unique customer id's invoice date ur_enegmatic 1 2,244 Feb-06-2021, 10:48 PM
Last Post: eddywinch82

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020