![]() |
Binning data to files - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Data Science (https://python-forum.io/forum-44.html) +--- Thread: Binning data to files (/thread-27780.html) |
Binning data to files - Kappel - Jun-21-2020 Hi, I am trying to develop a Power Curve / Algorithm that can give me a "Possible Max Power" signal when the Solar Panels on my roof are scaled-down in production. To make this power curve I fetch data every second where I get a dataset similar to the below (Made with random.randint). What would be a way I could separate this data into bins (Based on Solar_Radiation) so I can calculate a correlation between production and PV_Cell_Temp in each bin? I have been looking around on the internet, but there doesn't seem to be anything in Pandas I can use to do this.. Timestamp,PV_Production,Solar_Radiation,PV_Cell_Temp,Ambient_Temp 2020-06-21 13:37:02.934901,0,206,164.8,0 2020-06-21 13:37:02.935898,0,312,124.8,0 2020-06-21 13:37:02.942879,0,234,23.4,0 2020-06-21 13:37:02.943877,0,230,230.0,0 2020-06-21 13:37:02.944874,0,273,218.4,0 2020-06-21 13:37:02.948862,0,317,95.1,0 2020-06-21 13:37:02.951855,0,328,328.0,0 2020-06-21 13:37:02.954847,0,311,0.0,0 RE: Binning data to files - Larz60+ - Jun-21-2020 Need further specification:
RE: Binning data to files - Kappel - Jun-22-2020 Sorry about the missing info. What I want my end result to be is a power curve where the production is determined based on the variables: PV_Production: Total solar panel production [w] Solar_Radiation: Solar radiation [w/m2] PV_Cell_Temp: Temperature of the solar panel cells [c] Ambient_Temp: Ambient temperature [c] (Not used, PV_Cell_Temp should suffice. For the bins I imagined that: - A bin consists of all the data points with 'Solar_Radiation' as the binning factor in an interval of 100 Eg. Bin 1: if 0 < Solar_Radiation <= 100: (All data from each string where solar radiation is 0..100) Bin 2: 100 < Solar_Radiation <= 200 ... Bin 3... What are the keys: What do you mean by key? I hope the above made sense? I apologies for being very new to data analytics Maybe to be even more exact. What i am looking for is some way to take a dataframe; Example: String_2 = ([100,24,59,19,588,209,345,288,193,294,298]) And then have some sort of function that could bin that string into a defined interval, lets say each 100; df_1 = [24,59,19] #[0..100[ df_2 = [100, 193] #[100..200[ df_3 = [209, 288, 294, 298] #[200..300[ df_3 = [345] #[300..400[ df_4 = [] #[400..500[ df_5 = [588] #[500..600[ RE: Binning data to files - Larz60+ - Jun-22-2020 by key I mean the field that you organize all data sets by (date for example) Also, how are you obtaining the data. Can you interface directly to the panels from python? RE: Binning data to files - Kappel - Jun-22-2020 Oh, so the key I use is probably going to be a "Count" variable that I will add as the first column of the strings, this will just be incrementing += 1, I also have the Timestamp but I reckon a count integer would be more ideal as a key. I am obtaining data using Pyads (From a Beckhoff PC that gets the data from my PLC that is connected to the panels), I have set the variables to update every 1sec, however, I think I will change it to 1min avg. instead to keep the dataflow down a bit. Thank you so much for your time! |