Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Binning data to files
#1
Hi,
I am trying to develop a Power Curve / Algorithm that can give me a "Possible Max Power" signal when the Solar Panels on my roof are scaled-down in production.

To make this power curve I fetch data every second where I get a dataset similar to the below (Made with random.randint).
What would be a way I could separate this data into bins (Based on Solar_Radiation) so I can calculate a correlation between production and PV_Cell_Temp in each bin?
I have been looking around on the internet, but there doesn't seem to be anything in Pandas I can use to do this..

Timestamp,PV_Production,Solar_Radiation,PV_Cell_Temp,Ambient_Temp
2020-06-21 13:37:02.934901,0,206,164.8,0
2020-06-21 13:37:02.935898,0,312,124.8,0
2020-06-21 13:37:02.942879,0,234,23.4,0
2020-06-21 13:37:02.943877,0,230,230.0,0
2020-06-21 13:37:02.944874,0,273,218.4,0
2020-06-21 13:37:02.948862,0,317,95.1,0
2020-06-21 13:37:02.951855,0,328,328.0,0
2020-06-21 13:37:02.954847,0,311,0.0,0
Reply
#2
Need further specification:
  • What comprises a bin
  • What are the keys
  • what are the data field names
  • description of each field
Reply
#3
Sorry about the missing info. What I want my end result to be is a power curve where the production is determined based on the variables:
PV_Production: Total solar panel production [w]
Solar_Radiation: Solar radiation [w/m2]
PV_Cell_Temp: Temperature of the solar panel cells [c]
Ambient_Temp: Ambient temperature [c] (Not used, PV_Cell_Temp should suffice.

For the bins I imagined that:
- A bin consists of all the data points with 'Solar_Radiation' as the binning factor in an interval of 100
Eg.
Bin 1: if 0 < Solar_Radiation <= 100:
(All data from each string where solar radiation is 0..100)
Bin 2: 100 < Solar_Radiation <= 200
...
Bin 3...

What are the keys: What do you mean by key?

I hope the above made sense? I apologies for being very new to data analytics

Maybe to be even more exact. What i am looking for is some way to take a dataframe;
Example:
String_2 = ([100,24,59,19,588,209,345,288,193,294,298])

And then have some sort of function that could bin that string into a defined interval, lets say each 100;
df_1 = [24,59,19] #[0..100[
df_2 = [100, 193] #[100..200[
df_3 = [209, 288, 294, 298] #[200..300[
df_3 = [345] #[300..400[
df_4 = [] #[400..500[
df_5 = [588] #[500..600[
Reply
#4
by key I mean the field that you organize all data sets by (date for example)
Also, how are you obtaining the data. Can you interface directly to the panels from python?
Reply
#5
Oh, so the key I use is probably going to be a "Count" variable that I will add as the first column of the strings, this will just be incrementing += 1, I also have the Timestamp but I reckon a count integer would be more ideal as a key.

I am obtaining data using Pyads (From a Beckhoff PC that gets the data from my PLC that is connected to the panels), I have set the variables to update every 1sec, however, I think I will change it to 1min avg. instead to keep the dataflow down a bit.
Thank you so much for your time!
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Ask for machine learning Python example with 2 data files user5566b 2 2,288 Sep-05-2019, 12:15 PM
Last Post: user5566b
  Need Help With Filtering Data For Excel Files Using Pandas eddywinch82 9 6,135 Aug-06-2019, 03:44 PM
Last Post: eddywinch82
  import/use data from text files MichealPeterson 1 3,321 Jun-28-2017, 08:51 AM
Last Post: buran

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020