Binning data to files

Kappel · Jun-21-2020, 06:04 PM

Hi,
I am trying to develop a Power Curve / Algorithm that can give me a "Possible Max Power" signal when the Solar Panels on my roof are scaled-down in production.

To make this power curve I fetch data every second where I get a dataset similar to the below (Made with random.randint).
What would be a way I could separate this data into bins (Based on Solar_Radiation) so I can calculate a correlation between production and PV_Cell_Temp in each bin?
I have been looking around on the internet, but there doesn't seem to be anything in Pandas I can use to do this..

Timestamp,PV_Production,Solar_Radiation,PV_Cell_Temp,Ambient_Temp
2020-06-21 13:37:02.934901,0,206,164.8,0
2020-06-21 13:37:02.935898,0,312,124.8,0
2020-06-21 13:37:02.942879,0,234,23.4,0
2020-06-21 13:37:02.943877,0,230,230.0,0
2020-06-21 13:37:02.944874,0,273,218.4,0
2020-06-21 13:37:02.948862,0,317,95.1,0
2020-06-21 13:37:02.951855,0,328,328.0,0
2020-06-21 13:37:02.954847,0,311,0.0,0

**Larz60+** · Jun-21-2020, 07:31 PM

Need further specification:

What comprises a bin
What are the keys
what are the data field names
description of each field

Kappel · (This post was last modified: Jun-22-2020, 03:15 PM by Kappel.)

Sorry about the missing info. What I want my end result to be is a power curve where the production is determined based on the variables:
PV_Production: Total solar panel production [w]
Solar_Radiation: Solar radiation [w/m2]
PV_Cell_Temp: Temperature of the solar panel cells [c]
Ambient_Temp: Ambient temperature [c] (Not used, PV_Cell_Temp should suffice.

For the bins I imagined that:
- A bin consists of all the data points with 'Solar_Radiation' as the binning factor in an interval of 100
Eg.
Bin 1: if 0 < Solar_Radiation <= 100:
(All data from each string where solar radiation is 0..100)
Bin 2: 100 < Solar_Radiation <= 200
...
Bin 3...

What are the keys: What do you mean by key?

I hope the above made sense? I apologies for being very new to data analytics

Maybe to be even more exact. What i am looking for is some way to take a dataframe;
Example:
String_2 = ([100,24,59,19,588,209,345,288,193,294,298])

And then have some sort of function that could bin that string into a defined interval, lets say each 100;
df_1 = [24,59,19] #[0..100[
df_2 = [100, 193] #[100..200[
df_3 = [209, 288, 294, 298] #[200..300[
df_3 = [345] #[300..400[
df_4 = [] #[400..500[
df_5 = [588] #[500..600[

**Larz60+** · Jun-22-2020, 06:15 PM

by key I mean the field that you organize all data sets by (date for example)
Also, how are you obtaining the data. Can you interface directly to the panels from python?

Kappel · Jun-22-2020, 06:25 PM

Oh, so the key I use is probably going to be a "Count" variable that I will add as the first column of the strings, this will just be incrementing += 1, I also have the Timestamp but I reckon a count integer would be more ideal as a key.

I am obtaining data using Pyads (From a Beckhoff PC that gets the data from my PLC that is connected to the panels), I have set the variables to update every 1sec, however, I think I will change it to 1min avg. instead to keep the dataflow down a bit.
Thank you so much for your time!

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Ask for machine learning Python example with 2 data files	user5566b	2	2,288	Sep-05-2019, 12:15 PM Last Post: user5566b
	Need Help With Filtering Data For Excel Files Using Pandas	eddywinch82	9	6,135	Aug-06-2019, 03:44 PM Last Post: eddywinch82
	import/use data from text files	MichealPeterson	1	3,321	Jun-28-2017, 08:51 AM Last Post: buran

Binning data to files

User Panel Messages

Announcements