Python Forum
how to handling time series data file with Python?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
how to handling time series data file with Python?
#1
I am a newbie in deep learning and try to make feature matrix with python. My sample data structure is like below,

dataset.csv

State   Earnings   Hispanic   Indian   Asian   Black   White   people_in_poverty

Alabama   0.2        0.4        0.6     0.6      0.2    0.8          a.csv
Florida   0.5        0.6        0.4     0.1      0.6    0.7          b.csv
Kentucky  0.7        0.7        0.9     0.8      0.3    0.6          c.csv
Minnesota 0.3        0.1        0.2     0.5      0.2    0.7          d.csv
....
The column names [Earnings, Hispanic, Indian, Asian, Black, White] are the attributes and people_in_poverty is the class of feature matrix. When the value of people_in_poverty is numeric, the python codes are simple.

people_in_poverty
0.7
0.3
0.2
0.6
import pandas as pd
df = pd.read_csv('dataset.csv', names=['state', 'Earnings', 'Hispanic', 'Indian', 'Asian', 'Black', 'White', 'people_in_poverty'])
dataset = df.values
However, in my case, the class of feature matrix has csv file which includes the times series data.


people_in_poverty
a.csv
b.csv
c.csv
d.csv
a.csv
2010-08-27        0.2
2010-09-27        0.7
2010-10-27        0.6
2010-11-27        0.9
2010-12-27        0.4
2011-01-27        0.8
2011-02-27        0.5
2011-03-27        0.3
Then I want to know how to modify my pd.read_csv() python codes. The class of the feature matrix is not the numeric value, but csv file containing the time series values. Any advice is needed. Thanks in advanced.
Reply
#2
What shape do you envision for your DataFrame?
Reply
#3
I try to make model to predict 'people_in_poverty' of each 'state' with inputs of 'earning' and 'races'. So my feature matrix would be like below,

State   Earnings   Hispanic   Indian   Asian   Black   White                             people_in_poverty
 
Alabama   0.2        0.4        0.6     0.6      0.2    0.8     [[2010-08-27,0.2], [2010-09-27,0.7], [2010-10-27,0.6], [2010-11-27,0.9]]
Florida   0.5        0.6        0.4     0.1      0.6    0.7     [[2010-08-27,0.5], [2010-09-27,0.6], [2010-10-27,0.2], [2010-11-27,0.8]]
Kentucky  0.7        0.7        0.9     0.8      0.3    0.6     [[2010-08-27,0.1], [2010-09-27,0.4], [2010-10-27,0.5], [2010-11-27,0.5]]
Minnesota 0.3        0.1        0.2     0.5      0.2    0.7     [[2010-08-27,0.6], [2010-09-27,0.3], [2010-10-27,0.7], [2010-11-27,0.3]]
Reply
#4
Hello, How about this python codes using pandas median function. Because the 'people_in_poverty' columns have list values, I use the pandas median function to get the single value of feature matrix,

import pandas as pd

data_col1 = [['2010-08-27',0.2], ['2010-09-27',0.7], ['2010-10-27',0.6], ['2010-11-27',0.9]]
data_col2 = [['2010-08-27',0.5], ['2010-09-27',0.6], ['2010-10-27',0.2], ['2010-11-27',0.8]]
data_col3 = [['2010-08-27',0.1], ['2010-09-27',0.4], ['2010-10-27',0.5], ['2010-11-27',0.5]]
data_col4 = [['2010-08-27',0.6], ['2010-09-27',0.3], ['2010-10-27',0.7], ['2010-11-27',0.3]]

df1 = pd.DataFrame(data_col1, columns=['Date', 'Value'])
df2 = pd.DataFrame(data_col2, columns=['Date', 'Value'])
df3 = pd.DataFrame(data_col3, columns=['Date', 'Value'])
df4 = pd.DataFrame(data_col4, columns=['Date', 'Value'])

print(df1['Value'].median())
print(df2['Value'].median())
print(df3['Value'].median())
print(df4['Value'].median())
Any advice will be deeply appreciated, Thanks
Reply
#5
If you want to isolate a certain element of your dataframe, I found that a great way to do this is to isolate a given column, convert that column to a list, then use list[index] to isolate the given element.
column = df.ColumnTitle.to_string(index=False) #replace ColumnTitle with your own column title
lst = (list(column.split()))[1:]
element = lst[x] #where x represents an index

Hopefully this helps!
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Help: Conversion of Electricity Data into Time Series Data SmallGuy 3 1,156 Oct-04-2023, 03:31 PM
Last Post: deanhystad
  Time Series Production Process Problem Mzarour 1 2,097 Feb-28-2023, 12:25 PM
Last Post: get2sid
  Add data to CSV file via Python Anaconda23 0 699 Dec-30-2022, 02:31 AM
Last Post: Anaconda23
  reduce time series based on sum condition amdi40 0 1,078 Apr-06-2022, 09:09 AM
Last Post: amdi40
  How to accumulate volume of time series amdi40 3 2,259 Feb-15-2022, 02:23 PM
Last Post: amdi40
  Recommendations for ML libraries for time-series forecast AndreasPython 0 1,862 Jan-06-2021, 01:03 PM
Last Post: AndreasPython
  Find two extremum in data series Sancho_Pansa 0 1,674 Dec-04-2020, 02:06 PM
Last Post: Sancho_Pansa
  Time Series forecating with multiple independent variables Krychol88 1 1,823 Oct-23-2020, 08:11 AM
Last Post: DPaul
  Changing Time Series from Start to End of Month illmattic 0 1,827 Jul-16-2020, 10:49 AM
Last Post: illmattic
  HELP- DATA FRAME INTO TIME SERIES- BASIC bntayfur 0 1,732 Jul-11-2020, 09:04 PM
Last Post: bntayfur

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020