Python Forum
trouble importing and converting CSV
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
trouble importing and converting CSV
#1
I am banging my head against the wall!!! I'm very new to Python and thought I'd actually run through some tutorials using the kind of data I'll be working with... I've tried several ways of importing my CSV files using both Numpy and Pandas, with errors every time!!!
column 1 is date, column 2 is Time WITH milliseconds, and the rest of the columns are all various numbers. Column 1 and 2 are only used for plotting the data its all the numbers columns that actually need to be analized... here is basically the format of the CSV files... I know column 1 and 2 need to be strings and Ideally I would merge them into 1 column... but I just want to be able to start doing something with my data... Biggest issue is getting Pandas to recognize and convert dates and times.

2017/6/13, 09:39:12.0, 5813.00, 5813.75, 5812.50, 5813.75, 189, 146
2017/6/13, 09:39:17.0, 5814.00, 5814.75, 5810.50, 5814.25, 940, 700
2017/6/13, 09:39:49.0, 5814.75, 5815.50, 5814.00, 5815.50, 289, 203
2017/6/13, 09:40:01.0, 5815.50, 5816.25, 5814.75, 5816.25, 647, 446

ANY ADVICE APPRECIATED!!!
Reply
#2
Pandas is a good choice; It is easy to operate with dates using it.

import pandas as pd

import io # you don't need to import this, if you load data from a file, e.g. your_file.csv


string_data = b'''
date,time,val1,val2,val3,val4,val5,val6
2017/6/13, 09:39:12.0,5813.00,5813.75,5812.50,5813.75,189,146
2017/6/13,09:39:17.0,5814.00,5814.75,5810.50,5814.25,940,700
2017/6/13,09:39:49.0  ,5814.75,5815.50,5814.00,5815.50,289,203
2017/6/13,09:40:01.0,5815.50,5816.25,5814.75,5816.25,647,446'''

data = pd.read_csv(io.BytesIO(string_data), encoding='utf-8', sep=',')
# if you are loading data from a file, pass the file name instead:
# data = pd.read_csv('your_file.csv', sep=',')

# clean whitespaces if they exist 
data = data.applymap(lambda x: x.strip() if isinstance(x, str) else x)

# combine date and time into single column
data['datetime'] = pd.to_datetime(data.date + ' ' + data.time, format="%Y/%m/%d %H:%M:%S.0")


data = data.drop(columns=['date', 'time']) # drop combined columns

print(data.datetime)
Output:
0 2017-06-13 09:39:12 1 2017-06-13 09:39:17 2 2017-06-13 09:39:49 3 2017-06-13 09:40:01 Name: datetime, dtype: datetime64[ns] <class 'pandas.core.frame.DataFrame'> RangeIndex: 4 entries, 0 to 3 Data columns (total 7 columns): val1 4 non-null float64 val2 4 non-null float64 val3 4 non-null float64 val4 4 non-null float64 val5 4 non-null int64 val6 4 non-null int64 datetime 4 non-null datetime64[ns] dtypes: datetime64[ns](1), float64(4), int64(2) memory usage: 304.0 bytes
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020