Python Forum
Importing a .csv file to dataframe - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Importing a .csv file to dataframe (/thread-15240.html)



Importing a .csv file to dataframe - crispyduck - Jan-09-2019

Hello all,

I'm wondering if you may be able to help...
I'm trying to import this .csv file, so I can view the data in columns under each heading, however when imported the format is not coming into a nice table.
Instead of the columns being populated with the data underneath they are being populated with NaN values (not a number). The data seems to be bunched in the second column under 'Reading values'

shape is: 6949 rows, 12 columns
type is: pandas.core.frame.DataFrame

I am using
rawdata=pd.read_csv("Desktop/rawdata.csv")
print(rawdata)




Has anyone seen this before, any suggestions would be appreciated.
Thanks

sorry you cant see my image at this time as ive only just registered


RE: Importing a .csv file to dataframe - ichabod801 - Jan-09-2019

Please use python tags for multi-line code. icode tags are for inline code.

Also, please post a few line of the csv file and the resulting data frame, using output tags.


RE: Importing a .csv file to dataframe - crispyduck - Jan-09-2019

Hello Bunny Rabbit



here are the first 6 lines of the .csv
Output:
Reading values Reading values timestamp Insert time Reporting type Errorcode OBIS code Unit Factor CT factor Quality Quality Status Word Reading reason Reading values,"Reading values timestamp","Insert time","Reporting type","Errorcode","OBIS code","Unit","Factor","CT factor","Quality","Quality Status Word","Reading reason" 29041.601,"21.11.2018 14:15:00","21.11.2018 14:28:12","absolute","0","8-0:1.0.0*255","m³","1.000","1.000","Real value","No errors","Periodic" 29041.601,"21.11.2018 14:30:00","21.11.2018 14:45:06","absolute","0","8-0:1.0.0*255","m³","1.000","1.000","Real value","No errors","Periodic" 29041.601,"21.11.2018 14:45:00","21.11.2018 14:59:26","absolute","0","8-0:1.0.0*255","m³","1.000","1.000","Real value","No errors","Periodic" 29041.601,"21.11.2018 15:00:00","21.11.2018 15:15:57","absolute","0","8-0:1.0.0*255","m³","1.000","1.000","Real value","No errors","Periodic"
here is the output
Output:
Reading values \ 0 Reading values,"Reading values timestamp","Ins... 1 29041.601,"21.11.2018 14:15:00","21.11.2018 14... 2 29041.601,"21.11.2018 14:30:00","21.11.2018 14... 3 29041.601,"21.11.2018 14:45:00","21.11.2018 14... 4 29041.601,"21.11.2018 15:00:00","21.11.2018 15... 5 29041.601,"21.11.2018 15:30:00","21.11.2018 15... 6 29041.601,"21.11.2018 16:15:00","21.11.2018 16... 7 29041.601,"21.11.2018 16:30:00","21.11.2018 16... 8 29041.601,"21.11.2018 17:15:00","21.11.2018 17... 9 29041.601,"21.11.2018 18:15:00","21.11.2018 18... 10 29041.601,"21.11.2018 19:30:00","21.11.2018 19... 11 29041.601,"21.11.2018 20:15:00","21.11.2018 20... 12 29041.601,"21.11.2018 21:00:00","21.11.2018 21... 13 29041.601,"21.11.2018 22:15:00","21.11.2018 22... 14 29041.601,"21.11.2018 22:30:00","21.11.2018 22... 15 29041.601,"21.11.2018 23:00:00","21.11.2018 23... 16 29041.601,"21.11.2018 23:15:00","21.11.2018 23... 17 29041.601,"21.11.2018 23:45:00","22.11.2018 00... 18 29041.601,"22.11.2018 00:00:00","22.11.2018 00... 19 29041.601,"22.11.2018 00:30:00","22.11.2018 00... 20 29041.601,"22.11.2018 01:30:00","22.11.2018 01...
hope this helps


RE: Importing a .csv file to dataframe - ichabod801 - Jan-09-2019

As I said, please use output tags, not icode tags (the weird icon between the red X and the brackets([]) in the editor). I put them in for you this time.

It looks to be reading in as a data frame as you want. Most of it is coming in as strings, probably because of the quotes in the csv file. You can specify data types ahead of time with the dtypes parameter of read_csv, or afterwords with the astype method of the column. Here is a Stack Overflow post with examples on how to use them.

Edit: No, I see what you are saying about the columns all being one. I think the first line of the csv file is confusing it. It is reading it as the column names, but there are no commas, so it only sees one column. Try deleting the first row of the csv file and see if that improves things.


RE: Importing a .csv file to dataframe - crispyduck - Jan-11-2019

Thanks for the pointer to the Stack overflow post, I got searching around in this area and found some more useful posts but not enough to solve my problem

I know that my .csv file needs more processing before I read it in as all the data is in a single column and the headings are being ignored.

the following code forces the headings to be seen so ive manually done this for the first 3 lines,
my_string1 = 'Reading values,Reading values timestamp,Insert time,Reporting type,Errorcode,OBIS code,Unit,Factor,CT factor,Quality,Quality Status Word,Reading reason'
my_string2 = '29041.601,21.11.2018 14:30:00,21.11.2018 14:45:06,absolute,0,8-0:1.0.0*255,m³,1.000,1.000,Real value,No errors,Periodic'
my_string3 = '29041.601,"21.11.2018 14:30:00","21.11.2018 14:45:06","absolute","0","8-0:1.0.0*255","m³","1.000","1.000","Real value","No errors","Periodic"'
my_list1 = my_string1.split(",")
my_list2 = my_string2.split(",")
my_list3 = my_string3.split(",")

path = 'desktop/rawdata.csv'
rows = [my_list1,my_list2,my_list3]

with open(path, "w") as csv_file:
    writer = csv.writer(csv_file, delimiter=',')
    for row in rows:
        writer.writerow(row)
This nicely spreads my data across 12 colums with data underneath them Cool
however my file has just under 7000 lines so this isnt practical!


is there a way that I can get this process (or a similar one that does more or less the same thing) to iterate through my whole file?


RE: Importing a .csv file to dataframe - ichabod801 - Jan-11-2019

I just did a check. The file reads fine for me if I delete the first line. Is that not working for you?