Hello :)
I have stored numbers in a file from another program, and I'd like to extract the values in lists to use with another python program I already wrote, here is how the data file looks like:
For some unknown reason i have these strange characters at the beginning. I came up with the following code to get the values and get rid of the commas and EOL characters:
To describe my issue, we'll just run this with the first two "print", namely "print numb" and "print t". This outputs:
And if I try to convert to float by replacing t.append(numb) with t.append(float(numb)) I get the following output:
To make my life easier, I wondered if I coult convert it back to a string inside the list to make the conversion to float/int later. So I tried by changing t.append(float(numb)) to t.append(str(numb)) which yielded as output:
For some reasons, some of this code actually works when run directly in an interpreter in a terminal. Example:
But if I read the data from the data file:
gives when run:
In interpreter:
I have a lot of troubles understanding what's happening there.
My main goal is to just have something working (I expected to take no more than 10mn to write the storing algorithm and the data-retrieving one, apparently I couln't be more wrong, nothing went right), but I'd also like to understand what is going on obviously (so that I don't get stuck into this kind of issues again)
Any hints?
Thanks in advance.
Python 2.7.9 / Debian Jessie / i686
I have stored numbers in a file from another program, and I'd like to extract the values in lists to use with another python program I already wrote, here is how the data file looks like:
1 2 3 4 5 6 7 8 9 10 |
�� 1.12005000 , 1.11800000 , 14574 1.11947000 , 1.11811000 , 8285 1.12035000 , 1.11749000 , 7979 1.11812000 , 1.11597000 , 18181 1.13148000 , 1.11499000 , 30360 1.13176000 , 1.12344000 , 57786 1.12441000 , 1.11997000 , 24175 1.12261000 , 1.12067000 , 14455 1.12466000 , 1.12198000 , 10255 1.12643000 , 1.12209000 , 29588 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
f = open ( 'EU-1H.txt' , 'r' ) a = f.readline() a = a[ 2 : - 1 ] #removing the two odd characters from he first line as well as the EOL character fullarray, t = [], [] for numb in a[: - 1 ].split( ',' ): #print numb t.append(numb) #print t fullarray.append(t) for i, line in enumerate (f): t = [] for numb in line[: - 1 ].split( ',' ): #just making life easier by removing the EOL t.append(numb) fullarray.append(t) if i = = 10 : break #I don't need to process the whole thousands of lines to know if the code works #print fullarray |
1 2 3 4 |
1.12005000 1.11800000 14574 [ '1\x00.\x001\x002\x000\x000\x005\x000\x000\x000\x00' , '\x001\x00.\x001\x001\x008\x000\x000\x000\x000\x000\x00' , '\x001\x004\x005\x007\x004\x00\r' ] |
1 2 3 4 5 |
1.12005000 Traceback (most recent call last): File "reader.py" , line 9 , in <module> t.append( float (numb)) ValueError: invalid literal for float (): 1 |
1 2 3 4 |
1.12005000 1.11800000 14574 [ '1\x00.\x001\x002\x000\x000\x005\x000\x000\x000\x00' , '\x001\x00.\x001\x001\x008\x000\x000\x000\x000\x000\x00' , '\x001\x004\x005\x007\x004\x00\r' ] |
1 2 3 4 5 |
>>> a = '1.12005000,1.11800000,14574' #copypasted from the source file >>> a '1.12005000,1.11800000,14574' >>> a.split( ',' ) [ '1.12005000' , '1.11800000' , '14574' ] |
1 2 3 4 5 |
f = open ( 'EU-1H.txt' , 'r' ) a = f.readline() a = a[ 2 : - 1 ] print a print a.split( ',' ) |
1 2 |
1.12005000 , 1.11800000 , 14574 [ '1\x00.\x001\x002\x000\x000\x005\x000\x000\x000\x00' , '\x001\x00.\x001\x001\x008\x000\x000\x000\x000\x000\x00' , '\x001\x004\x005\x007\x004\x00\r\x00' ] |
1 2 3 4 5 6 7 8 9 10 |
>>> f = open ( 'EU-1H.txt' , 'r' ) >>> a = f.readline() >>> a[ 2 : - 1 ] '1\x00.\x001\x002\x000\x000\x005\x000\x000\x000\x00,\x001\x00.\x001\x001\x008\x000\x000\x000\x000\x000\x00,\x001\x004\x005\x007\x004\x00\r\x00' >>> print a[ 2 : - 1 ] 1.12005000 , 1.11800000 , 14574 >>> a[ 2 : - 1 ].split( ',' ) [ '1\x00.\x001\x002\x000\x000\x005\x000\x000\x000\x00' , '\x001\x00.\x001\x001\x008\x000\x000\x000\x000\x000\x00' , '\x001\x004\x005\x007\x004\x00\r\x00' ] >>> print a[ 2 : - 1 ].split( ',' ) [ '1\x00.\x001\x002\x000\x000\x005\x000\x000\x000\x00' , '\x001\x00.\x001\x001\x008\x000\x000\x000\x000\x000\x00' , '\x001\x004\x005\x007\x004\x00\r\x00' ] |
My main goal is to just have something working (I expected to take no more than 10mn to write the storing algorithm and the data-retrieving one, apparently I couln't be more wrong, nothing went right), but I'd also like to understand what is going on obviously (so that I don't get stuck into this kind of issues again)
Any hints?
Thanks in advance.
Python 2.7.9 / Debian Jessie / i686