Posts: 37
Threads: 13
Joined: Apr 2018
Having csv file with the value as below
2.22971E+15,2.20408E+15
while reading from csv able to read the data as
'2229708157109627', '2204081406354342'
code:
1 2 3 |
import pandas as pd
with open ( 'mypath\mycsv.csv' , encoding = "utf8" ) as csv_file:
df = pd.read_csv(csv_file,index_col = None , header = 0 )
|
but actually
2229708157109620, 2204081406354340
Why the values are changing here
from 22297081571096 20 to 22297081571096 27
from 22040814063543 40 to 22040814063543 42
?
Posts: 8,165
Threads: 160
Joined: Sep 2016
May-29-2019, 10:20 AM
(This post was last modified: May-29-2019, 10:20 AM by buran.)
how do you get significant digits after 71 or 08?
1 2 3 4 5 6 7 8 9 |
>>> 2.22971E + 15
2229710000000000.0
>>> 2.20408E + 15
2204080000000000.0
>>> d = Decimal( '2.20408E+15' )
>>> d
Decimal( '2.20408E+15' )
>>> int (d)
2204080000000000
|
Posts: 37
Threads: 13
Joined: Apr 2018
Save the below records as csv file
2229708157109627
2204081406354342
and try reading them from pandas or with csv.
you could see output as
2229708157109620
2204081406354340
Posts: 8,165
Threads: 160
Joined: Sep 2016
(May-29-2019, 10:25 AM)parthi1705 Wrote: Save the below records as csv file
2229708157109627
2204081406354342 (May-29-2019, 10:07 AM)parthi1705 Wrote: Having csv file with the value as below
2.22971E+15,2.20408E+15
So it's
Output: 2229708157109627,2204081406354342
not as stated in original post
Output: 2.22971E+15,2.20408E+15
Posts: 8,165
Threads: 160
Joined: Sep 2016
May-29-2019, 11:02 AM
(This post was last modified: May-29-2019, 11:02 AM by buran.)
(May-29-2019, 10:25 AM)parthi1705 Wrote: try reading them from pandas or with csv.
Here it is with csv module:
1 2 3 4 5 6 7 8 9 10 |
import csv
with open ( 'spam.csv' ) as f:
rdr = csv.reader(f)
for line in rdr:
foo, bar = line
print (foo)
print ( int (foo))
print (bar)
print ( int (bar))
|
Output: 2229708157109627
2229708157109627
2204081406354342
2204081406354342
>>>
what is in the file, that's what you get
Posts: 2,128
Threads: 11
Joined: May 2017
May-29-2019, 01:45 PM
(This post was last modified: May-29-2019, 01:46 PM by DeaD_EyE.)
The values must divided by 1e15?
You can use decimal.Decimal and change the precision, but Decimal is slow.
1 2 3 4 5 6 7 8 9 10 11 |
import decimal
def convert(integers, divisor = '1e15' ):
with decimal.localcontext() as ctx:
ctx.prec = 20
for value in integers:
yield decimal.Decimal(value) / decimal.Decimal(divisor)
list (convert([ '2229708157109627' , '2204081406354342' ]))
|
Output: [Decimal('2.229708157109627'), Decimal('2.204081406354342')]
Numpy has a float128 datatype, but 64 seems to be enough to represent the value.
Are you using a 32 bit version of Python?
With numpy I got following (Python 3.7.3 x64):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
In [ 34 ]: import numpy as np
In [ 35 ]: np.float128(num)
Out[ 35 ]: 2229708157109627.0
In [ 36 ]: np.float64(num)
Out[ 36 ]: 2229708157109627.0
In [ 37 ]: np.float32(num)
Out[ 37 ]: 2229708100000000.0
In [ 38 ]: np.float128(num) / 1e15
Out[ 38 ]: 2.229708157109627
In [ 39 ]: np.float64(num) / 1e15
Out[ 39 ]: 2.229708157109627
In [ 40 ]: np.float32(num) / 1e15
Out[ 40 ]: 2.22970811252736
In [ 41 ]: num / 1e15
Out[ 41 ]: 2.229708157109627
In [ 42 ]: num
Out[ 42 ]: 2229708157109627
|
Posts: 37
Threads: 13
Joined: Apr 2018
May-30-2019, 04:27 AM
(This post was last modified: May-30-2019, 04:56 AM by parthi1705.)
(May-29-2019, 10:40 AM)buran Wrote: (May-29-2019, 10:25 AM)parthi1705 Wrote: Save the below records as csv file
2229708157109627
2204081406354342 (May-29-2019, 10:07 AM)parthi1705 Wrote: Having csv file with the value as below
2.22971E+15,2.20408E+15
So it's
Output: 2229708157109627,2204081406354342
not as stated in original post
Output: 2.22971E+15,2.20408E+15
Buran,
If you open csv file you would probably see the data as 2.22971E+15,2.20408E+15 unless you have the mouse pointer over the cell.
Actual csv file data goes like below.
2229708157109620
2204081406354340
892984837048418000
2204081406354340
843482762412573
1667516366684810
1 2 3 4 5 6 7 8 9 10 11 12 13 |
>>>
>>> import pandas as pd
>>> with open ( 'D:\mycsv.csv' , encoding = "utf8" ) as csv_file:
df = pd.read_csv(csv_file,index_col = None , header = 0 )
>>> df[ 'ID' ].head( 6 )
0 2229708157109627
1 2204081406354342
2 892984837048418304
3 2204081406354342
4 843482762412573
5 1667516366684818
Name: ID , dtype: int64
|
Save those records and see what it looks like...
Original records are from database to my csv file. one more wonder is when i save back them again to csv file able to see original records
1 2 |
>>> newdf = df[ 'ID' ].head( 10 )
>>> newdf.to_csv( 'D:\spam.csv' , index = None , header = True , encoding = 'utf-8' )
|
but why while displaying it shows differently ??
Have attached screenshot for reference.
Installed version of Python 3.7.0
Attached Files
Thumbnail(s)
Posts: 8,165
Threads: 160
Joined: Sep 2016
May-30-2019, 06:48 AM
(This post was last modified: May-30-2019, 06:48 AM by buran.)
That is not python/pandas problem.
The problem is that you view your csv file in some spreadsheet software (Excel, OpenOffice, etc., most probably Excel)
That is why you see scientific (exponential) notation. In addition, it's the excel that truncates the significant digits up to 15, i.e. the length stays the same, but significant digits are the first 15.
See for yourself - on the left is the actual csv file and on the right side is the same file open in excel
To see what you really have in the file, open your csv file as it comes from the database in a notepad or notepadd++ or some other simple text editor. You will see it's what pandas show as dataframe values
As an advice and to avoid such confusions in the future, when refer to data/content in csv/txt files, always tell what you see in notepad/notepad++ and not in what you see being displayed in a spreadsheet.
Posts: 37
Threads: 13
Joined: Apr 2018
May-30-2019, 09:06 AM
(This post was last modified: May-30-2019, 09:06 AM by parthi1705.)
It is not the issue in the way we view the data with csv file.. it the way we read the actual data from csv file.
Can you try reading the Csv file has below data
2229708157109620
2204081406354340
892984837048418000
and can show what you get in python or pandas or csv or with any other module.
Posts: 8,165
Threads: 160
Joined: Sep 2016
May-30-2019, 09:16 AM
(This post was last modified: May-30-2019, 09:17 AM by buran.)
Do you open the csv file in notepad and see the data in it as I requested in my previous post? Yes or No? Note - open the file as exported from DB, not opened and saved in excel.
I demonstrated that the issue is with the excel.
|