Python Forum
Dealing with Exponential data
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Dealing with Exponential data
#1
Having csv file with the value as below
2.22971E+15,2.20408E+15
while reading from csv able to read the data as
'2229708157109627', '2204081406354342'
code:
import pandas as pd
with open('mypath\mycsv.csv', encoding="utf8") as csv_file:
    df = pd.read_csv(csv_file,index_col=None, header=0) 
but actually
2229708157109620, 2204081406354340

Why the values are changing here
from 2229708157109620 to 2229708157109627
from 2204081406354340 to 2204081406354342
?
Reply
#2
how do you get significant digits after 71 or 08?

>>> 2.22971E+15
2229710000000000.0
>>> 2.20408E+15
2204080000000000.0
>>> d = Decimal('2.20408E+15')
>>> d
Decimal('2.20408E+15')
>>> int(d)
2204080000000000
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#3
Save the below records as csv file
2229708157109627
2204081406354342
and try reading them from pandas or with csv.
you could see output as
2229708157109620
2204081406354340
Reply
#4
(May-29-2019, 10:25 AM)parthi1705 Wrote: Save the below records as csv file
2229708157109627
2204081406354342
(May-29-2019, 10:07 AM)parthi1705 Wrote: Having csv file with the value as below
2.22971E+15,2.20408E+15


So it's
Output:
2229708157109627,2204081406354342
not as stated in original post
Output:
2.22971E+15,2.20408E+15
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#5
(May-29-2019, 10:25 AM)parthi1705 Wrote: try reading them from pandas or with csv.

Here it is with csv module:

import csv

with open('spam.csv') as f:
    rdr = csv.reader(f)
    for line in rdr:
        foo, bar = line
        print(foo)
        print(int(foo))
        print(bar)
        print(int(bar))
Output:
2229708157109627 2229708157109627 2204081406354342 2204081406354342 >>>
what is in the file, that's what you get
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#6
The values must divided by 1e15?
You can use decimal.Decimal and change the precision, but Decimal is slow.

import decimal


def convert(integers, divisor='1e15'):
    with decimal.localcontext() as ctx:
        ctx.prec = 20
        for value in integers:
            yield decimal.Decimal(value) / decimal.Decimal(divisor)


list(convert(['2229708157109627', '2204081406354342']))
Output:
[Decimal('2.229708157109627'), Decimal('2.204081406354342')]
Numpy has a float128 datatype, but 64 seems to be enough to represent the value.
Are you using a 32 bit version of Python?
With numpy I got following (Python 3.7.3 x64):

In [34]: import numpy as np                                                                                                                   

In [35]: np.float128(num)                                                                                                                     
Out[35]: 2229708157109627.0

In [36]: np.float64(num)                                                                                                                      
Out[36]: 2229708157109627.0

In [37]: np.float32(num)                                                                                                                      
Out[37]: 2229708100000000.0

In [38]: np.float128(num) / 1e15                                                                                                              
Out[38]: 2.229708157109627

In [39]: np.float64(num) / 1e15                                                                                                               
Out[39]: 2.229708157109627

In [40]: np.float32(num) / 1e15                                                                                                               
Out[40]: 2.22970811252736

In [41]: num / 1e15                                                                                                                           
Out[41]: 2.229708157109627

In [42]: num                                                                                                                                  
Out[42]: 2229708157109627
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply
#7
(May-29-2019, 10:40 AM)buran Wrote:
(May-29-2019, 10:25 AM)parthi1705 Wrote: Save the below records as csv file
2229708157109627
2204081406354342
(May-29-2019, 10:07 AM)parthi1705 Wrote: Having csv file with the value as below
2.22971E+15,2.20408E+15


So it's
Output:
2229708157109627,2204081406354342
not as stated in original post
Output:
2.22971E+15,2.20408E+15

Buran,
If you open csv file you would probably see the data as 2.22971E+15,2.20408E+15 unless you have the mouse pointer over the cell.

Actual csv file data goes like below.

2229708157109620
2204081406354340
892984837048418000
2204081406354340
843482762412573
1667516366684810

>>>
>>> import pandas as pd
>>> with open('D:\mycsv.csv', encoding="utf8") as csv_file:
        df = pd.read_csv(csv_file,index_col=None, header=0) 

>>> df['ID'].head(6)
0      2229708157109627
1      2204081406354342
2    892984837048418304
3      2204081406354342
4       843482762412573
5      1667516366684818
Name: ID, dtype: int64
Save those records and see what it looks like...
Original records are from database to my csv file. one more wonder is when i save back them again to csv file able to see original records
>>> newdf= df['ID'].head(10)
>>> newdf.to_csv('D:\spam.csv', index = None, header=True, encoding='utf-8')
but why while displaying it shows differently ?? Huh Huh
Have attached screenshot for reference.
Installed version of Python 3.7.0

Attached Files

Thumbnail(s)
   
Reply
#8
That is not python/pandas problem.
The problem is that you view your csv file in some spreadsheet software (Excel, OpenOffice, etc., most probably Excel)
That is why you see scientific (exponential) notation. In addition, it's the excel that truncates the significant digits up to 15, i.e. the length stays the same, but significant digits are the first 15.
See for yourself - on the left is the actual csv file and on the right side is the same file open in excel
   

To see what you really have in the file, open your csv file as it comes from the database in a notepad or notepadd++ or some other simple text editor. You will see it's what pandas show as dataframe values

As an advice and to avoid such confusions in the future, when refer to data/content in csv/txt files, always tell what you see in notepad/notepad++ and not in what you see being displayed in a spreadsheet.
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#9
Video 
It is not the issue in the way we view the data with csv file.. it the way we read the actual data from csv file.

Can you try reading the Csv file has below data
2229708157109620
2204081406354340
892984837048418000
and can show what you get in python or pandas or csv or with any other module.
Reply
#10
Do you open the csv file in notepad and see the data in it as I requested in my previous post? Yes or No? Note - open the file as exported from DB, not opened and saved in excel.
I demonstrated that the issue is with the excel.
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Excel from SAP - dealing with formats and VBA MasterOfDestr 7 447 Feb-25-2024, 12:23 PM
Last Post: Pedroski55
  UnicodeEncodeError - Dealing with Japanese Characters fioranosnake 2 2,352 Jul-07-2022, 08:43 PM
Last Post: fioranosnake
  how to get non-exponential format Skaperen 1 1,415 Nov-21-2021, 08:51 PM
Last Post: bowlofred
  Dealing with duplicated data in a CSV file bts001 10 11,187 Sep-06-2021, 12:11 AM
Last Post: SamHobbs
  Dealing with a .json nightmare... ideas? t4keheart 10 4,249 Jan-28-2020, 10:12 PM
Last Post: t4keheart
  understanding exponential and bitwise operators srm 1 2,000 Jun-15-2019, 11:14 AM
Last Post: ThomasL
  Dealing with multiple context managers heras 5 4,600 Nov-16-2018, 09:01 AM
Last Post: DeaD_EyE
  dealing with big data of timestamp LMQ 0 2,142 Jul-27-2018, 01:23 PM
Last Post: LMQ
  dealing with spaces in file names AceScottie 5 74,633 Jun-02-2018, 01:06 PM
Last Post: AceScottie
  Dealing with strings thru mmap in Python doublezero 4 7,935 Mar-01-2017, 06:33 PM
Last Post: zivoni

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020