Oct-25-2019, 05:56 PM
I have a parquet file with 4 columns. It looks something like below.
TYPE | ID | SRNO | AMT
D | 123456 | 1 | 100.00
D | 123457 | 2 | 200.00
D | 123459 | 3 | 500.00
D | 123458 | 4 | 1000.00
The Schema for this file is
Does it mean Pandasql cannot handle a file that has Decimal Datatypes ? Is there a better programatic alternatives to handles this than Typecasting explicitly .
Thanks in Advance
TYPE | ID | SRNO | AMT
D | 123456 | 1 | 100.00
D | 123457 | 2 | 200.00
D | 123459 | 3 | 500.00
D | 123458 | 4 | 1000.00
The Schema for this file is
dataframe.printSchema
Output:|-- TYPE: string (nullable = true)
|-- ID: integer (nullable = true)
|-- SRNO: integer (nullable = true)
|-- AMT: decimal(15,2) (nullable = true)
NOTE : When I read this file in pandas the schema changes for decimal and is represented as objectpandas_dataframe.dtypes
Output:TYPE object
ID int32
SRNO int32
AMT object
I get the below error when I try to run a sql on the Dataframe.ps.sqldf("select * from pandas_dataframe")
Error:Traceback (most recent call last):
File "/usr/local/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 1229, in _execute_context
cursor, statement, parameters, context
File "/usr/local/lib64/python3.6/site-packages/sqlalchemy/engine/default.py", line 577, in do_executemany
cursor.executemany(statement, parameters)
sqlite3.InterfaceError: Error binding parameter 3 - probably unsupported type.
FYI .. I tried casting the Decimal field to String , Double and it works fine . Does it mean Pandasql cannot handle a file that has Decimal Datatypes ? Is there a better programatic alternatives to handles this than Typecasting explicitly .
Thanks in Advance