Python Forum

Full Version: import csv adding a header with pandas
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi,

I'm importing a csv without a header. And I'm trying to tweak my script so that a header is included during load.

In addition to including the header, can I define the field types?

import pandas, csv

from io import StringIO
from sqlalchemy import create_engine

def psql_insert_copy(table, conn, keys, header, data_iter):
   dbapi_conn = conn.connection
   with dbapi_conn.cursor() as cur:
       s_buf = StringIO()
       writer = csv.writer(s_buf)
       writer.writerows(data_iter)
       s_buf.seek(0)
       s_buf.write(header + "\n" + old) #--> test include header
       columns = ', '.join('"{}"'.format(k) for k in keys)
       if table.schema:
           table_name = '{}.{}'.format(table.schema, table.name, columns)
       else:
           table_name = table.name
       sql = 'COPY {} ({}) FROM STDIN WITH CSV'.format(table_name, columns)
       cur.copy_expert(sql=sql, file=s_buf)

engine = create_engine('postgresql://xxxxx:xxxx@xxxx:xxxxx/xxxxxxx')

df = pandas.read_csv("xxxxxx.csv")
df.to_sql('xxx', engine, schema='xxx', method=psql_insert_copy)
the layout of the file I'm importing

Quote:9999999,012021,I,0099999999,000000000099999999+,000000000000000000-,000000000000000000-
9999999,012021,I,0099999999,000000000099999999+,000000000000000000-,000000000000000000-
9999999,012021,I,0099999999,000000000099999999+,000000000000000000-,000000000000000000-
9999999,012021,I,0099999999,000000000099999999+,000000000000000000-,000000000000000000-

Header desejável

cnpj varchar,
data date,
tipo char,
conta varchar,
value integer,
value2 interger,
value3 integer



So, any ideas for implementation?