Python Forum
Step through a really large CSV file incrementally in Python
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Step through a really large CSV file incrementally in Python
#4
(May-06-2019, 08:14 PM)Yoriz Wrote: Maybe something like this will work

Note All code not tested

with open(source) as csv_file:
    csv_reader = csv.reader(csv_file, delimiter=',')
    next(csv_reader)
    insert_sql = """ INSERT INTO billing_info_test (InvoiceId, PayerAccountId, LinkedAccountId) VALUES (%s, %s, %s) """
    rows = []
    row_count = 0
    for row in csv_reader:
        row_count += 1
        rows.append(row)
        if row_count == 1000:
            cursor.executemany(insert_sql,rows)
            print(cursor.rowcount, 'inserted with LinkedAccountId', row[2], 'at', datetime.now().isoformat())
            rows = []
            row_count = 0
    if rows:
        cursor.executemany(insert_sql,rows)
        print(cursor.rowcount, 'inserted with LinkedAccountId', row[2], 'at', datetime.now().isoformat())
    print("Committing the DB")
    mydb.commit(
cursor.close()
mydb.close()


Using itertools
from itertools import zip_longest


def grouper(iterable, n, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
    args = [iter(iterable)] * n
    return zip_longest(*args, fillvalue=fillvalue)
    
with open(source) as csv_file:
    csv_reader = csv.reader(csv_file, delimiter=',')
    next(csv_reader)
    insert_sql = """ INSERT INTO billing_info_test (InvoiceId, PayerAccountId, LinkedAccountId) VALUES (%s, %s, %s) """
    for rows in grouper(csv_reader, 1000):
        cursor.executemany(insert_sql,rows)
        print(cursor.rowcount, 'inserted with LinkedAccountId', row[2], 'at', datetime.now().isoformat())
    print("Committing the DB")
    mydb.commit(
cursor.close()
mydb.close()

Your first example worked really nicely! Thank you for that! It supercharged this process. I also want to learn the iterools method. I will try that tomorrow. Best wishes!
Reply


Messages In This Thread
RE: Step through a really large CSV file incrementally in Python - by bluethundr - May-07-2019, 02:48 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Converted EXE file size is too large Rajasekaran 0 1,554 Mar-30-2023, 11:50 AM
Last Post: Rajasekaran
  validate large json file with millions of records in batches herobpv 3 1,320 Dec-10-2022, 10:36 PM
Last Post: bowlofred
  Pyinstaller distribution file seems too large hammer 4 2,814 Mar-31-2022, 02:33 PM
Last Post: snippsat
  Initializing, reading and updating a large JSON file medatib531 0 1,817 Mar-10-2022, 07:58 PM
Last Post: medatib531
  can't read QRcode in large file simoneek 0 1,527 Sep-16-2020, 08:52 AM
Last Post: simoneek
  Iterate 2 large text files across lines and replace lines in second file medatib531 13 6,013 Aug-10-2020, 11:01 PM
Last Post: medatib531
  Read/Sort Large text file avoiding line-by-line read using mmep or hdf5 Robotguy 0 2,079 Jul-22-2020, 08:11 PM
Last Post: Robotguy
  Loading large .csv file with pandas hangejj 2 2,429 Jun-08-2020, 01:32 AM
Last Post: hangejj
  Generate Cartesian Products with Itertools Incrementally CoderMan 2 1,874 Jun-04-2020, 04:51 PM
Last Post: CoderMan
  Moving large amount of data between MySql and Sql Server using Python ste80adr 4 3,466 Apr-24-2020, 01:24 PM
Last Post: Jeff900

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020