my openpyxl use is too slow, am I reading rows incorrectly?

Clunk_Head · Jun-12-2019, 09:58 PM

I'm using openpyxl to read 500,000 to 900,000 record excel files with 100 columns, give or take.
I have a function that I'm using to read a row:

def read_row(worksheet, row, cols):
    row_data = []
    for index in range(1, cols + 1): 
        row_data.append(worksheet.cell(row, column = index).value)
    return row_data

but it takes between .2 and .8 seconds to read and return each row.
This average of .5 seconds times 900,000 translates to 5 days processing time for a single pull.
Is there any way to speed up this function or to use a faster module for excel?

For this I'm not married to any concept except using python to read excel so I'm open to any constructive advice.

Thank you

Clunk_Head · Jun-12-2019, 11:48 PM

Answered my own question:

def read_row(worksheet, row):
    return [cell.value for cell in worksheet[row]]

This reduced read_row time to a hair over 0.007 seconds and increased performance by a factor of 70.
5 days has been reduced to less than 2 hours but I hope there's a way to cut this down even more.
Still leaving this open for any further advice.

deac33 · Apr-30-2020, 10:29 PM

Huge help for me, makes my program usable now. Makes sense to read the whole row instead of a cell at a time as I had done. Smile

I guess you didn't get any further help?
Thanks mucho, -deac33

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Reading Specific Rows In a CSV File	finndude	3	972	Dec-13-2022, 03:19 PM Last Post: finndude
	The code I have written removes the desired number of rows, but wrong rows	Jdesi1983	0	1,625	Dec-08-2021, 04:42 AM Last Post: Jdesi1983
	openpyxl incorrect delete rows	VladislavM	6	4,077	Jul-19-2021, 08:54 AM Last Post: VladislavM
	Pandas DataFrame combine rows by column value, where Date Rows are NULL	rhat398	0	2,105	May-04-2021, 10:51 PM Last Post: rhat398
	Indexing [::-1] to Reverse ALL 2D Array Rows, ALL 3D, 4D Array Columns & Rows Python	Jeremy7	8	7,097	Mar-02-2021, 01:54 AM Last Post: Jeremy7
	Writing to file ends incorrectly	project_science	4	2,677	Jan-06-2021, 06:39 PM Last Post: bowlofred
	New to the language. What am I doing incorrectly here?	christopher3786	3	2,232	Jun-20-2020, 10:18 AM Last Post: pyzyx3qwerty
	How can I speed up my openpyxl program reading Excel .xlsx files?	deac33	0	3,387	May-04-2020, 08:02 PM Last Post: deac33
	Reading specific rows from CVS to Excel	DavidTheGrockle	3	2,658	Nov-06-2019, 04:49 PM Last Post: DavidTheGrockle
	list is printing incorrectly..	anna	1	2,076	May-18-2019, 12:12 PM Last Post: ichabod801

my openpyxl use is too slow, am I reading rows incorrectly?

User Panel Messages

Announcements