Apr-27-2018, 11:18 AM
I've to export the database table to Excel (xlsx) format.
I came to know that Python is very fast in file processing.
I've installed python 3.6.1, XlsxWriter-1.0.2 and cx_Oracle-5.3-12c
But I saw the date field from the database was getting converted to the number. To resolve this, I made a check if a list value \ (cell) value is date time, if yes then formatted the value to date type which resolves my problem
But, I am now running into performance issue with this extra check any suggestion how to gain performance
Below python function is used to generate excel xlsx from the database cursor
PS: Quick background of how I am proessing:
I am opening a loop which is processing only 50,000 rows and then calls this python excel writer function:
I've close to 900 such table and data counts in these ranges from 1,00,000 to 3,20,00,00,000
I am newbie to python and only think of executing the same python code for separate code parallel 10 times.
Guidance would be much appreciated
I came to know that Python is very fast in file processing.
I've installed python 3.6.1, XlsxWriter-1.0.2 and cx_Oracle-5.3-12c
But I saw the date field from the database was getting converted to the number. To resolve this, I made a check if a list value \ (cell) value is date time, if yes then formatted the value to date type which resolves my problem
But, I am now running into performance issue with this extra check any suggestion how to gain performance
Below python function is used to generate excel xlsx from the database cursor
PS: Quick background of how I am proessing:
I am opening a loop which is processing only 50,000 rows and then calls this python excel writer function:
I've close to 900 such table and data counts in these ranges from 1,00,000 to 3,20,00,00,000
I am newbie to python and only think of executing the same python code for separate code parallel 10 times.
Guidance would be much appreciated
#Function to write excel from Oracle Cursor def writeToExcel(cur_sor, targetDir, export_file_name): Actual_Path = os.path.join(targetDir, export_file_name) #Array to capture Date type columns DateTimeColumns = [] print('\t\t\t writing: '+export_file_name+'\t\t\t\t'+str(datetime.datetime.now())) workbook = xlsxwriter.Workbook(Actual_Path) # Create Excel Object for new workbook worksheet = workbook.add_worksheet(sourceSYS) # Add a New Worksheet Name - scott_schema row = 0 col = 0 for i in range(len(cur_sor.description)): desc = cur_sor.description[i] #Only Data Type column will be capture if format(desc[1])== "<class 'cx_Oracle.TIMESTAMP'>": DateTimeColumns.append(i) bold = workbook.add_format({'bold': True}) date_format = workbook.add_format({'num_format': 'dd/mm/yy'}) worksheet.write(row, (col + i), format(desc[0]), bold) # Iterate for column headers date_format = workbook.add_format({'num_format': 'dd/mm/yy'}) color_format = workbook.add_format() color_format.set_font_color('red') row = row + 1 #Loop for each row, return by database cursor for tupple_row in cur_sor: col = 0 #Loop for each column, for particular row open in level-up cursor for list_item in tupple_row: #If column position matches with datetype column position if col in DateTimeColumns: #Check if the cell value is date type, additional check to handle "None" (blank date cell) if isinstance(list_item, datetime.date) or isinstance(list_item, datetime.datetime) \ or isinstance(list_item, datetime.time) or isinstance(list_item, datetime.timedelta): #Format the date, inorder to save as date rather number worksheet.write(row, col, list_item.strftime("%Y-%b-%d %H:%M:%S.%f")) else: worksheet.write(row, col, list_item) col = col + 1 row = row + 1 workbook.close()