Sep-16-2022, 10:43 AM
in the code below, page is the webpage response from Playwright for what it's worth:
The csv file gets created ok and looks good as far as I can tell. But then when I go to import this file into a Microsoft Access database, nothing is imported. Is there some sort of conversion that needs to be done before writing the data to the csv file?
Thank you
The csv file gets created ok and looks good as far as I can tell. But then when I go to import this file into a Microsoft Access database, nothing is imported. Is there some sort of conversion that needs to be done before writing the data to the csv file?
Thank you
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
for row in page.query_selector_all( "table tbody tr" ): i + = 1 link = row.query_selector( "xpath=td/a[text()='View']" ) url = link.get_attribute( 'href' ) charges = process_details(url) if charges not in [' ', ' err']: charges = fix_charge(charges) print (charges) for charge in charges.split( ';' ): print ( '------' , charge) record = [ row.query_selector_all( "td" )[ 1 ].inner_text(), row.query_selector_all( "td" )[ 2 ].inner_text(), row.query_selector_all( "td" )[ 3 ].inner_text(), row.query_selector_all( "td" )[ 4 ].inner_text(), row.query_selector_all( "td" )[ 6 ].inner_text(), charge.strip().replace( ',' , ';' ), new_date ] if charge.strip() not in [' ', ' ']: output_file.writerow(record) page.go_back() file .close() def fix_charge(charge): charge = charge[charge.index( ' ' ):].lstrip() for num in re.findall( '\d+' , charge): charge = charge.replace(num, '1' ) charge = charge.replace( '1.' , ';' ) charge = charge.replace( '1-' , ';' ) charge = charge.replace( '1)' , ';' ) charge = charge.replace( '(1)' , ';' ) charge = charge.replace( ' 1 ' , ';' ) charge = charge.strip().strip( ',' ) return charge.strip() |