Python Forum
Handling multiple errors when using datafiles in Pandas
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Handling multiple errors when using datafiles in Pandas
#1
Hello
I'm relatively new to Python so please forgive me if this isn't a particularly challenging problem.

I have a very large spreadsheet that I need to 'read' in Python and to parse each row of one of the columns in the spreadsheet. This column contains a mixture of locations that require converting to decimal degrees latitude and longitude. The mixture part is a random selection of British National Grid (SK 12345 12345 for example), Irish National Grid (J 12345 12345 for example) and Eastings & Northings (512345 412345 for example).

I can read the spreadsheet into a dataframe. I can convert each of the location type to the correct format. The problem lies when the format of the location type is the wrong type for the conversion routine input.

I can use Try: to trap the error individually and write dummy values for lat and long so that the error doesn't cause the program to stop and, I suppose I could just run the resultant file back through the next conversion routine to gradually convert all the location data but this isn't an elegant way of doing this or particularly efficient as I'm dealing with in excess of 250000 locations.

Is there a way of doing this all in one pass?

I've tried nested Try: but couldn't get this to work reliably...

Any ideas please?

Thanks

--
Nev
Reply
#2
It is hard to say something without further details, e.g. Does your data frame include missed values, are all values presented in the same format (coordinates might be presented as strings, e.g. in format dd.ddddddd (decimal degrees) or
dddd mmm ssss (degrees, minutes and seconds)) or something else?

250k locations is not too much. You are likely needed to use some regular expressions, not sure.
You can do all this conversion in one pass, e.g. consider using .apply method of a df-instance. You will need to write your own function to be applied. Hope that helps.
Could you provide subsample of your data?
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Import multiple CSV files into pandas Krayna 0 1,714 May-20-2021, 04:56 PM
Last Post: Krayna
  pandas str.extract multiple regex groups with OR pythonidae 2 7,871 Dec-19-2019, 05:43 PM
Last Post: pythonidae
  Reading Multiple Sheets using Pandas dhiliptcs 1 4,041 Sep-30-2019, 11:26 PM
Last Post: scidam

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020