Nov-02-2020, 10:49 PM
Hi everyone,
I'm having some slight difficulty getting this function to do exactly what I need it to do.
Essentially it reads in a file.csv, and needs to store various columns into a "master_list" of tuples.
My function works, but I'm sure could be improved. It does not return exactly the output I need.
These are the columns of interest:
and the requirement:
Here's my code:
Expected output:
I'm having some slight difficulty getting this function to do exactly what I need it to do.
Essentially it reads in a file.csv, and needs to store various columns into a "master_list" of tuples.
My function works, but I'm sure could be improved. It does not return exactly the output I need.
These are the columns of interest:
Quote:year = int(line[2])
month = int(line[3])
magnitude = float(line[9])
location = line[19]
latitude = float(line[20])
longitude = float(line[21])
deaths = int(line[23])
missing = int(line[25])
injuries = int(line[27])
damages = float(line[29])
and the requirement:
Quote:If the number of deaths, missing, injured, and damages columns are empty, replace
it with a zero. If any other numerical data cannot be made into an int or float,
skip that entire line of data. Create a tuple with items in this order:
tup = (year,month,magnitude,location,latitude,longitude,\
deaths,missing,injuries,damages)
Here's my code:
def read_file(fp): next(fp, None) masterList = [] tup = () for col in csv.reader(fp, delimiter=',', skipinitialspace=True): year = col[2] month = col[3] magnitude = col[9] location = col[19] latitude = col[20] longitude = col[21] deaths = col[23] missing = col[25] injured = col[27] damages = col[29] try: year = int(year) month = int(month) magnitude = float(magnitude) latitude = float(latitude) longitude = float(longitude) except: continue if deaths.isdigit() == True: if int(deaths) > 0: deaths = int(deaths) elif deaths == '': deaths = int('0') else: deaths = int('0') if missing.isdigit() == True: if int(missing) > 0: missing = int(missing) elif missing == '': missing = int('0') else: missing = int('0') if injured.isdigit() == True: if int(injured) > 0: injured = int(injured) elif injured == '': injured = int('0') else: injured = int('0') if isinstance(damages, float) == True: try: damages = float(damages) if damages: damages = int('0') except: damages = int('0')I think the most major problem with my function right now is that the last column (damages) does not return the correct value all of the time. If the cell in the csv is blank, it needs to be a 0. If not, it needs to read the float. I can't seem to get this right, can anybody offer some suggestions?
Expected output:
[(2020, 1, 6.0, 'CHINA: XINJIANG PROVINCE', 39.831, 77.106, 1, 0, 2, 0), (2020, 1, 6.7, 'TURKEY: ELAZIG AND MALATYA PROVINCES', 38.39, 39.081, 41, 0, 1600, 0), (2020, 1, 7.7, 'CUBA: GRANMA; CAYMAN IS; JAMAICA', 19.44, -78.755, 0, 0, 0, 0), (2020, 2, 6.0, 'TURKEY: VAN; IRAN', 38.482, 44.367, 10, 0, 60, 0), (2020, 3, 5.4, 'BALKANS NW: CROATIA: ZAGREB', 45.897, 15.966, 1, 0, 27, 6000.0), (2020, 3, 5.7, 'USA: UTAH', 40.751, -112.078, 0, 0, 0, 48.5)]My output (notice the last float in the tuple is wrong):
[(2020, 1, 6.0, 'CHINA: XINJIANG PROVINCE', 39.831, 77.106, 1, 0, 2, 0), (2020, 1, 6.7, 'TURKEY: ELAZIG AND MALATYA PROVINCES', 38.39, 39.081, 41, 0, 1600, 0), (2020, 1, 7.7, 'CUBA: GRANMA; CAYMAN IS; JAMAICA', 19.44, -78.755, 0, 0, 0, 0), (2020, 2, 6.0, 'TURKEY: VAN; IRAN', 38.482, 44.367, 10, 0, 60, 0), (2020, 3, 5.4, 'BALKANS NW: CROATIA: ZAGREB', 45.897, 15.966, 1, 0, 27, 6000.0), (2020, 3, 5.7, 'USA: UTAH', 40.751, -112.078, 0, 0, 0, 0)]