Posts: 9
Threads: 4
Joined: Oct 2024
have some more problems with this code but i thing if i can overcome this one i can find a way to fix the others as well. have a file with country, year, and life_expectancy and i need to calculate what is the average life_expectancy from all the country put together from a choosen year. so the user types in a year and the programm print him the average number from that year. here is the parth where my code struggles the most.
if choose_year.lower() == year:
year_expectancy = sum(expectancy) / len(expectancy) i can not use panda or other libarys and should be only fixed with relly basic coding. thank you all for your help
#Ah example of how the list work
# germany, ger, 2021, 20.663,
# germany, ger, 1990, 17.638,
# brasil, bra, 1999, 22.473,
# brasil, bra, 2002, 7.982,
# England, UK, 2021, 9.827,
# england UK, 1999, 14.672,
# japan, ja, 2005, 20.661,
# japan, ja, 2021, 16.836,
# mexico, mx, 2008, 11.383,
# mexico, mx, 1999, 26.837,
life_expectancy = open('D://life-expectancy.csv')
with open('D://life-expectancy.csv') as data:
max_expectancy = 0
min_expectancy = 99999999
index = 0
choose_year = 0
year_expectancy = 0
for line in data:
data = line.strip()
data = line.split(',')
entity = data[0]
code = data[1]
year = int(data[2])
expectancy = float(data[3])
if expectancy > max_expectancy:
max_expectancy = expectancy
max_country = entity
max_year = year
if expectancy < min_expectancy:
min_expectancy =expectancy
min_country = entity
min_year = year
choose_year = input('Enter the year of interest : ')
# prints allways 0. that calculation does not work
if choose_year.lower() == year:
year_expectancy = sum(expectancy) / len(expectancy)
print()
print(f'The overall max life expectancy is: {max_expectancy} from {max_country} in {max_year}')
print(f'The overall min life expectancy is: {min_expectancy} from {min_country} in {min_year}')
print()
print(f'For the year {choose_year}:')
# here i need the answer of the calculation put it
print(f'The average life expectancy across all countries was {year_expectancy}')
# #print(f'The max life expenctancy was in {} with {}')
# #print(f'The min life expectancy was in {} with{}')
Posts: 1,145
Threads: 114
Joined: Sep 2019
Maybe something like this. There are better ways.
import csv
file = 'test.csv'
# Create a list
tmplist = []
alist = []
tmp = []
# Get the data from csv file
with open(file, 'r') as data:
for lines in data.readlines():
tmplist.append(lines.strip().split(','))
# Do some cleanup of data
for lines in tmplist:
line = [word.strip() for word in lines]
alist.append(line)
year = input('Choose Year: ')
for index, item in enumerate(alist):
if item[2] == year:
tmp.append(float(item[3]))
if len(tmp) > 0:
avg = sum(tmp)/len(tmp)
print(f'Avg. {avg}')
else:
print(f'There is no data for {year}')
Posts: 6,827
Threads: 20
Joined: Feb 2020
What have you tried? You should start by collecting your data in a list or dictionary so you can use it after it’s been read.
Posts: 9
Threads: 4
Joined: Oct 2024
I did Sort them into countrys, abbreviation, years and data
And i can Work with These 4 parts, but now i struggles when. I need the whole Line from all with the Same year
Posts: 1,145
Threads: 114
Joined: Sep 2019
In my example query the list for the whole line using an if statement.
Posts: 1,950
Threads: 8
Joined: Jun 2018
This task can be broken into steps:
- read data from file
- find max and min values of column and get data from that row
- filter column data and perform calculations.
# data in life-expectancy.csv as this:
germany,ger,2021,20.663
germany,ger,1990,17.638
brasil,bra,1999,22.473
brasil,bra,2002,7.982
England,UK,2021,9.827
england,UK,1999,14.672
japan,ja,2005,20.661
japan,ja,2021,16.836
mexico,mx,2008,11.383
mexico,mx,1999,26.837
# read data from file, convert values and create list of dictionaries:
import csv
with open("life-expectancy.csv", "r", newline="") as csvfile:
processing = {"Country": str, "Abbreviation": str, "Year": int, "Life expectancy": float}
reader = csv.DictReader(csvfile, fieldnames=processing)
data = [{k:processing[k](v) for k, v in line.items()} for line in reader]
# data is:
[{'Country': 'germany', 'Abbreviation': 'ger', 'Year': 2021, 'Life expectancy': 20.663},
{'Country': 'germany', 'Abbreviation': 'ger', 'Year': 1990, 'Life expectancy': 17.638},
{'Country': 'brasil', 'Abbreviation': 'bra', 'Year': 1999, 'Life expectancy': 22.473},
{'Country': 'brasil', 'Abbreviation': 'bra', 'Year': 2002, 'Life expectancy': 7.982},
{'Country': 'England', 'Abbreviation': 'UK', 'Year': 2021, 'Life expectancy': 9.827},
{'Country': 'england', 'Abbreviation': 'UK', 'Year': 1999, 'Life expectancy': 14.672},
{'Country': 'japan', 'Abbreviation': 'ja', 'Year': 2005, 'Life expectancy': 20.661},
{'Country': 'japan', 'Abbreviation': 'ja', 'Year': 2021, 'Life expectancy': 16.836},
{'Country': 'mexico', 'Abbreviation': 'mx', 'Year': 2008, 'Life expectancy': 11.383},
{'Country': 'mexico', 'Abbreviation': 'mx', 'Year': 1999, 'Life expectancy': 26.837}] There are built-in min and max functions and they can be applied to data (it contains repetition and I would keep it as the challange for OP :-)):
longest = max(data, key=lambda x: x["Life expectancy"])
shortest = min(data, key=lambda x: x["Life expectancy"])
print(f"Happiest life in {longest['Country']} in {longest['Year']} for {longest['Life expectancy']}")
print(f"Saddest life in {shortest['Country']} in {shortest['Year']} for {shortest['Life expectancy']}")
# outputs
Happiest life in mexico in 1999 for 26.837
Saddest life in brasil in 2002 for 7.982 For filtering data based on target value we can define helper function to yield rows which match the criteria (once again, I left some things to solve):
def filter_rows(data, **kwargs):
for row in data:
for key, value in kwargs.items():
if row.get(key) != value:
break
else:
yield row
print(sum(row["Life expectancy"] for row in filter_rows(data, Year=2021)))
# 47.326
records = filter_rows(data, Country="brasil")
print(*records)
#{'Country': 'brasil', 'Abbreviation': 'bra', 'Year': 1999, 'Life expectancy': 22.473}
# {'Country': 'brasil', 'Abbreviation': 'bra', 'Year': 2002, 'Life expectancy': 7.982}
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy
Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Posts: 1,095
Threads: 143
Joined: Jul 2017
I would just use DictReader from the module csv. DictReader gives you a dictionary for every row in your csv. The dictionary keys are the column headers.
csv.reader and DictReader are generators, so it is very small in memory. People come here and say, "I have a csv with 10 million rows, how to deal with that?"
import csv
path2csv = '/home/pedro/temp/life_expect.csv'
# get a list of dictionaries with the data from the csv
with open(path2csv) as infile:
data = list(csv.DictReader(infile))
cc = input('What country do you want to know the average life span for? ')
year = input('Which year are you thinking of? ')
# didn't use year here
for d in data:
if d['country'] == cc:
print(d['life']) Output: 20.663
17.638
da = csv.DictReader(path2csv)
type(da)
<class 'csv.DictReader'>
import sys
sys.getsizeof(da) Output: 48
You can do this without csv by simulating the DictReader: get the first row as keys, then make a dictionary of every row, using the headers as keys.
Posts: 6,827
Threads: 20
Joined: Feb 2020
This is vague:
Quote:i can not use panda or other libarys
Does this mean you can only use built-ins? No imports (csv for example)?
Posts: 9
Threads: 4
Joined: Oct 2024
(Oct-21-2024, 06:20 PM)deanhystad Wrote: This is vague:
Quote:i can not use panda or other libarys
Does this mean you can only use built-ins? No imports (csv for example)?
i can only inmport the data but no other program or so that helps me to work with this
Posts: 6,827
Threads: 20
Joined: Feb 2020
If you cannot use any libraries, even a standard library like csv, you'll need to do something like menator's example. Just ignore where menator imports csv and then doesn't use it. Using names from your example code:
with open("life_expectancy.csv", "r") as file:
data = []
for line in file:
country, code, year, expectancy = line.split(",")
data.append((country.strip(), code.strip(), int(year), float(expectancy)))
print(data) Output: [('germany', 'ger', 2021, 20.663), ('germany', 'ger', 1990, 17.638), ('brasil', 'bra', 1999, 22.473), ('brasil', 'bra', 2002, 7.982), ('england', 'uk', 2021, 9.827), ('england', 'uk', 1999, 14.672), ('japan', 'ja', 2005, 20.661), ('japan', 'ja', 2021, 16.836), ('mexico', 'mx', 2008, 11.383), ('mexico', 'mx', 1999, 26.837)]
Now you have a list where each element in the list is a line from your expectancy file. You can use this list to extract information. For example, I can get all the data for Japan.
japan_data = [entry for entry in data if entry[0] == 'japan']
print(japan_data) Output: [('japan', 'ja', 2005, 20.661), ('japan', 'ja', 2021, 16.836)]
You can also use functions like max, min and sort on these values. A better way to find max and min expectancy.
expectancy = lambda x: x[3]
print("Max life expectancy =", max(data, key=expectancy))
print("Min life expectancy =", min(data, key=expectancy)) Output: Max life expectancy = ('mexico', 'mx', 1999, 26.837)
Min life expectancy = ('brasil', 'bra', 2002, 7.982)
Or print a table of the data in chronological order.
country = lambda x: x[0]
year = lambda x: x[2]
for entry in sorted(data, key=year):
print(f"{year(entry)} {country(entry):10} {expectancy(entry):6.3f}") Output: 1990 germany 17.638
1999 brasil 22.473
1999 england 14.672
1999 mexico 26.837
2002 brasil 7.982
2005 japan 20.661
2008 mexico 11.383
2021 germany 20.663
2021 england 9.827
2021 japan 16.836
|