Data Dictionaries in Python - Printable Version

Data Dictionaries in Python - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Data Dictionaries in Python (/thread-22717.html)

Pages: 1 2

Data Dictionaries in Python - mrsenorchuck - Nov-24-2019

Hello,

Hope you are well.

I want to load the all time premier league table in to a a data dictionary in phyton, https://en.wikipedia.org/wiki/Premier_League_records_and_statistics

I have the table in that link in a csv file. PL.csv.

I want to load each column in to a data dictionary using the first entry in each row as the row lable.

How can I do this? I know this is basic stuff but I am new to the language.

Regards,
Aidan.

RE: Data Dictionaries in Python - Larz60+ - Nov-24-2019

Please post a small data sample (just a few rows)

RE: Data Dictionaries in Python - mrsenorchuck - Nov-24-2019

(Nov-24-2019, 03:35 AM)Larz60+ Wrote: Please post a small data sample (just a few rows)

Thank you for your reply.

Pos,Club,Seasons,Pld,Win,Draw,Loss,GF,GA,GD,Pts
1,Manchester United,27,1038,648,224,166,1989,929,1060,2168
2,Arsenal,27,1038,565,260,213,1845,1013,832,1955
3,Chelsea,27,1038,558,257,223,1770,1002,768,1931
4,Liverpool,27,1038,529,262,247,1774,1046,728,1849

This is just a historic list of the performace of teams in a league, the first row is the column labels.

I am looking to load this data in to a data dictioary in python.

RE: Data Dictionaries in Python - DeaD_EyE - Nov-24-2019

The csv module can do this for you.

import csv


result = []
with open('PL.csv') as fd:
    reader = csv.DictReader(fd)
    for row in reader:
        result.append(row)


print(result)

You get a list with dicts. The headers are included as keys in the dicts.

https://docs.python.org/3/library/csv.html#csv.DictReader

If you scroll down, you see some other examples.

RE: Data Dictionaries in Python - mrsenorchuck - Nov-24-2019

(Nov-24-2019, 10:32 AM)DeaD_EyE Wrote: The csv module can do this for you.
import csv


result = []
with open('PL.csv') as fd:
    reader = csv.DictReader(fd)
    for row in reader:
        result.append(row)


print(result)
You get a list with dicts. The headers are included as keys in the dicts.

https://docs.python.org/3/library/csv.html#csv.DictReader

If you scroll down, you see some other examples.

Thank you for your reply.

I am looking to import the csv file into columns in a dictionary.

Your code above loads the csv data in and labels it correctly etc.

Is the data already in columns with this code and if so how do I access individual items lke the 5th best team etc?

RE: Data Dictionaries in Python - snippsat - Nov-24-2019

(Nov-24-2019, 10:50 AM)mrsenorchuck Wrote: Is the data already in columns with this code and if so how do I access individual items lke the 5th best team etc?

Can show a way with Pandas which has a lot of power if you shall manipulate data for different results.
Can take table direct from web-site,to a DataFrame with pd.read_html.
Here a Notebook,the table is pretty clean and can start to work with right away for different result.

As you see in Notebook example can ask question that check for more than one result.

# More than 3 1st and also more than 560 Win
df[(df['1st'] >= 3) & (df['Win'] > 560)]

RE: Data Dictionaries in Python - mrsenorchuck - Nov-24-2019

(Nov-24-2019, 02:07 PM)snippsat Wrote:
(Nov-24-2019, 10:50 AM)mrsenorchuck Wrote: Is the data already in columns with this code and if so how do I access individual items lke the 5th best team etc?
Can show a way with Pandas which has a lot of power if you shall manipulate data for different results.
Can take table direct from web-site,to a DataFrame with pd.read_html.
Here a Notebook,the table is pretty clean and can start to work with right away for different result.

As you see in Notebook example can ask question that check for more than one result.
# More than 3 1st and also more than 560 Win
df[(df['1st'] >= 3) & (df['Win'] > 560)]

Thanks for your reply.

You see I can't use pandas.

Looking for a good worked example importing and interogating data using dictionaries

RE: Data Dictionaries in Python - perfringo - Nov-24-2019

Working example:

from csv import DictReader

with open('premier_league.txt', 'r') as f:
    data = list(DictReader(f))

data is list of dictionaries. We can 'query' like this:

>>> [row['Club'] for row in data if row['Pos'] == '4']
['Liverpool']
>>> next(row['Club'] for row in data if row['Pos'] == '4')
'Liverpool'
>>> (max((row for row in data), key=lambda x: int(x['Seasons'])))['Club']  # Club with max number of seasons
'Manchester United'

As you notice, DictReader will not convert numbers into integers automagically. It can be done 'manually' during import or after.

RE: Data Dictionaries in Python - mrsenorchuck - Nov-24-2019

(Nov-24-2019, 02:49 PM)perfringo Wrote: Working example:
from csv import DictReader

with open('premier_league.txt', 'r') as f:
    data = list(DictReader(f))
data is list of dictionaries. We can 'query' like this:
>>> [row['Club'] for row in data if row['Pos'] == '4']
['Liverpool']
>>> next(row['Club'] for row in data if row['Pos'] == '4')
'Liverpool'
>>> (max((row for row in data), key=lambda x: int(x['Seasons'])))['Club']  # Club with max number of seasons
'Manchester United'
As you notice, DictReader will not convert numbers into integers automagically. It can be done 'manually' during import or after.

Thanks for this detail.

Ok, so I see you are using specific filter criteria to query specific rows.

When the data is loaded in to the data dictionary is it easy to sum columns etc? I am guessing they would have to be converted to int first?

RE: Data Dictionaries in Python - mrsenorchuck - Nov-24-2019

So I used this appraoch below:

How can I get the average points and pick teams that have finished first?

Sample data at the very bottom

# Purpose: Inputing data from a csv file
# Example of: File input from a csv file, using a dictionary

print("This program loads all historic premier league data")

# start with an empty dictionary
# dictionary keys will be the (Pos, Club)
premier = {}

print()
print("Historic premier league")
# open the file
with open(r"Historic_PL.csv") as data_file:
# read in the first line containing the headers
headers = data_file.readline()

# for each other line in the file
for line in data_file:
# split each line into components (remove white space from ends of line)
Pos,Club,Seasons,Pld,Win,Draw,Loss,GF,GA,GD,Pts,First,Second,Third,Fourth,Relegated,Best = line.strip().split(",")

# insert the data into the dictionary
premier[(int(Pos), Club)] = (int(Seasons),int(Pld),int(Win),int(Draw),int(Loss),int(GF),int(GA),int(GD),int(Pts),int(First),int(Second),int(Third),int(Fourth),int(Relegated),int(Best))

print(f"Number of values: {len(premier)}")

################################################################
Pos,Club,Seasons,Pld,Win,Draw,Loss,GF,GA,GD,Pts,First,Second,Third,Fourth,Relegated,Best
1,Manchester United,27,1038,648,224,166,1989,929,1060,2168,13,6,3,1,0,1
2,Arsenal,27,1038,565,260,213,1845,1013,832,1955,3,6,5,7,0,1
3,Chelsea,27,1038,558,257,223,1770,1002,768,1931,5,4,5,2,0,1
4,Liverpool,27,1038,529,262,247,1774,1046,728,1849,0,4,5,7,0,2
5,Tottenham Hotspur,27,1038,446,257,335,1547,1306,241,1595,0,1,2,3,0,2
6,Everton,27,1038,377,296,365,1357,1311,46,1427,0,0,0,1,0,4
7,Manchester City,22,848,391,196,261,1374,975,399,1369,4,2,2,1,2,1