Python Forum
Read csv file, parse data, and store in a dictionary
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Read csv file, parse data, and store in a dictionary
#1
I have a file that contains songs recently played by a radio station, the artist, and time played in this format: "November 4, 2019 8:02 PM","Wagon Wheel","Darius Rucker". I am trying to store the content of this file in string variable playlist_csv, use splitlines() to store records in variable lines, and then iterate through the lines to store data in a dictionary. The key should be a datetime object of the timestamp, and the value should be a tuple of song and artist: {datetime_key: (song, artist)}

This is what I have for code so far:
# read the file and store content in string variable playlist_csv
with open('playlist.txt', 'r') as csv_file:
    playlist_csv = csv_file.read().replace('\n', '')
    # use splitlines() method to store records in variable lines (it is list)
    split_playlist = playlist_csv.splitlines()
    # iterate through lines to store data in playlist_dict dictionary
    playlist_dict = {}
    for l in csv.reader(split_playlist, quotechar='"', delimiter=',',
       quoting=csv.QUOTE_ALL, skipinitialspace=True):
       dt=datetime.strptime(l[0], '%B %d, %Y %I:%M %p')
       playlist_dict[l[dt]].append(dt)
print(playlist_dict)
However, I keep running into errors when trying to store this data in a dictionary (specifically "'datetime.datetime' object is not subscriptable" and "list indices must be integers or slices" when modifying the code). Desired output looks like: {datetime.datetime(2019, 11, 4, 20, 2): ('Wagon Wheel', 'Darius Rucker'),...}

I appreciate any help!
Reply
#2
Try playlist_dict[dt] = l[1:] perhaps.
Reply
#3
If you're sure, that you have only 3 columns everywhere, you can use item unpacking.

with open('playlist.txt', 'r') as csv_file:
    playlist_dict = {}
    reader = csv.reader(
        csv_file, quotechar='"', delimiter=',',
        quoting=csv.QUOTE_ALL, skipinitialspace=True
    )
    for timestamp, song, artist in reader:
       dt = datetime.strptime(timestamp, '%B %d, %Y %I:%M %p')
       playlist_dict[dt].append((song, artist))


print(playlist_dict)
You can make it shorter.
No use of splitlines, because the csv_reader does it indirect.

I corrected the assignment in #11 of your code.

If you want to assign a value to a key, it looks like this:
some_dict = {}
a_key = 'my_key'
some_value = 42
some_value = (1,2,3) # could be a tuple
some_value = [1,2,3] # could be a list
some_value = {'foo': 'bar'} # or  a dict
some_value = {1,2,3} # could be a set

# assignment 
some_dict[a_key] = some_value # in this case the name was last overwritten by a set
Keys could be only hashable objects. This means you could not use mutable mappings/sequences as key.
The datetime object is for example immutable. The values of a dict, don't need to be hashable.

And since Python 3.6 we've got the implementation detail, that dicts keeps the order.
Since Python 3.7 it's in the language specification and a guarantee.

If you test your code with older Python version, you'll get scrambled results.
Previously dicts didn't keep the order. In some versions they used an algorithm to scramble it.

If you want to write code for older Python versions, you have to know it.
In this case you can use collections.OrderedDict.
My code examples are always for Python >=3.6.0
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply
#4
Thanks for helping here. I am sure I only have 3 columns everywhere, and I am also using python 3.6. However, when I run this code, I keep getting the error:
playlist_dict[dt].append((song, artist))
KeyError: datetime.datetime(2019, 11, 4, 20, 2)

Any idea what is causing this?

(Nov-26-2019, 07:44 AM)DeaD_EyE Wrote: If you're sure, that you have only 3 columns everywhere, you can use item unpacking.

with open('playlist.txt', 'r') as csv_file:
    playlist_dict = {}
    reader = csv.reader(
        csv_file, quotechar='"', delimiter=',',
        quoting=csv.QUOTE_ALL, skipinitialspace=True
    )
    for timestamp, song, artist in reader:
       dt = datetime.strptime(timestamp, '%B %d, %Y %I:%M %p')
       playlist_dict[dt].append((song, artist))


print(playlist_dict)
You can make it shorter.
No use of splitlines, because the csv_reader does it indirect.

I corrected the assignment in #11 of your code.

If you want to assign a value to a key, it looks like this:
some_dict = {}
a_key = 'my_key'
some_value = 42
some_value = (1,2,3) # could be a tuple
some_value = [1,2,3] # could be a list
some_value = {'foo': 'bar'} # or  a dict
some_value = {1,2,3} # could be a set

# assignment 
some_dict[a_key] = some_value # in this case the name was last overwritten by a set
Keys could be only hashable objects. This means you could not use mutable mappings/sequences as key.
The datetime object is for example immutable. The values of a dict, don't need to be hashable.

And since Python 3.6 we've got the implementation detail, that dicts keeps the order.
Since Python 3.7 it's in the language specification and a guarantee.

If you test your code with older Python version, you'll get scrambled results.
Previously dicts didn't keep the order. In some versions they used an algorithm to scramble it.

If you want to write code for older Python versions, you have to know it.
In this case you can use collections.OrderedDict.
Reply
#5
My mistake.

In line number 9:
playlist_dict[dt].append((song, artist))
# the key dt does not exist
# no list behind
to...

playlist_dict[dt] = (song, artist)
If you expect songs/artist with the same date, then the value should be a list.
from collections import defaultdict

playlist_dict = defaultdict(list)
# not existing keys, return an empty list
# which could be modified

playlist_dict['this key does not exist'].append(42)  # <-- returns an empty list, which is already assigned to the key
# now the key 'this key does not exist' exists.
playlist_dict['this key does not exist'].append(43) # <-- adding next object to the existing list
So you can decide. Just assign a tuple with song/artist to the key.
If there is an song/artist with the same date, the old one is just overwritten.
If you expect this, the easiest way is to use a defaultdict.



import csv
from collections import defaultdict


with open('playlist.txt', 'r') as csv_file:
    playlist_dict = defaultdict(list)
    reader = csv.reader(
        csv_file, quotechar='"', delimiter=',',
        quoting=csv.QUOTE_ALL, skipinitialspace=True
    )
    for timestamp, song, artist in reader:
       dt = datetime.strptime(timestamp, '%B %d, %Y %I:%M %p')
       playlist_dict[dt].append((song, artist))
 
 
print(playlist_dict)
My code examples are always for Python >=3.6.0
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  how to parse data fakka 2 123 Sep-22-2021, 10:50 PM
Last Post: bowlofred
  Read csv file through PyCharm kimx0961 3 443 Aug-01-2021, 07:05 PM
Last Post: perfringo
  Why it does not print(file.read()) Rejaul84 1 405 Jul-01-2021, 10:37 PM
Last Post: bowlofred
  Read and write active Excel file euras 4 536 Jun-29-2021, 11:16 PM
Last Post: Pedroski55
  [Solved] Trying to read specific lines from a file Laplace12 7 773 Jun-21-2021, 11:15 AM
Last Post: Laplace12
  [Solved] Using readlines to read data file and sum columns Laplace12 4 517 Jun-16-2021, 12:46 PM
Last Post: Laplace12
  Read file, reformat and write new file bryanmartin113 1 517 Jun-08-2021, 09:27 PM
Last Post: Larz60+
  Read/Write binary file deanhystad 3 935 Feb-01-2021, 10:29 AM
Last Post: Larz60+
  Converting data in CSV and TXT to dictionary kam_uk 3 583 Dec-22-2020, 08:43 PM
Last Post: bowlofred
  xml file creation from an XML file template and data from an excel file naji_python 1 629 Dec-21-2020, 03:24 PM
Last Post: Gribouillis

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020