Python Forum
Read csv file, parse data, and store in a dictionary
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Read csv file, parse data, and store in a dictionary
#1
I have a file that contains songs recently played by a radio station, the artist, and time played in this format: "November 4, 2019 8:02 PM","Wagon Wheel","Darius Rucker". I am trying to store the content of this file in string variable playlist_csv, use splitlines() to store records in variable lines, and then iterate through the lines to store data in a dictionary. The key should be a datetime object of the timestamp, and the value should be a tuple of song and artist: {datetime_key: (song, artist)}

This is what I have for code so far:
# read the file and store content in string variable playlist_csv
with open('playlist.txt', 'r') as csv_file:
    playlist_csv = csv_file.read().replace('\n', '')
    # use splitlines() method to store records in variable lines (it is list)
    split_playlist = playlist_csv.splitlines()
    # iterate through lines to store data in playlist_dict dictionary
    playlist_dict = {}
    for l in csv.reader(split_playlist, quotechar='"', delimiter=',',
       quoting=csv.QUOTE_ALL, skipinitialspace=True):
       dt=datetime.strptime(l[0], '%B %d, %Y %I:%M %p')
       playlist_dict[l[dt]].append(dt)
print(playlist_dict)
However, I keep running into errors when trying to store this data in a dictionary (specifically "'datetime.datetime' object is not subscriptable" and "list indices must be integers or slices" when modifying the code). Desired output looks like: {datetime.datetime(2019, 11, 4, 20, 2): ('Wagon Wheel', 'Darius Rucker'),...}

I appreciate any help!
Reply
#2
Try playlist_dict[dt] = l[1:] perhaps.
Reply
#3
If you're sure, that you have only 3 columns everywhere, you can use item unpacking.

with open('playlist.txt', 'r') as csv_file:
    playlist_dict = {}
    reader = csv.reader(
        csv_file, quotechar='"', delimiter=',',
        quoting=csv.QUOTE_ALL, skipinitialspace=True
    )
    for timestamp, song, artist in reader:
       dt = datetime.strptime(timestamp, '%B %d, %Y %I:%M %p')
       playlist_dict[dt].append((song, artist))


print(playlist_dict)
You can make it shorter.
No use of splitlines, because the csv_reader does it indirect.

I corrected the assignment in #11 of your code.

If you want to assign a value to a key, it looks like this:
some_dict = {}
a_key = 'my_key'
some_value = 42
some_value = (1,2,3) # could be a tuple
some_value = [1,2,3] # could be a list
some_value = {'foo': 'bar'} # or  a dict
some_value = {1,2,3} # could be a set

# assignment 
some_dict[a_key] = some_value # in this case the name was last overwritten by a set
Keys could be only hashable objects. This means you could not use mutable mappings/sequences as key.
The datetime object is for example immutable. The values of a dict, don't need to be hashable.

And since Python 3.6 we've got the implementation detail, that dicts keeps the order.
Since Python 3.7 it's in the language specification and a guarantee.

If you test your code with older Python version, you'll get scrambled results.
Previously dicts didn't keep the order. In some versions they used an algorithm to scramble it.

If you want to write code for older Python versions, you have to know it.
In this case you can use collections.OrderedDict.
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply
#4
Thanks for helping here. I am sure I only have 3 columns everywhere, and I am also using python 3.6. However, when I run this code, I keep getting the error:
playlist_dict[dt].append((song, artist))
KeyError: datetime.datetime(2019, 11, 4, 20, 2)

Any idea what is causing this?

(Nov-26-2019, 07:44 AM)DeaD_EyE Wrote: If you're sure, that you have only 3 columns everywhere, you can use item unpacking.

with open('playlist.txt', 'r') as csv_file:
    playlist_dict = {}
    reader = csv.reader(
        csv_file, quotechar='"', delimiter=',',
        quoting=csv.QUOTE_ALL, skipinitialspace=True
    )
    for timestamp, song, artist in reader:
       dt = datetime.strptime(timestamp, '%B %d, %Y %I:%M %p')
       playlist_dict[dt].append((song, artist))


print(playlist_dict)
You can make it shorter.
No use of splitlines, because the csv_reader does it indirect.

I corrected the assignment in #11 of your code.

If you want to assign a value to a key, it looks like this:
some_dict = {}
a_key = 'my_key'
some_value = 42
some_value = (1,2,3) # could be a tuple
some_value = [1,2,3] # could be a list
some_value = {'foo': 'bar'} # or  a dict
some_value = {1,2,3} # could be a set

# assignment 
some_dict[a_key] = some_value # in this case the name was last overwritten by a set
Keys could be only hashable objects. This means you could not use mutable mappings/sequences as key.
The datetime object is for example immutable. The values of a dict, don't need to be hashable.

And since Python 3.6 we've got the implementation detail, that dicts keeps the order.
Since Python 3.7 it's in the language specification and a guarantee.

If you test your code with older Python version, you'll get scrambled results.
Previously dicts didn't keep the order. In some versions they used an algorithm to scramble it.

If you want to write code for older Python versions, you have to know it.
In this case you can use collections.OrderedDict.
Reply
#5
My mistake.

In line number 9:
playlist_dict[dt].append((song, artist))
# the key dt does not exist
# no list behind
to...

playlist_dict[dt] = (song, artist)
If you expect songs/artist with the same date, then the value should be a list.
from collections import defaultdict

playlist_dict = defaultdict(list)
# not existing keys, return an empty list
# which could be modified

playlist_dict['this key does not exist'].append(42)  # <-- returns an empty list, which is already assigned to the key
# now the key 'this key does not exist' exists.
playlist_dict['this key does not exist'].append(43) # <-- adding next object to the existing list
So you can decide. Just assign a tuple with song/artist to the key.
If there is an song/artist with the same date, the old one is just overwritten.
If you expect this, the easiest way is to use a defaultdict.



import csv
from collections import defaultdict


with open('playlist.txt', 'r') as csv_file:
    playlist_dict = defaultdict(list)
    reader = csv.reader(
        csv_file, quotechar='"', delimiter=',',
        quoting=csv.QUOTE_ALL, skipinitialspace=True
    )
    for timestamp, song, artist in reader:
       dt = datetime.strptime(timestamp, '%B %d, %Y %I:%M %p')
       playlist_dict[dt].append((song, artist))
 
 
print(playlist_dict)
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Help with to check an Input list data with a data read from an external source sacharyya 3 318 Mar-09-2024, 12:33 PM
Last Post: Pedroski55
  Matching Data - Help - Dictionary manuel174102 1 355 Feb-02-2024, 04:47 PM
Last Post: deanhystad
  Recommended way to read/create PDF file? Winfried 3 2,784 Nov-26-2023, 07:51 AM
Last Post: Pedroski55
  parse json field from csv file lebossejames 4 668 Nov-14-2023, 11:34 PM
Last Post: snippsat
  python Read each xlsx file and write it into csv with pipe delimiter mg24 4 1,308 Nov-09-2023, 10:56 AM
Last Post: mg24
  read file txt on my pc to telegram bot api Tupa 0 1,048 Jul-06-2023, 01:52 AM
Last Post: Tupa
  parse/read from file seperated by dots giovanne 5 1,043 Jun-26-2023, 12:26 PM
Last Post: DeaD_EyE
  Formatting a date time string read from a csv file DosAtPython 5 1,160 Jun-19-2023, 02:12 PM
Last Post: DosAtPython
  How do I read and write a binary file in Python? blackears 6 6,013 Jun-06-2023, 06:37 PM
Last Post: rajeshgk
  Read csv file with inconsistent delimiter gracenz 2 1,140 Mar-27-2023, 08:59 PM
Last Post: deanhystad

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020