Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 Read csv file, parse data, and store in a dictionary
#1
I have a file that contains songs recently played by a radio station, the artist, and time played in this format: "November 4, 2019 8:02 PM","Wagon Wheel","Darius Rucker". I am trying to store the content of this file in string variable playlist_csv, use splitlines() to store records in variable lines, and then iterate through the lines to store data in a dictionary. The key should be a datetime object of the timestamp, and the value should be a tuple of song and artist: {datetime_key: (song, artist)}

This is what I have for code so far:
# read the file and store content in string variable playlist_csv
with open('playlist.txt', 'r') as csv_file:
    playlist_csv = csv_file.read().replace('\n', '')
    # use splitlines() method to store records in variable lines (it is list)
    split_playlist = playlist_csv.splitlines()
    # iterate through lines to store data in playlist_dict dictionary
    playlist_dict = {}
    for l in csv.reader(split_playlist, quotechar='"', delimiter=',',
       quoting=csv.QUOTE_ALL, skipinitialspace=True):
       dt=datetime.strptime(l[0], '%B %d, %Y %I:%M %p')
       playlist_dict[l[dt]].append(dt)
print(playlist_dict)
However, I keep running into errors when trying to store this data in a dictionary (specifically "'datetime.datetime' object is not subscriptable" and "list indices must be integers or slices" when modifying the code). Desired output looks like: {datetime.datetime(2019, 11, 4, 20, 2): ('Wagon Wheel', 'Darius Rucker'),...}

I appreciate any help!
Quote
#2
Try playlist_dict[dt] = l[1:] perhaps.
Quote
#3
If you're sure, that you have only 3 columns everywhere, you can use item unpacking.

with open('playlist.txt', 'r') as csv_file:
    playlist_dict = {}
    reader = csv.reader(
        csv_file, quotechar='"', delimiter=',',
        quoting=csv.QUOTE_ALL, skipinitialspace=True
    )
    for timestamp, song, artist in reader:
       dt = datetime.strptime(timestamp, '%B %d, %Y %I:%M %p')
       playlist_dict[dt].append((song, artist))


print(playlist_dict)
You can make it shorter.
No use of splitlines, because the csv_reader does it indirect.

I corrected the assignment in #11 of your code.

If you want to assign a value to a key, it looks like this:
some_dict = {}
a_key = 'my_key'
some_value = 42
some_value = (1,2,3) # could be a tuple
some_value = [1,2,3] # could be a list
some_value = {'foo': 'bar'} # or  a dict
some_value = {1,2,3} # could be a set

# assignment 
some_dict[a_key] = some_value # in this case the name was last overwritten by a set
Keys could be only hashable objects. This means you could not use mutable mappings/sequences as key.
The datetime object is for example immutable. The values of a dict, don't need to be hashable.

And since Python 3.6 we've got the implementation detail, that dicts keeps the order.
Since Python 3.7 it's in the language specification and a guarantee.

If you test your code with older Python version, you'll get scrambled results.
Previously dicts didn't keep the order. In some versions they used an algorithm to scramble it.

If you want to write code for older Python versions, you have to know it.
In this case you can use collections.OrderedDict.
My code examples are always for Python >=3.6.0
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Quote
#4
Thanks for helping here. I am sure I only have 3 columns everywhere, and I am also using python 3.6. However, when I run this code, I keep getting the error:
playlist_dict[dt].append((song, artist))
KeyError: datetime.datetime(2019, 11, 4, 20, 2)

Any idea what is causing this?

(Nov-26-2019, 07:44 AM)DeaD_EyE Wrote: If you're sure, that you have only 3 columns everywhere, you can use item unpacking.

with open('playlist.txt', 'r') as csv_file:
    playlist_dict = {}
    reader = csv.reader(
        csv_file, quotechar='"', delimiter=',',
        quoting=csv.QUOTE_ALL, skipinitialspace=True
    )
    for timestamp, song, artist in reader:
       dt = datetime.strptime(timestamp, '%B %d, %Y %I:%M %p')
       playlist_dict[dt].append((song, artist))


print(playlist_dict)
You can make it shorter.
No use of splitlines, because the csv_reader does it indirect.

I corrected the assignment in #11 of your code.

If you want to assign a value to a key, it looks like this:
some_dict = {}
a_key = 'my_key'
some_value = 42
some_value = (1,2,3) # could be a tuple
some_value = [1,2,3] # could be a list
some_value = {'foo': 'bar'} # or  a dict
some_value = {1,2,3} # could be a set

# assignment 
some_dict[a_key] = some_value # in this case the name was last overwritten by a set
Keys could be only hashable objects. This means you could not use mutable mappings/sequences as key.
The datetime object is for example immutable. The values of a dict, don't need to be hashable.

And since Python 3.6 we've got the implementation detail, that dicts keeps the order.
Since Python 3.7 it's in the language specification and a guarantee.

If you test your code with older Python version, you'll get scrambled results.
Previously dicts didn't keep the order. In some versions they used an algorithm to scramble it.

If you want to write code for older Python versions, you have to know it.
In this case you can use collections.OrderedDict.
Quote
#5
My mistake.

In line number 9:
playlist_dict[dt].append((song, artist))
# the key dt does not exist
# no list behind
to...

playlist_dict[dt] = (song, artist)
If you expect songs/artist with the same date, then the value should be a list.
from collections import defaultdict

playlist_dict = defaultdict(list)
# not existing keys, return an empty list
# which could be modified

playlist_dict['this key does not exist'].append(42)  # <-- returns an empty list, which is already assigned to the key
# now the key 'this key does not exist' exists.
playlist_dict['this key does not exist'].append(43) # <-- adding next object to the existing list
So you can decide. Just assign a tuple with song/artist to the key.
If there is an song/artist with the same date, the old one is just overwritten.
If you expect this, the easiest way is to use a defaultdict.



import csv
from collections import defaultdict


with open('playlist.txt', 'r') as csv_file:
    playlist_dict = defaultdict(list)
    reader = csv.reader(
        csv_file, quotechar='"', delimiter=',',
        quoting=csv.QUOTE_ALL, skipinitialspace=True
    )
    for timestamp, song, artist in reader:
       dt = datetime.strptime(timestamp, '%B %d, %Y %I:%M %p')
       playlist_dict[dt].append((song, artist))
 
 
print(playlist_dict)
My code examples are always for Python >=3.6.0
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Quote

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  How to read multiple csv files and merge data rajeshE 0 57 Mar-28-2020, 04:01 PM
Last Post: rajeshE
  Read and save file in chucksize zinho 1 66 Mar-27-2020, 02:31 PM
Last Post: zinho
  Recommended way to read/create PDF file? Winfried 1 129 Mar-17-2020, 03:17 PM
Last Post: Larz60+
  problem coverting string data file to dictionary AKNL 22 547 Mar-10-2020, 01:27 PM
Last Post: AKNL
  how to read json file jk91 34 1,223 Feb-26-2020, 08:10 AM
Last Post: jk91
  Read Data from Serial Port PA3040 3 188 Feb-16-2020, 04:54 AM
Last Post: PA3040
  Read Yaml configuration file in Python binhduonggttn 1 178 Feb-11-2020, 05:43 AM
Last Post: ndc85430
  Simple Read File Issue blackjesus24 4 200 Feb-09-2020, 12:07 AM
Last Post: blackjesus24
  Read csv file data parthi1705 0 142 Jan-29-2020, 01:42 PM
Last Post: parthi1705
  Read all csv files, and store the last line from each folder SriRajesh 4 431 Jan-17-2020, 12:13 AM
Last Post: SriRajesh

Forum Jump:


Users browsing this thread: 1 Guest(s)