Read data from a CSV file in S3 bucket and store it in a dictionary in python

Rupini · May-14-2020, 10:04 AM

I am trying to read a csv file from S3 bucket and store its content into a dictionary. Sample csv file data. I want to use my first row as key and subsequent rows as value

sample data:
name,origin,dest
xxx,uk,france
yyyy,norway,finland
zzzz,denmark,canada

I am using the below code which is storing the entire row in a dictionary. But I want to loop through each row and store each field in a row as key value pair.

Current output: {'content': 'xxx,uk,france'}
Required output: {'name':'xxx','origin':'uk','dest':'france'}

import boto3

s3 = boto3.client('s3')
obj = s3.get_object(Bucket = 'bucket_name', Key = 'logs/log.csv')
lines = obj['Body'].read().decode("utf-8").replace("'", '"')
lines = lines.splitlines()
if (isinstance(lines, str)):
        lines = (lines)

docData = {}
for line in lines:
        docData['content'] = str(line)

print(docData)

***snippsat*** · May-14-2020, 10:43 AM

If use csv module and DictReader.
It will give you that structure bye default.

import csv

with open("log.csv") as f:
    records = csv.DictReader(f)
    for row in records:
         print(row)

Output:{'name': 'xxx', 'origin': 'uk', 'dest': 'france'}
{'name': 'yyyy', 'origin': 'norway', 'dest': 'finland'}
{'name': 'zzzz', 'origin': 'denmark', 'dest': 'canada'}

Rupini · May-14-2020, 11:52 AM

Hi @snippsat, Thank you for your reply. I won't be able to use CSV module in this case as my file can be either csv or text file. My S3 bucket will include network log files (can be .csv or .log depending on the source) which I am trying to read. Is there other ways for me to achieve the same without CSV module?.

***snippsat*** · May-15-2020, 04:57 PM

(May-14-2020, 11:52 AM)Rupini Wrote: is there other ways for me to achieve the same without CSV module?

Yes you can write a own csv.DictReader implementation.
Also if you look at csv module at top there is a link to Source code: Lib/csv.py .
So there can look at how they have written it,the important part start at line 119.

Here is start with some good hints,as this is homework there missing a little part.

lst = []
with open("log.csv") as f:
    header = next(f)
    header = header.strip().split(',')
    for row in f:
        row = row.strip().split(',')
        print(row)

Look at what have now.

['xxx', 'uk', 'france']
['yyyy', 'norway', 'finland']
['zzzz', 'denmark', 'canada']

>>> header
['name', 'origin', 'dest']

>>> row
['zzzz', 'denmark', 'canada']

# Now can test line 119
>>> dict(zip(header, row))
{'name': 'zzzz', 'origin': 'denmark', 'dest': 'canada'}

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	dictionary output to text file (beginner)	Delg_Dankil	2	1,167	Jul-12-2023, 11:45 AM Last Post: deanhystad
	Using dictionary to find the most sent emails from a file	siliusu	6	7,539	Apr-22-2021, 06:07 PM Last Post: siliusu
	Updating dictionary in another py file	tommy_voet	1	4,860	Mar-28-2021, 07:25 PM Last Post: buran
	Making a dictionary from a file	instyabam	0	1,501	Oct-27-2020, 11:59 AM Last Post: instyabam
	how can i create a dictionary of dictionaries from a file	Astone	2	2,243	Oct-26-2020, 02:40 PM Last Post: DeaD_EyE
	Convert all actions through functions, fill the dictionary from a file	Astone	3	2,415	Oct-26-2020, 09:11 AM Last Post: DeaD_EyE
	Can we store value in file if we open file in read mode?	prasanthbab1234	3	2,550	Sep-26-2020, 12:10 PM Last Post: ibreeden
	[split] how to read a specific row in CSV file ?	laxmipython	2	8,856	May-22-2020, 12:19 PM Last Post: Larz60+
	Read text file, process data and print specific output	Happythankyoumoreplease	3	2,902	Feb-20-2020, 12:19 PM Last Post: jefsummers
	Store data in array	kasper1903	4	3,088	Oct-04-2019, 12:43 PM Last Post: kasper1903

Read data from a CSV file in S3 bucket and store it in a dictionary in python

User Panel Messages

Announcements