Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
what data structure to use?
#1
Hello,

I have a two-column, CSV file that lists amenities cities have:
ZIP;Amenity
51454;5
51454;6
53130;5
59437;6
63178;1
69029;6
69081;5
69290;5
71540;5
75101;6
75101;6
75101;5
75101;4
etc.


The first colum contains ZIP codes, and the second column contains an amenity for that town (eg. 1=church, 2=school, 3=restaurant, etc.)

[Image: image.png]

I need to write a loop that will fill an array:
1. If it doesn't yet exist, add the zip code as a key in that array
2. For that ZIP, if it doesn't exist, add the key for that kind of amenity (ie. 1, 2, 3, etc.), and increment the value (eg. 1=3 means that the town now has three churches).

What kind of data structure do you think is best for that task?

Thank you.
Reply
#2
Not a CSV file. A delimited file, but not a COMMA separated values file.

What python data types are you familiar with, or maybe what python data types are you allowed to use? There are many ways to solve this problem.
Reply
#3
If it were me I'd use a class.

class Zip_Code_Entry :
	def __init__ (self, zip_code) :
		self.zip_code = zip_code
		self.amenities = {}

	def add_amenity (self, amenity: str) :
		if amenity in self.amenities :
			self.amenities [amenity] += 1
		else :
			self.amenities [amenity] = 1

	def show_amenities (self) :
		for key, value in self.amenities.items () :
			print (f'There are {value} {key} in {self.zip_code}.')

first_one = Zip_Code_Entry ('12345')
first_one.add_amenity ('Churches')
first_one.show_amenities ()
Reply
#4
Thank you.
Reply
#5
You have number of options
one way is to have dict of dicts. Outer dict will have zip as keys and dicts as values. Eacj inner dict will have amenity code as key and number of said amenity as value. you will iterate over data and populate the data structure
You can do this in a number of different ways.

for you can use collections.defaultdict, twice.

from collections import defaultdict
import csv

mydata = defaultdict(lambda: defaultdict(int))
with open('sample.txt') as f:
    rdr = csv.reader(f, delimiter=';')
    next(rdr) # skip header row
    for zipcode, amenity in rdr:
        mydata[zipcode][amenity] += 1
print(mydata)
or just once
from collections import Counter
from collections import defaultdict
mydata = defaultdict(dict)
with open('sample.txt') as f:
    for key, value in Counter(f).items():
        zipcode, amenity = key.strip().split(';')
        mydata[zipcode][amenity] = value
print(mydata)
Another way is to use pandas

import pandas as pd 
df = pd.read_csv('sample.txt', sep=';')
df = df.groupby(['ZIP', 'Amenity'], as_index=False).size()
print(df)
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  How can I add certain elements in this 2d data structure and calculate a mean TheOddCircle 3 1,543 May-27-2022, 09:09 AM
Last Post: paul18fr
  Appropriate data-structure / design for business-day relations (week/month-wise) sx999 2 2,799 Apr-23-2021, 08:09 AM
Last Post: sx999
  Yahoo_fin, Pandas: how to convert data table structure in csv file detlefschmitt 14 7,718 Feb-15-2021, 12:58 PM
Last Post: detlefschmitt
  How to use Bunch data structure moish 2 2,903 Dec-24-2020, 06:25 PM
Last Post: deanhystad
  Correct data structure for this problem Wigi 13 4,620 Oct-09-2020, 11:09 AM
Last Post: buran
  difficulties to chage json data structure using json module in python Sibdar 1 2,080 Apr-03-2020, 06:47 PM
Last Post: micseydel
  File system representation in a data structure Alfalfa 1 2,060 Dec-18-2019, 01:56 AM
Last Post: Alfalfa
  Custom data structure icm63 2 2,530 Mar-27-2019, 02:40 AM
Last Post: icm63
  Nested Data structure question arjunfen 7 4,251 Feb-22-2019, 02:18 PM
Last Post: snippsat
  Display 20 records at a time,data structure or loop pythonds 1 2,454 Mar-29-2018, 11:09 AM
Last Post: DeaD_EyE

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020