Is there a better data structure than classes for a set of employes? - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Data Science (https://python-forum.io/forum-44.html) +--- Thread: Is there a better data structure than classes for a set of employes? (/thread-24664.html) |
Is there a better data structure than classes for a set of employes? - Schlangenversteher - Feb-26-2020 Hello, i have around 100 uniform data sets of employes that i would like to read and process in my code. Most of the data, like name or nationality can be used directly, other parameters like their actual cost for the company need to be derived from the data. In the beginning, i used a dictionary of dictionary to deal with the data set. This turned out to be unreadable, awful code. Therefore i used a .yml file for storing the data and wrote a script that reads the .yml, creates a class object for every employe and passes a list of emplye objects to the actual code. While this works out for me, it still looks kinda strange and off at times. Is there a better way to store and process those data? I was thinking about using Sql lite or some data structure that panda offers. RE: Is there a better data structure than classes for a set of employes? - Larz60+ - Feb-26-2020 It's hard to understand exactly what you are talking about without a sample of data. A dictionary is a very good way to store structured data. Show an example of your data as a dictionary and also as yaml. RE: Is there a better data structure than classes for a set of employes? - Schlangenversteher - Feb-26-2020 Example with dictionaries: employeList = { { "name": "susan", "nationality": "netherland", "overallCost": None }, ... } # resolve overall cost for susan employeList[0]["overallCost"] = resolveOverallCost(employeList[0]) workWithEmployeList()Example with yaml: # employe.yml - name: susan nationality: netherland ... # python script class employeObject: overallCost = None def __init__(self, name, nationality): self.name = name self.nationarlity = nationality self.resolveOverallCost() def resolveOverallCost(self): ... self.overallCost = calculatedValue open(yamlfile): employeList = readYamlFile() for employe in employeList: emloyeObjectList.append(employeObject(employe["name], emplaye["nationality"]) workWithEmployeObjectList() RE: Is there a better data structure than classes for a set of employes? - buran - Feb-26-2020 classes look just fine (given that you also want to have derived properties/attributes). Of course as Larz said built-in data structure like dict or named tuple can also be used, but this looks like nice use case for custom class. You can write the whole class from scratch or to make it easier, you can have look at @dataclass that will help create __init__ and some other dunder methods for your class. We have a nice tutorial by @snippsat If you show your code as well as some sample data and what the derived data would look like we can help with further guidance EDIT: you did post your code while I was answering RE: Is there a better data structure than classes for a set of employes? - buran - Feb-26-2020 Where do you get the data to calculate cost? Do you have extra fields per employee in the yaml or just the name and nationality? Also you can make the cost property (using @property decorator), instead of having overallcost property and resolveOverallCost method RE: Is there a better data structure than classes for a set of employes? - buran - Feb-26-2020 employees.yaml
import yaml from random import randint from dataclasses import dataclass # one way to define basic class class Employee: def __init__(self, name, nationality): self.name = name self.nationality = nationality @property def cost(self): some_calculated_cost = randint(0, 20) # here I just randomly genereate cost between 0 and 20 return some_calculated_cost # alternative, using @dataclass @dataclass class Employee2: name: str nationality: str @property def cost(self): some_calculated_cost = randint(0, 20) # here I just randomly genereate cost between 0 and 20 return some_calculated_cost if __name__ == '__main__': # load using Employee class with open('employees.yaml') as f: employees = [Employee(**empl) for empl in yaml.safe_load(f)] # load using Employee2 class with open('employees.yaml') as f: employees2 = [Employee2(**empl) for empl in yaml.safe_load(f)] print(employees) # this one has no __str__ or __repr__ method defined print(employees2) # note the difference, this one has __repr__() method autocreated for employee in employees: print(f'{employee.name}: {employee.cost}') for employee in employees: print(f'{employee.name}: {employee.cost}') |