Python Forum
Is there a better data structure than classes for a set of employes?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Is there a better data structure than classes for a set of employes?
#1
Hello,

i have around 100 uniform data sets of employes that i would like to read and process in my code. Most of the data, like name or nationality can be used directly, other parameters like their actual cost for the company need to be derived from the data.
In the beginning, i used a dictionary of dictionary to deal with the data set. This turned out to be unreadable, awful code. Therefore i used a .yml file for storing the data and wrote a script that reads the .yml, creates a class object for every employe and passes a list of emplye objects to the actual code.
While this works out for me, it still looks kinda strange and off at times.
Is there a better way to store and process those data? I was thinking about using Sql lite or some data structure that panda offers.
Reply
#2
It's hard to understand exactly what you are talking about without a sample of data.
A dictionary is a very good way to store structured data.
Show an example of your data as a dictionary and also as yaml.
Reply
#3
Example with dictionaries:
employeList = {
    {
       "name": "susan",
       "nationality": "netherland",
       "overallCost": None
    },
    ...
}

# resolve overall cost for susan
employeList[0]["overallCost"] = resolveOverallCost(employeList[0])

workWithEmployeList()
Example with yaml:
# employe.yml
- name: susan
  nationality: netherland
...

# python script

class employeObject:
    overallCost = None
    def __init__(self, name, nationality):
        self.name = name
        self.nationarlity = nationality
        self.resolveOverallCost()

    def resolveOverallCost(self):
        ...
        self.overallCost = calculatedValue

open(yamlfile):
    employeList = readYamlFile()

for employe in employeList:
    emloyeObjectList.append(employeObject(employe["name], emplaye["nationality"])

workWithEmployeObjectList()
Reply
#4
classes look just fine (given that you also want to have derived properties/attributes). Of course as Larz said built-in data structure like dict or named tuple can also be used, but this looks like nice use case for custom class.
You can write the whole class from scratch or to make it easier, you can have look at @dataclass that will help create __init__ and some other dunder methods for your class.
We have a nice tutorial by @snippsat


If you show your code as well as some sample data and what the derived data would look like we can help with further guidance
EDIT: you did post your code while I was answering
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#5
Where do you get the data to calculate cost?
Do you have extra fields per employee in the yaml or just the name and nationality?
Also you can make the cost property (using @property decorator), instead of having overallcost property and resolveOverallCost method
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#6
employees.yaml
Output:
- name: John nationality: USA - name: Jane nationality: UK
import yaml
from random import randint
from dataclasses import dataclass

# one way to define basic class
class Employee:
    def __init__(self, name, nationality):
        self.name = name
        self.nationality =  nationality

    @property
    def cost(self):
        some_calculated_cost = randint(0, 20) # here I just randomly genereate cost between 0 and 20
        return some_calculated_cost

# alternative, using @dataclass
@dataclass
class Employee2:
    name: str
    nationality: str


    @property
    def cost(self):
        some_calculated_cost = randint(0, 20) # here I just randomly genereate cost between 0 and 20
        return some_calculated_cost



if __name__ == '__main__':
    
    # load using Employee class
    with open('employees.yaml') as f:
        employees = [Employee(**empl) for empl in yaml.safe_load(f)]

    # load using Employee2 class
    with open('employees.yaml') as f:
        employees2 = [Employee2(**empl) for empl in yaml.safe_load(f)]

    print(employees) # this one has no __str__ or __repr__ method defined
    print(employees2) # note the difference, this one has __repr__() method autocreated

    for employee in employees:
        print(f'{employee.name}: {employee.cost}')

    for employee in employees:
        print(f'{employee.name}: {employee.cost}')
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Data structure question standenman 1 618 Jun-04-2023, 11:51 AM
Last Post: jefsummers
  Data saving structure JosefFilosopio 0 2,099 May-04-2019, 10:48 AM
Last Post: JosefFilosopio
  What data structure I need dervast 3 2,529 Apr-07-2019, 11:50 PM
Last Post: scidam
  Replacing values for specific columns in Panda data structure Padowan 1 14,636 Nov-27-2017, 08:21 PM
Last Post: Padowan

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020