Python Forum
Grouping and sum of a list of objects - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Grouping and sum of a list of objects (/thread-35338.html)



Grouping and sum of a list of objects - Otbredbaron - Oct-21-2021

Hi,

I have a list of objects with certain attributes: I would like to group them and then calculate the sum of one of their attributes.
I'll try to explain better with an example:


from itertools import groupby
from dataclasses import dataclass

@dataclass
class Mine():
    number: int
    material: str
    production: float
    um: str     # Unit of measure
    activity: bool

@dataclass 
class Total_Production():
    material: str
    tot_production: float
    um: str     # Unit of measure

def by_material(p):
    return p.material
    
def sum_by_material(mines):
    grouped_prods = []
    for key, group in groupby(sorted(mines, key=by_material), by_material):
        tot = 0
        for g in group:
            tot += g.production 
        grouped_prods.append(Total_Production(key, tot, g.um))
    return grouped_prods
This works fine, but I would like to calculate the sum using the relevant built-in function or in general to look for better ideas, I tried to write something like this:

def sum_by_material(mines):
    grouped_prods = []
    for key, group in groupby(sorted(mines, key=by_material), by_material):
        tot = sum(g.production for g in group)
        grouped_prods.append(Total_Production(key, tot, ???))
    return grouped_prods
But of course I lose the information about the unit of measure and I don't know how to add it.


Thank you in advance


RE: Grouping and sum of a list of objects - Gribouillis - Oct-23-2021

If you are ready to import the well known more_itertools library, you could write
from more_itertools import spy

def sum_by_material(mines):
    grouped_prods = []
    for key, group in groupby(sorted(mines, key=by_material), by_material):
        (mine,), group = spy(group)
        tot = sum(g.production for g in group)
        grouped_prods.append(Total_Production(key, tot, mine.um))
    return grouped_prods
Note that your code assumes that all the mines with a given material use the same unit of measure.

You could also do this, without additional library
def sum_by_material(mines):
    grouped_prods = []
    for key, group in groupby(sorted(mines, key=by_material), by_material):
        mine = next(group)
        tot = sum((g.production for g in group), mine.production)
        grouped_prods.append(Total_Production(key, tot, mine.um))
    return grouped_prods