Python Forum

Full Version: group by property
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3
I have a list like:
result = [
  {
    "name: "John",
    "m": "good",
    "n": 1
  },
  {
    "name": 'Alina',
    "m": "good",
    "n": 1
  },
  {
    "name": "Olivia",
    "m": "bad",
    "n": 2
  },
  {
    "name": "Ruby",
    "m": "bad",
    "n": 2
  },
  ...
]
I want to convert it into like this:
result = [
  { // n is 1
    "g": [
      {
        "name": "John",
        "n": 1
      },
      {
        "name":"Alina",
        "n": 1
      }
    ],
    "b": [
      {
        "name": "...",
        "n": 1
      },
      {
        "name": "...",
        "n": 1
      }
    ]
  },
  { // n is 2
    "g": [...],
    "b": [...]
  },
  { // n is 3
   // ...
  }
]
What I'm trying to achieve here is as follows:
- if m is good, g should have group of dictionary where n matches
- if m is bad, b should have group of dictionary where n matches

So, here I have to group them by n and with good and bad property. Hope this is clear.

Please note the value of n is generated with timestamp. I have shown here 1,2,3.. just for ease.
Here, this should do the trick. List represents result and NewList is the actual result
List = [
  {
    "name": "John",
    "m": "good",
    "n": 1
  },
  {
    "name": 'Alina',
    "m": "good",
    "n": 1
  },
  {
    "name": "Olivia",
    "m": "bad",
    "n": 2
  },
  {
    "name": "Ruby",
    "m": "bad",
    "n": 2
  }
]

NewList = [{'g' : [], 'b' : []}]
for Dict in List:
    if Dict['n'] == len(NewList):
        NewList[Dict['n'] - 1][Dict['m'][0]].append(Dict)
    else:
        NewList.append({'g' : [], 'b' : []})
print(NewList)
Thanks. It is grouping as expected. But it is returning empty [].

[
  {
    "g": [],
    "b": []
  },
  {
    "g": [],
    "b": []
  },
  ...
]
Probably this line is issue:
NewList[Dict['n'] - 1][Dict['m'][0]].append(Dict)
You are probably decreasing n value. I have random number rather than increasing number. So, I have to match the n exactly not by decreasing.
No, the first empty is for bad with number 1. But there are no bad with number 1. Therefore it leaves it empty. The next empty one is for good with number 2 but there are no good with number 2
I have all the matching data with n, good, bad. The logic I think is not matching n with other n data.
Try this and you get no empty spaces because each number has a good and a bad.
List = [
  {
    "name": "John",
    "m": "good",
    "n": 1
  },
  {
    "name": 'Alina',
    "m": "good",
    "n": 1
  },
  {
    "name": "Peter",
    "m": "bad",
    "n": 1
  },
  {
    "name": 'Mary',
    "m": "bad",
    "n": 1
  },
  {
    "name": "Olivia",
    "m": "bad",
    "n": 2
  },
  {
    "name": "Ruby",
    "m": "bad",
    "n": 2
  },
  {
    "name": "Sukuyomi",
    "m": "good",
    "n": 2
  },
  {
    "name": "Furuhashi",
    "m": "good",
    "n": 2
  }
]

NewList = [{'g' : [], 'b' : []}]
for Dict in List:
    if Dict['n'] == len(NewList):
        NewList[Dict['n'] - 1][Dict['m'][0]].append(Dict)
    else:
        NewList.append({'g' : [], 'b' : []})
print(NewList)
Pandas is a very good tool to perform grouping, e.g.

import pandas as pd
df = pd.DataFrame(data)
df.groupby(['m', 'n']).groups
Output:
{('bad', 2): Int64Index([2, 3], dtype='int64'), ('bad', 3): Int64Index([4], dtype='int64'), ('good', 1): Int64Index([0, 1], dtype='int64')}
If you don't want to use Pandas, you can do this using standard Python packages only, e.g.

from itertools import groupby
from operator import itemgetter
for a, b in groupby(data, key=lambda x: (itemgetter('m')(x), itemgetter('n')(x))):
    print(a, list(b))
Output:
('good', 1) [{'name': 'John', 'm': 'good', 'n': 1}, {'name': 'Alina', 'm': 'good', 'n': 1}] ('bad', 2) [{'name': 'Olivia', 'm': 'bad', 'n': 2}, {'name': 'Ruby', 'm': 'bad', 'n': 2}] ('bad', 3) [{'name': 'Ruby', 'm': 'bad', 'n': 3}]
This might be not exactly you want. Nevertheless, these approaches may be useful: you can easily access/process the data in each group.
As I said the value of n is generated with timestamp. So it doesn't work. Try the following which results empty list:

List = [
    {
      "name": "John",
      "m": "good",
      "n": 1561895746657
    },
    {
      "name": 'Alina',
      "m": "good",
      "n": 1561895746657
    },
    {
      "name": "Peter",
      "m": "bad",
      "n": 1561895746657
    },
    {
      "name": 'Mary',
      "m": "bad",
      "n": 1561895746657
    },
    {
      "name": "Olivia",
      "m": "bad",
      "n": 1561872416143
    },
    {
      "name": "Ruby",
      "m": "bad",
      "n": 1561872416143
    },
    {
      "name": "Sukuyomi",
      "m": "good",
      "n": 1561872416143
    },
    {
      "name": "Furuhashi",
      "m": "good",
      "n": 1561872416143
    }
  ]
  
  NewList = [{'g' : [], 'b' : []}]
  for Dict in List:
      if Dict['n'] == len(NewList):
          NewList[Dict['n'] - 1][Dict['m'][0]].append(Dict)
      else:
          NewList.append({'g' : [], 'b' : []})
  print(NewList)

@scidam,

I just tried with standard library. And it works fine. But I'm not sure if I understand them.

@SheeppOSU,

I would love to use your method coz, I don't need to import other library. So, can you please update your answer once again with the given timestamp?
I just tested it and it worked for me, so hope it works for you
NewList = []
for Dict in List:
    if not NewList:
        NewList.append({'n' : Dict['n'], 'g' : [], 'b' : []})
    new = False
    count = 0
    for SortedDict in NewList:
        if SortedDict['n'] == Dict['n']:
            new = False
            NewList[count][Dict['m'][0]].append(Dict)
            break
        else:
            new = True
        count += 1
    if new:
        NewList.append({'n' : Dict['n'], 'g' : [], 'b' : []})
        NewList[len(NewList) - 1][Dict['m'][0]].append(Dict)
print(NewList)
Not sure where I'm gone wrong. But I'm getting KeyError 'c'. I couldn't see anywhere using 'c' as key.
Pages: 1 2 3