Python Forum
How to most effectively unpack list of name-value pair dictionaries in a dataframe?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How to most effectively unpack list of name-value pair dictionaries in a dataframe?
#1
Hello! I'm working with a dataset that has a rather inconvenient format where one of the columns is basically a list of name-value pair dictionaries. I would like to expand that column such that each of the names is it's own column. So far, I've found a way to do it by manually extracting each of the values, but ideally, I would prefer a more general solution that is also efficient. Here's an example:

import pandas as pd

data = {'name': ['Alice', 'Bob', 'Clark'],
        'preferences': [[{'name': 'fruit', 'value': 'apple'}, 
                         {'name': 'drink', 'value': 'lemonade'},
                         {'name': 'food', 'value': 'pizza'}],
                        [{'name': 'fruit', 'value': 'orange'}, 
                         {'name': 'drink', 'value': 'soda'},
                         {'name': 'food', 'value': 'soup'}],
                        [{'name': 'fruit', 'value': 'pear'}, 
                         {'name': 'drink', 'value': 'water'},
                         {'name': 'food', 'value': 'chicken'}]]}

df = pd.DataFrame(data)

# Extract values from 'preferences' column
df['fruit'] = df['preferences'].apply(lambda x: [item['value'] for item in x if item['name'] == 'fruit'][0])
df['drink'] = df['preferences'].apply(lambda x: [item['value'] for item in x if item['name'] == 'drink'][0])
df['food'] = df['preferences'].apply(lambda x: [item['value'] for item in x if item['name'] == 'food'][0])

# Drop the 'preferences' column
df = df.drop(columns=['preferences'])
An additional complication is that not every column has the same name-value pairs. In that case, the method above fails (IndexError) without doing an additional check, which is even more inefficient.

Maybe the solution is to use pd.json_normalize on the preferences column, pivot that, then append the various dataframes?
Reply
#2
Updated code to account for potential gaps:

import pandas as pd
import numpy as np

def extract_value(l, name):
    extracted = [item['value'] for item in l if item['name'] == name]
    if len(extracted) == 0:
        return np.nan
    else:
        return extracted[0]

data = {'name': ['Alice', 'Bob', 'Clark'],
        'preferences': [[{'name': 'fruit', 'value': 'apple'}, 
                         {'name': 'drink', 'value': 'lemonade'},
                         {'name': 'food', 'value': 'pizza'}],
                        [{'name': 'fruit', 'value': 'orange'}, 
                         {'name': 'drink', 'value': 'soda'},
                         {'name': 'food', 'value': 'soup'}],
                        [{'name': 'fruit', 'value': 'pear'}, 
                         {'name': 'food', 'value': 'chicken'}]]}
 
df = pd.DataFrame(data)
 
# Extract values from 'preferences' column
df['fruit'] = df['preferences'].apply(lambda x: extract_value(x, 'fruit'))
df['drink'] = df['preferences'].apply(lambda x: extract_value(x, 'drink'))
df['food'] = df['preferences'].apply(lambda x: extract_value(x, 'food'))
 
# Drop the 'preferences' column
df = df.drop(columns=['preferences'])
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Question [Solved] How to refer to dataframe column name based on a list lorensa74 1 2,282 May-17-2021, 07:02 AM
Last Post: lorensa74
  Comparing results within a list and appending to pandas dataframe Aryagm 1 2,358 Dec-17-2020, 01:08 PM
Last Post: palladium
  How to form a dataframe reading separate dictionaries from .txt file? Doug 1 4,265 Nov-09-2020, 09:24 AM
Last Post: PsyPy
  Computing the distance between each pair of points Truman 11 4,226 Jun-20-2020, 01:15 PM
Last Post: Truman
  how to list/count the number of dictionaries paul18fr 2 2,018 Nov-18-2019, 09:50 PM
Last Post: paul18fr
  Creating A List of DataFrames & Manipulating Columns in Each DataFrame firebird 1 4,325 Jul-31-2019, 04:04 AM
Last Post: scidam
  Inserting data from python list into a pandas dataframe mahmoud899 0 2,620 Mar-02-2019, 04:07 AM
Last Post: mahmoud899
  List and Dictionaries with Pandas Balinor 3 2,999 Aug-20-2018, 10:47 PM
Last Post: ichabod801

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020