Python Forum

Full Version: Better way to create nested dictionary with defaultdict()
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I need to loop through some JSON data (company storm data) and create a nested dictionary 4 keys deep with the first 3 keys having values of type dict and the last key having a value of type list that will store integers. I want to avoid KeyErrors so i am using defaultdict(). To avoid AttributeErrors when assigning to the list i have come up with 2 solutions. The first uses nested defaultdict() passing a list to the final defaultdict() call:

nestedDict = defaultdict(lambda: defaultdict(lambda: defaultdict(lambda: defaultdict(list))))
nestedDict[k1][k2][k3][k4].append(someInt)

another uses a recursive function with a try/except clause:

def dict_factory():
return defaultdict(dict_factor)
nestedDict = dict_factory()
try:
nestedDict[k1][k2][k3][k4].append(someInt)
except AttributeError:
nestedDict[k1][k2][k3][k4] = [someInt]

the first solution is certainly shorter but will the embedded defaultdict() produce any issues that i'm not considering. the second solution is wordier but i'm wondering if it is the best way. is one solution recommended over the other?
any assistance would be appreciated.
thanks.
If you have JSON data, why you need to create these nested dicts? It looks like you are doing something wrong. Could you elaborate on your goal?
(Nov-29-2017, 09:20 PM)buran Wrote: [ -> ]If you have JSON data, why you need to create these nested dicts? It looks like you are doing something wrong. Could you elaborate on your goal?

You can create the dict from the JSON file directly
The data is a json example from github.

>>> import json
>>> j = """
... [{
...   "_id": {
...     "$oid": "5968dd23fc13ae04d9000001"
...   },
...   "product_name": "sildenafil citrate",
...   "supplier": "Wisozk Inc",
...   "quantity": 261,
...   "unit_cost": "$10.47"
... }, {
...   "_id": {
...     "$oid": "5968dd23fc13ae04d9000002"
...   },
...   "product_name": "Mountain Juniperus ashei",
...   "supplier": "Keebler-Hilpert",
...   "quantity": 292,
...   "unit_cost": "$8.74"
... }, {
...   "_id": {
...     "$oid": "5968dd23fc13ae04d9000003"
...   },
...   "product_name": "Dextromathorphan HBr",
...   "supplier": "Schmitt-Weissnat",
...   "quantity": 211,
...   "unit_cost": "$20.53"
... }]"""

>>> d = json.loads(j)

>>> d
[{'_id': {'$oid': '5968dd23fc13ae04d9000001'}, 'product_name': 'sildenafil citrate', 'supplier': 'Wisozk Inc', 'quantity': 261, 'unit_cost': '$10.47'}, {'_id': {'$oid': '5968dd23fc13ae04d9000002'}, 'product_name': 'Mountain Juniperus ashei', 'supplier': 'Keebler-Hilpert', 'quantity': 292, 'unit_cost': '$8.74'}, {'_id': {'$oid': '5968dd23fc13ae04d9000003'}, 'product_name': 'Dextromathorphan HBr', 'supplier': 'Schmitt-Weissnat', 'quantity': 211, 'unit_cost': '$20.53'}]

>>> type(d[0])
<class 'dict'>
How do I miss to coppy json.loads() call from the terminal? Sorry about that. Corrected.
sorry. i should not have said it was JSON data. that has confused the issue. the JSON data is not in the format i need, therefore i cannot use json.loads(). JSON data is in this format (and there can be hundreds of dicts):
JSON = '[{K1:V1, K2:V2, K3:V3, K4:V4, K5:V5}]'

the nested dictionary needs to be in this format, where v5 is an integer):
nestedDict = { V1:{V2:{V3:{V4:[V5]}}}}
one possible approach:
from collections import OrderedDict
import json
JSON = '[{"K1":"V1", "K2":"V2", "K3":"V3", "K4":"V4", "K5":5}]'
jsn = json.loads(JSON, object_pairs_hook=OrderedDict)
json_values = list(jsn[0].values())
my_nested_dict = {json_values[-2]:[json_values[-1]]}
for v in json_values[-3::-1]:
    my_nested_dict = {v:my_nested_dict}
print(my_nested_dict)
Output:
{u'V1': {u'V2': {u'V3': {u'V4': [5]}}}}
thanks. i'll add this to my cookbook.
and here is different approach using recursion

from collections import OrderedDict
import json

def dict_factory(lst):
    if len(lst) == 2:
        return {lst[0]:[lst[1]]}
    else:
        return {lst[0]:dict_factory(lst[1:])}

JSON = '[{"K1":"V1", "K2":"V2", "K3":"V3", "K4":"V4", "K5":5}]'
jsn = json.loads(JSON, object_pairs_hook=OrderedDict)
json_values = list(jsn[0].values())
my_nested_dict = dict_factory(json_values)
print(my_nested_dict)
these don't work for a JSON list with multiple dictionaries e.g.
JSON = '[{"K1":"V1", "K2":"V2", "K3":"V3", "K4":"V4", "K5":6},{"K1":"V1", "K2":"V2", "K3":"V3", "K4":"V4", "K5":9}]'
but i'll compare them to my solution and see which i like. thanks for the answers.
just iterate over the jsn.

using comprehension:

from collections import OrderedDict
import json
 
def dict_factory(lst):
    if len(lst) == 2:
        return {lst[0]:[lst[1]]}
    else:
        return {lst[0]:dict_factory(lst[1:])}
 
JSON = '[{"K1":"V1", "K2":"V2", "K3":"V3", "K4":"V4", "K5":5}, {"K1":"V1", "K2":"V2", "K3":"V3", "K4":"V4", "K5":6}, {"K1":"V1", "K2":"V2", "K3":"V3", "K4":"V4", "K5":9}]'

jsn = json.loads(JSON, object_pairs_hook=OrderedDict)
json_values = (list(item.values()) for item in jsn)
nested_dicts = [dict_factory(values) for values in json_values]
print(nested_dicts)
or

expand the comprehension

from collections import OrderedDict
import json
 
def dict_factory(lst):
    if len(lst) == 2:
        return {lst[0]:[lst[1]]}
    else:
        return {lst[0]:dict_factory(lst[1:])}
 
JSON = '[{"K1":"V1", "K2":"V2", "K3":"V3", "K4":"V4", "K5":5}, {"K1":"V1", "K2":"V2", "K3":"V3", "K4":"V4", "K5":6}, {"K1":"V1", "K2":"V2", "K3":"V3", "K4":"V4", "K5":9}]'

jsn = json.loads(JSON, object_pairs_hook=OrderedDict)
nested_dicts = [ ]
for item in jsn:
    json_values = list(item.values())
    nested_dict = dict_factory(json_values)
    nested_dicts.append(nested_dict)
print(nested_dicts)
Output:
[{u'V1': {u'V2': {u'V3': {u'V4': [5]}}}}, {u'V1': {u'V2': {u'V3': {u'V4': [6]}}}}, {u'V1': {u'V2': {u'V3': {u'V4': [9]}}}}] >>>