I need to loop through some JSON data (company storm data) and create a nested dictionary 4 keys deep with the first 3 keys having values of type dict and the last key having a value of type list that will store integers. I want to avoid KeyErrors so i am using defaultdict(). To avoid AttributeErrors when assigning to the list i have come up with 2 solutions. The first uses nested defaultdict() passing a list to the final defaultdict() call:
nestedDict = defaultdict(lambda: defaultdict(lambda: defaultdict(lambda: defaultdict(list))))
nestedDict[k1][k2][k3][k4].append(someInt)
another uses a recursive function with a try/except clause:
def dict_factory():
return defaultdict(dict_factor)
nestedDict = dict_factory()
try:
nestedDict[k1][k2][k3][k4].append(someInt)
except AttributeError:
nestedDict[k1][k2][k3][k4] = [someInt]
the first solution is certainly shorter but will the embedded defaultdict() produce any issues that i'm not considering. the second solution is wordier but i'm wondering if it is the best way. is one solution recommended over the other?
any assistance would be appreciated.
thanks.
If you have JSON data, why you need to create these nested dicts? It looks like you are doing something wrong. Could you elaborate on your goal?
(Nov-29-2017, 09:20 PM)buran Wrote: [ -> ]If you have JSON data, why you need to create these nested dicts? It looks like you are doing something wrong. Could you elaborate on your goal?
You can create the dict from the JSON file directly
The data is a json example from github.
>>> import json
>>> j = """
... [{
... "_id": {
... "$oid": "5968dd23fc13ae04d9000001"
... },
... "product_name": "sildenafil citrate",
... "supplier": "Wisozk Inc",
... "quantity": 261,
... "unit_cost": "$10.47"
... }, {
... "_id": {
... "$oid": "5968dd23fc13ae04d9000002"
... },
... "product_name": "Mountain Juniperus ashei",
... "supplier": "Keebler-Hilpert",
... "quantity": 292,
... "unit_cost": "$8.74"
... }, {
... "_id": {
... "$oid": "5968dd23fc13ae04d9000003"
... },
... "product_name": "Dextromathorphan HBr",
... "supplier": "Schmitt-Weissnat",
... "quantity": 211,
... "unit_cost": "$20.53"
... }]"""
>>> d = json.loads(j)
>>> d
[{'_id': {'$oid': '5968dd23fc13ae04d9000001'}, 'product_name': 'sildenafil citrate', 'supplier': 'Wisozk Inc', 'quantity': 261, 'unit_cost': '$10.47'}, {'_id': {'$oid': '5968dd23fc13ae04d9000002'}, 'product_name': 'Mountain Juniperus ashei', 'supplier': 'Keebler-Hilpert', 'quantity': 292, 'unit_cost': '$8.74'}, {'_id': {'$oid': '5968dd23fc13ae04d9000003'}, 'product_name': 'Dextromathorphan HBr', 'supplier': 'Schmitt-Weissnat', 'quantity': 211, 'unit_cost': '$20.53'}]
>>> type(d[0])
<class 'dict'>
How do I miss to coppy json.loads() call from the terminal? Sorry about that. Corrected.
sorry. i should not have said it was JSON data. that has confused the issue. the JSON data is not in the format i need, therefore i cannot use json.loads(). JSON data is in this format (and there can be hundreds of dicts):
JSON = '[{K1:V1, K2:V2, K3:V3, K4:V4, K5:V5}]'
the nested dictionary needs to be in this format, where v5 is an integer):
nestedDict = { V1:{V2:{V3:{V4:[V5]}}}}
one possible approach:
from collections import OrderedDict
import json
JSON = '[{"K1":"V1", "K2":"V2", "K3":"V3", "K4":"V4", "K5":5}]'
jsn = json.loads(JSON, object_pairs_hook=OrderedDict)
json_values = list(jsn[0].values())
my_nested_dict = {json_values[-2]:[json_values[-1]]}
for v in json_values[-3::-1]:
my_nested_dict = {v:my_nested_dict}
print(my_nested_dict)
Output:
{u'V1': {u'V2': {u'V3': {u'V4': [5]}}}}
thanks. i'll add this to my cookbook.
and here is different approach using recursion
from collections import OrderedDict
import json
def dict_factory(lst):
if len(lst) == 2:
return {lst[0]:[lst[1]]}
else:
return {lst[0]:dict_factory(lst[1:])}
JSON = '[{"K1":"V1", "K2":"V2", "K3":"V3", "K4":"V4", "K5":5}]'
jsn = json.loads(JSON, object_pairs_hook=OrderedDict)
json_values = list(jsn[0].values())
my_nested_dict = dict_factory(json_values)
print(my_nested_dict)
these don't work for a JSON list with multiple dictionaries e.g.
JSON = '[{"K1":"V1", "K2":"V2", "K3":"V3", "K4":"V4", "K5":6},{"K1":"V1", "K2":"V2", "K3":"V3", "K4":"V4", "K5":9}]'
but i'll compare them to my solution and see which i like. thanks for the answers.
just iterate over the
jsn
.
using comprehension:
from collections import OrderedDict
import json
def dict_factory(lst):
if len(lst) == 2:
return {lst[0]:[lst[1]]}
else:
return {lst[0]:dict_factory(lst[1:])}
JSON = '[{"K1":"V1", "K2":"V2", "K3":"V3", "K4":"V4", "K5":5}, {"K1":"V1", "K2":"V2", "K3":"V3", "K4":"V4", "K5":6}, {"K1":"V1", "K2":"V2", "K3":"V3", "K4":"V4", "K5":9}]'
jsn = json.loads(JSON, object_pairs_hook=OrderedDict)
json_values = (list(item.values()) for item in jsn)
nested_dicts = [dict_factory(values) for values in json_values]
print(nested_dicts)
or
expand the comprehension
from collections import OrderedDict
import json
def dict_factory(lst):
if len(lst) == 2:
return {lst[0]:[lst[1]]}
else:
return {lst[0]:dict_factory(lst[1:])}
JSON = '[{"K1":"V1", "K2":"V2", "K3":"V3", "K4":"V4", "K5":5}, {"K1":"V1", "K2":"V2", "K3":"V3", "K4":"V4", "K5":6}, {"K1":"V1", "K2":"V2", "K3":"V3", "K4":"V4", "K5":9}]'
jsn = json.loads(JSON, object_pairs_hook=OrderedDict)
nested_dicts = [ ]
for item in jsn:
json_values = list(item.values())
nested_dict = dict_factory(json_values)
nested_dicts.append(nested_dict)
print(nested_dicts)
Output:
[{u'V1': {u'V2': {u'V3': {u'V4': [5]}}}}, {u'V1': {u'V2': {u'V3': {u'V4': [6]}}}}, {u'V1': {u'V2': {u'V3': {u'V4': [9]}}}}]
>>>