Basic one: Aggregating from a dictionary - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Basic one: Aggregating from a dictionary (/thread-20472.html) |
Basic one: Aggregating from a dictionary - Mustey - Aug-12-2019 I am so sorry, that's a stupid mistake (I forgot to return!). Feel free to delete! Sorry, I haven't coded for 2 years (and I wasn't any good when I did!) and now I need to write a python script. Basically, I need to tell people what files they own on the system. I've managed to get the files "crawled", producing a JSON like: {'file1.txt':'John','file2.txt':'James','file3.txt':'John'} My next step is to aggregate them like so (this is the format my next method expects): {'John':['file1.txt','file3.txt'],'James':[file2.txt]} I don't care about the order. Plus I know how to .sort() if I need to :) I have a feeling I might be able to use NumPy, Pandas DataFrame or some other instant solution but I don't know these yet and I don't want to jump the gun. Also, it really irritates me I have something that should work but I seem to miss a point about language semantics. Here's what I got: def group_by_owners(to_group): # get list of owners: owners = to_group.values() # Remove duplicated values: owners = list(set(owners)) # Prepare a list of tuples from the dict: ownership_tuples = to_group.items() # Find files per each owner: result = {} for current_owner in owners: current_owner_files=[] # Brute search through data: for (file_name, owner_name) in ownership_tuples: if owner_name == current_owner: current_owner_files.append(file_name) result[current_owner] = current_owner_files files = { 'file1.txt': 'John', 'file2.txt': 'James', 'file3.txt': 'John' } print (group_by_owners(files))Running this returns: None 1. I thought that with "(file_name, owner_name)" I would be able to traverse the "ownership_tuples" list but maybe I am wrong? 2. I googled for an hour but I couldn't find help on how to set a dictionary value when the key is a variable. I just guessed here: result[current_owner] 3. When I do print(ownership_tuples) I get: dict_items([('file1.txt', 'John'), ('file2.txt', 'James'), ('file3.txt', 'John')]) What is this 'dict_items'? Is it an indicator that I am doing something wrong? I expect to print a tuple, not a dictionary. PS: any ideas to make this more elegant will be happily accepted, as I am just warming up my brain to programming again and could use some inspiration! PS2: It's a surprise I get so much done with computers. I am a bottom-feeder in the world of programming. Apologies for my beginner questions! RE: Basic one: Aggregating from a dictionary - buran - Aug-12-2019 it can probably be done with NumPy too but there is defaultdict from collections module from collections import defaultdict import json spam = defaultdict(list) files = {'fle1.txt': 'John', 'file2.txt': 'James', 'file3.txt': 'John'} for fname, owner in files.items(): spam[owner].append(fname) print(spam) # dump as json eggs = json.dumps(spam) print(eggs)
|