![]() |
Searching through a list of dictionaries with a condition. - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Searching through a list of dictionaries with a condition. (/thread-13110.html) Pages:
1
2
|
Searching through a list of dictionaries with a condition. - Mr_Keystrokes - Sep-28-2018 Hey guys, I have a huge spreadsheet that I am attempting to search through for some specific data. On the one hand I have IDs like this: Y00988-11 G01024-14 Z01933-13 And on the other hand I have a massive spreadsheet(CSV) in the following format: Run,Sample,Source,Rate, DFT,G01024-14,A,High DFT,U04424-15,B,Low TFF,T64673-18,A,Low RRT,I01324-14,A,High RRT,J01624-14,A,High ... I'm trying to extract both the 'Sample' ID and the 'Run'. I read the csv spreadsheet into a Dictionary using the in built reader, but I'm having trouble extracting the elements I am interested in. import csv import sys # sequences of interest dataset=sys.argv[1] # CSV spreadsheet database=sys.argv[2] sampleIDs=[] with open(dataset, 'r') as file: for line in file: line.strip('\n') sampleIDs.append(line) file.close() seq_Dict=[] finalList=['init'] with open(database, 'rb') as csvfile: reader=csv.DictReader(csvfile, delimiter='\t') for line in reader: seq_Dict.append(line) csvfile.close() for element in seq_Dict: for key, value in element.items(): if element['Sample'] in sampleIDs: finalList.pop() finalList.append(element['Sample']+" "+element['Run']) for i in finalList: print(i)This script returns the info of the last ID in my sampleIDs, so I can see that what is occurring during the loop is being overwriting the previous iteration. So I did try to use deepcopy but that didn't seem to work.
RE: Searching through a list of dictionaries with a condition. - ichabod801 - Sep-28-2018 It's not overwriting, you are removing the last iteration before you add a new one. The pop method removes the last item of the list. You keep removing an item and adding an item (lines 31 + 32), so you end up with one item. Remove line 31 and it should work. You can also remove line 25. The with statement on line 21 takes care of that. RE: Searching through a list of dictionaries with a condition. - Mr_Keystrokes - Sep-28-2018 (Sep-28-2018, 12:07 PM)ichabod801 Wrote: It's not overwriting, you are removing the last iteration before you add a new one. The pop method removes the last item of the list. You keep removing an item and adding an item (lines 31 + 32), so you end up with one item. Remove line 31 and it should work. On the contrary, by removing the pop method it still returns the last sample ID, the only difference is that it's repeated by the number of key-values there are in the dictionary i.e. S01933-11 r480 S01933-11 r480 S01933-11 r480 S01933-11 r480 S01933-11 r480 S01933-11 r480 S01933-11 r480 S01933-11 r480 S01933-11 r480 RE: Searching through a list of dictionaries with a condition. - ichabod801 - Sep-28-2018 Okay, this bit: for element in seq_Dict: for key, value in element.items(): if element['Sample'] in sampleIDs: finalList.append(element['Sample']+" "+element['Run'])The second for loop is not necessary. What the above code does is for every key in element, it checks element and appends the sample and run. If you get rid of the second for loop, it will just check each element once. I think there may only be one matching element in the data that matches your filter, and the above issue is why it is repeated. But I can't check that without a (small) sample of the data and what you are passing to sampleIDs. RE: Searching through a list of dictionaries with a condition. - Mr_Keystrokes - Oct-01-2018 Hey apologies for late reply, I'm not getting email alerts. I'm going to try what you said, although I'd like to point out why I created the second loop. It's because I don't know what the syntax is to access a particular key in a list of dictionaries. For example, I know that there is the syntax: arrayofDict[0]['key']But this will hone in on only the first element of the list and won't grant access to all the dictionaries in the list. I'm trying to cycle through the list of dictionaries and print out the key-value of a particular key. RE: Searching through a list of dictionaries with a condition. - ichabod801 - Oct-01-2018 if seq_Dict is your list of dictionaries, that's what your first for loop does. Each time through the loop, element is the next dict in seq_Dict. RE: Searching through a list of dictionaries with a condition. - Mr_Keystrokes - Oct-02-2018 Yeah, but the question is, if every dictionary in the list has the same keys-values (structure), can one exclusively access and retrieve the value of the key you're interested in and only that key. RE: Searching through a list of dictionaries with a condition. - ichabod801 - Oct-02-2018 Sure: for each_dict in a_list: print(each_dict[key])Most people would do this as a list comprehension: [each_dict[key] for each_dict in a_list] RE: Searching through a list of dictionaries with a condition. - Mr_Keystrokes - Oct-04-2018 Hmm, let me see.. RE: Searching through a list of dictionaries with a condition. - Mr_Keystrokes - Oct-04-2018 a_list=[{"Sample" : "A-15", "Run" : "n47", "quality" : "good" }, {"Sample" : "B-04", "Run" : "n45", "quality" : "good"}, {"Sample" : "C-10", "Run" : "n48", "quality" : "bad"}, {"Sample" : "Z-95", "Run" : "n47", "quality" : "good" },] sampleIDs=['A-15', 'B-04', 'C-10'] for each_dict in a_list: if each_dict['Sample'] in sampleIDs: print(each_dict['Sample']+" "+each_dict['Run'])So if I run this^ I expect to get: A-15 n47 B-04 n45 C-10 n48 but instead I get: C-10 n48 Is this because I'm overwriting the operation with each iteration? If so how can I avoid doing that? |