Looping through dictionary and comparing values with elements of a separate list. - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Looping through dictionary and comparing values with elements of a separate list. (/thread-11096.html) |
Looping through dictionary and comparing values with elements of a separate list. - Mr_Keystrokes - Jun-22-2018 Apologies for the long question. Basically I am trying to loop through a dictionary I've constructed and check whether a specific element of the hash is in a given list. Test script: Hash_Isolates={ "1" : ['L02476-16_P_R1', 'AE006468', '873'], "2" : ['AE006468', 'AE006468', '40'], "3" : ['AE006468', 'L02476-16_P_R1', '756'], "4" : ['L00409-17_R1', 'L02476-16_P_R1', '987'], "5" : ['L00817-17_R1', 'AE006468', '65'] } new_isolateList=['AE006468', 'L00817-17_R1'] my_Isolates=[] for i in Hash_Isolates: if Hash_Isolates[i][0] in new_isolateList and Hash_Isolates[i][1] in new_isolateList: my_Isolates.append(Hash_Isolates[i]) print(len(my_Isolates))Strangely, when I run this test script it works, but when I run the proper script it doesn't. For the test script you get 2 printed out. #!/usr/bin/env python2.7 import getpass import sys import re isolateFile=sys.argv[1] snapper_data=sys.argv[2] ## Get the user ID ## def get_User(): currentUser = getpass.getuser() return currentUser isolatePath='/home/'+get_User()+'/path/to/file/'+isolateFile dataPath='/home/'+get_User()+'/path/to/file/'+snapper_data # Retrieve isolates from file isolateList=[] with open(isolatePath, 'r') as file: isolateList=file.readlines() new_isolateList=[] for i in isolateList: try: x=re.search('(\w.....-?.?.?\d?)', str(i)).group(1) except: pass new_isolateList.append(x) all_results=[] with open(dataPath, 'r') as file: all_results=file.readlines() # w is the position in the list of the samples being compared from the whole file # x is first sample in comparison # y is the second sample in comparison # z is the SNP distance between the first and second samples Hash_Isolates={} for i in all_results: w=re.search('(.?.?.?.?.?.?.?),.+,.+,\d+\n', str(i)).group(1) x=re.search('.?.?.?.?.?.?.?,(.+),.+,\d+\n', str(i)).group(1) y=re.search('.?.?.?.?.?.?.?,.+,(.+),\d+\n', str(i)).group(1) z=re.search('.?.?.?.?.?.?.?,.+,.+,(\d+)\n', str(i)).group(1) Hash_Isolates[w]=[x, y, z] my_Isolates=[] for i in Hash_Isolates: if Hash_Isolates[i][0] in new_isolateList and Hash_Isolates[i][1] in new_isolateList: my_Isolates.append(Hash_Isolates[i]) print(len(my_Isolates))So I expect this to work in the same way, but it prints 0. This snapper_data file has 100k + lines. The data looks like this: The isolateFile is a text file: L01121-17_R1 AE006468 L00817-17_R1 L00665-17_R1 The snapper_data file is csv file: 1,L02476-16_P_R1,AE006468,873 2,L02476-16_P_R1,L02888-16_P_R1,2 3,L02476-16_P_R1,L00541-14_P_R1,914 4,L02476-16_P_R1,L02471-16_P_R1,842 5,AE006468,L02888-16_P_R1,832 I'm really desperate to get command of using dictionaries but this is bugging me. RE: Looping through dictionary and comparing values with elements of a separate list. - buran - Jun-22-2018 well, it looks like you overcomplicate things. import csv with open('isolate_file.txt') as f: new_isolate = {line.strip() for line in f} with open('snapper_data.csv') as sd: rdr = csv.reader(sd) my_isolates = [line[1:] for line in rdr if len(set(line[1:-1]) & new_isolate)==2] print(my_isolates)snapper_data.csv isolate_file.txt output:
RE: Looping through dictionary and comparing values with elements of a separate list. - Mr_Keystrokes - Jun-22-2018 You see this is why I like Python. So many simpler ways of doing things you just have to know them. Now I've got to decipher what you've written. Thanks. len(set(line[1:-1]) & new_isolate)==2]Hmm, this doesn't take into account the instances where the 2 elements being compared are the same. RE: Looping through dictionary and comparing values with elements of a separate list. - buran - Jun-22-2018 (Jun-22-2018, 01:12 PM)Mr_Keystrokes Wrote: Hmm, this doesn't take into account the instances where the 2 elements being compared are the same. let's take the check in a function import csv def check_line(line, isolate): my_set = set(line) return len(my_set & isolate) == len(my_set) with open('isolate_file.txt') as f: new_isolate = {line.strip() for line in f} with open('snapper_data.csv') as sd: rdr = csv.reader(sd) my_isolates = [line[1:] for line in rdr if check_line(line[1:-1], new_isolate)] print(my_isolates)you can do it also like this import csv with open('isolate_file.txt') as f: new_isolate = {line.strip() for line in f} with open('snapper_data.csv') as sd: rdr = csv.reader(sd) my_isolates = [line[1:] for line in rdr if line[1] in new_isolate and line[2] in new_isolate] print(my_isolates) RE: Looping through dictionary and comparing values with elements of a separate list. - Mr_Keystrokes - Jun-22-2018 Thanks, I like the last solution best. Didn't know about csv reader so that will be useful in the future. And I would never have looked up set(). I have to say it's much simpler than Perl. RE: Looping through dictionary and comparing values with elements of a separate list. - wavic - Jun-22-2018 (Jun-22-2018, 02:49 PM)Mr_Keystrokes Wrote: I have to say it's much simpler than Perl.No joking? :D “Perl – The only language that looks the same before and after RSA encryption.” |