Python Forum
Looping through dictionary and comparing values with elements of a separate list.
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Looping through dictionary and comparing values with elements of a separate list.
#1
Apologies for the long question. Basically I am trying to loop through a dictionary I've constructed and check whether a specific element of the hash is in a given list.

Test script:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Hash_Isolates={
    "1" : ['L02476-16_P_R1', 'AE006468', '873'],
    "2" : ['AE006468', 'AE006468', '40'],
    "3" : ['AE006468', 'L02476-16_P_R1', '756'],
    "4" : ['L00409-17_R1', 'L02476-16_P_R1', '987'],
    "5" : ['L00817-17_R1', 'AE006468', '65']
}
 
new_isolateList=['AE006468', 'L00817-17_R1']
 
my_Isolates=[]
 
for i in Hash_Isolates:
    if Hash_Isolates[i][0] in new_isolateList and Hash_Isolates[i][1] in new_isolateList:
        my_Isolates.append(Hash_Isolates[i])
 
print(len(my_Isolates))
Strangely, when I run this test script it works, but when I run the proper script it doesn't.
For the test script you get 2 printed out.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
#!/usr/bin/env python2.7
 
import getpass
import sys
import re
 
 
isolateFile=sys.argv[1]
snapper_data=sys.argv[2]
 
## Get the user ID ##
def get_User():
    currentUser = getpass.getuser()
    return currentUser
 
 
isolatePath='/home/'+get_User()+'/path/to/file/'+isolateFile
dataPath='/home/'+get_User()+'/path/to/file/'+snapper_data
 
 
# Retrieve isolates from file
 
isolateList=[]
with open(isolatePath, 'r') as file:
    isolateList=file.readlines()
 
new_isolateList=[]
for i in isolateList:
    try:
        x=re.search('(\w.....-?.?.?\d?)', str(i)).group(1)
    except:
        pass
    new_isolateList.append(x)
 
all_results=[]
 
with open(dataPath, 'r') as file:
    all_results=file.readlines()
 
 
# w is the position in the list of the samples being compared from the whole file
# x is first sample in comparison
# y is the second sample in comparison
# z is the SNP distance between the first and second samples
Hash_Isolates={}
for i in all_results:
    w=re.search('(.?.?.?.?.?.?.?),.+,.+,\d+\n', str(i)).group(1)
    x=re.search('.?.?.?.?.?.?.?,(.+),.+,\d+\n', str(i)).group(1)
    y=re.search('.?.?.?.?.?.?.?,.+,(.+),\d+\n', str(i)).group(1)
    z=re.search('.?.?.?.?.?.?.?,.+,.+,(\d+)\n', str(i)).group(1)
    Hash_Isolates[w]=[x, y, z]
 
my_Isolates=[]
 
for i in Hash_Isolates:
    if Hash_Isolates[i][0] in new_isolateList and Hash_Isolates[i][1] in new_isolateList:
        my_Isolates.append(Hash_Isolates[i])
 
 
print(len(my_Isolates))
So I expect this to work in the same way, but it prints 0. This snapper_data file has 100k + lines.
The data looks like this:
The isolateFile is a text file:

L01121-17_R1
AE006468
L00817-17_R1
L00665-17_R1

The snapper_data file is csv file:

1,L02476-16_P_R1,AE006468,873
2,L02476-16_P_R1,L02888-16_P_R1,2
3,L02476-16_P_R1,L00541-14_P_R1,914
4,L02476-16_P_R1,L02471-16_P_R1,842
5,AE006468,L02888-16_P_R1,832

I'm really desperate to get command of using dictionaries but this is bugging me.
Reply
#2
well, it looks like you overcomplicate things.

1
2
3
4
5
6
7
8
9
10
import csv
 
with open('isolate_file.txt') as f:
    new_isolate = {line.strip() for line in f}
 
with open('snapper_data.csv') as sd:
    rdr = csv.reader(sd)
    my_isolates = [line[1:] for line in rdr if len(set(line[1:-1]) & new_isolate)==2]
              
print(my_isolates)
snapper_data.csv
Output:
1,L02476-16_P_R1,AE006468,873 2,L02476-16_P_R1,L02888-16_P_R1,2 3,L02476-16_P_R1,L00541-14_P_R1,914 4,L02476-16_P_R1,L02471-16_P_R1,842 5,AE006468,L02888-16_P_R1,832 6,L01121-17_R1,AE006468,100
isolate_file.txt
Output:
L01121-17_R1 AE006468 L00817-17_R1 L00665-17_R1
output:
Output:
[['L01121-17_R1', 'AE006468', '100']]
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#3
You see this is why I like Python. So many simpler ways of doing things you just have to know them. Now I've got to decipher what you've written. Thanks.

1
2
  
len(set(line[1:-1]) & new_isolate)==2]
Hmm, this doesn't take into account the instances where the 2 elements being compared are the same.
Reply
#4
(Jun-22-2018, 01:12 PM)Mr_Keystrokes Wrote: Hmm, this doesn't take into account the instances where the 2 elements being compared are the same.

let's take the check in a function
1
2
3
4
5
6
7
8
9
10
11
12
13
14
import csv
 
def check_line(line, isolate):
    my_set = set(line)
    return len(my_set & isolate) == len(my_set)
 
with open('isolate_file.txt') as f:
    new_isolate = {line.strip() for line in f}
 
with open('snapper_data.csv') as sd:
    rdr = csv.reader(sd)
    my_isolates = [line[1:] for line in rdr if check_line(line[1:-1], new_isolate)]
              
print(my_isolates)
you can do it also like this
1
2
3
4
5
6
7
8
9
10
import csv
 
with open('isolate_file.txt') as f:
    new_isolate = {line.strip() for line in f}
 
with open('snapper_data.csv') as sd:
    rdr = csv.reader(sd)
    my_isolates = [line[1:] for line in rdr if line[1] in new_isolate and line[2] in new_isolate]
              
print(my_isolates)   
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#5
Thanks, I like the last solution best. Didn't know about csv reader so that will be useful in the future. And I would never have looked up set(). I have to say it's much simpler than Perl.
Reply
#6
(Jun-22-2018, 02:49 PM)Mr_Keystrokes Wrote: I have to say it's much simpler than Perl.
No joking? :D
“Perl – The only language that looks the same before and after RSA encryption.”
"As they say in Mexico 'dosvidaniya'. That makes two vidaniyas."
https://freedns.afraid.org
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Replace values in Yaml file with value in dictionary PelleH 1 2,324 Feb-11-2025, 09:51 AM
Last Post: alexjordan
  Assigning cycle values in a list nmancini 3 1,098 Sep-16-2024, 09:35 PM
Last Post: deanhystad
  remove duplicates from dicts with list values wardancer84 27 6,268 May-27-2024, 04:54 PM
Last Post: wardancer84
  Sort a list of dictionaries by the only dictionary key Calab 2 1,528 Apr-29-2024, 04:38 PM
Last Post: Calab
Question Using Lists as Dictionary Values bfallert 8 2,456 Apr-21-2024, 06:55 AM
Last Post: Pedroski55
  unable to remove all elements from list based on a condition sg_python 3 1,776 Jan-27-2024, 04:03 PM
Last Post: deanhystad
  Dictionary in a list bashage 2 1,479 Dec-27-2023, 04:04 PM
Last Post: deanhystad
  filtering a list of dictionary as per given criteria jss 5 1,860 Dec-23-2023, 08:47 AM
Last Post: Gribouillis
  need to compare 2 values in a nested dictionary jss 2 1,855 Nov-30-2023, 03:17 PM
Last Post: Pedroski55
  Copying the order of another list with identical values gohanhango 7 2,785 Nov-29-2023, 09:17 PM
Last Post: Pedroski55

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020