Posts: 413
Threads: 110
Joined: Apr 2020
Greetings!
For some reason, I do not get the iterations in Python.
Especially with "Else".
here is a problem, I have two files, File-1 and File-2. I must use File-1 lines to find lines in File-2.
I have no problem finding lines but I cannot print Lines that do not match (from File-2)
File-1 lines:
03/28/2021,P,6,LINE2
03/28/2021,P,9,LINE4
File-2 lines:
03/28/2021,P,16,LINE1
03/28/2021,P,6,LINE2
03/28/2021,P,9,LINE3
03/28/2021,P,9,LINE4
03/28/2021,P,8,LINE5
03/28/2021,S,95,LINE6
03/28/2021,S,1,LINE7
03/28/2021,P,46,LINE8
I need to print out only lines that do not match:
03/28/2021,P,16,LINE1
03/28/2021,P,9,LINE3
03/28/2021,P,8,LINE5
03/28/2021,S,95,LINE6
03/28/2021,S,1,LINE7
03/28/2021,P,46,LINE8
But the code I wrote prints this:
ELSE -->> 03/28/2021,P,16,LINE1
ELSE -->> 03/28/2021,P,9,LINE3
ELSE -->> 03/28/2021,P,9,LINE4
ELSE -->> 03/28/2021,P,8,LINE5
ELSE -->> 03/28/2021,S,95,LINE6
ELSE -->> 03/28/2021,S,1,LINE7
ELSE -->> 03/28/2021,P,46,LINE8
ELSE -->> 03/28/2021,P,16,LINE1
ELSE -->> 03/28/2021,P,6,LINE2
ELSE -->> 03/28/2021,P,9,LINE3
ELSE -->> 03/28/2021,P,8,LINE5
ELSE -->> 03/28/2021,S,95,LINE6
ELSE -->> 03/28/2021,S,1,LINE7
ELSE -->> 03/28/2021,P,46,LINE8
I have to split lines in File-1 and File-2, some additional processing required in the lines.
Here is the code:
with open (file_2,'r') as l_few :
f2=l_few.readlines()
with open (file_1,'r') as f1:
for lf1 in f1:
lf1=lf1.strip()
sp1 = lf1.split(",")
for lf2 in f2 :
lf2=lf2.strip()
if lf2 :
if sp1[3] in lf2 :
spL2=lf2.split(",")
#print (" File 2 Line matched --> "+lf2)
#break
else :
print (" ELSE -->> "+lf2)
#break Any help appreciated, I exhausted my resources, trying to solve it for the last 3 days.
I tried to move "else" all over the place and used "break" but still cannot make a clean print of only the 'Not Matched' lines.
Thank you.
Posts: 101
Threads: 0
Joined: Jan 2021
Mar-29-2021, 11:27 AM
(This post was last modified: Mar-29-2021, 11:27 AM by Serafim.)
First of all, the "l_few" (file with fewer lines) is file_1 according to the text, so I swapped the file names to change as little as possible in your original program.
Then I came up with this, hope it helps, note that I didn't change much, just enough to make it print what you need:
with open ('file_2','r') as l_few :
f2=l_few.readlines()
sp23 = [s.split(',')[3].strip() for s in f2] # collect all the terms you need to compare
with open ('file_1','r') as f1:
for lf1 in f1:
lf1=lf1.strip()
sp1 = lf1.split(",")
if sp1[3] in sp23:
pass # here we have a match so print nothing
# spL2=lf2.split(",")
#print (" File 2 Line matched --> "+lf2)
#break
else :
print (lf1) # print the non-matching line
#break Output:
Output: 03/28/2021,P,16,LINE1
03/28/2021,P,9,LINE3
03/28/2021,P,8,LINE5
03/28/2021,S,95,LINE6
03/28/2021,S,1,LINE7
03/28/2021,P,46,LINE8
So the error you made was to compare first with one term then with the other so, for each line you had at least one mismatch.
Further simplification:
with open ('file_2','r') as l_few :
f2=l_few.readlines()
sp23 = [s.split(',')[3].strip() for s in f2] # collect all the terms you need to compare
with open ('file_1','r') as f1:
for lf1 in f1:
lf1=lf1.strip()
sp1 = lf1.split(",")
if sp1[3] not in sp23:
print (lf1) # print the non-matching line
Posts: 1,940
Threads: 8
Joined: Jun 2018
As this is under 'General Coding Help' and not 'Homework' I don't pay attention to 'I must use' and focus on task at hand: 'print lines that don't match'
In Python there is datastructure for membership testing called set and it has method difference
If order is not important and there are no duplicate rows in second file following implementation delivers desired result:
with open('first_file.csv', 'r') as f, open('second_file.csv', 'r') as s:
result = set(s.readlines()).difference(f.readlines())
print(*result) Which will give output something like this:
Output: 03/28/2021,S,1,LINE7
03/28/2021,P,46,LINE8
03/28/2021,P,9,LINE3
03/28/2021,S,95,LINE6
03/28/2021,P,8,LINE5
03/28/2021,P,16,LINE1
There is no newlines removed but it can be easily done if needed, also it's easy to sort if required.
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy
Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Posts: 413
Threads: 110
Joined: Apr 2020
(Mar-29-2021, 12:11 PM)perfringo Wrote: with open('first_file.csv', 'r') as f, open('second_file.csv', 'r') as s: result = set(s.readlines()).difference(f.readlines()) print(*result) I cannot compare files directly. I need to process lines after that lines in both files will be diferesnt.
Posts: 6,552
Threads: 19
Joined: Feb 2020
Can still use a set. It will be more efficient than comparing strings. First build the set of strings you are looking for, then read in the second file and check against the set.
with open(DIR/'lines_to_look_for.txt', 'r') as file:
lines = set([line.strip().split(',')[3] for line in file])
with open(DIR/'check_for_lines.txt', 'r') as file:
for line in file:
line = line.strip()
if line.split(',')[3] in lines:
print(f'{line} MATCH')
else:
print(f'{line} NO MATCH')
Posts: 413
Threads: 110
Joined: Apr 2020
(Mar-29-2021, 07:31 PM)deanhystad Wrote: with open(DIR/'lines_to_look_for.txt', 'r') as file: lines = set([line.strip().split(',')[3] for line in file]) with open(DIR/'check_for_lines.txt', 'r') as file: for line in file: line = line.strip() if line.split(',')[3] in lines: print(f'{line} MATCH') else: print(f'{line} NO MATCH') I appreciate you help but,
I Could not make your doc work.
it produces wrong outcome.
03/28/2021,P,16,LINE1 NO MATCH
03/28/2021,P,6,LINE2 MATCH
03/28/2021,P,9,LINE3 NO MATCH
03/28/2021,P,9,LINE4 MATCH
03/28/2021,P,8,LINE5 NO MATCH
03/28/2021,S,95,LINE6 NO MATCH
03/28/2021,S,1,LINE7 NO MATCH
03/28/2021,P,46,LINE8 NO MATCH
or if I switch file-1 and file-2
03/28/2021,P,6,LINE2 MATCH
03/28/2021,P,9,LINE4 MATCH
I need :
03/28/2021,P,16,LINE1
03/28/2021,P,9,LINE3
03/28/2021,P,8,LINE5
03/28/2021,S,95,LINE6
03/28/2021,S,1,LINE7
03/28/2021,P,46,LINE8
Posts: 413
Threads: 110
Joined: Apr 2020
(Mar-29-2021, 11:27 AM)Serafim Wrote: with open ('file_2','r') as l_few : f2=l_few.readlines()sp23 = [s.split(',')[3].strip() for s in f2] # collect all the terms you need to comparewith open ('file_1','r') as f1: for lf1 in f1: lf1=lf1.strip() sp1 = lf1.split(",") if sp1[3] not in sp23: print (lf1) # print the non-matching line
I really appreciate your help but the code does not print out anything...
Posts: 413
Threads: 110
Joined: Apr 2020
(Mar-29-2021, 11:27 AM)Serafim Wrote: with open ('file_2','r') as l_few : f2=l_few.readlines()sp23 = [s.split(',')[3].strip() for s in f2] # collect all the terms you need to comparewith open ('file_1','r') as f1: for lf1 in f1: lf1=lf1.strip() sp1 = lf1.split(",") if sp1[3] in sp23: pass # here we have a match so print nothing # spL2=lf2.split(",") #print (" File 2 Line matched --> "+lf2) #break else : print (lf1) # print the non-matching line #break
My bad! the code s actually working!
Thank you very much for your help!
I made some insignificant changes to your code, just easier to read (for me).
with open (file_1,'r') as l_few :
f1=l_few.readlines()
for s in f1 :
sp23 = s.split(',')
with open (file_2,'r') as f2:
for lf2 in f2:
lf2=lf2.strip()
sp2 = lf2.split(",")
if sp2[3] in sp23:
pass # here we have a match so print nothing
else :
print (lf2) # print the non-matching line
#break
Posts: 1,058
Threads: 111
Joined: Sep 2019
Using the example above
with open('f1.txt', 'r') as file:
lines = set([line.strip().split(',')[3] for line in file])
with open('f2.txt', 'r') as file:
for line in file:
line = line.strip()
if line.split(',')[3] not in lines:
print(f'{line} NO MATCH') Output: 03/28/2021,P,16,LINE1 NO MATCH
03/28/2021,P,9,LINE3 NO MATCH
03/28/2021,P,8,LINE5 NO MATCH
03/28/2021,S,95,LINE6 NO MATCH
03/28/2021,S,1,LINE7 NO MATCH
03/28/2021,P,46,LINE8 NO MATCH
Posts: 413
Threads: 110
Joined: Apr 2020
(Mar-30-2021, 12:47 AM)menator01 Wrote: with open('f1.txt', 'r') as file:
lines = set([line.strip().split(',')[3] for line in file])
with open('f2.txt', 'r') as file:
for line in file:
line = line.strip()
if line.split(',')[3] not in lines:
print(f'{line} NO MATCH') Thank you! It works I just confused about the square brackets in this line:
set([line.strip().split(',')[3] for line in file]).
Thank you again!
|