Python Forum
failing to print not matched lines from second file
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
failing to print not matched lines from second file
#1
Greetings!
For some reason, I do not get the iterations in Python. Sad
Especially with "Else". Confused
here is a problem, I have two files, File-1 and File-2. I must use File-1 lines to find lines in File-2.
I have no problem finding lines but I cannot print Lines that do not match (from File-2)

File-1 lines:
03/28/2021,P,6,LINE2
03/28/2021,P,9,LINE4

File-2 lines:
03/28/2021,P,16,LINE1
03/28/2021,P,6,LINE2
03/28/2021,P,9,LINE3
03/28/2021,P,9,LINE4
03/28/2021,P,8,LINE5
03/28/2021,S,95,LINE6
03/28/2021,S,1,LINE7
03/28/2021,P,46,LINE8

I need to print out only lines that do not match:
03/28/2021,P,16,LINE1
03/28/2021,P,9,LINE3
03/28/2021,P,8,LINE5
03/28/2021,S,95,LINE6
03/28/2021,S,1,LINE7
03/28/2021,P,46,LINE8

But the code I wrote prints this:
ELSE -->> 03/28/2021,P,16,LINE1
ELSE -->> 03/28/2021,P,9,LINE3
ELSE -->> 03/28/2021,P,9,LINE4
ELSE -->> 03/28/2021,P,8,LINE5
ELSE -->> 03/28/2021,S,95,LINE6
ELSE -->> 03/28/2021,S,1,LINE7
ELSE -->> 03/28/2021,P,46,LINE8
ELSE -->> 03/28/2021,P,16,LINE1
ELSE -->> 03/28/2021,P,6,LINE2
ELSE -->> 03/28/2021,P,9,LINE3
ELSE -->> 03/28/2021,P,8,LINE5
ELSE -->> 03/28/2021,S,95,LINE6
ELSE -->> 03/28/2021,S,1,LINE7
ELSE -->> 03/28/2021,P,46,LINE8

I have to split lines in File-1 and File-2, some additional processing required in the lines.
Here is the code:
with open (file_2,'r') as l_few :
    f2=l_few.readlines() 
with open (file_1,'r') as f1:          
    for lf1 in f1:
        lf1=lf1.strip()
        sp1 = lf1.split(",")
        for lf2 in f2 :
            lf2=lf2.strip()
            if lf2 :
                if sp1[3] in lf2 : 
                    spL2=lf2.split(",")
                    #print (" File 2 Line matched --> "+lf2)
                    #break
                else :
                    print (" ELSE -->> "+lf2)
                    #break
Any help appreciated, I exhausted my resources, trying to solve it for the last 3 days. Sick
I tried to move "else" all over the place and used "break" but still cannot make a clean print of only the 'Not Matched' lines.
Thank you.
Reply
#2
First of all, the "l_few" (file with fewer lines) is file_1 according to the text, so I swapped the file names to change as little as possible in your original program.
Then I came up with this, hope it helps, note that I didn't change much, just enough to make it print what you need:
with open ('file_2','r') as l_few :
    f2=l_few.readlines()
sp23 = [s.split(',')[3].strip() for s in f2] # collect all the terms you need to compare
with open ('file_1','r') as f1:          
    for lf1 in f1:
        lf1=lf1.strip()
        sp1 = lf1.split(",")
        if sp1[3] in sp23:
            pass # here we have a match so print nothing
            # spL2=lf2.split(",")
            #print (" File 2 Line matched --> "+lf2)
            #break
        else :
            print (lf1) # print the non-matching line
            #break
Output:
Output:
03/28/2021,P,16,LINE1 03/28/2021,P,9,LINE3 03/28/2021,P,8,LINE5 03/28/2021,S,95,LINE6 03/28/2021,S,1,LINE7 03/28/2021,P,46,LINE8
So the error you made was to compare first with one term then with the other so, for each line you had at least one mismatch.
Further simplification:
with open ('file_2','r') as l_few :
    f2=l_few.readlines()
sp23 = [s.split(',')[3].strip() for s in f2] # collect all the terms you need to compare
with open ('file_1','r') as f1:          
    for lf1 in f1:
        lf1=lf1.strip()
        sp1 = lf1.split(",")
        if sp1[3] not in sp23:
            print (lf1) # print the non-matching line
tester_V likes this post
Reply
#3
As this is under 'General Coding Help' and not 'Homework' I don't pay attention to 'I must use' and focus on task at hand: 'print lines that don't match'

In Python there is datastructure for membership testing called set and it has method difference

If order is not important and there are no duplicate rows in second file following implementation delivers desired result:

with open('first_file.csv', 'r') as f, open('second_file.csv', 'r') as s:
    result = set(s.readlines()).difference(f.readlines())
    print(*result)
Which will give output something like this:

Output:
03/28/2021,S,1,LINE7 03/28/2021,P,46,LINE8 03/28/2021,P,9,LINE3 03/28/2021,S,95,LINE6 03/28/2021,P,8,LINE5 03/28/2021,P,16,LINE1
There is no newlines removed but it can be easily done if needed, also it's easy to sort if required.
tester_V likes this post
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Reply
#4
(Mar-29-2021, 12:11 PM)perfringo Wrote: with open('first_file.csv', 'r') as f, open('second_file.csv', 'r') as s:    result = set(s.readlines()).difference(f.readlines())    print(*result)
I cannot compare files directly. I need to process lines after that lines in both files will be diferesnt.
Reply
#5
Can still use a set. It will be more efficient than comparing strings. First build the set of strings you are looking for, then read in the second file and check against the set.
with open(DIR/'lines_to_look_for.txt', 'r') as file:
    lines = set([line.strip().split(',')[3] for line in file])

with open(DIR/'check_for_lines.txt', 'r') as file:
    for line in file:
        line = line.strip()
        if line.split(',')[3] in lines:
            print(f'{line} MATCH')
        else:
            print(f'{line} NO MATCH')
tester_V likes this post
Reply
#6
(Mar-29-2021, 07:31 PM)deanhystad Wrote: with open(DIR/'lines_to_look_for.txt', 'r') as file:    lines = set([line.strip().split(',')[3] for line in file]) with open(DIR/'check_for_lines.txt', 'r') as file:    for line in file:        line = line.strip()        if line.split(',')[3] in lines:            print(f'{line} MATCH')        else:            print(f'{line} NO MATCH')
I appreciate you help but,
I Could not make your doc work.
it produces wrong outcome.

03/28/2021,P,16,LINE1 NO MATCH
03/28/2021,P,6,LINE2 MATCH
03/28/2021,P,9,LINE3 NO MATCH
03/28/2021,P,9,LINE4 MATCH
03/28/2021,P,8,LINE5 NO MATCH
03/28/2021,S,95,LINE6 NO MATCH
03/28/2021,S,1,LINE7 NO MATCH
03/28/2021,P,46,LINE8 NO MATCH

or if I switch file-1 and file-2

03/28/2021,P,6,LINE2 MATCH
03/28/2021,P,9,LINE4 MATCH

I need :
03/28/2021,P,16,LINE1
03/28/2021,P,9,LINE3
03/28/2021,P,8,LINE5
03/28/2021,S,95,LINE6
03/28/2021,S,1,LINE7
03/28/2021,P,46,LINE8
Reply
#7
(Mar-29-2021, 11:27 AM)Serafim Wrote: with open ('file_2','r') as l_few :    f2=l_few.readlines()sp23 = [s.split(',')[3].strip() for s in f2] # collect all the terms you need to comparewith open ('file_1','r') as f1:              for lf1 in f1:        lf1=lf1.strip()        sp1 = lf1.split(",")        if sp1[3] not in sp23:            print (lf1) # print the non-matching line

I really appreciate your help but the code does not print out anything...
Reply
#8
(Mar-29-2021, 11:27 AM)Serafim Wrote: with open ('file_2','r') as l_few :    f2=l_few.readlines()sp23 = [s.split(',')[3].strip() for s in f2] # collect all the terms you need to comparewith open ('file_1','r') as f1:              for lf1 in f1:        lf1=lf1.strip()        sp1 = lf1.split(",")        if sp1[3] in sp23:            pass # here we have a match so print nothing            # spL2=lf2.split(",")            #print (" File 2 Line matched --> "+lf2)            #break        else :            print (lf1) # print the non-matching line            #break

My bad! the code s actually working!
Thank you very much for your help!

I made some insignificant changes to your code, just easier to read (for me).

with open (file_1,'r') as l_few :
    f1=l_few.readlines()
for s in f1 :
    sp23 = s.split(',')

with open (file_2,'r') as f2:          
    for lf2 in f2:
        lf2=lf2.strip()
        sp2 = lf2.split(",")
        if sp2[3] in sp23:
            pass # here we have a match so print nothing
        else :
            print (lf2) # print the non-matching line
            #break
Reply
#9
Using the example above

with open('f1.txt', 'r') as file:
    lines = set([line.strip().split(',')[3] for line in file])

with open('f2.txt', 'r') as file:
    for line in file:
        line = line.strip()
        if line.split(',')[3] not in lines:
            print(f'{line} NO MATCH')
Output:
03/28/2021,P,16,LINE1 NO MATCH 03/28/2021,P,9,LINE3 NO MATCH 03/28/2021,P,8,LINE5 NO MATCH 03/28/2021,S,95,LINE6 NO MATCH 03/28/2021,S,1,LINE7 NO MATCH 03/28/2021,P,46,LINE8 NO MATCH
tester_V likes this post
I welcome all feedback.
The only dumb question, is one that doesn't get asked.
My Github
How to post code using bbtags


Reply
#10
(Mar-30-2021, 12:47 AM)menator01 Wrote: with open('f1.txt', 'r') as file:    
lines = set([line.strip().split(',')[3] for line in file]) 
with open('f2.txt', 'r') as file:    
for line in file:        
line = line.strip()        
if line.split(',')[3] not in lines:            
print(f'{line} NO MATCH')
Thank you! It works I just confused about the square brackets in this line:
set([line.strip().split(',')[3] for line in file]).
Thank you again!
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Failing regex, space before and after the "match" tester_V 6 1,115 Mar-06-2023, 03:03 PM
Last Post: deanhystad
  Failing to print sorted files tester_V 4 1,188 Nov-12-2022, 06:49 PM
Last Post: tester_V
  Saving the print result in a text file Calli 8 1,699 Sep-25-2022, 06:38 PM
Last Post: snippsat
  Failing reading a file and cannot exit it... tester_V 8 1,753 Aug-19-2022, 10:27 PM
Last Post: tester_V
  Failing regex tester_V 3 1,143 Aug-16-2022, 03:53 PM
Last Post: deanhystad
  Delete multiple lines from txt file Lky 6 2,201 Jul-10-2022, 12:09 PM
Last Post: jefsummers
  Print to a New Line when Appending File DaveG 0 1,189 Mar-30-2022, 04:14 AM
Last Post: DaveG
  Extracting Specific Lines from text file based on content. jokerfmj 8 2,856 Mar-28-2022, 03:38 PM
Last Post: snippsat
Sad Want to Save Print output in csv file Rasedul 5 10,687 Jan-11-2022, 07:04 PM
Last Post: snippsat
  Convert legacy print file to XLSX file davidm 1 1,768 Oct-17-2021, 05:08 AM
Last Post: davidm

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020