Python Forum
failing to print not matched lines from second file
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
failing to print not matched lines from second file
Why were you confused by square brackets? You use them in your original post.
Outside square brackets is list comprehension and the [3] is string slicing.
Someone correct me if I am wrong please.
I welcome all feedback.
The only dumb question, is one that doesn't get asked.
My Github
How to post code using bbtags
Download my project scripts

Hm, I do not understand why you split the line, if you compare whole lines.

Things to pay attention for:
  • what is compared? Lines or some columns in the liens?
  • What should happen with empty lines?
  • What should happen if one line has leading white spaces and the references not?
  • What should happen if one line has tailing white spaces and the references not?
  • Do the reference have empty lines and leading white spaces?

If you just want to compare whole not empty lines and stripping whitespaces:
from io import StringIO

test1 = """



test2 = """



file1 = StringIO(test1)
file2 = StringIO(test2)
# using StringIO to simulate an open file

# file1 and file2 can also come from open()
# TextIOWrapper, StringIO, BytesIO, ... supports
# line iteration

def get_references(text):
    references = set()
    # we want to look up fast
    # preserving the order is not required for
    # the references
    # a set contains only unique elements

    # this removes leading and tailing white spaces
    for line in map(str.strip, text):
        if not line:
            # skip empty lines
            # because of str.strip
            # the line does not contain white spaces
            # bool(empty_string) -> False
        # set has no append.
        # instead you add objects to the set
    return references

def show_not_matching(text, references):
    line_iter = map(str.strip, text)
    # to get line numbers, enumerate is used
    # it just iterates over the iterable and
    # yields (number, elemten_of_iterable)
    lines = enumerate(line_iter, start=1)
    for line_number, line in lines:
        if not line:
        if line not in references:
            # string formatting
            print(f"[{line_number:>5}] Not matching -> {line}")

if __name__ == "__main__":
    # with test data in source code
    ref = get_references(file1)
    show_not_matching(file2, ref)

    # later with real files
    # with open("file1.txt") as fd_ref:
    #     refs = get_references(fd_ref)
    # with open("file2.txt") as fd:
    #     show_not_matching(fd, refs)
And if you not want to compare the date:
from io import StringIO

test1 = """



test2 = """



file1 = StringIO(test1)
file2 = StringIO(test2)
# using StringIO to simulate an open file

# file1 and file2 can also come from open()
# TextIOWrapper, StringIO, BytesIO, ... supports
# line iteration

def get_references(text):
    references = set()
    # we want to look up fast
    # preserving the order is not required for
    # the references
    # a set contains only unique elements

    # this removes leading and tailing white spaces
    for line in map(str.strip, text):
        if not line:
            # skip empty lines
            # because of str.strip
            # the line does not contain white spaces
            # bool(empty_string) -> False
        # set has no append.
        # instead you add objects to the set

        # just removing the date from line
        # _ is a throw away name
        _, line = line.split(",", maxsplit=1)
    print("References:", references)
    return references

def show_not_matching(text, references):
    line_iter = map(str.strip, text)
    # to get line numbers, enumerate is used
    # it just iterates over the iterable and
    # yields (number, elemten_of_iterable)
    lines = enumerate(line_iter, start=1)
    for line_number, line in lines:
        if not line:
        # here the same
        # we want to remove the date from the
        # line we want to compare with the references
        # where the date was also removed
        # but we keep the original line, for
        # printing it
        _, line_to_compare = line.split(",", maxsplit=1)
        # now use the modified line to look it up in references
        if line_to_compare not in references:
            # string formatting
            print(f"[{line_number:>5}] Not matching -> {line}")

if __name__ == "__main__":
    # with test data in source code
    ref = get_references(file1)
    show_not_matching(file2, ref)

    # later with real files
    # with open("file1.txt") as fd_ref:
    #     refs = get_references(fd_ref)
    # with open("file2.txt") as fd:
    #     show_not_matching(fd, refs)
This time without comments, but with real files:
def get_references(text):
    references = set()
    for line in map(str.strip, text):
        if not line:
        _, line = line.split(",", maxsplit=1)
    return references

def show_not_matching(text, references):
    line_iter = map(str.strip, text)
    lines = enumerate(line_iter, start=1)
    for line_number, line in lines:
        if not line:
        _, line_to_compare = line.split(",", maxsplit=1)
        if line_to_compare not in references:
            print(f"[{line_number:>5}] Not matching -> {line}")

if __name__ == "__main__":
    with open("file1.txt") as fd_ref:
        refs = get_references(fd_ref)

    with open("file2.txt") as fd:
        show_not_matching(fd, refs)
Read the Python documentation, if you see functions you don't know.
enumerate, map, str.split, set, in operator.

Also important for later use: str.strip(both_sides), str.lstrip(left_side), str.rstrip(right_side).
To remove only tailing white spaces, use str.lstrip.
tester_V likes this post
Almost dead, but too lazy to die:
All humans together. We don't need politicians!
By the way, you can use the csv Module.
This is also in the standard library.
tester_V likes this post
Almost dead, but too lazy to die:
All humans together. We don't need politicians!
Build a set of strings then read the second file - this will fix the problem that you are facing.

with open(DIR/'lines_to_look_for.txt', 'r') as file:
    lines = set([line.strip().split(',')[3] for line in file])
with open(DIR/'check_for_lines.txt', 'r') as file:
    for line in file:
        line = line.strip()
        if line.split(',')[3] in lines:
            print(f'{line} MATCH')
            print(f'{line} NO MATCH')
Square Brackets is list comprehension and the [3] is string slicing. menator01 said right only.
tester_V likes this post

Possibly Related Threads…
Thread Author Replies Views Last Post
  [solved] how to delete the 10 first lines of an ascii file paul18fr 7 968 Aug-07-2024, 08:18 PM
Last Post: Gribouillis
  Print the next 3 lines knob 3 800 May-22-2024, 12:26 PM
Last Post: andraee
  Cannot get cmd to print Python file Schauster 11 1,974 May-16-2024, 04:40 PM
Last Post: xMaxrayx
  Failing to connect by 'net use' tester_V 1 659 Apr-20-2024, 06:31 AM
Last Post: tester_V
  Failing to print sorted files tester_V 4 2,028 Nov-12-2022, 06:49 PM
Last Post: tester_V
  Saving the print result in a text file Calli 8 2,881 Sep-25-2022, 06:38 PM
Last Post: snippsat
  Failing reading a file and cannot exit it... tester_V 8 2,752 Aug-19-2022, 10:27 PM
Last Post: tester_V
  Failing regex tester_V 3 1,845 Aug-16-2022, 03:53 PM
Last Post: deanhystad
  Delete multiple lines from txt file Lky 6 3,275 Jul-10-2022, 12:09 PM
Last Post: jefsummers
  Print to a New Line when Appending File DaveG 0 1,524 Mar-30-2022, 04:14 AM
Last Post: DaveG

Forum Jump:

User Panel Messages

Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020