Python Forum
Still do not get how Python iterates over a file
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Still do not get how Python iterates over a file
#1
Greetings!
I'm trying to code in python for about a year now, had some success but I still do not understand how it iterates over a file.
for example, I want to print "Matched" and 'Unmatched" elements from two list_1.
list_1 =['ZPM911','ZPM1912','ZPM1919','ZPM4555','ZPM4556','ZPM4557']

list_2 = ['ZPM911,UP,DOWN','ZPM1912,UP,UP','ZPM1919,DOWN,DOWN']

for el in list_1 :
    el=el.strip()
    
    for ehl in list_2 :   
        ehl=ehl.strip()
        sp0,*stuff = ehl.split(",")       
        if  el == sp0 :
            print(f" Matched ++++++ {el}")        
        else :
            print(f" Not Matched -- {el}")
I'd like to get output like this:
Matched ++++++ ZPM911
Matched ++++++ ZPM912
Matched ++++++ ZPM919

Not Matched -- ZPM4555
Not Matched -- ZPM4556
Not Matched -- ZPM4557

When I remove "else" I get all "Matched " elements
I tried many different things but I could never get the output I want.

Thank you.
Reply
#2
The title of your post is misleading, as there are no files in the code you've shown.

What have you done to debug the problem? If you want to understand what's going on, add extra calls to print to see what the values of the variables are on each iteration of the loops.
buran likes this post
Reply
#3
I replaced the file with a list just to make it simpler but actually have a file and a list.
What do you mean by" add extra calls to print"? I already have "print" and it prints gibberish :

Matched ++++++ ZPM911
Not Matched -- ZPM911
Not Matched -- ZPM911
Not Matched -- ZPM1912
Matched ++++++ ZPM1912
Not Matched -- ZPM1912
Not Matched -- ZPM1919
Not Matched -- ZPM1919
Matched ++++++ ZPM1919
Not Matched -- ZPM4555
Not Matched -- ZPM4555
Not Matched -- ZPM4555
Not Matched -- ZPM4556
Not Matched -- ZPM4556
Not Matched -- ZPM4556
Not Matched -- ZPM4557
Not Matched -- ZPM4557

Thank you.
Reply
#4
Here you go.
I'm using a file and a list.
File:
ZPM911
ZPM1912
ZPM1919
ZPM4555
ZPM4556
ZPM4557
Snippet:
ff=open("c:/02/TTST.txt",'r')
list_1=ff.readlines()
 
list_2 = ['ZPM911,UP,DOWN','ZPM1912,UP,UP','ZPM1919,DOWN,DOWN']
 
for el in list_1 :
    el=el.strip()
     
    for ehl in list_2 :   
        ehl=ehl.strip()
        sp0,*stuff = ehl.split(",")       
        if  el == sp0 :
            print(f" Matched ++++++ {el}")        
        else :
            print(f" Not Matched -- {el}")
ff.close()  
Reply
#5
You have two calls to print, only on lines 13 and 15. You're saying you don't know why your program isn't working, so the suggestion is to work that out - print out the rest of your variables to see what they are. Your assumptions about what they are need to be tested for you to understand how to fix the program.
Reply
#6
In this problem, 'Matched' and 'Not Matched' have a different status: you want to print 'Matched' if AT LEAST ONE of the elements of list_2 corresponds to el, and you want to print 'Not Matched' if NONE OF THE elements of list_2 corresponds to el. We don't see this dissymetry in the code.
Reply
#7
Thanks for looking in to it!

I 'd like to have this output:
Matched ++++++ ZPM911
Matched ++++++ ZPM912
Matched ++++++ ZPM919

Not Matched -- ZPM4555
Not Matched -- ZPM4556
Not Matched -- ZPM4557

But as you can see my code produces this one:
Matched ++++++ ZPM911
Not Matched -- ZPM911
Not Matched -- ZPM911
Not Matched -- ZPM1912
Matched ++++++ ZPM1912
Not Matched -- ZPM1912
Not Matched -- ZPM1919
Not Matched -- ZPM1919
Matched ++++++ ZPM1919
Not Matched -- ZPM4555
Not Matched -- ZPM4555
Not Matched -- ZPM4555
Not Matched -- ZPM4556
Not Matched -- ZPM4556
Not Matched -- ZPM4556
Not Matched -- ZPM4557
Not Matched -- ZPM4557

Thank you.
Reply
#8
(Aug-21-2021, 06:36 AM)tester_V Wrote: Here you go.
I'm using a file and a list.
File:
ZPM911
ZPM1912
ZPM1919
ZPM4555
ZPM4556
ZPM4557
Snippet:
ff=open("c:/02/TTST.txt",'r')
list_1=ff.readlines()
 
list_2 = ['ZPM911,UP,DOWN','ZPM1912,UP,UP','ZPM1919,DOWN,DOWN']
 
for el in list_1 :
    el=el.strip()
     
    for ehl in list_2 :   
        ehl=ehl.strip()
        sp0,*stuff = ehl.split(",")       
        if  el == sp0 :
            print(f" Matched ++++++ {el}")        
        else :
            print(f" Not Matched -- {el}")
ff.close()  
Readability is greatly improved if you put spaces around operators, that is
ehl=ehl.strip()
is far more readable if you do it as
ehl = ehl.strip()
We have trained our brains, from years of reading, to understand the use of spaces to separate words. Jamming everything together makes it unreadable, justlikethisparticularphraseinthissentence. Have some consideration for your readers, especially those of us over 50 whose vision is not as sharp as it used to be, and make the programs legible. Vertical whitespace is also a great aid to reading.

If you use readlines(), be aware that this may not work for large files.

Your code is extremely inefficient, which may be fine for short files, but it increases as a factor which is the product of the number of lines and the number of elements in the list. The list can be broken into parts just once, in advance, and that will save you considerable time because the split on comma will be done exactly once.

The advice given about adding the print statements is the first cut at a test. You need to know what Python is producing for sp0 and *stuff. For example, why aren't you applying strip() to sp0? But at no point do you actually print out sp0, so we don't know what you are comparing to el. Note that for diagnostic printout, you need to put delimiters around the strings, so you can see if spaces or tabs or something are sneaking in:
print(f"sp0=[{sp0}], el=[{el}]")
tester_V likes this post
Reply
#9
It always a good idea to start with the plan. So how can problem at hand described?

- for every item in list_1 check whether any item in list_2 starts with it
- if match is found I want to print it once
- if no match is found I want to print it once

It should be obvious that no-match printing can happen only after all items are checked. So simple implementation of above 'algorithm' can be made:

>>> list_1 =['ZPM911','ZPM1912','ZPM1919','ZPM4555','ZPM4556','ZPM4557']
>>> list_2 = ['ZPM911,UP,DOWN','ZPM1912,UP,UP','ZPM1919,DOWN,DOWN']
>>> for item in list_1:
...     for el in list_2:
...         if el.startswith(item):
...             print(f'Matched {item}')
...             break             # match found and we proceed to the next item
...     else:                     # no-break, no match found for the item
...         print(f'not matched {item}')
...
Matched ZPM911
Matched ZPM1912
Matched ZPM1919
not matched ZPM4555
not matched ZPM4556
not matched ZPM4557
tester_V likes this post
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Reply
#10
Also as your question was about iterate over a file.
In code under there is no readlines() or use of close().
So in Python can iterate directly over file object which is memory efficient as only one line at time in memory,
with open close the file object automatically.
list_2 = ['ZPM911,UP,DOWN','ZPM1912,UP,UP','ZPM1919,DOWN,DOWN']
with open('TTST.txt') as f:
    for line in f:
        line = line.strip()
        for el in list_2:
            if el.startswith(line):
                print(f'Matched {line}')
                break
        else:
            print(f'Not matched {line}')
Output:
Matched ZPM911 Matched ZPM1912 Matched ZPM1919 Not matched ZPM4555 Not matched ZPM4556 Not matched ZPM4557
tester_V likes this post
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Do not get how Python iterates over a file tester_V 12 1,741 Jan-29-2023, 01:49 PM
Last Post: DeaD_EyE

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020