Python Forum
List Creation and Position of Continue Statement In Regular Expression Code - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: List Creation and Position of Continue Statement In Regular Expression Code (/thread-37453.html)



List Creation and Position of Continue Statement In Regular Expression Code - new_coder_231013 - Jun-11-2022

Hello,

I'm going through a Python course and there's a section on extracting data using regular expressions in which the presenter shows the following code using re.findall within a for loop:

import re
hand = open ('mbox-short.txt')
numlist = list()
for line in hand:
     line = line.rstrip()
     stuff = re.findall('^X-DSPAM-Confidence: ([0-9.]+)', line)
     if len(stuff) !=1 : continue
     num = float(stuff[0])
     numlist.append(num)
print('Maximum:', max(numlist))
I have two (unrelated, I think) questions about the above:

1 ) The presenter refers to "stuff" (first introduced on line 6) as a list and the code on line 8 appears to be applying the float method to the first item in the list. My question is how did Python "know" that stuff was a list without a line of code before line 6 stating "stuff=[]"? Is the list somehow defined or created within the square brackets or parentheses used in line 6?

2 ) Doesn't the continue statement on line 7 need to be on its own line rather than at the end of line 7? I'm not sure if I've seen it placed on the same line before and am unclear as to whether this is an acceptable variation, a change from one version of Python to another, or something else.

Thanks so much in advance for your help.


RE: List Creation and Position of Continue Statement In Regular Expression Code - snippsat - Jun-11-2022

(Jun-11-2022, 12:57 PM)new_coder_231013 Wrote: My question is how did Python "know" that stuff was a list without a line of code before line 6 stating "stuff=[]"? Is the list somehow defined or created within the square brackets or parentheses used in line 6?
re.findall return a list,so he use that list.
>>> import re
>>> 
>>> s = 'hello world 123'
>>> r = re.findall(r'\d+', s)
>>> r
['123']
>>> r[0]
'123'
>>> int(r[0])
123
(Jun-11-2022, 12:57 PM)new_coder_231013 Wrote: 2 ) Doesn't the continue statement on line 7 need to be on its own line rather than at the end of line 7?
It will work with one line like that,but style is better like this.
if len(stuff) !=1:
    continue



RE: List Creation and Position of Continue Statement In Regular Expression Code - deanhystad - Jun-11-2022

Read The Fine Manual (https://docs.python.org/3/library/re.html)

According to the documentation:
Quote:re.findall(pattern, string, flags=0)
Return all non-overlapping matches of pattern in string, as a list of strings or tuples. The string is scanned left-to-right, and matches are returned in the order found. Empty matches are included in the result.

The result depends on the number of capturing groups in the pattern. If there are no groups, return a list of strings matching the whole pattern. If there is exactly one group, return a list of strings matching that group. If multiple groups are present, return a list of tuples of strings matching the groups. Non-capturing groups do not affect the form of the result.
Your program calls re.findall() and the function returns a list. The returned list is assigned to the variable "stuff".

If you don't know what a function does, look it up. I spend much of my coding time looking up how different functions work. Reading the documentation not only helps me understand how the function works, but often provides insight into what the function can be used for. In this case RTFM is time well spent.


RE: List Creation and Position of Continue Statement In Regular Expression Code - new_coder_231013 - Jun-15-2022

Thanks snippsat and deanhystad, appreciate it.