Python Forum
wanted: regex or code to find valide def statements in a line - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Forum & Off Topic (https://python-forum.io/forum-23.html)
+--- Forum: Bar (https://python-forum.io/forum-27.html)
+--- Thread: wanted: regex or code to find valide def statements in a line (/thread-24991.html)

Pages: 1 2


wanted: regex or code to find valide def statements in a line - Skaperen - Mar-14-2020

in a text file may be lines that look like valid python def statements. there may be various other things like "def" appearing in a different context. i would like to find some code or a regex that can determine if the entire line is a valid def statement or not. if valid it must return the function name. if not valid it must return something that test false in an if statement. this is just one line being tested. the body of the function may not even be present. but the entire line must be a complete and valid def statement.


RE: wanted: regex or code to find valide def statements in a line - Larz60+ - Mar-14-2020

you can pull it from the call stack (but code must be interpreted for this to work) see: https://python-forum.io/Thread-Walking-the-python-call-stack-without-leaks?highlight=stack


RE: wanted: regex or code to find valide def statements in a line - snippsat - Mar-14-2020

You could have tried something yourself,maybe you have not started looked into regex as mention before Wink
Here some quick test of something that may work or changed a little,depends on input text.
import re

text = '''\
def my_game():
    print('Game running')

def 123

def foo(arg):
    pass

def Bar(args*, kwargs**):
    pass

hello def is nice
def () wrong'''

# Make list of match
def_name = re.findall(r"def\s(\w+)\(.*", text)
def_line = re.findall(r"def\s\w+\(.*", text)

# Iterate over matches group() or group(1)
matches = re.finditer(r"def\s(\w+)\(.*", text)
for match in matches:
    print(match.group(1))
Output:
my_game foo Bar >>> >>> def_name ['my_game', 'foo', 'Bar'] >>> >>> def_line ['def my_game():', 'def foo(arg):', 'def Bar(args*, kwargs**):'] >>>



RE: wanted: regex or code to find valide def statements in a line - Skaperen - Mar-14-2020

i didn't have anything to try. and these sample regexes don't look like they are anywhere close to checking the whole line for validity. perhaps works OK on valid lines, but can it report all possible invalid lines as invalid?


RE: wanted: regex or code to find valide def statements in a line - snippsat - Mar-14-2020

(Mar-14-2020, 01:20 PM)Skaperen Wrote: i didn't have anything to try. and these sample regexes don't look like they are anywhere close to checking the whole line for validity
Need to have input to test on,or make some input as i done then can also try to make as difficult as possible to see how the regex work for many cases.

(Mar-14-2020, 01:20 PM)Skaperen Wrote: perhaps works OK on valid lines, but can it report all possible invalid lines as invalid?
The would be all lines that the regex don't match.
>>> def_line
['def my_game():', 'def foo(arg):', 'def Bar(args*, kwargs**):']
>>> 
>>> for line in text.split('\n'):
...     if line not in def_line:     
...         print(line)
...         
    print('Game running')

def 123

    pass

    pass

hello def is nice
def () wrong



RE: wanted: regex or code to find valide def statements in a line - Gribouillis - Mar-14-2020

Skaperen Wrote:the body of the function may not even be present. but the entire line must be a complete and valid def statement.
This is not possible. The body of the function belongs to the function definition statement. Or perhaps you are suggesting a new definition of what a function definition statement is?


RE: wanted: regex or code to find valide def statements in a line - Skaperen - Mar-15-2020

can i put the : on a later line. if line continuation is involved then what follows is considered to be pat of the line. if i leave out the : once the line started with "def" at a valid indent, can it still be valid in python3 ?

whether the : is there or not is just one aspect of testing if valid. others like valid prototype argument forms is another. i cannot provide input any more than i can provide every case that every copy of python3 will ever have to process (and decide is valid or not). is this valid?
    def foobar( **abc, *def ):
why not?


RE: wanted: regex or code to find valide def statements in a line - Gribouillis - Mar-15-2020

Skaperen Wrote:is this valid?…why not?
This is not valid because there is a syntax definition that forbids it. Have a look at the railroad diagram of python 3.8's syntax that I uploaded recently. You will clearly see that this construct is not possible. You will also see that you can not handle all the cases with a regex because the following is valid for example
def spam(ham=[n for n in range(1, dividend+1) if dividend % n == 0]):
   ...
One way to check that it is valid would be to add a line of function body with a simple 'pass' statement and call the 'compile' function to see if it can build an AST tree with this code.


RE: wanted: regex or code to find valide def statements in a line - snippsat - Mar-16-2020

To also check if all parameter in a Python function is valid then it's like the famous regex problem from email.
So this the is fully RFC 822 compliant regex,i have never needed to use that regex to eg extract email from text.
So with the regex i posted it will find both last two examples.
import re

text = '''\
def my_game():
    print('Game running')

def 123

def foo(arg):
    pass

def Bar(args*, kwargs**):
    pass

hello def is nice
def () wrong

def spam(ham=[n for n in range(1, dividend+1) if dividend % n == 0]):
    pass

def foobar( **abc, *def ):
    pass'''

# Make list of match
def_name = re.findall(r"def\s(\w+)\(.*", text)
def_line = re.findall(r"def\s\w+\(.*", text)

# Iterate over matches group() or group(1)
matches = re.finditer(r"def\s(\w+)\(.*", text)
for match in matches:
    print(match.group(1))
Output:
my_game foo Bar spam foobar
>>> def_line
['def my_game():',
 'def foo(arg):',
 'def Bar(args*, kwargs**):',
 'def spam(ham=[n for n in range(1, dividend+1) if dividend % n == 0]):',
 'def foobar( **abc, *def ):']
I would not try to write a regex for all cases,could as @Gribouillis mention step it up with adding pass and try to run the function.
With the not valid one will get SyntaxError ,valid one will return None or other error like NameError.
So if i test this it look like this:
>>> check = def_line[-1]
>>> check
'def foobar( **abc, *def ):'
>>> check = check.replace(':', ':pass')
>>> exec(check)
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
  File "<string>", line 1
    def foobar( **abc, *def ):pass
                       ^
SyntaxError: invalid syntax
So with the valid ones:
>>> check = def_line[1]
>>> check = check.replace(':', ':pass')
>>> check
'def foo(arg):pass'
>>> repr(exec(check))
'None'
>>> 
>>> check = def_line[-2]
>>> check = check.replace(':', ':pass')
>>> check
'def spam(ham=[n for n in range(1, dividend+1) if dividend % n == 0]):pass'
>>> exec(check)
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
  File "<string>", line 1, in <module>
NameError: name 'dividend' is not defined



RE: wanted: regex or code to find valide def statements in a line - Gribouillis - Mar-16-2020

Using compile() is more robust that exec()
>>> source = "def spam(ham=[n for n in range(1, dividend+1) if dividend % n == 0]):\n"
>>> source = source.rstrip('\n') + 'pass\n'
>>> result = compile(source, '<string>', mode='single')
<code object <module> at 0x7f33bbb89420, file "<string>", line 1>
Now with an invalid line
>>> source = "def spam(**ham, *eggs):\n"
>>> source = source.rstrip('\n') + 'pass\n'
>>> result = compile(source, '<string>', mode='single')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 1
    def spam(**ham, *eggs):pass
                  ^
SyntaxError: invalid syntax