Python Forum
wanted: regex or code to find valide def statements in a line
Thread Rating:
  • 1 Vote(s) - 1 Average
  • 1
  • 2
  • 3
  • 4
  • 5
wanted: regex or code to find valide def statements in a line
#1
in a text file may be lines that look like valid python def statements. there may be various other things like "def" appearing in a different context. i would like to find some code or a regex that can determine if the entire line is a valid def statement or not. if valid it must return the function name. if not valid it must return something that test false in an if statement. this is just one line being tested. the body of the function may not even be present. but the entire line must be a complete and valid def statement.
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#2
you can pull it from the call stack (but code must be interpreted for this to work) see: https://python-forum.io/Thread-Walking-t...ight=stack
Reply
#3
You could have tried something yourself,maybe you have not started looked into regex as mention before Wink
Here some quick test of something that may work or changed a little,depends on input text.
import re

text = '''\
def my_game():
    print('Game running')

def 123

def foo(arg):
    pass

def Bar(args*, kwargs**):
    pass

hello def is nice
def () wrong'''

# Make list of match
def_name = re.findall(r"def\s(\w+)\(.*", text)
def_line = re.findall(r"def\s\w+\(.*", text)

# Iterate over matches group() or group(1)
matches = re.finditer(r"def\s(\w+)\(.*", text)
for match in matches:
    print(match.group(1))
Output:
my_game foo Bar >>> >>> def_name ['my_game', 'foo', 'Bar'] >>> >>> def_line ['def my_game():', 'def foo(arg):', 'def Bar(args*, kwargs**):'] >>>
Reply
#4
i didn't have anything to try. and these sample regexes don't look like they are anywhere close to checking the whole line for validity. perhaps works OK on valid lines, but can it report all possible invalid lines as invalid?
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#5
(Mar-14-2020, 01:20 PM)Skaperen Wrote: i didn't have anything to try. and these sample regexes don't look like they are anywhere close to checking the whole line for validity
Need to have input to test on,or make some input as i done then can also try to make as difficult as possible to see how the regex work for many cases.

(Mar-14-2020, 01:20 PM)Skaperen Wrote: perhaps works OK on valid lines, but can it report all possible invalid lines as invalid?
The would be all lines that the regex don't match.
>>> def_line
['def my_game():', 'def foo(arg):', 'def Bar(args*, kwargs**):']
>>> 
>>> for line in text.split('\n'):
...     if line not in def_line:     
...         print(line)
...         
    print('Game running')

def 123

    pass

    pass

hello def is nice
def () wrong
Reply
#6
Skaperen Wrote:the body of the function may not even be present. but the entire line must be a complete and valid def statement.
This is not possible. The body of the function belongs to the function definition statement. Or perhaps you are suggesting a new definition of what a function definition statement is?
Reply
#7
can i put the : on a later line. if line continuation is involved then what follows is considered to be pat of the line. if i leave out the : once the line started with "def" at a valid indent, can it still be valid in python3 ?

whether the : is there or not is just one aspect of testing if valid. others like valid prototype argument forms is another. i cannot provide input any more than i can provide every case that every copy of python3 will ever have to process (and decide is valid or not). is this valid?
    def foobar( **abc, *def ):
why not?
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#8
Skaperen Wrote:is this valid?…why not?
This is not valid because there is a syntax definition that forbids it. Have a look at the railroad diagram of python 3.8's syntax that I uploaded recently. You will clearly see that this construct is not possible. You will also see that you can not handle all the cases with a regex because the following is valid for example
def spam(ham=[n for n in range(1, dividend+1) if dividend % n == 0]):
   ...
One way to check that it is valid would be to add a line of function body with a simple 'pass' statement and call the 'compile' function to see if it can build an AST tree with this code.
Reply
#9
To also check if all parameter in a Python function is valid then it's like the famous regex problem from email.
So this the is fully RFC 822 compliant regex,i have never needed to use that regex to eg extract email from text.
So with the regex i posted it will find both last two examples.
import re

text = '''\
def my_game():
    print('Game running')

def 123

def foo(arg):
    pass

def Bar(args*, kwargs**):
    pass

hello def is nice
def () wrong

def spam(ham=[n for n in range(1, dividend+1) if dividend % n == 0]):
    pass

def foobar( **abc, *def ):
    pass'''

# Make list of match
def_name = re.findall(r"def\s(\w+)\(.*", text)
def_line = re.findall(r"def\s\w+\(.*", text)

# Iterate over matches group() or group(1)
matches = re.finditer(r"def\s(\w+)\(.*", text)
for match in matches:
    print(match.group(1))
Output:
my_game foo Bar spam foobar
>>> def_line
['def my_game():',
 'def foo(arg):',
 'def Bar(args*, kwargs**):',
 'def spam(ham=[n for n in range(1, dividend+1) if dividend % n == 0]):',
 'def foobar( **abc, *def ):']
I would not try to write a regex for all cases,could as @Gribouillis mention step it up with adding pass and try to run the function.
With the not valid one will get SyntaxError ,valid one will return None or other error like NameError.
So if i test this it look like this:
>>> check = def_line[-1]
>>> check
'def foobar( **abc, *def ):'
>>> check = check.replace(':', ':pass')
>>> exec(check)
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
  File "<string>", line 1
    def foobar( **abc, *def ):pass
                       ^
SyntaxError: invalid syntax
So with the valid ones:
>>> check = def_line[1]
>>> check = check.replace(':', ':pass')
>>> check
'def foo(arg):pass'
>>> repr(exec(check))
'None'
>>> 
>>> check = def_line[-2]
>>> check = check.replace(':', ':pass')
>>> check
'def spam(ham=[n for n in range(1, dividend+1) if dividend % n == 0]):pass'
>>> exec(check)
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
  File "<string>", line 1, in <module>
NameError: name 'dividend' is not defined
Reply
#10
Using compile() is more robust that exec()
>>> source = "def spam(ham=[n for n in range(1, dividend+1) if dividend % n == 0]):\n"
>>> source = source.rstrip('\n') + 'pass\n'
>>> result = compile(source, '<string>', mode='single')
<code object <module> at 0x7f33bbb89420, file "<string>", line 1>
Now with an invalid line
>>> source = "def spam(**ham, *eggs):\n"
>>> source = source.rstrip('\n') + 'pass\n'
>>> result = compile(source, '<string>', mode='single')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 1
    def spam(**ham, *eggs):pass
                  ^
SyntaxError: invalid syntax
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  python code wanted: grep IP address Skaperen 7 6,057 Jul-09-2018, 05:25 AM
Last Post: Skaperen
  code wanted, but don't expect me to do it Skaperen 0 2,064 Jul-07-2018, 10:50 PM
Last Post: Skaperen
  command line progam wanted: clock Skaperen 2 2,674 Apr-18-2018, 06:54 AM
Last Post: Gribouillis
  code wanted: file splicing Skaperen 10 6,297 Mar-28-2018, 12:13 AM
Last Post: Skaperen
  looking 4 py code: line up columns Skaperen 8 7,745 Jan-09-2017, 05:15 AM
Last Post: Skaperen

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020