Python Forum

Full Version: RE-greedy or non greedy about '?'
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Quote:The '*', '+', and '?' qualifiers are all greedy.
from https://docs.python.org/3/library/re.html
I think '?' qulifier is not greedy, the description is inaccurate, so anything else I don't know? or need to modify the part of doc?
(Jun-01-2020, 04:17 AM)frank0903 Wrote: [ -> ]I think '?' qulifier is not greedy,
https://www.regular-expressions.info/refrepeat.html
import re
RE_QUESTION_MARK = 'yd?'
RE_ASTERISK = 'yd*'
test_str1 = 'ydddddd'
print(f'{test_str1} match result : {re.search(RE_ASTERISK, test_str1)}')
print(f'{test_str1} match result : {re.search(RE_QUESTION_MARK, test_str1)}')
Output:
ydddddd match result : <re.Match object; span=(0, 7), match='ydddddd'> ydddddd match result : <re.Match object; span=(0, 2), match='yd'>
according to the match results, '*' is greedy, acceptable. '?' only match 0 or 1 repetitions of the preceding RE, it's non-greedy. How can we call it greedy?
greedy:
https://regex101.com/r/QmYBkp/1
from the explanations on the right-hand side:
Quote:"yd?" gm
y matches the character y literally (case sensitive)
d? matches the character d literally (case sensitive)
? Quantifier — Matches between zero and one times, as many times as possible, giving back as needed (greedy)

lazy:
https://regex101.com/r/qvKqqU/1
from the explanations on the right-hand side:
Quote:"yd??" gm
y matches the character y literally (case sensitive)
d? matches the character d literally (case sensitive)
? Quantifier — Matches between zero and one times, as few times as possible, expanding as needed (lazy)
@buran,Thanks! understood totally.
if anyone also have the same question, I hope this post will help to understand what's the real meaning of '?'
summary:
'?' quantifier matches between zero and one times, as many times as possible. In other words, there are two options, 0 times or 1 times; '?' quantifier selects '1 times', not '0 times', it means '?' is greedy, it wants more. I think that's the reason it's called greedy.
'??' quantifier matches between zero and one times, as few times as possible. In other words, '??' quantifier selects '0 times'.I think that's the reason it's called lazy or non-greedy.
yep! and also if ? and * as quantifiers produce the same result (as implied by your example above), then probably one of them is not necessary