Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
A Question about Expression
#1
If I have the following code:
import re
x = 'From [email protected] Sat Jan 5 09:14:16 2008'
y = re.findall('\S+?@\S+?', x)
print(y)

Why does it print 'stephen.marquard@u' not 'd@u', even though the non-greedy (?) is added to both sides?
Reply
#2
I think it is because it is the first occurrence of the expression. 'd@u' would be an occurrence but it comes after the preceding one because its beginning is later in the string. Then after finding the first occurrence, the search starts again after the occurrence found, that's why d@u is never found. Why not just look for '\S@\S' ?
Reply
#3
Lazy regex "\S+?@" will consume the minimum number of characters until it reaches a white space character or until it reaches a character matching the subsequent character "@". In effect, it will start at the beginning of a character string that is not interrupted by white space and stop when it finds "@". Without the laziness, the regex engine might consume the "@" as part of "\S".

The lazy regex at the end only consumes a single character because that's all the engine needs to satisfy the regex.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  regular expression question Skaperen 4 2,475 Aug-23-2021, 06:01 PM
Last Post: Skaperen
  Pass results of expression to another expression cmdr_eggplant 2 2,278 Mar-26-2020, 06:59 AM
Last Post: ndc85430
  regular expression question Sanlus 6 3,523 Aug-04-2018, 06:49 PM
Last Post: volcano63

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020