REGEX Look Arounds - Printable Version

REGEX Look Arounds - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: REGEX Look Arounds (/thread-27043.html)

REGEX Look Arounds - CharlesKL - May-23-2020

Hi, I am trying to learn and understand look arounds, In the code below why is '1' removed and not !123!

a = "learn@123Python456"
re.findall(r"\d+", a)   #['123', '456']
re.findall(r"(?<!\W)\d+", a)   #['23', '456']

while if I use a positive look behind such as:

b = "@@@coding????isfun"
re.findall(r"\w+", b)   #['coding', 'isfun']
re.findall(r"(?<=\W)\w+", b)   #['coding', 'isfun']

All the characters are retained

I was using IDLE to run the code

Actually this is a better example of a positive look behind

b = "@@@coding  isfun"
re.findall(r"\w+", b) #['coding', 'isfun']
re.findall(r"(?<=\s)\w+", b) #['isfun']

any assistance will be appreciated, thanks

RE: REGEX Look Arounds - bowlofred - May-23-2020

In your third example, matches must start with an (uncaptured) whitespace. So "coding" is ineligible. Only after the whitespace are matches possible, so "isfun" is returned.

In your first example, matches must not start immediately following a "non-word" character. The first possible digit to capture in the string is "1", but that digit does follow a non-word character. So that position is passed. The next possible match starts with "2". Since that is eligible (it follows a "1" which is not part of \W), the first match group begins there.

RE: REGEX Look Arounds - CharlesKL - May-26-2020

Thank you, a bit confusing but now I understand