Python Forum

Full Version: Automate the boring stuff: regex not matching
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello,
I am trying to do the exercise on regex, matching 3 telephone numbers i a row, with or without area code, with or without comma at the end:

>>> phoneRegex = re.compile(r'((\d\d\d-)?\d\d\d-\d\d\d\d(,)?) {3}')
>>> mo = phoneRegex.search('my numbers are 415-555-1234, 555-4242, 212-555-0000')
This is not matching. And I cannot figure out why.
If you want to extract phone numbers, you regexp pattern should specify exactly what you want to extract.
You don't need commas, spaces <space>{3}, so why they are included to the pattern .

phoneRegex = re.compile(r'((?:\d\d\d-)?\d\d\d-\d\d\d\d)')
?: added to the first group to prevent grubbing the group as a result.

import re
x = 'my numbers are 415-555-1234, 555-4242, 212-555-0000'
phoneRegex = re.compile(r'((?:\d\d\d-)?\d\d\d-\d\d\d\d)')
phoneRegex.findall(x)
Output:
['415-555-1234', '555-4242', '212-555-0000']
Hi,
That's not what I need.
I need a regex that returns a result only if I have a group of 3 phone numbers one after the other.
import re

phone = r'(?:\d{3}-)?\d{3}-\d{4}'
phone3 = r'{p}(?:[\s,]\s*{p}){{2}}\,?'.format(p=phone)

mo = re.search(phone3, 'my numbers are 415-555-1234, 555-4242, 212-555-0000')
print(mo)
Output:
<_sre.SRE_Match object; span=(15, 51), match='415-555-1234, 555-4242, 212-555-0000'>
Great!
phone3 = r'{p}(?:[\s,]\s*{p}){{2}}\,?'.format(p=phone)
So{p} is phone, your first regex. Can you please explain shortly what are the regex that follow?
DJ_Qu Wrote:Can you please explain shortly what are the regex that follow?
It's all in the re module's documentation, (?:…){2} is a non capturing group repeated twice. I had to escape the braces because I'm using the format() method, so I write {{2}}. Inside this group, [\s,]\s* means a whitespace character or a comma followed by zero or more whitespace characters, then {p} means another phone number. The \,? at the end means an optional comma.
Is it me or regex is a headheache??
Thank you for your help!
Jamie Zawinski Wrote:Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.