Hello,
I am trying to do the exercise on regex, matching 3 telephone numbers i a row, with or without area code, with or without comma at the end:
>>> phoneRegex = re.compile(r'((\d\d\d-)?\d\d\d-\d\d\d\d(,)?) {3}')
>>> mo = phoneRegex.search('my numbers are 415-555-1234, 555-4242, 212-555-0000')
This is not matching. And I cannot figure out why.
If you want to extract phone numbers, you regexp pattern should specify exactly what you want to extract.
You don't need commas, spaces
<space>{3}
, so why they are included to the pattern .
phoneRegex = re.compile(r'((?:\d\d\d-)?\d\d\d-\d\d\d\d)')
?:
added to the first group to prevent grubbing the group as a result.
import re
x = 'my numbers are 415-555-1234, 555-4242, 212-555-0000'
phoneRegex = re.compile(r'((?:\d\d\d-)?\d\d\d-\d\d\d\d)')
phoneRegex.findall(x)
Output:
['415-555-1234', '555-4242', '212-555-0000']
Hi,
That's not what I need.
I need a regex that returns a result only if I have a group of 3 phone numbers one after the other.
import re
phone = r'(?:\d{3}-)?\d{3}-\d{4}'
phone3 = r'{p}(?:[\s,]\s*{p}){{2}}\,?'.format(p=phone)
mo = re.search(phone3, 'my numbers are 415-555-1234, 555-4242, 212-555-0000')
print(mo)
Output:
<_sre.SRE_Match object; span=(15, 51), match='415-555-1234, 555-4242, 212-555-0000'>
Great!
phone3 = r'{p}(?:[\s,]\s*{p}){{2}}\,?'.format(p=phone)
So{p} is phone, your first regex. Can you please explain shortly what are the regex that follow?
DJ_Qu Wrote:Can you please explain shortly what are the regex that follow?
It's all in the re module's
documentation,
(?:…){2}
is a non capturing group repeated twice. I had to escape the braces because I'm using the format() method, so I write
{{2}}
. Inside this group,
[\s,]\s*
means a whitespace character or a comma followed by zero or more whitespace characters, then
{p}
means another phone number. The
\,?
at the end means an optional comma.
Is it me or regex is a headheache??
Thank you for your help!