Hello, I am wondering if anyone can help me with my computing code, I have a text file and need to extract the UK phone numbers, and ensure that the phone numbers have a prefix of +44 and are followed by ten digits.
So far I've got:
import re # Import the regex module
uk_numbers = # The list where we will store the UK phone numbers
pattern =(?)
with open (‘phone_log.txt”, “rt”) as in_file:
for linenum, line in enumerate(in_file):
if pattern.search(line) !=None:
err_occur.append((linenum, line.rstrip(‘\n’)))
for linenum, line in err_occur
print(“Line”, linenum, “:”, line, sep=’’)
except FileNotFoundErrror:
print(“Input file not found”)
I'm currently unsure of what would be in my pattern line?
I was thinking maybe
pattern = re.compile('\"tel\:[\(\)\-0-10\ ]{1,}\"')
but I'm not sure if this would ensure that the phone numbers have a prefix of +44 and are followed by ten digits.
Any help would be much appreciated thanks
Ronnie
Maybe this will help:
#!/usr/bin/python3
import re
pattern = re.compile(r'''
tel: # tel:
\s*? # maybe some spaces
\+44 # +44
\s*? # maybe some spaces
\d{10,10} # 10 digits
''', re.X)
uk_numbers = []
with open ('tel.txt') as in_file:
for linenum, line in enumerate(in_file):
if pattern.search(line) != None:
uk_numbers.append((linenum, line.rstrip('\n')))
for linenum, line in uk_numbers:
print("Line", linenum, ":", line)
tel.txt:
"tel:+44 1234567890"
tel: +44 1234567890
tel: +441234567890
tel:+441234567890
(Oct-26-2017, 04:14 PM)heiner55 Wrote: [ -> ]Maybe this will help:
#!/usr/bin/python3
import re
pattern = re.compile(r'''
tel: # tel:
\s*? # maybe some spaces
\+44 # +44
\s*? # maybe some spaces
\d{10,10} # 10 digits
''', re.X)
uk_numbers = []
with open ('tel.txt') as in_file:
for linenum, line in enumerate(in_file):
if pattern.search(line) != None:
uk_numbers.append((linenum, line.rstrip('\n')))
for linenum, line in uk_numbers:
print("Line", linenum, ":", line)
tel.txt:
"tel:+44 1234567890"
tel: +44 1234567890
tel: +441234567890
tel:+441234567890
I've tried doing this however I'm having issues for when running it?
I've changes the "tel.txt" to the name of "phonecalls.txt", which is the name of the file I am extracting the data from, is this okay to do?
Many Thanks
Ronnie
(Oct-27-2017, 01:53 PM)heiner55 Wrote: [ -> ]That is ok.
Ok that's great, so my code is the following:
>>> data_file = open("phone_log.txt", "r")
>>> data = data_file.readlines()
>>> import re
>>> pattern = re.compile(r'''tel:\s*?\+44\s*?\d{10,10}''', re.X)
>>> uk_numbers =[]
>>> with open( 'phone_log.txt') as in_file:
for linenum, line in enumerate(in_file):
if pattern.search(line) != None:
uk_numbers.append((linenum, line.rstrip('\n')))
>>> for linenum, line in uk_numbers:
print("Line", linenum, ":", line)
What must I do to print the UK phone numbers? Would be great if I could know this since I've been trying for the past hour and not succeeding
Very much appreciated for your help
Maybe this helps:
https://docs.python.org/3.6/library/re.html
#!/usr/bin/python3
import re
pattern = r"""
tel: # tel:
\s*? # maybe some spaces
\+44 # +44
\s*? # maybe some spaces
(\d{10,10}) # 10 digits
"""
with open ('phone_log.txt') as in_file:
for linenr, line in enumerate(in_file):
match = re.search(pattern, line, re.X)
if match:
print("Line %d: %s" % (linenr, match.group(1)))
phone_log.txt:
here is some text "tel:+44 1234567890" hers is some text
tel: +44 1234567890 text text
text tel text tel: +441234567890 text
tel:+441234567890
Output:
Line 0: 1234567890
Line 1: 1234567890
Line 2: 1234567890
Line 3: 1234567890
(Oct-27-2017, 02:28 PM)heiner55 Wrote: [ -> ]Maybe this helps: https://docs.python.org/3.6/library/re.html
#!/usr/bin/python3
import re
pattern = r'''
tel: # tel:
\s*? # maybe some spaces
\+44 # +44
\s*? # maybe some spaces
(\d{10,10}) # 10 digits
'''
with open ('phone_log.txt') as in_file:
for linenr, line in enumerate(in_file):
match = re.search(pattern, line, re.X)
if match:
print("Line ", linenr, ": ", match[1], sep='')
phone_log.txt:
here is some text "tel:+44 1234567890" hers is some text
tel: +44 1234567890 text text
text tel text tel: +441234567890 text
tel:+441234567890
Output:
Line 0: 1234567890
Line 1: 1234567890
Line 2: 1234567890
Line 3: 1234567890
Ahh,yes
as in, how do i get that output? what must i do to get that output ?
print("Line ", linenr, ": ", match[1], sep='')
(Oct-27-2017, 03:35 PM)heiner55 Wrote: [ -> ]print("Line ", linenr, ": ", match[1], sep='')
Do I type this in the Python Shell?
You get the output if you run the sample program (see above) with the input-file phone_log.txt.