Posts: 10
Threads: 2
Joined: Oct 2017
Oct-26-2017, 02:49 PM
(This post was last modified: Oct-26-2017, 03:10 PM by sparkz_alot.)
Hello, I am wondering if anyone can help me with my computing code, I have a text file and need to extract the UK phone numbers, and ensure that the phone numbers have a prefix of +44 and are followed by ten digits.
So far I've got:
import re # Import the regex module
uk_numbers = # The list where we will store the UK phone numbers
pattern =(?)
with open (‘phone_log.txt”, “rt”) as in_file:
for linenum, line in enumerate(in_file):
if pattern.search(line) !=None:
err_occur.append((linenum, line.rstrip(‘\n’)))
for linenum, line in err_occur
print(“Line”, linenum, “:”, line, sep=’’)
except FileNotFoundErrror:
print(“Input file not found”)
I'm currently unsure of what would be in my pattern line?
I was thinking maybe
pattern = re.compile('\"tel\:[\(\)\-0-10\ ]{1,}\"')
but I'm not sure if this would ensure that the phone numbers have a prefix of +44 and are followed by ten digits.
Any help would be much appreciated thanks
Ronnie
Posts: 606
Threads: 3
Joined: Nov 2016
Oct-26-2017, 04:14 PM
(This post was last modified: Oct-26-2017, 04:15 PM by heiner55.)
Maybe this will help:
#!/usr/bin/python3
import re
pattern = re.compile(r'''
tel: # tel:
\s*? # maybe some spaces
\+44 # +44
\s*? # maybe some spaces
\d{10,10} # 10 digits
''', re.X)
uk_numbers = []
with open ('tel.txt') as in_file:
for linenum, line in enumerate(in_file):
if pattern.search(line) != None:
uk_numbers.append((linenum, line.rstrip('\n')))
for linenum, line in uk_numbers:
print("Line", linenum, ":", line) tel.txt:
"tel:+44 1234567890"
tel: +44 1234567890
tel: +441234567890
tel:+441234567890
Posts: 10
Threads: 2
Joined: Oct 2017
(Oct-26-2017, 04:14 PM)heiner55 Wrote: Maybe this will help:
#!/usr/bin/python3
import re
pattern = re.compile(r'''
tel: # tel:
\s*? # maybe some spaces
\+44 # +44
\s*? # maybe some spaces
\d{10,10} # 10 digits
''', re.X)
uk_numbers = []
with open ('tel.txt') as in_file:
for linenum, line in enumerate(in_file):
if pattern.search(line) != None:
uk_numbers.append((linenum, line.rstrip('\n')))
for linenum, line in uk_numbers:
print("Line", linenum, ":", line) tel.txt:
"tel:+44 1234567890"
tel: +44 1234567890
tel: +441234567890
tel:+441234567890
I've tried doing this however I'm having issues for when running it?
I've changes the "tel.txt" to the name of "phonecalls.txt", which is the name of the file I am extracting the data from, is this okay to do?
Many Thanks
Ronnie
Posts: 606
Threads: 3
Joined: Nov 2016
Posts: 10
Threads: 2
Joined: Oct 2017
(Oct-27-2017, 01:53 PM)heiner55 Wrote: That is ok.
Ok that's great, so my code is the following:
>>> data_file = open("phone_log.txt", "r")
>>> data = data_file.readlines()
>>> import re
>>> pattern = re.compile(r'''tel:\s*?\+44\s*?\d{10,10}''', re.X)
>>> uk_numbers =[]
>>> with open( 'phone_log.txt') as in_file:
for linenum, line in enumerate(in_file):
if pattern.search(line) != None:
uk_numbers.append((linenum, line.rstrip('\n')))
>>> for linenum, line in uk_numbers:
print("Line", linenum, ":", line)
What must I do to print the UK phone numbers? Would be great if I could know this since I've been trying for the past hour and not succeeding
Very much appreciated for your help
Posts: 606
Threads: 3
Joined: Nov 2016
Oct-27-2017, 02:28 PM
(This post was last modified: Oct-27-2017, 04:27 PM by heiner55.)
Maybe this helps: https://docs.python.org/3.6/library/re.html
#!/usr/bin/python3
import re
pattern = r"""
tel: # tel:
\s*? # maybe some spaces
\+44 # +44
\s*? # maybe some spaces
(\d{10,10}) # 10 digits
"""
with open ('phone_log.txt') as in_file:
for linenr, line in enumerate(in_file):
match = re.search(pattern, line, re.X)
if match:
print("Line %d: %s" % (linenr, match.group(1))) phone_log.txt:
here is some text "tel:+44 1234567890" hers is some text
tel: +44 1234567890 text text
text tel text tel: +441234567890 text
tel:+441234567890
Output: Line 0: 1234567890
Line 1: 1234567890
Line 2: 1234567890
Line 3: 1234567890
Posts: 10
Threads: 2
Joined: Oct 2017
(Oct-27-2017, 02:28 PM)heiner55 Wrote: Maybe this helps: https://docs.python.org/3.6/library/re.html
#!/usr/bin/python3
import re
pattern = r'''
tel: # tel:
\s*? # maybe some spaces
\+44 # +44
\s*? # maybe some spaces
(\d{10,10}) # 10 digits
'''
with open ('phone_log.txt') as in_file:
for linenr, line in enumerate(in_file):
match = re.search(pattern, line, re.X)
if match:
print("Line ", linenr, ": ", match[1], sep='') phone_log.txt:
here is some text "tel:+44 1234567890" hers is some text
tel: +44 1234567890 text text
text tel text tel: +441234567890 text
tel:+441234567890
Output: Line 0: 1234567890
Line 1: 1234567890
Line 2: 1234567890
Line 3: 1234567890
Ahh,yes
as in, how do i get that output? what must i do to get that output ?
Posts: 606
Threads: 3
Joined: Nov 2016
Oct-27-2017, 03:35 PM
(This post was last modified: Oct-27-2017, 03:35 PM by heiner55.)
print("Line ", linenr, ": ", match[1], sep='')
Posts: 10
Threads: 2
Joined: Oct 2017
(Oct-27-2017, 03:35 PM)heiner55 Wrote: print("Line ", linenr, ": ", match[1], sep='')
Do I type this in the Python Shell?
Posts: 606
Threads: 3
Joined: Nov 2016
Oct-27-2017, 03:42 PM
(This post was last modified: Oct-27-2017, 03:42 PM by heiner55.)
You get the output if you run the sample program (see above) with the input-file phone_log.txt.
|