May-12-2019, 02:19 PM
(This post was last modified: May-12-2019, 02:19 PM by SnoopFrogg.)
I'm currently reading "Automating The Boring Stuff With Python" and I had a quick question about an example program the author includes at the end of the chapter. The code is as follows:
#! Python3 # phoneAndEmail.py - Finds phone numbers and email addresses on the clipboard import pyperclip, re # Create phone number regex phoneRegex = re.compile(r'''( (\d{3}|\(\d{3}\))? # Area code (\s|-|\.)? # Separator (\d{3}) # First 3 digits (\s|-|\.) # Separator (\d{4}) # Last 4 digits (\s*(ext|x|ext.)\s*(\d{2,5}))? # Extension )''', re.VERBOSE) # Create email regex emailRegex = re.compile(r'''( [a-zA-Z0-9._%+-]+ # Username @ # @ symbol [a-zA-Z0-9.-]+ # Domain name (\.[a-zA-Z]{2,4}) # Dot-something )''', re.VERBOSE) # Find matches in clipboard text text = str(pyperclip.paste()) matches = [] # Store the matches found for groups in phoneRegex.findall(text): # phoneNum contains a string built from groups 1, 3, 5 and 8 of the matched text # These groups are the area code, first three digits, last four digits, and extension phoneNum = '-'.join([groups[1], groups[3], groups[5]]) if groups[8] != '': phoneNum += ' x' + groups[8] matches.append(phoneNum) # Append group 0 of each match to get the entire regular expression for groups in emailRegex.findall(text): matches.append(groups[0]) # Copy results to the clipboard if len(matches) > 0: pyperclip.copy('\n'.join(matches)) print('Copied to clipboard:') print('\n'.join(matches)) else: print('No phone numbers or email addresses found.')I get that line 25 creates an empty lists to store the matches found but what confuses me is lines 30-33. How do you know which part of the string is part of what group? I'm asking this because I'm working on a similar problem where I need to find website URLs that begin with http:// or https://. I also have the code to that program if you'd like to see what I have so far. Thanks in advance!