Jun-09-2021, 01:34 PM
Hi Experts,
Small sample:
<span id="AccordionHeaderText52741" class="clsAccordionHeaderText"><a id="AccordionHeaderTab52741" tabIndex="101" onkeydown="websys_ToggleAccordion('52741',event);">Billing</a></span>
a regular expression, which works in https://regex101.com/ is
But when I provide the entire page source to pull all the matching strings in the complete file, the code is failing.
Here is the error message both in sublime and jupyter.
Small sample:
<span id="AccordionHeaderText52741" class="clsAccordionHeaderText"><a id="AccordionHeaderTab52741" tabIndex="101" onkeydown="websys_ToggleAccordion('52741',event);">Billing</a></span>
a regular expression, which works in https://regex101.com/ is
",event\);\">(.*)<\/a.*"The same works in python editor which pulls out "Billing"
import re with open('testregpad.txt') as f: for line in f: match = re.search('event\);\">(.*)<\/a.*', line) print(match.group(1))output: Billing
But when I provide the entire page source to pull all the matching strings in the complete file, the code is failing.
Here is the error message both in sublime and jupyter.
Error:traceback (most recent call last):
File "C:\Data\Academic\Python\RegExtractData", line 5, in <module>
print(match.group(1))
AttributeError: 'NoneType' object has no attribute 'group'
[Finished in 120ms]
Help is appreciated.
Attached Files