Mar-08-2022, 02:33 AM
The tags are in the following format:
<pre>
<a href="../">../</a>
< a href="the text I want to extract" > the text I want to extract/</a>
" 29 - Nov - 2021 02:19 - "
5000 more tags in this format ....
</pre>
There are 5000 tags in this format, I am trying to extract the A tags based on the time stamps in the text after each tag. For example, all the tags that contain "29-Nov - 2021" in the text string.
Can I use beautiful soup to achive this?
Something like:
def time_in_text(tag):
return tag.name == 'a' and '29-Nov-2021' in tag.get_text()
Results= soup.find(time_in_text)
<pre>
<a href="../">../</a>
< a href="the text I want to extract" > the text I want to extract/</a>
" 29 - Nov - 2021 02:19 - "
5000 more tags in this format ....
</pre>
There are 5000 tags in this format, I am trying to extract the A tags based on the time stamps in the text after each tag. For example, all the tags that contain "29-Nov - 2021" in the text string.
Can I use beautiful soup to achive this?
Something like:
def time_in_text(tag):
return tag.name == 'a' and '29-Nov-2021' in tag.get_text()
Results= soup.find(time_in_text)