Aug-11-2022, 10:39 PM
Hello Everyone, I am currently having a problem with capturing numbers at the end of the h3 tags highlighted in red and the bold and italics are the numbers I am trying to capture. The following script I have updated to facilitate the update of the paragraphs: one I notice the numbers were at the end of other h3 tags as well. Unfortunately, I have having a difficult time completing this task of using two list to find the first number after the h3 tag. How can I use regex to find the first digits after the h3 tags using the if statement to capture 1 after He poured rocks and the 6 after When Cooking . This is the current script it only pulls the first number and not the 6 after the next h3 tag. What modifications do I need to make to solve this issue? Thank you for your assistance in this matter.
for dc in soup.findAll('div', {'class':'flex flex-auto flex-col bg-white shadow-md'}): txt = re.sub('\[(.*?)\]','',dc.text) #Take out text on and between [] ##print('div class text',txt) #Print div class text for h3 in soup.findAll('h3', {'class':'font-medium block subject-heading text-xl mb-3 mt-5'}): hcls = h3.text #H3 Class text print('This is h3 class text ',h3.text) if hcls in txt: fn = re.findall(r'^hcls|[\d+|$]',txt)[0] #find First Numbers print('This is first number after the h3 tag',fn)
Quote:data
He poured rocks 1 in the dungeon of his mind.
Joyce enjoyed eating pancakes with ketchup.
2
I think I will buy the red car, or I will lease the blue one.
3
He had decided to accept his fate of accepting his fate.
The tumbleweed refused to tumble at 8 but was more than willing to prance.
The two walked down the slot canyon for 2 miles oblivious to the sound of thunder in the distance
4
He used to get confused between soldiers and shoulders, but as a military man, he now soldiers responsibility.
5
Harrold felt confident that nobody would ever suspect his 2 spy pigeons.
When Cooking 6 It smells very delicious in the kitchen.