Hi all,
I have been looking everyhwere for this concept.
I wanted to delete all html except for the classes. I have listed
The idea is below. The code is not correct
Basically delete all html except for those classes listed in the list
Result:
Everything Deleted except:
<h2 class="1">section1</h2>
<p class="2">article1</p>
<p class="3">article3</p>
please do advise thank you
I have been looking everyhwere for this concept.
I wanted to delete all html except for the classes. I have listed
The idea is below. The code is not correct
html = '''\ <h2 class="1">section1</h2> <p class="2">article1</p> <p>article2</p> <p class="3">article3</p> <h1> Lorem Ipsum</h1> <p> 3 Lorem ipsum dolor </p>",'lxml') ''' soup = BeautifulSoup(html, 'lxml') for tag in soup(): if not class in ["1", "2"]: tag.decompose() print(soup)I cant find any code samples to show me this idea
Basically delete all html except for those classes listed in the list
Result:
Everything Deleted except:
<h2 class="1">section1</h2>
<p class="2">article1</p>
<p class="3">article3</p>
please do advise thank you