Jul-14-2018, 10:52 AM
Hi all,
I am trying to extract a heading and a title, there is something not quite right about this
I am not sure if I am meant to append the results ?
thank you
I am trying to extract a heading and a title, there is something not quite right about this
from bs4 import BeautifulSoup html = '''\ <h2 class="Title">section1</h2> <p class ="mainparagraph">article1</p> <p>article2</p> <p>article3</p> <h2>section2</h2> <span class="1"> hello 1 </span> <p>article4</p> <p>article5</p> <h2 class="2"> hello </h2> <p>article6</p> <span class="2"> hello 2 </span> <h1> Lorem Ipsum</h1> <p> 1 Lorem ipsum dolor </p> <h2> Lorem Ipsum</h1> <p> 2 Lorem ipsum dolor </p> <h1> Lorem Ipsum</h1> <p> 3 Lorem ipsum dolor </p>",'lxml') ''' soup = BeautifulSoup(html, 'lxml') #soup = BeautifulSoup(open("a.html"),'lxml') links = soup.findAll('h2', {'class': ['Title']},limit=1) with open('New.txt','w') as Output_File: for link in links: names1 = link.contents[0] links = soup.find('p', {'class': ['mainparagraph']}) names2 = link.contents[0] names2.extract() Output_File.write(print,names1.extract()+ '\n', names2.extract())
I am not sure if I am meant to append the results ?
thank you
:)
Python newbie trying to learn the ropes