I guess you use BeautifulSoup.
Doing it like this you mess up original structure as it also spilt sentence.
As you don't show html it's not easy to help.
Here a quick example see that sentence don't get split up here.
Doing it like this you mess up original structure as it also spilt sentence.
As you don't show html it's not easy to help.
Here a quick example see that sentence don't get split up here.
from bs4 import BeautifulSoup html = '''\ <body> <h1>This is a Heading</h1> <p>This is a paragraph</p> <p>blue car</p> </body>''' soup = BeautifulSoup(html, 'lxml')
>>> ptag = soup.find_all('p') >>> ptag [<p>This is a paragraph</p>, <p>blue car</p>] >>> >>> for t in ptag: ... print(t.text) ... This is a paragraph blue car >>> lst = [t.text for t in ptag] >>> lst ['This is a paragraph', 'blue car']