Python Forum
Beautiful soup truncates results - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html)
+--- Thread: Beautiful soup truncates results (/thread-24862.html)



Beautiful soup truncates results - jonesjoz - Mar-07-2020

When scraping a long web page the printed results get cut truncated. Any advice?
import requests
from bs4 import BeautifulSoup
URL = 'https://www.mobileread.com/forums/showthread.php?t=285771'
page = requests.get(URL)
soup = BeautifulSoup(page.content, 'html.parser')
print(soup.text)



RE: Beautiful soup truncates results - Larz60+ - Mar-08-2020

Quote:printed results get cut truncated
How so?


RE: Beautiful soup truncates results - jonesjoz - Mar-08-2020

(Mar-08-2020, 03:52 AM)Larz60+ Wrote:
Quote:printed results get cut truncated
How so?

The code will print the first 992 lines of text which is not the entirety. There is still more to follow.

<sample output>
<many lines>
I have updated the plugin in the first post so that it manages angular brackets that contain recognised HTML tags.

I tested it with variations of the following text file:

This is a line of text
This is <HELLO> <i>another</i>
<end sample>


RE: Beautiful soup truncates results - Larz60+ - Mar-08-2020

It's there, you just cant see it because no formatting.
change line 6 to
print(soup.prettify())


RE: Beautiful soup truncates results - jonesjoz - Mar-09-2020

Strangely enough, on a different computer the original code works fine. .prettyfy() did make more tags visible than just .text
Thank you all. - JJ