Python Forum
[SOLVED] [Beautiful Soup] How to deprettify?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[SOLVED] [Beautiful Soup] How to deprettify?
#1
Information 
Hello,

I made the mistake of using soup.prettify() to save soups to files, and I now have whitespaces that show up as useless spaces when viewing the files in an HTML WYSIWYG editor.

The following code doesn't work to remove those useless whitespaces.

Before I write a Python script to run the files through Tidy instead, does someone know if it can be fixed with BS?

Thank you.

for file in glob.glob("*.html"):
	BASE = Path(file).stem
	OUTPUTFILE = fr"{BASE}.CONV.html" 
	
	soup = BeautifulSoup(open(file,"br"),"lxml")
	for tag in soup.find_all():
		if tag.string:
			tag.string.replace_with(' '.join(tag.string.split()))
			print(tag.string)
		else:
			print(tag.name, " no string")
			pass

	with open(OUTPUTFILE, 'w', encoding='utf-8') as outp:
		outp.write(str(soup))
Reply
#2
To show the problem.
from bs4 import BeautifulSoup

html = '''\
<body>
  <h1>This is a Heading</h1>
  <p>This is a paragraph</p>
  <p>blue car</p>
</body>'''

soup = BeautifulSoup(html, 'lxml')
print(soup.prettify())
print('-' * 25)
print(str(soup))
Output:
<body> <h1> This is a Heading </h1> <p> This is a paragraph </p> <p> blue car </p> </body> ------------------------- <body> <h1>This is a Heading</h1> <p>This is a paragraph</p> <p>blue car</p> </body>
So the new line is annoying(i tried to fix it a long time ago),now just ways under.
Easy fix is to use to html formatting online eg code beautify.
Or install Prettier,has a command line tool eg use prettier --write . formatt all html file in a folder.
G:\div_code\html_file
λ prettier --write .
h1.html 170ms
h2.html 5ms
Then output of both from BS option over will be correct formatted html.
Output:
<body> <h1>This is a Heading</h1> <p>This is a paragraph</p> <p>blue car</p> </body>
Reply
#3
Thank you.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Question [SOLVED] [Beautiful Soup] Replace tag.string from another file? Winfried 2 269 Yesterday, 03:43 PM
Last Post: Winfried
Question [SOLVED] [Beautiful Soup] Move line to top in HTML head? Winfried 0 307 Apr-13-2025, 05:50 AM
Last Post: Winfried
  Trouble selecting attribute with beautiful soup bananatoast 3 2,909 Jan-30-2022, 10:01 AM
Last Post: bananatoast
  I need help parsing through data and creating a database using beautiful soup username369 1 2,361 Sep-22-2021, 08:45 PM
Last Post: Larz60+

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020