Python Forum

Full Version: [SOLVED] [Beautiful Soup] Replace tag.string from another file?
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello,

In an HTML page ("target"), I'd like to replace the body's text with the body from another file ("source").

Neither ".string", ".text", nor "str()" seems the right way to do it :-/ Does someone know?

Thank you.

""" target
<body>
	<div id="body">FILL_ME</div>
</body>
"""

#TypeError: decoding to str: need a bytes-like object, NoneType found
target.body.find('div', id='body').string = source.body.string

#&lt;
target.body.find('div', id='body').string = str(source.body)

#plain text, tags gone
target.body.find(id="body").string.replace_with(source.body.text)

#TypeError: decoding to str: need a bytes-like object, Tag found
target.body.find('div', id='body').string = source.body

#AttributeError: 'NoneType' object has no attribute 'replace_with'
target.body.find(string="FILL_ME").replace_with(source.body)
Still stuck :-/

The goal is to 1) grab "<i>good stuff</i>" from the input, and 2) use it to replace "FILL_ME" in the output:

html_in = """
<body>
	<i>good stuff</i>
</body>
"""

html_out = """
<body>
	<div id="body">FILL_ME</div>
</body>
"""
soup_in = BeautifulSoup(html_in,"html.parser")
soup_out = BeautifulSoup(html_out,"html.parser")
body_copy = copy.copy(soup_in.body)
#print(body_copy)

#ValueError: Cannot replace an element with its contents when that element is not part of a tree.
#body_copy.unwrap()
#stuff = body_copy.unwrap()

#NONE stuff = body_copy.string
#PLAIN stuff = body_copy.text
#PLAIN stuff = body_copy.get_text()
#&lt; stuff = str(body_copy)
print(stuff)

#soup_out.find("div", id="body").string.replace_with(body_copy)
soup_out.find("div", id="body").string.replace_with(stuff)
print(soup_out)
For others' benefit:

html_in = """
<body>
	<i>good stuff</i>
	
<i>more good stuff</i>

</body>
"""

html_out = """
<body>
	<div id="body">FILL_ME</div>
</body>
"""

soup_in = BeautifulSoup(html_in,"html.parser")
soup_out = BeautifulSoup(html_out,"html.parser")

new_body = soup_out.find('div', id='body')
new_body.clear()
new_body.extend(soup_in.body.contents)

print(soup_out)