Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
XML (utf-8) question
#2
Use BS-4 is better in most parts and Unicode support is very good.
I never use parser in standard library,same with urllib use Requests.
The standard library has strong modules that has a more stable platform and do not need so much changing,
but with parser and HTTP stuff is better to use modules that keep up with the rabbit changing of web.
doc Wrote:Beautiful Soup uses a sub-library called Unicode, Dammit to detect a document’s encoding and convert it to Unicode
When you write out a document from Beautiful Soup, you get a UTF-8 document, even if the document wasn’t in UTF-8 to begin with
So as i just did post a answer here,can use that code and do some Unicode stuff.
from bs4 import BeautifulSoup

xml = '''\
<provider>
  <identity>chess king♟♜♞</identity>
  <endpoint>some point.com</endpoint>
</provider>'''

soup = BeautifulSoup(xml, 'xml')
>> result = soup.find('identity')
>>> result
<identity>chess king♟♜♞</identity>
>>> result.string.replace_with("testç")
'chess king♟♜♞'

>>> soup
<?xml version="1.0" encoding="utf-8"?>
<provider>
<identity>testç</identity>
<endpoint>some point.com</endpoint>
</provider>

>>> result = soup.find('identity')
>>> result
<identity>testç</identity>
>>> result.string = '♟♜♞'

>>> soup
<?xml version="1.0" encoding="utf-8"?>
<provider>
<identity>♟♜♞</identity>
<endpoint>some point.com</endpoint>
</provider>
Reply


Messages In This Thread
XML (utf-8) question - by DPaul - Mar-25-2020, 10:00 AM
RE: XML (utf-8) question - by snippsat - Mar-25-2020, 05:29 PM
RE: XML (utf-8) question - by DPaul - Mar-25-2020, 06:49 PM
RE: XML (utf-8) question - by snippsat - Mar-25-2020, 07:59 PM
RE: XML (utf-8) question - by DPaul - Mar-26-2020, 07:37 AM

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020