Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
XML (utf-8) question
#3
Hi,

Thanks, your beautifulsoup method seems very strong, especially with the chess symbols!
I've used beautifulsoup for scraping, but this problem is about simple XML database writing and reading, no www.

After some searching and testing i found a solution, and i have drawn some conclusions:
1. Python is not so good at writing utf-8
2 i convert the text string before attaching it to the tree, like so:

 s= '>testç"<@/€#ê'
s.encode('utf-8')
    xxx.text = s 
3. When reading the database from file with elementtree, the formatting is flawed when using ET.dump(...) to print.
But this works perfectly :
for field in child:
    print('Field:', field.tag,':', field.text)
4. It would seem that the python default for reading is utf-8

5. No I can't do chess symbols with that, i tried. :-(
Dpaul
Reply


Messages In This Thread
XML (utf-8) question - by DPaul - Mar-25-2020, 10:00 AM
RE: XML (utf-8) question - by snippsat - Mar-25-2020, 05:29 PM
RE: XML (utf-8) question - by DPaul - Mar-25-2020, 06:49 PM
RE: XML (utf-8) question - by snippsat - Mar-25-2020, 07:59 PM
RE: XML (utf-8) question - by DPaul - Mar-26-2020, 07:37 AM

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020