Python Forum
[SOLVED] Special characters in XML
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[SOLVED] Special characters in XML
#1
Hello,

I'm building an XML for file catalog. One of the files contains the character ö (o with umlaut), and it seems to invalidate the XML. What is the recommended way to handle such characters?

TIA
Reply
#2
xml with character ö or unicode should work fine in Python.
lxml is fine for building xml
from lxml import etree as ET

root = ET.Element('Doc')
level1 = ET.SubElement(root, 'st1ö')
l1 = ET.SubElement(level1, 'Text')
l1.text = 'Test character ö 日本語'
second = ET.SubElement(level1, 'tokens')
second.text = 'happy 😃'
level2 = ET.SubElement(second, 'token', word="car")
level2.text = 'Test'
tree = ET.ElementTree(root)
tree.write('output.xml', pretty_print=True, xml_declaration=True, encoding="utf-8") 
Output:
<?xml version='1.0' encoding='UTF-8'?> <Doc> <st1ö> <Text>Test character ö 日本語</Text> <tokens>happy 😃<token word="car">Test</token></tokens> </st1ö> </Doc>
ForeverNoob likes this post
Reply
#3
Does the text encoding used in the file match the encoding used to read the file. The ö might be garbage as soon as it's read from the file.
ForeverNoob likes this post
Reply
#4
Got the error - at some point I wrote:

encoding="utf8"

instead of:

encoding="utf-8"

Now it works OK. Thanks all.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Question [SOLVED] How to replace characters in a string? Winfried 2 1,050 Sep-04-2024, 01:41 PM
Last Post: Winfried
  Copy xml content from webpage and save to locally without special characters Nik1811 14 5,199 Mar-26-2024, 09:28 AM
Last Post: Nik1811
Question Special Characters read-write Prisonfeed 1 1,463 Sep-17-2023, 08:26 PM
Last Post: Gribouillis
Question [SOLVED] Delete specific characters from string lines EnfantNicolas 4 3,252 Oct-21-2021, 11:28 AM
Last Post: EnfantNicolas
  Rename Multiple files in directory to remove special characters nyawadasi 9 10,548 Feb-16-2021, 09:49 PM
Last Post: BashBedlam
  Remove escape characters / Unicode characters from string DreamingInsanity 5 21,659 May-15-2020, 01:37 PM
Last Post: snippsat
  Check for a special characters in a column and flag it ayomayam 0 2,591 Feb-12-2020, 03:04 PM
Last Post: ayomayam
  Pynput doesn't recognize shift button and special characters VirtualDreamer 0 3,977 Jul-17-2019, 11:55 AM
Last Post: VirtualDreamer
  problems with python script and special characters last08 1 2,973 Mar-29-2019, 09:28 AM
Last Post: Kebap
  IDLE crash using special characters? reikonaga 6 5,997 Aug-06-2018, 07:37 AM
Last Post: keegan_010

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020