Python Forum
Thread Rating:
  • 1 Vote(s) - 4 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Learning advanced lxml
#1
I started (about two hours ago) digging into lxml, for no other reason than
I want to develop a better knowledge of what it's capable of.

Getting a utf-8 encoding error,
i tried: etree.tostring but that didn't seem to do the trick


from lxml import etree
import requests
import socket


class TryLxml2:
   def __init__(self, url=None):
       try:
           if socket.gethostbyname(socket.gethostname()) != '127.0.0.1':
               with open('data\\rfc-index.xml', 'wb') as f:
                   self.response = requests.get(url, stream=True)

       except Exception as ex:
           # todo -- Use tkinter.messagebox here
           template = "An exception of type {0} occurred. arguments: \n{1!r}"
           message = template.format(type(ex).__name__, ex.args)
           print(message)
           # raise Exception

       try:
           # etree.tostring(self.xml, encoding='UTF-8', xml_declaration=False)
           doc = etree.XML(self.response.text.strip())
           rfc_entry = doc.findtext('rfc-entry')
       except etree.XMLSyntaxError as ex:
           template = "An exception of type {0} occurred. arguments: \n{1!r}"
           message = template.format(type(ex).__name__, ex.args)
           print(message)

def main(url):
   TryLxml2(url)

if __name__ == '__main__':
   filename = 'data\dev_data.xml'
   main('https://www.rfc-editor.org/rfc-index.xml')
Error:
Traceback (most recent call last):  File "M:/python/q-t/r/RFC_Library/src/TryLxml2.py", line 35, in <module>    main('https://www.rfc-editor.org/rfc-index.xml')  File "M:/python/q-t/r/RFC_Library/src/TryLxml2.py", line 31, in main    TryLxml2(url)  File "M:/python/q-t/r/RFC_Library/src/TryLxml2.py", line 23, in __init__    doc = etree.XML(self.response.text.strip())  File "src\lxml\lxml.etree.pyx", line 3192, in lxml.etree.XML (src\lxml\lxml.etree.c:78747)  File "src\lxml\parser.pxi", line 1843, in lxml.etree._parseMemoryDocument (src\lxml\lxml.etree.c:118266) ValueError: Unicode strings with encoding declaration are not supported. Please use bytes input or XML fragments without declaration.
By the way, found a good reference for this by John Shipman who has an excellent tkinter reference as well.
Reply


Messages In This Thread
Learning advanced lxml - by Larz60+ - Apr-12-2017, 04:34 AM
RE: Learning advanced lxml - by wavic - Apr-12-2017, 06:41 AM
RE: Learning advanced lxml - by snippsat - Apr-12-2017, 12:57 PM
RE: Learning advanced lxml - by Larz60+ - Apr-12-2017, 05:22 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  How do you create an advanced filtering system? KirkmanJ 0 2,417 Jul-02-2018, 08:34 AM
Last Post: KirkmanJ

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020