Python Forum
Using xml.parsers.expat I can't get parser.ParseFile(f) to work
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Using xml.parsers.expat I can't get parser.ParseFile(f) to work
#1
This code comes from Dave Beazly and it is a bit old, Python 2.something. I am trying to get it to work in Python 3.

This is from Part 3 : Coroutines and Event Dispatching, coexpat.py

Up to now, everything worked.

I have this function and various others, all available on the link above.

def expat_parse(f,target):
    parser = xml.parsers.expat.ParserCreate()
    parser.buffer_size = 65536
    parser.buffer_text = True
    #parser.returns_unicode = False
    parser.StartElementHandler = \
       lambda name,attrs: target.send(('start',(name,attrs)))
    parser.EndElementHandler = \
       lambda name: target.send(('end',name))
    parser.CharacterDataHandler = \
       lambda data: target.send(('text',data))
    parser.ParseFile(f)
When I enter as per the given code:

expat_parse(open(xmlfile), buses_to_dicts(filter_on_field("route","22", filter_on_field("direction","North Bound", bus_locations()))))
I get this error:

Quote:Traceback (most recent call last):
File "/usr/lib/python3.10/idlelib/run.py", line 578, in runcode
exec(code, self.locals)
File "<pyshell#14>", line 1, in <module>
File "<pyshell#13>", line 12, in expat_parse
TypeError: read() did not return a bytes object (type=str)

But read() should return a string, not bytes, I believe.

The docs say:

Quote:xmlparser.ParseFile(file)
Parse XML data reading from the object file. file only needs to provide the read(nbytes) method, returning the empty string when there’s no more data.

If I do this there is no problem:

Quote:data = open(xmlfile)
type(data)
<class '_io.TextIOWrapper'>
mystring = data.read()
type(mystring)
<class 'str'>

So now I don't know what the problem is, or what file in xmlparser.ParseFile(file) should be!

I thought maybe the xml file is corrupt, but I can open it and .read() it to a string (13817 lines) and it displays in my browser ok.

What does xmlparser.ParseFile(f) want for f?

Is the module too old?
Reply
#2
Have you tried opening in binary mode?
open(xmlfile, 'rb')
Pedroski55 and snippsat like this post
« We can solve any problem by introducing an extra level of indirection »
Reply
#3
(Apr-24-2024, 05:33 AM)Pedroski55 Wrote: What does xmlparser.ParseFile(f) want for f?

Is the module too old?
Still work with some changes over to Python 3,as mention bye Gribouillis need 'rb'
# coexpat.py
#
# An example of pushing XML events generated by the low-level expat
# XML library into coroutines.

import xml.parsers.expat

def expat_parse(f,target):
    parser = xml.parsers.expat.ParserCreate()
    parser.buffer_size = 65536
    parser.buffer_text = True
    # parser.returns_unicode = False
    parser.StartElementHandler = \
       lambda name,attrs: target.send(('start',(name,attrs)))
    parser.EndElementHandler = \
       lambda name: target.send(('end',name))
    parser.CharacterDataHandler = \
       lambda data: target.send(('text',data))
    parser.ParseFile(f)

# Example.  This uses the bus processing code from earlier with no changes.

if __name__ == '__main__':
    from buses import *

    with open("allroutes.xml", 'rb') as file:
        expat_parse(file,
            buses_to_dicts(
            filter_on_field("route", "22",
            filter_on_field("direction", "North Bound",
            bus_locations()))))
Output:
22,1485,"North Bound",41.880481123924255,-87.62948191165924 22,1629,"North Bound",42.01851969751819,-87.6730209876751 22,1489,"North Bound",41.962393500588156,-87.66610128229314 .....
Other changes in the two files this code calles,is just over to print() function.
Eg in coroutine.py line 21 print line, is in Python 3 print(line, end='').
Pedroski55 likes this post
Reply
#4
Thanks!

Worked first time with 'rb'!
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020