Hello,
I have only shallow experience with XML parsers — although I'm currently dealing with HTML.
Unless I missed it, neither lxml nor ElementTree provides a function to insert a big block of HTML at a location in the tree. Before I use a regex instead of those XML parsers… do you confirm?
Thank you.
Here's what it could look like:
I have only shallow experience with XML parsers — although I'm currently dealing with HTML.
Unless I missed it, neither lxml nor ElementTree provides a function to insert a big block of HTML at a location in the tree. Before I use a regex instead of those XML parsers… do you confirm?
Thank you.
Here's what it could look like:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
import lxml.etree as et from lxml import html from io import StringIO with open ( "input.html" ) as reader: content = reader.read() parser = et.HTMLParser(encoding = 'latin1' ,remove_blank_text = True ,recover = True ) tree = et.parse(StringIO(content), parser) root = tree.getroot() body = root.find( "body" ) with open ( "block.html" ) as reader: big_block = reader.read() #HERE body.insert_after(big_block) |