Python Forum
Best way to process large/complex XML/schema ?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Best way to process large/complex XML/schema ?
#2
(May-13-2021, 01:44 AM)MDRI Wrote: We know Python is interpreter language.

Is Pyhton the right one to do the above for performance?

What are the option we have ?
Performance is no problem as eg lxml has C speed.
lxml Wrote:The lxml XML toolkit is a Pythonic binding for the C libraries libxml2 and libxslt.
It is unique in that it combines the speed and XML feature completeness of these libraries with the simplicity of a native Python API
(May-13-2021, 01:44 AM)MDRI Wrote: The above input XML messages need to be validated against XML Schema (xsd) for schema comliant
Validation with lxml

Quote:The validated XML messages to be parses and extract the data .
I like to use BS for parsing,still same speed as here use lxml as parser.
import requests
from bs4 import BeautifulSoup

url = 'http://httpbin.org/xml'
response = requests.get(url)
soup = BeautifulSoup(response.content, 'lxml')
title = soup.select_one('title')
print(title)
print(title.text)
Output:
<title>Wake up to WonderWidgets!</title> Wake up to WonderWidgets!
Reply


Messages In This Thread
RE: Best way to process large/complex XML/schema ? - by snippsat - May-13-2021, 03:27 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  __init__() got multiple values for argument 'schema' dawid294 4 2,992 Jan-03-2024, 09:42 AM
Last Post: buran
  how to catch schema error? maiya 0 1,922 Jul-16-2021, 08:37 AM
Last Post: maiya
  Missing Schema-Python Question Andwconteh 1 2,590 Jun-16-2021, 01:00 PM
Last Post: Andwconteh
  How to sharing object between multiple process from main process using Pipe Subrata 1 3,744 Sep-03-2019, 09:49 PM
Last Post: woooee
  Avoid output buffering when redirecting large data (40KB) to another process Ramphic 3 3,552 Mar-10-2018, 04:49 AM
Last Post: Larz60+

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020