Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Analysing XML file
#1
I am trying to analyse the following XML file, and ones like it:

<xml>
<exam xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="../mockexam.xsd">
<name>Hang Gliding Club Pilot Theory Exam - Air Law</name>
<title>
<name>Hang Gliding Club Pilot Theory Exam - Air Law</name>
<image>G0031281-1-scaled.jpg</image>
</title>
<question type='single'>
<text>When flying on a ridge, with the ridge to your left, and approaching another HG or PG head on, what should you do?</text>
<answer>Don't change your course.</answer>
<answer correct='true'>Change course to the right.</answer>
<answer>Change course to the left.</answer>
</question>
<question type='single'>
<text>When flying on a ridge, with the ridge to your left, and approaching a paramotor head on, what should you do?</text>
<answer>Don't change your course.</answer>
<answer>Change course to the right.</answer>
<answer>Change course to the left.</answer>
<answer correct='true'>Don't change your course unless a collision seems iminent.</answer>
<explanation>Your primary responsibility is to act so as to avoid risk of injury or damage, so if the oncoming pilot does not seem to be responding, you should take avoiding action.</explanation>
</question>
</exam>
</xml>

What I need to do is:

1. Check that the root element is <exam> and that there is only one of them.
2. Check that the <exam> element has one and only one <title> element, with text content (i.e. not empty). Error if not.
3. Get the text child of the <title> element and process it.
4. Check that the <exam> element has zero or one child <image>. Error if not.
5. If there is an <image> element, check that it only has text content. Error otherwise.
6. Get the text, and process it.
7. Check that the <exam> element has at least one <question> element. Error if not.
8. For each <question> element:
9. Check that there is a 'type' attribute. Error if not.
10. Check that the 'type' attribute has value 'single' or 'checkbox'. Error if not.
11. Get the value of 'type' so that it can be used later.
12. Check that there is one and only one <text> child of the <question>, and that it contains a text node and nothing else. Error if not.
13. Get the value of the text and process it.
14. Check that the <question> element has one or more <answer> elements. Error if not.
15. For each <answer> element:
16. Check that the <answer> element has a text child with content (i.e. not an empty string). Error if not.
17. See whether the <answer> element has an attribute 'correct' which is equal to True or False. If true, store which <answer> it belonged to. Error if another answer has already been stored (there can be only one correct answer).
18. Get the value of the text child and process it.
19. Error if there are any other children of <answer>.
20. Check whether there is zero or one <explanation> child of <question>. Error if more.
21. Check whether the <explanation> element has a text child and nothing else.
22. Get the text child and process it. Error if of zero length.
23. Check that there are no more children of <question>. Error if there are.

I realise that a lot of this is validating the XML file. Maybe there is a better way to do this, such as to validate it against an XSD file before trying to process it. Can I do this in Python?

Apart from that question, I guess the most important remaining question is "how do iterate through all the child elements of one specific type of a parent element". For example, how do I iterate through all the <answer> elements which are children of the element
<question type='single'>
<text>When flying on a ridge, with the ridge to your left, and approaching a paramotor head on, what should you do?</text>

Many thanks - Rowan
Reply
#2
Python comes with modules for processing XML - see the standard library documentation. If you want to validate against a schema, search PyPI for libraries.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  need help with data analysing with python and sqlite Hardcool 2 366 Jan-30-2024, 06:49 AM
Last Post: Athi
  python analysing image Aggeliki 3 4,672 Nov-15-2016, 09:04 PM
Last Post: Yoriz

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020