Python Forum
Thread Rating:
  • 2 Vote(s) - 4 Average
  • 1
  • 2
  • 3
  • 4
  • 5
XML handling
#1
I am using XML as below.

<User folder="/Users" name="cake">
<GroupMembership>abc-trq</GroupMembership>
<GroupMembership>kqt-ops</GroupMembership>
</User>



running code
 
import xml.etree.ElementTree as ET
tree = ET.parse('my.xml')
root = tree.getroot()
#for ten in root:
#    print( ten.attrib)
for User in root.getiterator('User'):
      print (User.attrib)
#for User in root.findall('User'):
#    GroupMembership = User.find('GroupMembership')
#    print(GroupMembership.text)
the above code prints but the hashed out bunch doesn't work
{'folder': '/Users', 'name': 'cake'}
my scenario is to get all group membership owned by a user
cake - abc-trq , kqt-ops
Reply
#2
You can use BeautifulSoup here:
from bs4 import BeautifulSoup

myxml = '''
<User folder="/Users" name="cake">
    <GroupMembership>abc-trq</GroupMembership>
    <GroupMembership>kqt-ops</GroupMembership>
</User>'''

soup = BeautifulSoup(myxml, 'lxml')
for user in soup.find_all('user'):
    print(user.attrs)

for user in soup.find_all('groupmembership'):
    print(user.text)
results:
Output:
{'folder': '/Users', 'name': 'cake'} abc-trq kqt-ops
Reply
#3
thanks for the suggestion but unfortunately i am looking for a solution in ElementTree module.

I was able to further enhance the code with a better output.

import xml.etree.ElementTree as ET
tree = ET.parse('my.xml')
root = tree.getroot()
#for Safex in root:
#    print( Safex.attrib)
#for User in root.getiterator('User'):
# print (User.attrib)
#print root.getchildren()[7].getchildren()
for User in root.findall('Add/User/'):
    name = User.get('name')
    mem = User.find('GroupMembership').text
    print (name,mem)
Output:
('cake', 'abc-trq')
with this new code i am able to get one groupmembership value only, need to print all groupmembership associated with a user. I also got an error where the user is in xml file but it doesn't hold any groupmembership, i get the error as below.

Error:
Traceback (most recent call last): File "my.xml", line 11, in <module> mem = User.find('GroupMembership').text
Reply
#4
From the ElementTree documentation:

Quote:find(match, namespaces=None)
Finds the first subelement matching match. match may be a tag name or a path. Returns an element instance or None. namespaces is an optional mapping from namespace prefix to full name.

findall(match, namespaces=None)
Finds all matching subelements, by tag name or path. Returns a list containing all matching elements in document order. namespaces is an optional mapping from namespace prefix to full name.

Find is only for the first match, findall is for all matches. Give findall a try instead.

For the error, is that the full text of the exception? You can add error handling to control for that possibility.
Reply
#5
thank you the new code works as expected.


import xml.etree.ElementTree as ET
tree = ET.parse('my.xml')
root = tree.getroot()
for User in root.findall('Add/User/'):
    name = User.get('name')
    print name
    for mem in User.findall('GroupMembership'):
        print '  --->', mem.text
Output:
Cake ---> abc-trq ---> kqt-ops
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020