Python Forum
Overwrite values in XML file with values from another XML file
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Overwrite values in XML file with values from another XML file
#1
I have one main xml file (Mainfile_1.xml) where some items show value = 'FAIL'. I want to replace those Fail values with correct values from another XML file (Fixfile_1.xml). It should look like on the picture below:

[Image: tYIVy.png]

So as you can see, values from Fixfile_1.xml should replace 'FAIL' values in Mainfile_1.xml for coresponding Item name and object id.

So far I wrote a code where I read both xml files and print only data related with Fail values. My MAIN PROBLEM is how to save it to a file so the failed values would be overwriten by values from fixfile_1.xml. "Tree.write" only delete "<?xml version='1.0' encoding='UTF-8'?>" line for some reason.

There is my code:

 import xml.etree.ElementTree as ET
    
    Mainfile = 'Mainfile_1.xml'
    tree = ET.parse(Mainfile)
    root = tree.getroot()
    fixfile = 'fixfile_1.xml'
    tree2 = ET.parse(fixfile)
    root2 = tree2.getroot()
    for objects in root.iter('object'):
        objabsno = objects.attrib.get('absno')
        for attributes in objects.getchildren():
            name = attributes.attrib.get('name')
            value = attributes.attrib.get('value')
            if value == 'FAIL':
                for objects2 in root2.iter('object'):
                    objabsno2 = objects2.attrib.get('absno')
                    for attributes2 in objects2.getchildren():
                        name2 = attributes2.attrib.get('name')
                        value2 = attributes2.attrib.get('value')
                        if objabsno2 == objabsno:
                            if name == name2:
                                print(name,name2,value,value2)
    tree.write('newMainfile_1.xml')
There is Mainfile_1.xml

    <?xml version='1.0' encoding='UTF-8'?>
    <Module bs='Mainfile_1'>
    <object name='namex' number='1' id='1000'>
        <item name='item0' value='100'/>
        <item name='item00' value='100'/>
    </object>
    <object name='namey' number='2' id='1001'>
        <item name='item1' value='100'/>
        <item name='item00' value='100'/>
    </object>
    <object name='name1' number='3' id='1234'>
        <item name='item1' value='FAIL'/>
        <item name='item2' value='233'/>
        <item name='item3' value='233'/>
        <item name='item4' value='FAIL'/>
    </object>
    <object name='name2' number='4' id='1238'>
        <item name='item8' value='FAIL'/>
        <item name='item9' value='233'/>
    </object>
    <object name='name32' number='5' id='2345'>
        <item name='item1' value='111'/>
        <item name='item2' value='FAIL'/>
    </object>
    <object name='name4' number='6' id='2347'>
        <item name='item1' value='FAIL'/>
        <item name='item2' value='FAIL'/>
        <item name='item3' value='233'/>
        <item name='item4' value='FAIL'/>
    </object>
    </Module>
And there is Fixfile_1.xml

    <?xml version='1.0' encoding='UTF-8'?>
    <Module bs='Mainfile_1'>
    <object id='1234'>
        <item name='item1' value='something
    more of something'/>
        <item name='item4' value='something
    more of something'/>
    </object>
    <object id='1238'>
        <item name='item8' value='something12
    more of something'/>
    </object>
    <object id='2345'>
        <item name='item2' value='something
    more of something'/>
    </object>
    <object id='2347'>
        <item name='item1' value='something14
    more of something'/>
        <item name='item2' value='something
    more of something'/>
        <item name='item4' value='something14
    something14
    something12
    more of something'/>
    </object>
    </Module>
And there is one more thing!! Because I have a lot of coresponding files like that (Mainfile_1.xml - Fixfile_1.xml, Mainfile_2.xml - Fixfile_2.xml,Mainfile_3.xml - Fixfile_3.xml, etc.) is there a way to open and overwrite them all at once?
Reply
#2
I'm working on something that will help.
it'll take a while, but should be done on my tomorrow (EDT)

I'll be back
Reply
#3
(Mar-31-2022, 02:40 AM)Larz60+ Wrote: I'm working on something that will help.
it'll take a while, but should be done on my tomorrow (EDT)

I'll be back

Ok, thank you. I really appreciate any help
Reply
#4
There are two access methods shown below:
  1. process_using_defusedxml this uses an etree, but not xml.etree.ElementTree which is very unsafe, venerable to attacks
    Quote:Note XML is not safe, see: https://docs.python.org/3/library/xml.ht...rabilities use defusedxml instead install with pip: 'pip install defusedxml see GitHub: https://github.com/tiran/defusedxml

  2. process_using_bs4 this is (my) preferred method, and as far as I know safe. It uses BeautifulSoup4 to parse the input.

Using the second method, you can be rearrange into a class with appropriate update methods

from pathlib import Path
import os

def process_using_defusedxml(filename):
    import defusedxml.ElementTree as ET

    def tree_walk(root, level=0):
        indent = " " * (4 * level)
        for child in root:
            print(f"\n{indent}Type(child): {type(child)}")
            print(f"\n{indent}tag: {child.tag}")
            print(f"    {indent}attribute: {child.attrib}")
            print(f"    {indent}text: {child.text}")
            level += 1
            tree_walk(child)

    tree = ET.parse(filename)
    root = tree.getroot()

    tree_walk(root)
    
# alternative method using Beautiful Soup
def process_using_bs4(filename):
    from bs4 import BeautifulSoup

    with filename.open('r') as fp:        
        xmldata = fp.read()
        soup = BeautifulSoup(xmldata, 'lxml')
        module = soup.find('module')
        modulename = module.get('bs')
        print(f"Module Name: {modulename}")

        objects = soup.find_all('object')
        print(f"\nobjects:")
        for n, obj in enumerate(objects):
            print(f"\nobject_id: {obj.get('id')} object name: {obj.get('name')}" \
                f" object number: {obj.get('number')}")
            items = obj.find_all('item')
            if items:
                print()
                for n1, item in enumerate(items):
                    if item:
                        print(f"    item number: {n1} name: {item.get('name')} " \
                            f"value: {item.get('value')}")

os.chdir(os.path.abspath(os.path.dirname(__file__)))
filename = Path('.') / 'Mainfile_1.xml'

# process_using_defusedxml(filename)
process_using_bs4(filename)
BeautifulSoup4 (bs4) method results:
Output:
Module Name: Mainfile_1 objects: object_id: 1000 object name: namex object number: 1 item number: 0 name: item0 value: 100 item number: 1 name: item00 value: 100 object_id: 1001 object name: namey object number: 2 item number: 0 name: item1 value: 100 item number: 1 name: item00 value: 100 object_id: 1234 object name: name1 object number: 3 item number: 0 name: item1 value: FAIL item number: 1 name: item2 value: 233 item number: 2 name: item3 value: 233 item number: 3 name: item4 value: FAIL object_id: 1238 object name: name2 object number: 4 item number: 0 name: item8 value: FAIL item number: 1 name: item9 value: 233 object_id: 2345 object name: name32 object number: 5 item number: 0 name: item1 value: 111 item number: 1 name: item2 value: FAIL object_id: 2347 object name: name4 object number: 6 item number: 0 name: item1 value: FAIL item number: 1 name: item2 value: FAIL item number: 2 name: item3 value: 233 item number: 3 name: item4 value: FAIL
This should give you something to work with.
Reply
#5
(Apr-01-2022, 08:27 AM)Larz60+ Wrote: There are two access methods shown below:
  1. process_using_defusedxml this uses an etree, but not xml.etree.ElementTree which is very unsafe, venerable to attacks
    Quote:Note XML is not safe, see: https://docs.python.org/3/library/xml.ht...rabilities use defusedxml instead install with pip: 'pip install defusedxml see GitHub: https://github.com/tiran/defusedxml

    This should give you something to work with.


Thank you very much, luckily I managed to find solution by my own but new problems occured. what I showed in my another topic.
Reply
#6
Keep in mind that etree.ElementTree is very unsafe, venerable to many attacks.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Question Using Lists as Dictionary Values bfallert 8 354 Apr-21-2024, 06:55 AM
Last Post: Pedroski55
  Printing out incidence values for Class Object SquderDragon 3 304 Apr-01-2024, 07:52 AM
Last Post: SquderDragon
  Get an average of the unique values of a column with group by condition and assign it klllmmm 0 289 Feb-17-2024, 05:53 PM
Last Post: klllmmm
  Too much values to unpack actualpy 3 483 Feb-11-2024, 05:38 PM
Last Post: deanhystad
  Converting column of values into muliple columns of counts highland44 0 260 Feb-01-2024, 12:48 AM
Last Post: highland44
  __init__() got multiple values for argument 'schema' dawid294 4 2,409 Jan-03-2024, 09:42 AM
Last Post: buran
  How to access values returned from inquirer cspower 6 832 Dec-26-2023, 09:34 PM
Last Post: cspower
  partial functions before knowing the values mikisDeWitte 4 623 Dec-24-2023, 10:00 AM
Last Post: perfringo
  file open "file not found error" shanoger 8 1,163 Dec-14-2023, 08:03 AM
Last Post: shanoger
  need to compare 2 values in a nested dictionary jss 2 881 Nov-30-2023, 03:17 PM
Last Post: Pedroski55

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020