Python Forum
Overwrite values in XML file with values from another XML file
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Overwrite values in XML file with values from another XML file
#1
I have one main xml file (Mainfile_1.xml) where some items show value = 'FAIL'. I want to replace those Fail values with correct values from another XML file (Fixfile_1.xml). It should look like on the picture below:

[Image: tYIVy.png]

So as you can see, values from Fixfile_1.xml should replace 'FAIL' values in Mainfile_1.xml for coresponding Item name and object id.

So far I wrote a code where I read both xml files and print only data related with Fail values. My MAIN PROBLEM is how to save it to a file so the failed values would be overwriten by values from fixfile_1.xml. "Tree.write" only delete "<?xml version='1.0' encoding='UTF-8'?>" line for some reason.

There is my code:

 import xml.etree.ElementTree as ET
    
    Mainfile = 'Mainfile_1.xml'
    tree = ET.parse(Mainfile)
    root = tree.getroot()
    fixfile = 'fixfile_1.xml'
    tree2 = ET.parse(fixfile)
    root2 = tree2.getroot()
    for objects in root.iter('object'):
        objabsno = objects.attrib.get('absno')
        for attributes in objects.getchildren():
            name = attributes.attrib.get('name')
            value = attributes.attrib.get('value')
            if value == 'FAIL':
                for objects2 in root2.iter('object'):
                    objabsno2 = objects2.attrib.get('absno')
                    for attributes2 in objects2.getchildren():
                        name2 = attributes2.attrib.get('name')
                        value2 = attributes2.attrib.get('value')
                        if objabsno2 == objabsno:
                            if name == name2:
                                print(name,name2,value,value2)
    tree.write('newMainfile_1.xml')
There is Mainfile_1.xml

    <?xml version='1.0' encoding='UTF-8'?>
    <Module bs='Mainfile_1'>
    <object name='namex' number='1' id='1000'>
        <item name='item0' value='100'/>
        <item name='item00' value='100'/>
    </object>
    <object name='namey' number='2' id='1001'>
        <item name='item1' value='100'/>
        <item name='item00' value='100'/>
    </object>
    <object name='name1' number='3' id='1234'>
        <item name='item1' value='FAIL'/>
        <item name='item2' value='233'/>
        <item name='item3' value='233'/>
        <item name='item4' value='FAIL'/>
    </object>
    <object name='name2' number='4' id='1238'>
        <item name='item8' value='FAIL'/>
        <item name='item9' value='233'/>
    </object>
    <object name='name32' number='5' id='2345'>
        <item name='item1' value='111'/>
        <item name='item2' value='FAIL'/>
    </object>
    <object name='name4' number='6' id='2347'>
        <item name='item1' value='FAIL'/>
        <item name='item2' value='FAIL'/>
        <item name='item3' value='233'/>
        <item name='item4' value='FAIL'/>
    </object>
    </Module>
And there is Fixfile_1.xml

    <?xml version='1.0' encoding='UTF-8'?>
    <Module bs='Mainfile_1'>
    <object id='1234'>
        <item name='item1' value='something
    more of something'/>
        <item name='item4' value='something
    more of something'/>
    </object>
    <object id='1238'>
        <item name='item8' value='something12
    more of something'/>
    </object>
    <object id='2345'>
        <item name='item2' value='something
    more of something'/>
    </object>
    <object id='2347'>
        <item name='item1' value='something14
    more of something'/>
        <item name='item2' value='something
    more of something'/>
        <item name='item4' value='something14
    something14
    something12
    more of something'/>
    </object>
    </Module>
And there is one more thing!! Because I have a lot of coresponding files like that (Mainfile_1.xml - Fixfile_1.xml, Mainfile_2.xml - Fixfile_2.xml,Mainfile_3.xml - Fixfile_3.xml, etc.) is there a way to open and overwrite them all at once?
Reply
#2
I'm working on something that will help.
it'll take a while, but should be done on my tomorrow (EDT)

I'll be back
Reply
#3
(Mar-31-2022, 02:40 AM)Larz60+ Wrote: I'm working on something that will help.
it'll take a while, but should be done on my tomorrow (EDT)

I'll be back

Ok, thank you. I really appreciate any help
Reply
#4
There are two access methods shown below:
  1. process_using_defusedxml this uses an etree, but not xml.etree.ElementTree which is very unsafe, venerable to attacks
    Quote:Note XML is not safe, see: https://docs.python.org/3/library/xml.ht...rabilities use defusedxml instead install with pip: 'pip install defusedxml see GitHub: https://github.com/tiran/defusedxml

  2. process_using_bs4 this is (my) preferred method, and as far as I know safe. It uses BeautifulSoup4 to parse the input.

Using the second method, you can be rearrange into a class with appropriate update methods

from pathlib import Path
import os

def process_using_defusedxml(filename):
    import defusedxml.ElementTree as ET

    def tree_walk(root, level=0):
        indent = " " * (4 * level)
        for child in root:
            print(f"\n{indent}Type(child): {type(child)}")
            print(f"\n{indent}tag: {child.tag}")
            print(f"    {indent}attribute: {child.attrib}")
            print(f"    {indent}text: {child.text}")
            level += 1
            tree_walk(child)

    tree = ET.parse(filename)
    root = tree.getroot()

    tree_walk(root)
    
# alternative method using Beautiful Soup
def process_using_bs4(filename):
    from bs4 import BeautifulSoup

    with filename.open('r') as fp:        
        xmldata = fp.read()
        soup = BeautifulSoup(xmldata, 'lxml')
        module = soup.find('module')
        modulename = module.get('bs')
        print(f"Module Name: {modulename}")

        objects = soup.find_all('object')
        print(f"\nobjects:")
        for n, obj in enumerate(objects):
            print(f"\nobject_id: {obj.get('id')} object name: {obj.get('name')}" \
                f" object number: {obj.get('number')}")
            items = obj.find_all('item')
            if items:
                print()
                for n1, item in enumerate(items):
                    if item:
                        print(f"    item number: {n1} name: {item.get('name')} " \
                            f"value: {item.get('value')}")

os.chdir(os.path.abspath(os.path.dirname(__file__)))
filename = Path('.') / 'Mainfile_1.xml'

# process_using_defusedxml(filename)
process_using_bs4(filename)
BeautifulSoup4 (bs4) method results:
Output:
Module Name: Mainfile_1 objects: object_id: 1000 object name: namex object number: 1 item number: 0 name: item0 value: 100 item number: 1 name: item00 value: 100 object_id: 1001 object name: namey object number: 2 item number: 0 name: item1 value: 100 item number: 1 name: item00 value: 100 object_id: 1234 object name: name1 object number: 3 item number: 0 name: item1 value: FAIL item number: 1 name: item2 value: 233 item number: 2 name: item3 value: 233 item number: 3 name: item4 value: FAIL object_id: 1238 object name: name2 object number: 4 item number: 0 name: item8 value: FAIL item number: 1 name: item9 value: 233 object_id: 2345 object name: name32 object number: 5 item number: 0 name: item1 value: 111 item number: 1 name: item2 value: FAIL object_id: 2347 object name: name4 object number: 6 item number: 0 name: item1 value: FAIL item number: 1 name: item2 value: FAIL item number: 2 name: item3 value: 233 item number: 3 name: item4 value: FAIL
This should give you something to work with.
Reply
#5
(Apr-01-2022, 08:27 AM)Larz60+ Wrote: There are two access methods shown below:
  1. process_using_defusedxml this uses an etree, but not xml.etree.ElementTree which is very unsafe, venerable to attacks
    Quote:Note XML is not safe, see: https://docs.python.org/3/library/xml.ht...rabilities use defusedxml instead install with pip: 'pip install defusedxml see GitHub: https://github.com/tiran/defusedxml

    This should give you something to work with.


Thank you very much, luckily I managed to find solution by my own but new problems occured. what I showed in my another topic.
Reply
#6
Keep in mind that etree.ElementTree is very unsafe, venerable to many attacks.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  remove duplicates from dicts with list values wardancer84 27 883 May-27-2024, 04:54 PM
Last Post: wardancer84
Question Using Lists as Dictionary Values bfallert 8 624 Apr-21-2024, 06:55 AM
Last Post: Pedroski55
  Printing out incidence values for Class Object SquderDragon 3 436 Apr-01-2024, 07:52 AM
Last Post: SquderDragon
  Get an average of the unique values of a column with group by condition and assign it klllmmm 0 486 Feb-17-2024, 05:53 PM
Last Post: klllmmm
  Too much values to unpack actualpy 3 584 Feb-11-2024, 05:38 PM
Last Post: deanhystad
  Converting column of values into muliple columns of counts highland44 0 333 Feb-01-2024, 12:48 AM
Last Post: highland44
  __init__() got multiple values for argument 'schema' dawid294 4 3,008 Jan-03-2024, 09:42 AM
Last Post: buran
  How to access values returned from inquirer cspower 6 1,026 Dec-26-2023, 09:34 PM
Last Post: cspower
  partial functions before knowing the values mikisDeWitte 4 718 Dec-24-2023, 10:00 AM
Last Post: perfringo
  file open "file not found error" shanoger 8 1,412 Dec-14-2023, 08:03 AM
Last Post: shanoger

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020