Python Forum

Full Version: Editing text between two string from different lines
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I want to edit atributte values from xml file and replace every "\n" to "(newline)\n". This is necesary because I need to keep structure of xml file while after parsing, it is standaraized and whitespaces are gone, so I want to, firstly, add string like "newline) before whitespaces so after parsing xml file, when it will be standarized and whitespaces will be gone, I will edit whole file and replace string "(newline)" back to single "\n".

Picture below shows what I want to do:
[Image: QisXc.png]

I tried to use regex to get text between "value=" and "/>" but I can get only values if it is in one line and I don't even know how to edit it after.

    import re
    with open("file", "r") as f:
        contents = f.readlines()
        for line in contents:
            result = re.search('value=(.*?) />', line)
            print(result)
There is my file:

<Module bs="Mainfile_1">
    <object id="1000" name="namex" number="1">
        <item name="item0" value="100"/>
        <item name="item00" value="100
    	
    	100"/>
    </object>
    <object id="1001" name="namey" number="2">
        <item name="item1" value="100"/>
        <item name="item00" value="100"/>
    </object>
    <object id="1234" name="name1" number="3">
        <item name="item1" value="FAIL"/>
        <item name="item2" value="233"/>
        <item name="item3" value="233
    	234
    	246"/>
        <item name="item4" value="FAIL"/>
    </object>
    <object id="1238" name="name2" number="4">
        <item name="item8" value="FAIL"/>
        <item name="item9" value="233
    	234
    	
    	245
    	246
    	267"/>
    </object>
    <object id="2345" name="name32" number="5">
        <item name="item1" value="111"/>
        <item name="item2" value="FAIL" />
    </object>
    <object id="2347" name="name4" number="6">
        <item name="item1" value="FAIL"/>
        <item name="item2" value="FAIL"/>
        <item name="item3" value="233"/>
        <item name="item4" value="FAIL"/>
    </object>
    </Module>
Here is one way of going about it:
output_file = open ('test2.xml','w')
with open ('test.xml', 'r') as input_file :
	for line in input_file :
		line = line.strip ('\n')
		if line [-1] != '>' :
			line += '(newline)'
		output_file.write (line + '\n')
output_file.close ()
Output:
<Module bs="Mainfile_1"> <object id="1000" name="namex" number="1"> <item name="item0" value="100"/> <item name="item00" value="100(newline) (newline) 100"/> </object> <object id="1001" name="namey" number="2"> <item name="item1" value="100"/> <item name="item00" value="100"/> </object> <object id="1234" name="name1" number="3"> <item name="item1" value="FAIL"/> <item name="item2" value="233"/> <item name="item3" value="233(newline) 234(newline) 246"/> <item name="item4" value="FAIL"/> </object> <object id="1238" name="name2" number="4"> <item name="item8" value="FAIL"/> <item name="item9" value="233(newline) 234(newline) (newline) 245(newline) 246(newline) 267"/> </object> <object id="2345" name="name32" number="5"> <item name="item1" value="111"/> <item name="item2" value="FAIL" /> </object> <object id="2347" name="name4" number="6"> <item name="item1" value="FAIL"/> <item name="item2" value="FAIL"/> <item name="item3" value="233"/> <item name="item4" value="FAIL"/> </object> </Module>