Python Forum
Formating generated .data file to XML
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Formating generated .data file to XML
#2
(Apr-13-2022, 05:04 PM)malcoverc Wrote: I have some generated data files I want to format to XML:

    1234=>item1:something11:
    
    something11<COMMA>item4:something12:
    
    12something<END_OF_OBJECT_LINE>
    1238=>item8:something12:
    
    something11:<END_OF_OBJECT_LINE>
    2345=>item2:something12:
    
    something11:<END_OF_OBJECT_LINE>
    123=>item1:something1:
    
    something11<COMMA>item2:something:
    
    11something<COMMA>item4:something:
    
    11something<END_OF_OBJECT_LINE>
What I Tried to do is to replace some specified regular expression to make it look like XML:

    with open("OGfile.data", "r") as f:
        with open("tempfile.data", "w") as fo:
        # formating file to XML format
            contents = f.readlines()
            contents.insert(0, "<?xml version='1.0' encoding='UTF-8'?>\n<Module>\n<Object id='")
            contents =[w.replace("<END_OF_OBJECT_LINE>\n", "'/>\n</Object>\n<Object id='") for w in contents]
            contents =[w.replace("=>", "'>\n     <Attribute name='") for w in contents]
            contents =[w.replace('<COMMA>', "'/>\n     <Attribute name='") for w in contents]
            contents =[w.replace(':something', "' value='something") for w in contents]
            # saving formated file to new file
            contents = "".join(contents)
            fo.write(contents)
    
    # fixing invalid last line from formated file with open("tempfile.data", "r") as f2:
        with open("finalfile.data", "w") as fo2:
            contents2 = f2.readlines()
            contents2 = [w.replace("<END_OF_OBJECT_LINE>", "'/>\n</Object>\n</Module>") for w in contents2]
            contents2 = "".join(contents2)
            fo2.write(contents2)
and It works fine, I made it into:

<?xml version='1.0' encoding='UTF-8'?>
    <Module>
    <Object id='1234'>
         <Attribute name='item1' value='something11:
    
    something11'/>
         <Attribute name='item4' value='something12:
    
    12something'/>
    </Object>
    <Object id='1238'>
         <Attribute name='item8' value='something12:
    
    something11:'/>
    </Object>
    <Object id='2345'>
         <Attribute name='item2' value='something12:
    
    something11:'/>
    </Object>
    <Object id='123'>
         <Attribute name='item1' value='something1:
    
    something11'/>
         <Attribute name='item2' value='something:
    
    11something'/>
         <Attribute name='item4' value='something:
    
    11something'/>
    </Object>
    </Module>
BUT, there is one problem, I am changing contents =[w.replace(':something', "' value='something") for w in contents] just by taking this value but if it would start with something different instead of "something" i would be doomed. I have been thinking about using regex to take string between "Attribute name:" and "<COMMA>" or "<END_OF_OBJECT_LINE>", but my attemps failed misserably because I am quite new into programming and python. It could be also done much better if I could somehow insert convert this .data file into dictionary and then make it into xml in proper way, but I have no idea how to separate it corretly to dictionary. Do you have any suggestions?
See section 3.3.3 of the XML definition. Be aware that it says that newlines are replaced by spaces, and then that sequences of spaces be reduced to a single space, so if I have read the spec correctly, you may not end up with what you expect to end up with. See the example table right before section 3.4.

You have not shown an example of the regular expressions you tried. The regular expression syntax is very straightforward, but the key is in using parentheses to specify the pattern you are looking for, but that is unclear since you refer to :something as your desired pattern, but there is no place in the input I see :something appearing. If you could show the string before the replace and after the replace (print statements are very good for this) as well as the pattern you are using, it would make things a lot clearer.
Reply


Messages In This Thread
RE: Formating generated .data file to XML - by supuflounder - Apr-13-2022, 08:29 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  HOW TO USE C# GENERATED DLL davide_vergnani 2 1,848 Jun-12-2023, 03:35 PM
Last Post: davide_vergnani
  propper formating paracelsusx 2 1,967 Jul-16-2021, 09:17 AM
Last Post: perfringo
  Adding graph points and formating project_science 4 2,499 Jan-24-2021, 05:02 PM
Last Post: project_science
  xml file creation from an XML file template and data from an excel file naji_python 1 2,173 Dec-21-2020, 03:24 PM
Last Post: Gribouillis
  How do you work with procedurally generated data? rbbauer00 1 1,537 Jul-08-2020, 04:21 AM
Last Post: ndc85430
  How to save CSV file data into the Azure Data Lake Storage Gen2 table? Mangesh121 0 2,166 Jun-26-2020, 11:59 AM
Last Post: Mangesh121
  Excel: Apply formating of a sheet(file1) to another sheet(file2) lowermoon 1 2,125 May-26-2020, 07:57 AM
Last Post: buran
  the exe file by generated by pyinstaller ,can't get the PYTHONPATH roger2020 11 7,268 Jan-14-2020, 11:07 AM
Last Post: roger2020
  tuple and formating problem darktitan 7 3,550 Feb-17-2019, 07:37 PM
Last Post: marienbad
  Use Variables Generated from Functions in different files to use on the main file AykutRobotics 3 3,037 Jan-01-2019, 04:19 PM
Last Post: AykutRobotics

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020