Hello All,
I am a new Pandas user, I am now familiar with this application. I'm fine with importing a CSV file and similar data tables.
However, for a new task, the data would be in xml files, the contents of which would have to be read and then concatenated into a file.
My problem is reading it.
I can't scan all the depths of the root structure.
The code I use is as follows:
I would be very happy for any help, or if someone would write why I couldn’t read this multi-level root structure, already at that depth.
All the best to everyone
I am a new Pandas user, I am now familiar with this application. I'm fine with importing a CSV file and similar data tables.
However, for a new task, the data would be in xml files, the contents of which would have to be read and then concatenated into a file.
My problem is reading it.
I can't scan all the depths of the root structure.
The code I use is as follows:
# In [1]: import xml.etree.ElementTree as et # In [2]: import pandas as pd # In [3]: xml_data = open("C:\\Adatok\\DAC-6\\teszt1.xml", 'r').read() # In [4]: root = et.XML(xml_data) data = [] cols = [] for i, child in enumerate(root): data.append([subchild.text for subchild in child]) cols.append(child.tag) for i, subchild in enumerate(child): data.append([subsubchild.text for subsubchild in subchild]) cols.append(subchild.tag) for i, subsubchild in enumerate(subchild): data.append([subsubsubchild.text for subsubsubchild in subsubchild]) cols.append(subsubchild.tag) for i, subsubsubchild in enumerate(subsubchild): data.append([subsubsubsubchild.text for subsubsubsubchild in subsubsubchild]) cols.append(subsubsubchild.tag) for i, subsubsubsubchild in enumerate(subsubsubchild): data.append([subsubsubsubsubchild.text for subsubsubsubsubchild in subsubsubsubchild]) cols.append(subsubsubsubchild.tag) for i, subsubsubsubsubchild in enumerate(subsubsubsubchild): data.append([subsubsubsubsubsubchild.text for subsubsubsubsubsubchild in subsubsubsubsubchild]) cols.append(subsubsubsubsubchild.tag) for i, subsubsubsubsubsubchild in enumerate(subsubsubsubsubchild): data.append([subsubsubsubsubsubsubchild.text for subsubsubsubsubsubsubchild in subsubsubsubsubsubchild]) cols.append(subsubsubsubsubsubchild.tag)
Error:NameError Traceback (most recent call last)
<ipython-input-35-c374e6ba6497> in <module>
21 data.append([subsubsubsubsubsubchild.text for subsubsubsubsubsubchild in subsubsubsubsubchild])
22 cols.append(subsubsubsubsubchild.tag)
---> 23 for i, subsubsubsubsubsubchild in enumerate(subsubsubsubsubchild):
24 data.append([subsubsubsubsubsubsubchild.text for subsubsubsubsubsubsubchild in subsubsubsubsubsubchild])
25 cols.append(subsubsubsubsubsubchild.tag)
NameError: name 'subsubsubsubsubchild' is not defined
# In [ ]: df = pd.DataFrame(data).T # In [ ]: df.columns = cols # In [ ]: df.head()I have attached a sample of the xml file to load
I would be very happy for any help, or if someone would write why I couldn’t read this multi-level root structure, already at that depth.
All the best to everyone
Larz60+ write Jun-08-2021, 09:31 PM:
Please post all code, output and errors (it it's entirety) between their respective tags. Refer to BBCode help topic on how to post. Use the "Preview Post" button to make sure the code is presented as you expect before hitting the "Post Reply/Thread" button.
fixed for you this time. Please use bbcode tags on future posts.
Please post all code, output and errors (it it's entirety) between their respective tags. Refer to BBCode help topic on how to post. Use the "Preview Post" button to make sure the code is presented as you expect before hitting the "Post Reply/Thread" button.
fixed for you this time. Please use bbcode tags on future posts.
Attached Files