Jun-08-2021, 08:19 PM
Hello All,
I am a new Pandas user, I am now familiar with this application. I'm fine with importing a CSV file and similar data tables.
However, for a new task, the data would be in xml files, the contents of which would have to be read and then concatenated into a file.
My problem is reading it.
I can't scan all the depths of the root structure.
The code I use is as follows:
I would be very happy for any help, or if someone would write why I couldn’t read this multi-level root structure, already at that depth.
All the best to everyone
I am a new Pandas user, I am now familiar with this application. I'm fine with importing a CSV file and similar data tables.
However, for a new task, the data would be in xml files, the contents of which would have to be read and then concatenated into a file.
My problem is reading it.
I can't scan all the depths of the root structure.
The code I use is as follows:
# In [1]: import xml.etree.ElementTree as et # In [2]: import pandas as pd # In [3]: xml_data = open("C:\\Adatok\\DAC-6\\teszt1.xml", 'r').read() # In [4]: root = et.XML(xml_data) data = [] cols = [] for i, child in enumerate(root): data.append([subchild.text for subchild in child]) cols.append(child.tag) for i, subchild in enumerate(child): data.append([subsubchild.text for subsubchild in subchild]) cols.append(subchild.tag) for i, subsubchild in enumerate(subchild): data.append([subsubsubchild.text for subsubsubchild in subsubchild]) cols.append(subsubchild.tag) for i, subsubsubchild in enumerate(subsubchild): data.append([subsubsubsubchild.text for subsubsubsubchild in subsubsubchild]) cols.append(subsubsubchild.tag) for i, subsubsubsubchild in enumerate(subsubsubchild): data.append([subsubsubsubsubchild.text for subsubsubsubsubchild in subsubsubsubchild]) cols.append(subsubsubsubchild.tag) for i, subsubsubsubsubchild in enumerate(subsubsubsubchild): data.append([subsubsubsubsubsubchild.text for subsubsubsubsubsubchild in subsubsubsubsubchild]) cols.append(subsubsubsubsubchild.tag) for i, subsubsubsubsubsubchild in enumerate(subsubsubsubsubchild): data.append([subsubsubsubsubsubsubchild.text for subsubsubsubsubsubsubchild in subsubsubsubsubsubchild]) cols.append(subsubsubsubsubsubchild.tag)
Error:NameError Traceback (most recent call last)
<ipython-input-35-c374e6ba6497> in <module>
21 data.append([subsubsubsubsubsubchild.text for subsubsubsubsubsubchild in subsubsubsubsubchild])
22 cols.append(subsubsubsubsubchild.tag)
---> 23 for i, subsubsubsubsubsubchild in enumerate(subsubsubsubsubchild):
24 data.append([subsubsubsubsubsubsubchild.text for subsubsubsubsubsubsubchild in subsubsubsubsubsubchild])
25 cols.append(subsubsubsubsubsubchild.tag)
NameError: name 'subsubsubsubsubchild' is not defined
# In [ ]: df = pd.DataFrame(data).T # In [ ]: df.columns = cols # In [ ]: df.head()I have attached a sample of the xml file to load
I would be very happy for any help, or if someone would write why I couldn’t read this multi-level root structure, already at that depth.
All the best to everyone