Python Forum
How to import an xml file to Pandas
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How to import an xml file to Pandas
#1
Hello All,
I am a new Pandas user, I am now familiar with this application. I'm fine with importing a CSV file and similar data tables.
However, for a new task, the data would be in xml files, the contents of which would have to be read and then concatenated into a file.
My problem is reading it.
I can't scan all the depths of the root structure.
The code I use is as follows:
# In [1]:
import xml.etree.ElementTree as et
# In [2]:
import pandas as pd
# In [3]:
xml_data = open("C:\\Adatok\\DAC-6\\teszt1.xml", 'r').read()  
# In [4]:
root = et.XML(xml_data) 

data = []
cols = []
for i, child in enumerate(root):
    data.append([subchild.text for subchild in child])
    cols.append(child.tag)
for i, subchild in enumerate(child):
    data.append([subsubchild.text for subsubchild in subchild])
    cols.append(subchild.tag)
for i, subsubchild in enumerate(subchild):
    data.append([subsubsubchild.text for subsubsubchild in subsubchild])
    cols.append(subsubchild.tag)
for i, subsubsubchild in enumerate(subsubchild):
    data.append([subsubsubsubchild.text for subsubsubsubchild in subsubsubchild])
    cols.append(subsubsubchild.tag)
for i, subsubsubsubchild in enumerate(subsubsubchild):
    data.append([subsubsubsubsubchild.text for subsubsubsubsubchild in subsubsubsubchild])
    cols.append(subsubsubsubchild.tag)
for i, subsubsubsubsubchild in enumerate(subsubsubsubchild):
    data.append([subsubsubsubsubsubchild.text for subsubsubsubsubsubchild in subsubsubsubsubchild])
    cols.append(subsubsubsubsubchild.tag)
for i, subsubsubsubsubsubchild in enumerate(subsubsubsubsubchild):
    data.append([subsubsubsubsubsubsubchild.text for subsubsubsubsubsubsubchild in subsubsubsubsubsubchild])
    cols.append(subsubsubsubsubsubchild.tag)
Error:
NameError Traceback (most recent call last) <ipython-input-35-c374e6ba6497> in <module> 21 data.append([subsubsubsubsubsubchild.text for subsubsubsubsubsubchild in subsubsubsubsubchild]) 22 cols.append(subsubsubsubsubchild.tag) ---> 23 for i, subsubsubsubsubsubchild in enumerate(subsubsubsubsubchild): 24 data.append([subsubsubsubsubsubsubchild.text for subsubsubsubsubsubsubchild in subsubsubsubsubsubchild]) 25 cols.append(subsubsubsubsubsubchild.tag) NameError: name 'subsubsubsubsubchild' is not defined
# In [ ]:
df = pd.DataFrame(data).T  
# In [ ]:
df.columns = cols  
# In [ ]:
df.head()
I have attached a sample of the xml file to load


I would be very happy for any help, or if someone would write why I couldn’t read this multi-level root structure, already at that depth.

All the best to everyone
Larz60+ write Jun-08-2021, 09:31 PM:
Please post all code, output and errors (it it's entirety) between their respective tags. Refer to BBCode help topic on how to post. Use the "Preview Post" button to make sure the code is presented as you expect before hitting the "Post Reply/Thread" button.
fixed for you this time. Please use bbcode tags on future posts.

Attached Files

.xml   teszt1.xml (Size: 7.92 KB / Downloads: 0)
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  import csv adding a header with pandas Soares 0 1,564 Dec-16-2021, 12:16 PM
Last Post: Soares
  Import multiple CSV files into pandas Krayna 0 1,723 May-20-2021, 04:56 PM
Last Post: Krayna
  import columns of data from local csv file CatherineKan 2 3,344 May-10-2021, 05:10 AM
Last Post: ricslato
  Pandas Import CSV count between numerical values within 1 Column ptaylor520 3 2,665 Jul-16-2019, 08:13 AM
Last Post: ptaylor520
  import pandas as pd not working in pclinuxos loren41 3 2,324 May-19-2019, 03:49 PM
Last Post: Larz60+
  Import Excel File that Starts with Number kiki1113 1 3,319 Dec-20-2018, 07:13 PM
Last Post: Larz60+
  Trying to import JSON data into Python/Pandas DataFrame then edit then write CSV Rhubear 0 4,103 Jul-23-2018, 09:50 PM
Last Post: Rhubear

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020