Python Forum
extracting XML data into datframe
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
extracting XML data into datframe
#3
This is the returned XML:
Output:
b' <?xml version="1.0" encoding="utf-8"?> <soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <soap:Body> <GetInstantaneousFlowDataResponse xmlns="http://www.NationalGrid.com/EDP/UI/"> <GetInstantaneousFlowDataResult> <ReportName xmlns="http://www.NationalGrid.com/EDP/BusinessEntities/Public">Instantaneous Flow Report </ReportName> <PublishedTime xmlns="http://www.NationalGrid.com/EDP/BusinessEntities/Public">2018-07-29T17:11:00 </PublishedTime> <EDPReportPage xmlns="http://www.NationalGrid.com/EDP/BusinessEntities/Public"> <PageName>page1</PageName> <CurrentGasDay>2018-07-29T00:00:00</CurrentGasDay> <EDPEnergyGraphTableCollection> <EDPEnergyGraphTableBE> <EDPEnergyGraphTableName>table1</EDPEnergyGraphTableName> <ItemPosition>1</ItemPosition> <EDPObjectCollection> <EDPObjectBE> <EDPObjectName>ALDBROUGH</EDPObjectName> <EnergyDataList> <EDPEnergyDataBE> <ApplicableAt>2018-07-29T17:00:00</ApplicableAt> <FlowRate>0</FlowRate> <QualityIndicator /> <ScheduleTime>2018-07-29T17:12:00</ScheduleTime> </EDPEnergyDataBE> <EDPEnergyDataBE> <ApplicableAt>2018-07-29T17:02:00</ApplicableAt> <FlowRate>0</FlowRate> <QualityIndicator /> <ScheduleTime>2018-07-29T17:12:00</ScheduleTime> </EDPEnergyDataBE> <EDPEnergyDataBE> <ApplicableAt>2018-07-29T17:04:00</ApplicableAt> <FlowRate>0</FlowRate> <QualityIndicator /> <ScheduleTime>2018-07-29T17:12:00</ScheduleTime> </EDPEnergyDataBE> <EDPEnergyDataBE> <ApplicableAt>2018-07-29T17:06:00</ApplicableAt> <FlowRate>0</FlowRate> <QualityIndicator /> <ScheduleTime>2018-07-29T17:12:00</ScheduleTime> </EDPEnergyDataBE> <EDPEnergyDataBE> <ApplicableAt>2018-07-29T17:08:00</ApplicableAt> <FlowRate>0</FlowRate> <QualityIndicator /> <ScheduleTime>2018-07-29T17:12:00</ScheduleTime> </EDPEnergyDataBE> <EDPEnergyDataBE> <ApplicableAt>2018-07-29T17:10:00</ApplicableAt> <FlowRate>0</FlowRate> <QualityIndicator /> <ScheduleTime>2018-07-29T17:12:00</ScheduleTime> </EDPEnergyDataBE> </EnergyDataList> </EDPObjectBE> </EDPObjectCollection> <Description>Total Supply</Description> </EDPEnergyGraphTableBE> </EDPEnergyGraphTableCollection> </EDPReportPage> </GetInstantaneousFlowDataResult> </GetInstantaneousFlowDataResponse> </soap:Body> </soap:Envelope>'
I would like to extract the following information: EDPObjectName, ApplicableAt, Flow Rate & ScheduleTime. I have tried modifying some code I have found on this forum, but my dataframe does not populate with any data, it just has the column names.

No exceptions are produced when I run the code.

I was hoping that the data would be picked out of the XML and put into a dataframe.

import requests
from lxml import etree
    
fromDate = "2018-07-29"

def getXML():
    url="http://energywatch.natgrid.co.uk/EDP-PublicUI/PublicPI/InstantaneousFlowWebService.asmx"
    headers = {'content-type': 'application/soap+xml; charset=utf-8'}
    body ="""<soap12:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soap12="http://www.w3.org/2003/05/soap-envelope">
  <soap12:Body>
    <GetInstantaneousFlowData xmlns="http://www.NationalGrid.com/EDP/UI/" />
  </soap12:Body>
</soap12:Envelope>"""

    response = requests.post(url,data=body,headers=headers)
    return response.content

import pandas as pd
df1 = pd.DataFrame(columns=("name","value"))   
for pd_date in pd.date_range(fromDate, periods=1):
    day = pd_date.strftime('%Y-%m-%d')
     
    root = etree.fromstring(getXML())
     
    #map prefix 'd' to the default namespace URI
    ns = {'d': 'http://www.NationalGrid.com/EDP/UI/'}
     
    publication_objects = root.xpath('//d:EDPReportBE', namespaces=ns)
     
    for obj in publication_objects:
        name = obj.find('d:EDPObjectName', ns).text

        for data in obj.findall('d:EDPObjectBE/d:EDPEnergyDataBE', ns):
            value = float(data.find('d:FlowRate', ns).text)
        
            df1.loc[len(df1) +1] = [name, value]
            
Reply


Messages In This Thread
extracting XML data into datframe - by hey_arnold - Jul-29-2018, 02:54 PM
RE: extracting XML data into datframe - by buran - Jul-29-2018, 04:05 PM
RE: extracting XML data into datframe - by hey_arnold - Jul-29-2018, 04:48 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Searching a .txt file for a specific number and extracting the corresponding data nrozman 3 3,179 Jul-27-2018, 02:07 PM
Last Post: nrozman

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020