Aug-05-2018, 09:40 AM
I am trying to extract certain data from the XML using Element tree, but I am unsure why my code doesn't work, any guidance would be helpful.
It appears that the paths I have used aren't correct as no values are returned. I think I am fairly close..... hopefully?
It appears that the paths I have used aren't correct as no values are returned. I think I am fairly close..... hopefully?
import requests from lxml import etree fromDate = "2018-07-29" def getXML(): url="http://energywatch.natgrid.co.uk/EDP-PublicUI/PublicPI/InstantaneousFlowWebService.asmx" headers = {'content-type': 'application/soap+xml; charset=utf-8'} body ="""<soap12:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soap12="http://www.w3.org/2003/05/soap-envelope"> <soap12:Body> <GetInstantaneousFlowData xmlns="http://www.NationalGrid.com/EDP/UI/" /> </soap12:Body> </soap12:Envelope>""" response = requests.post(url,data=body,headers=headers) return response.content import pandas as pd df1 = pd.DataFrame(columns=("applicable_at","name","value","created_date")) for pd_date in pd.date_range(fromDate, periods=1): day = pd_date.strftime('%Y-%m-%d') root = etree.fromstring(getXML()) #map prefix 'd' to the default namespace URI ns = {'d': 'http://www.nationalgrid.com/EDP/BusinessEntities/Public'} publication_objects = root.xpath('//d:EDPObjectCollection', namespaces=ns) for obj in publication_objects: name = obj.find('d:EDPObjectName', ns).text for data in obj.findall('d:EnergyDataList/d:EDPEnergyDataBE', ns): applicable_at = pd.to_datetime(data.find('d:ApplicableAt', ns).text) value = float(data.find('d:FlowRate', ns).text) created_date = pd.to_datetime(data.find('d:ScheduleTime', ns).text) df1.loc[len(df1) +1] = [applicable_at,name, value,created_date]