Python Forum
re.findall help searching for string in xml response
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
re.findall help searching for string in xml response
#1
Hi everyone, I am very new to coding let alone python...!!! Any help is appreciated.

I have managed to piece together the code below to test the response times of our API. I am having issues extracting data from the xml response.

I am trying to extract the string ResponseTime="337" from the xml response, the number in the "" changes but the rest is the same. When I run the code I see the following.

c:\unscan>python dir-uapitest.py
End point : uAPI Direct APAC VIP
Request 0 took 1.305 seconds
uAPI Time [(u'"452"', u'2')]
Request 1 took 0.795 seconds
uAPI Time [(u'"349"', u'9')]
Request 2 took 0.819 seconds
uAPI Time [(u'"366"', u'6')]
End point : uAPI Direct APAC VIP
Sending request WITH connection pooling ...
Request 0 took 0.719 seconds
uAPI Time [(u'"281"', u'1')]
Request 1 took 0.473 seconds
uAPI Time [(u'"352"', u'2')]
Request 2 took 0.395 seconds
uAPI Time [(u'"268"', u'8')]

uapitime = re.findall(r'ResponseTime=("([^"])*")', r.text)
print "uAPI Time %s" % (uapitime)
XML Response
<SOAP:Envelope xmlns:SOAP="http://schemas.xmlsoap.org/soap/envelope/">
<SOAP:Body>
<air:LowFareSearchRsp TraceId="roventrace" TransactionId="AB53869A0A0759CF79A986CC276D3293" ResponseTime="337" DistanceUnits="MI" CurrencyType="INR" xmlns:air="http://www.travelport.com/schema/air_v45_0" xmlns:common_v45_0="http://www.travelport.com/schema/common_v45_0">
<common_v45_0:ResponseMessage Code="4039" Type="Warning" ProviderCode="1G">"Result size exceeded the maximum allowable and some results were discarded. It may be necessary to narrow your search using search modifiers."</common_v45_0:ResponseMessage>

Full code
import xml.dom.minidom, time, requests, re

uapikey = "xxxxxxxxxxxxxxxxxxxx" #Base 64 encoded Username and Password

url = "https://apac.universal-api.travelport.com/B2BGateway/connect/uAPI/AirService"

data = """
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
            <soap:Body>
                        <air:LowFareSearchReq xmlns:air="http://www.travelport.com/schema/air_v45_0" AuthorizedBy="user" TargetBranch="xxxxxx" TraceId="paultrace" SolutionResult="true" xmlns:com="http://www.travelport.com/schema/common_v45_0">
                                    <com:BillingPointOfSaleInfo OriginApplication="uAPI"/>
                                    <air:SearchAirLeg>
                                                <air:SearchOrigin>
                                                            <com:CityOrAirport Code="BLR"/>
                                                </air:SearchOrigin>
                                                <air:SearchDestination>
                                                            <com:CityOrAirport Code="DEL"/>
                                                </air:SearchDestination>
                                                <air:SearchDepTime PreferredTime="2018-07-17">
                                  </air:SearchDepTime>
                                    </air:SearchAirLeg>
                                    <air:AirSearchModifiers MaxSolutions="10">
                                                <air:PreferredProviders>
                                                            <com:Provider Code="1G"/>
                                                </air:PreferredProviders>
                                                <air:PermittedCarriers>
                                                            <com:Carrier Code="AI"/>
                                                </air:PermittedCarriers>
                                    </air:AirSearchModifiers>
                                    <com:SearchPassenger Code="ADT"/>
                                    <air:AirPricingModifiers FaresIndicator="AllFares" ETicketability="Yes"/>
                        </air:LowFareSearchReq>
            </soap:Body>
</soap:Envelope>

"""
headers1 = {'User-Agent':'TravelportIndia','Content-type':'text/xml;charset=\"utf-8\"','Content-length': '%d' % len(data),'Authorization':'Basic %s'%(uapikey),'Accept-Encoding':'gzip'}
headers = {'User-Agent':'TravelportIndia','Content-type':'text/xml;charset=\"utf-8\"','Content-length': '%d' % len(data),'Authorization':'Basic %s'%(uapikey), 'Accept-Encoding':'gzip','Connection':'Keep-Alive'}


# Without Connection Pooling
print "End point : uAPI Direct APAC VIP "
#print "Sending request without connection pooling ..."
for x in range(0,250):
    starttime = time.time()
    r = requests.post(url, data=data, headers=headers1)
    endtime = time.time()
    uapitime = re.findall(r'ResponseTime=("([^"])*")', r.text)
    print "Request %s took %.3f seconds" % (str(x), endtime - starttime)
    print "uAPI Time %s" % (uapitime)
    #print r.headers
    #print r.text

# With Connection pooling
session = requests.Session()
print "End point : uAPI Direct APAC VIP "
print "Sending request WITH connection pooling ..."
for x in range(0,250):
    starttime = time.time()
    r = session.post(url, data=data, headers=headers)
    endtime = time.time()
    uapitime = re.findall(r'ResponseTime=("([^"])*")', r.text)
    print "Request %s took %.3f seconds" % (str(x), endtime - starttime)
    print "uAPI Time %s" % (uapitime)
    #print r.headers
    #print r.text
Reply
#2
BeautifulSoup is the tool.

from bs4 import BeautifulSoup
import requests

html = requests.get(url, headers=headers).content # or whatever attributes are needed

soup = BeautifulSoup(html, 'lxml')

response_t = soup.find('air:lowfaresearchrsp')['responsetime']
print(response_t)
"As they say in Mexico 'dosvidaniya'. That makes two vidaniyas."
https://freedns.afraid.org
Reply
#3
sweet.. I did look at BeautifulSoup last night but could not figure it out... more reading and learning.. Thanks for your help

Hmmm... Ok I get the following when I run the code..

End point : uAPI Direct APAC VIP
Request 0 took 0.834 seconds
Traceback (most recent call last):
File "dir-uapitest.py", line 52, in <module>
response_t = soup.find('air:lowfaresearchrsp')['responsetime']
TypeError: 'NoneType' object has no attribute '__getitem__'

from bs4 import BeautifulSoup
import xml.dom.minidom, time, requests, re

uapikey = "KEYREMOVED" #Base 64 encoded Username and Password

url = "https://apac.universal-api.travelport.com/B2BGateway/connect/uAPI/AirService"

data = """
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
            <soap:Body>
                        <air:LowFareSearchReq xmlns:air="http://www.travelport.com/schema/air_v45_0" AuthorizedBy="user" TargetBranch="P1618717" TraceId="roventrace" SolutionResult="true" xmlns:com="http://www.travelport.com/schema/common_v45_0">
                                    <com:BillingPointOfSaleInfo OriginApplication="uAPI"/>
                                    <air:SearchAirLeg>
                                                <air:SearchOrigin>
                                                            <com:CityOrAirport Code="BLR"/>
                                                </air:SearchOrigin>
                                                <air:SearchDestination>
                                                            <com:CityOrAirport Code="DEL"/>
                                                </air:SearchDestination>
                                                <air:SearchDepTime PreferredTime="2018-07-17">
                                  </air:SearchDepTime>
                                    </air:SearchAirLeg>
                                    <air:AirSearchModifiers MaxSolutions="10">
                                                <air:PreferredProviders>
                                                            <com:Provider Code="1G"/>
                                                </air:PreferredProviders>
                                                <air:PermittedCarriers>
                                                            <com:Carrier Code="AI"/>
                                                </air:PermittedCarriers>
                                    </air:AirSearchModifiers>
                                    <com:SearchPassenger Code="ADT"/>
                                    <air:AirPricingModifiers FaresIndicator="AllFares" ETicketability="Yes"/>
                        </air:LowFareSearchReq>
            </soap:Body>
</soap:Envelope>

"""
headers1 = {'User-Agent':'TravelportIndia','Content-type':'text/xml;charset=\"utf-8\"','Content-length': '%d' % len(data),'Authorization':'Basic %s'%(uapikey),'Accept-Encoding':'gzip'}
headers = {'User-Agent':'TravelportIndia','Content-type':'text/xml;charset=\"utf-8\"','Content-length': '%d' % len(data),'Authorization':'Basic %s'%(uapikey), 'Accept-Encoding':'gzip','Connection':'Keep-Alive'}

html = requests.get(url, headers=headers1).content
soup = BeautifulSoup(html, 'lxml')

# Without Connection Pooling
print "End point : uAPI Direct APAC VIP"
#print "Sending request without connection pooling ..."
for x in range(0,250):
    starttime = time.time()
    r = requests.post(url, data=data, headers=headers1)
    endtime = time.time()
    print "Request %s took %.3f seconds" % (str(x), endtime - starttime)
    response_t = soup.find('air:lowfaresearchrsp')['responsetime']
    print(response_t)
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  .findAll() Truman 8 5,309 Nov-17-2018, 01:27 AM
Last Post: snippsat
  Different Output of findall and search in re module shiva 1 2,303 Mar-12-2018, 08:39 PM
Last Post: snippsat

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020