Jan-15-2019, 04:31 PM
Hi,
I'm trying to scrap a remote batch server's jobs from it's XML webpage.
Running the code line by line in the interpreter fails with traceback at
Can anyone see what I am doing wrong? TIA
This is my code:
I'm trying to scrap a remote batch server's jobs from it's XML webpage.
Running the code line by line in the interpreter fails with traceback at
tree = ET.parse(r.content)It appears the second call (get) has not been able to use the session cookie as per the traceback (Please log in to THE Batch Server with valid user...)
Can anyone see what I am doing wrong? TIA
This is my code:
import requests import requests.packages.urllib3 from lxml import html from lxml import etree import xml.etree.ElementTree as ET requests.packages.urllib3.disable_warnings() # create a session s = requests.Session() # make a login POST request, using the session s.post("https://server01/login.jsp", data=dict(username="UserA", password="PasswordA"), verify=False) # subsequent requests that use the session will automatically handle cookies r = s.get("https://serverA/admin?action=xmlStatus", cookies=s.cookies) # if [print(s.cookies)] a JSESSIONID is returned # print Batch Jobs tree = ET.parse(r.content) batch_jobs = tree.xpath("//div[@id='collapsible2']/div[1]/div[2]/div[1]/span[2]/text()") print (batch_jobs)This is the Traceback I get:
Error:Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib64/python3.6/xml/etree/ElementTree.py", line 1196, in parse
tree.parse(source, parser)
File "/usr/lib64/python3.6/xml/etree/ElementTree.py", line 586, in parse
source = open(source, "rb")
FileNotFoundError: [Errno 2] No such file or directory: b'\r\n\r\n<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">\r\n<html>\r\n <head>\r\n\t<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">\r\n <title>THE Batch Server</title>\r\n <link rel="stylesheet" type="text/css" href="css/page.css"/>\r\n </head>\r\n <body>\r\n <h1> <img height="22" width="22" src="images/batchdefault.png"/> THE Batch Server Login</h1>\r\n \r\n \r\n Please log in to THE Batch Server with valid user which has access to use case \'Scheduled Jobs\'.\r\n \r\n <p/>\r\n <form method="post" action="login">\r\n <table border="0">\r\n <tbody>\r\n <tr>\r\n <td>User ID</td>\r\n <td>\r\n \r\n <input type="text" name="username">\r\n \r\n </td>\r\n </tr>\r\n <tr>\r\n <td>Password</td>\r\n <td><input type="password" name="password"></td>\r\n </tr>\r\n </tbody>\r\n </table>\r\n <input type="submit" value="Submit">\r\n </form>\r\n </body>\r\n</html>'