Selenium get data from newly accessed page - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html) +--- Thread: Selenium get data from newly accessed page (/thread-21681.html) |
Selenium get data from newly accessed page - hoff1022 - Oct-09-2019 Don't know how to get data from a newly loaded/clicked page using Selenium. I have 2 questions: 1) I load a page with Selenium (asp) enter some value and a new page loads. On this new page I want to access this data: <td class="queryfield" valign="top">05/09/2019 </td>. How do I do that? How do I make selenium grab data from the 'newly' loaded page and not the original page and how do I access this item specifically - looking to return the 05/09/2019 data. 2) How do I make Selenium wait until the page loads? I currently use sleep(.5) - see my code below. Thanks. [inline] FMCSA_SNAPSHOT_WEBSITE = "https://safer.fmcsa.dot.gov/CompanySnapshot.aspx" driver = webdriver.Firefox() driver.get(FMCSA_SNAPSHOT_WEBSITE) sleep(.5) input_dot = driver.find_element_by_id('4') sleep(.5) input_dot.send_keys(dot) #use 348313 for dot value as test case input_dot.send_keys(Keys.ENTER) # how do I load data from this new page?[/inline] RE: Selenium get data from newly accessed page - metulburr - Oct-09-2019 check the proper waiting methods instead of using time sleep. A 1/2 of a second might not be long enough to obtain the page. It all depends on the internet connection and speed, and computer. That is why using time.sleep is not a good option. This loads the pages from thereafter execution of whatever driver.get(FMCSA_SNAPSHOT_WEBSITE) For example, if hitting search and loads a new page, driver should contain that information when you get the HTML driver.page_source . You just have to make sure to properly wait for it. On occasions this may not work as sometimes the new windows is in a different frame or new tab in which you would have to switch to it to get the content.Another alternative is to use the embedded API by inputting the content in directly in the URL, such as searching for Jim and selecting name would provide the URL of https://safer.fmcsa.dot.gov/keywordx.asp?searchstring=%2AJIM%2A&SEARCHTYPE= and it immediately goes to what i am assuming is the page you are referring to. RE: Selenium get data from newly accessed page - hoff1022 - Oct-09-2019 @metulburr 1) https://safer.fmcsa.dot.gov/keywordx.asp?searchstring=%2AJIM%2A&SEARCHTYPE= this query only searches by the "name" = "JIM" what if I want to use the search by "USDOT Number" not name? 2) also when I do driver.page_source I get data on the original page "https://safer.fmcsa.dot.gov/CompanySnapshot.aspx" not the newly loaded page with the data. How do I get data from the newly loaded page with the info: Entity Type: CARRIER Operating Status: ACTIVE Out of Service Date: None Legal Name: PRINTLINK SHORT RUN BUSINESS FORMS DBA Name: PRINTLINK PALMER Physical Address: 309 FRITZ KEIPER BLVD BATTLE CREEK, MI 49037 Phone: (269) 965-1336 Mailing Address: P O BOX 428 BATTLE CREEK, MI 49016-0428 USDOT Number: 348313 State Carrier ID Number: MC/MX/FF Number(s): DUNS Number: 72-574-015 Power Units: 1 Drivers: 2 MCS-150 Form Date: 05/09/2019 MCS-150 Mileage (Year): 52,025 (2013) |