Selenium Parsing (unable to Parse page after loading) - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html) +--- Thread: Selenium Parsing (unable to Parse page after loading) (/thread-15171.html) |
Selenium Parsing (unable to Parse page after loading) - oneclick - Jan-07-2019 Im trying to scrape a torrentz url, I get the Page loading html instead of search result html, tried to put sleep time didnt worked out. anyone knows how to do import time from selenium import webdriver from selenium.webdriver.common.keys import Keys from bs4 import BeautifulSoup import requests browser = webdriver.Firefox() browser.get("https://torrentz.eu/") time.sleep(10) #class selenium.webdriver.support.expected_conditions.title_contains(Torrent) search = browser.find_element_by_id('thesearchbox') search.send_keys('xxxxx') search.send_keys(Keys.RETURN) # hit return after you enter search text time.sleep(10) tempurl = browser.current_url print(tempurl) tempcont = requests.get(tempurl, timeout=10) soup = BeautifulSoup(tempcont.content, "html.parser") print(soup.prettify()) RE: Selenium Parsing (unable to Parse page after loading) - hbknjr - Jan-07-2019 Make sure your ISP doesn't block torrent sites. You can wait till the element is visible. try waits search = WebDriverWait(browser,5).until(lambda x: x.find_element_by_id('thesearchbox') # where WebDriverWait(DRIVER,TIMEOUT_SECONDS) browser.find_element_by_id('thesearchbutton').click()It keeps handling NoSuchElementException error for the specified amount of seconds. RE: Selenium Parsing (unable to Parse page after loading) - oneclick - Jan-08-2019 I can able to get search result, and see the result page, code execute perfectly till print(tempurl)next two line does not give me a parse html code of Search result instead I get html code of Loading page you can try this code for yourself is their any way around (Jan-07-2019, 07:34 AM)hbknjr Wrote: Make sure your ISP doesn't block torrent sites. RE: Selenium Parsing (unable to Parse page after loading) - metulburr - Jan-08-2019 torrentz.eu is no longer active https://tribune.com.pk/story/1156409/torrent-search-engine-torrentz-eu-offline-no-one-knows/ Quote:Although the home page of Torrentz.eu is still active, it has completely disabled its search functionality and has removed all torrent links. It is still not clear why the website has been shut down. RE: Selenium Parsing (unable to Parse page after loading) - oneclick - Jan-11-2019 (Jan-08-2019, 04:15 AM)metulburr Wrote: torrentz.eu is no longer active https://tribune.com.pk/story/1156409/torrent-search-engine-torrentz-eu-offline-no-one-knows/i have tried with torrentz2.eu same problemQuote:Although the home page of Torrentz.eu is still active, it has completely disabled its search functionality and has removed all torrent links. It is still not clear why the website has been shut down. RE: Selenium Parsing (unable to Parse page after loading) - metulburr - Jan-11-2019 your code does work with that site for me. it printed out the html. However you dont need requests or beautifulsoup, if you use selenium. Selenium can make requests and parse html. You can also do it in the background so it doesnt bring up a browser. You should also use wait instead of time sleep. It will be faster. from selenium import webdriver from selenium.webdriver.common.keys import Keys from selenium.webdriver.common.by import By from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC browser = webdriver.Firefox() browser.get("https://torrentz2.eu/") WebDriverWait(browser, 10).until(EC.presence_of_element_located((By.ID, 'search'))) #class selenium.webdriver.support.expected_conditions.title_contains(Torrent) search = browser.find_element_by_id('thesearchbox') search.send_keys('xxxxx') search.send_keys(Keys.RETURN) # hit return after you enter search text WebDriverWait(browser, 10).until(EC.presence_of_element_located((By.CLASS_NAME, 'results'))) RE: Selenium Parsing (unable to Parse page after loading) - oneclick - Jan-13-2019 (Jan-11-2019, 03:39 PM)metulburr Wrote: your code does work with that site for me. it printed out the html. However you dont need requests or beautifulsoup, if you use selenium. Selenium can make requests and parse html. You can also do it in the background so it doesnt bring up a browser. You should also use wait instead of time sleep. It will be faster. Thank you by the time read your post i found another way around code = browser.page_sourcethis code have helped me read the html content Thanks for this generous help RE: Selenium Parsing (unable to Parse page after loading) - tomalex - Oct-30-2020 (Jan-07-2019, 03:08 AM)oneclick Wrote: Im trying to scrape a torrentz url, I get the Page loading html instead of search result html, tried to put sleep time didnt worked out. Amazing one sir I have used it before but seems like this website is not working any more could you please create for this one? torrentzeu.org Thank You. |