Bottom Page

Thread Rating:
  • 3 Vote(s) - 2.67 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 Selenium Parsing (unable to Parse page after loading)
#1
Im trying to scrape a torrentz url, I get the Page loading html instead of search result html, tried to put sleep time didnt worked out.

anyone knows how to do
import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup
import requests

browser = webdriver.Firefox()
browser.get("https://torrentz.eu/")
time.sleep(10)

#class selenium.webdriver.support.expected_conditions.title_contains(Torrent)

search = browser.find_element_by_id('thesearchbox')
search.send_keys('xxxxx')
search.send_keys(Keys.RETURN) # hit return after you enter search text
time.sleep(10)
    
tempurl = browser.current_url
print(tempurl)
tempcont = requests.get(tempurl, timeout=10)
soup = BeautifulSoup(tempcont.content, "html.parser")

print(soup.prettify())
Quote
#2
Make sure your ISP doesn't block torrent sites.

You can wait till the element is visible.
try waits

search  = WebDriverWait(browser,5).until(lambda x: x.find_element_by_id('thesearchbox')
# where WebDriverWait(DRIVER,TIMEOUT_SECONDS)
browser.find_element_by_id('thesearchbutton').click()
It keeps handling NoSuchElementException error for the specified amount of seconds.
metulburr likes this post
Quote
#3
I can able to get search result, and see the result page, code execute perfectly till
print(tempurl)
next two line does not give me a parse html code of Search result instead I get html code of Loading page

you can try this code for yourself

is their any way around

(Jan-07-2019, 07:34 AM)hbknjr Wrote: Make sure your ISP doesn't block torrent sites.

You can wait till the element is visible.
try waits

search  = WebDriverWait(browser,5).until(lambda x: x.find_element_by_id('thesearchbox')
# where WebDriverWait(DRIVER,TIMEOUT_SECONDS)
browser.find_element_by_id('thesearchbutton').click()
It keeps handling NoSuchElementException error for the specified amount of seconds.
Quote
#4
torrentz.eu is no longer active

https://tribune.com.pk/story/1156409/tor...one-knows/

Quote:Although the home page of Torrentz.eu is still active, it has completely disabled its search functionality and has removed all torrent links. It is still not clear why the website has been shut down.
Quote
#5
(Jan-08-2019, 04:15 AM)metulburr Wrote: torrentz.eu is no longer active https://tribune.com.pk/story/1156409/tor...one-knows/
Quote:Although the home page of Torrentz.eu is still active, it has completely disabled its search functionality and has removed all torrent links. It is still not clear why the website has been shut down.
i have tried with torrentz2.eu same problem
Quote
#6
your code does work with that site for me. it printed out the html. However you dont need requests or beautifulsoup, if you use selenium. Selenium can make requests and parse html. You can also do it in the background so it doesnt bring up a browser. You should also use wait instead of time sleep. It will be faster.

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

 
browser = webdriver.Firefox()
browser.get("https://torrentz2.eu/")
WebDriverWait(browser, 10).until(EC.presence_of_element_located((By.ID, 'search')))
 
#class selenium.webdriver.support.expected_conditions.title_contains(Torrent)
 
search = browser.find_element_by_id('thesearchbox')
search.send_keys('xxxxx')
search.send_keys(Keys.RETURN) # hit return after you enter search text
WebDriverWait(browser, 10).until(EC.presence_of_element_located((By.CLASS_NAME, 'results')))
     

Quote
#7
(Jan-11-2019, 03:39 PM)metulburr Wrote: your code does work with that site for me. it printed out the html. However you dont need requests or beautifulsoup, if you use selenium. Selenium can make requests and parse html. You can also do it in the background so it doesnt bring up a browser. You should also use wait instead of time sleep. It will be faster.

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

 
browser = webdriver.Firefox()
browser.get("https://torrentz2.eu/")
WebDriverWait(browser, 10).until(EC.presence_of_element_located((By.ID, 'search')))
 
#class selenium.webdriver.support.expected_conditions.title_contains(Torrent)
 
search = browser.find_element_by_id('thesearchbox')
search.send_keys('xxxxx')
search.send_keys(Keys.RETURN) # hit return after you enter search text
WebDriverWait(browser, 10).until(EC.presence_of_element_located((By.CLASS_NAME, 'results')))
     


Thank you Smile Smile Smile

by the time read your post i found another way around
code = browser.page_source
this code have helped me read the html content

Thanks for this generous help
Quote

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  Unable to access javaScript generated data with selenium and headless FireFox. pjn4 0 92 Aug-04-2019, 11:10 AM
Last Post: pjn4
  Unable to switch out of nested frames into main page abi17124 0 85 Jul-17-2019, 06:06 PM
Last Post: abi17124
  Django Two blocks of dynamic content on one page iFunKtion 5 493 Jul-04-2019, 02:31 AM
Last Post: noisefloor
  Python/BeautiifulSoup. list of urls ->parse->extract data to csv. getting ERROR IanTheLMT 2 215 Jul-04-2019, 02:31 AM
Last Post: IanTheLMT
  Can't get method to scroll down page. caarsonr 5 324 Jun-20-2019, 09:14 PM
Last Post: caarsonr
  Parsing infor from scraped files. Larz60+ 2 316 Apr-12-2019, 05:06 PM
Last Post: Larz60+
  Fetching and Parsing XML Data FalseFact 3 346 Apr-01-2019, 10:21 AM
Last Post: Larz60+
  page navigation & form filling rudolphyaber 0 224 Mar-13-2019, 06:31 PM
Last Post: rudolphyaber
  Sorting getting off, when I switch page Django 1.11 m0ntecr1st0 0 232 Feb-12-2019, 06:40 PM
Last Post: m0ntecr1st0
  unable to import pymysql IMuriel 3 645 Jan-08-2019, 08:56 PM
Last Post: IMuriel

Forum Jump:


Users browsing this thread: 1 Guest(s)