Python Forum
Click a button to get next page - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html)
+--- Thread: Click a button to get next page (/thread-9525.html)



Click a button to get next page - ian - Apr-14-2018

I am working up to the following code and get the page with the button. I need to click it to go next page. I prefer to use Requests or BeautifulSoup. I also tried to install mechanize failed with error "mechanize only works on python 2.x". I use python 3.6.2

soup = BeautifulSoup(page.content, "html.parser")
html = list(soup.children)[1]
body = list(soup.children)[2]
btn = body.find("button")
print(btn)
<button class="actionbtn" id="getlinks" type="submit"><table class="gl1"><tr><td class="gl2"><img alt="Get download links" src="img/links.png"/></td><td class="gl3">Get download links!<br/></td></tr></table></button>
>>>


RE: Click a button to get next page - snippsat - Apr-14-2018

(Apr-14-2018, 03:32 PM)ian Wrote: I need to click it to go next page. I prefer to use Requests or BeautifulSoup.
Not a job that suits these well.

If you need to interact with a web-page: click buttons, scroll etc - you need to use a tool that utilizes a real browser, like Selenium.
I more about this in Web-scraping part-2


RE: Click a button to get next page - ian - Apr-14-2018

This button's type is 'submit'. I'm wondering if I can use requests.Session().Post


RE: Click a button to get next page - snippsat - Apr-14-2018

(Apr-14-2018, 05:05 PM)ian Wrote: This button's type is 'submit'. I'm wondering if I can use requests.Session().Post
Submit a form requests is possibly with Requests.
Example with this form Pen
With Requests:
import requests
 
headers = {'Content-type': 'application/x-www-form-urlencoded'}
data = {"email": "[email protected]"}
response = requests.post('http://127.0.0.1:5000/email', headers=headers, data=data)
So this example i have run before,it will send data to a Flask and catch with email = request.form['email'] on server.


RE: Click a button to get next page - ian - Apr-14-2018

I got to try Selenium. It asks for webdriver. I tried to install but hanging. I tried download webdriver for Edge, Ie, Firefox and Chrome all the same. I use Windows 10.


RE: Click a button to get next page - snippsat - Apr-15-2018

(Apr-14-2018, 08:57 PM)ian Wrote: I tried to install but hanging.
I don't know what you mean bye hanging.

As first test,do this pip install -U selenium.
C:\1_py\scrape
λ pip install -U selenium
Collecting selenium
  Downloading selenium-3.11.0-py2.py3-none-any.whl (943kB)
    100% |████████████████████████████████| 952kB 484kB/s
Installing collected packages: selenium
  Found existing installation: selenium 3.9.0
    Uninstalling selenium-3.9.0:
      Successfully uninstalled selenium-3.9.0
Successfully installed selenium-3.11.0
Download chromedriver,unzip to a folder eg C:\scrape
In same folder you have script under.
C:\scrape
  |-- get_doc.py
  |-- chromedriver.exe
# get_doc.py
from selenium import webdriver
import time

browser = webdriver.Chrome()
browser.get("http://www.python.org")
time.sleep(5)
doc = browser.find_elements_by_xpath('//*[@id="top"]/nav/ul/li[3]/a')[0]
doc.click()
time.sleep(5)
browser.quit()
Now you run get_doc.py,do this from command line.
C:\1_py\scrape
λ python get_doc.py
It should start browser,and after 5-sec it click on Docs so you are in Python 3.6.5 documentation.


RE: Click a button to get next page - ian - Apr-15-2018

Works great! Thank you very much.
I thought I need to manually install webdrivers but command_line window popup and hanging there.

Now I can find buttons/links and click going next page ok.
Is there a way to replace time.sleep in your sample with another one so I can search elements as soon as next page ALL loaded. I tried 'Wait.until' but cannot figure out how to wait until ALL elements loaded. Similar to Microsoft Powershell While ($ie.busy) { Start-Sleep -Seconds 1 } Thanks.


RE: Click a button to get next page - snippsat - Apr-15-2018

(Apr-15-2018, 06:42 PM)ian Wrote: Is there a way to replace time.sleep in your sample with another one so I can search elements as soon as next page ALL loaded. I tried 'Wait.until' but cannot figure out how to wait until ALL elements loaded. Similar to Microsoft Powershell While ($ie.busy) { Start-Sleep -Seconds 1 } Thanks.
There are two kind of waits(doc) explicit waits and implicit waits.
So time.sleep(...) is a forced wait no matter how fast the site load elements,
can be used as first test,but then it can be better to look at waits.