Python requests to loop over the list and click button

Prince_Bhatia · (This post was last modified: Oct-26-2017, 07:52 AM by Prince_Bhatia.)

i edited my orginal post after doing hard research

This is my first post in stack overflow. For my question i read this post request using python to asp.net page i found what i was looking for but some minor help needed

My question is i want to web scrape a website http://up-rera.in/ , by clicking inspect element websites throws to a different link which is this: http://upreraportal.cloudapp.net/View_projects.aspx

it is using Aspx

My query is how can i loop on all the drop down and click search to get the page content , for example select agra and click search and you will get page details

Since this is my learning phase so i am avoiding selenium as of now to get page details.

Is there any one that can guide me correctly and help me to amend my code which are mentioned below:

import requests
from bs4 import BeautifulSoup
import os
import time
import csv
from lxml import html


url = "http://upreraportal.cloudapp.net/View_projects.aspx"

headers= {'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
          'Accept-Encoding':'gzip, deflate',
          'Accept-Language':'en-US,en;q=0.8',
          'Cache-Control':'max-age=0',
          'Connection':'keep-alive',
          'Content-Length':'2922',
          'Content-Type':'application/x-www-form-urlencoded',
          'Cookie':'ASP.NET_SessionId=syk2ntjk3ceffibbwq4nert4',
          'Host':'upreraportal.cloudapp.net',
          'Origin':'http://upreraportal.cloudapp.net',
          'Referer':'http://upreraportal.cloudapp.net/View_projects.aspx',
          'Upgrade-Insecure-Requests':'1',
          'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36'}

formfields={'VIEWSTATE':'9VAv5iAKM/uLKHgQ6U91ShYmoKdKfrPqrxB2y86PhSY8pOPAulcgfrsPDINzwmvXGr+vdlE7FT6eBQCKtAFsJPQ9gQ9JIBTBCGCIjYwFuixL3vz6Q7R0OZTH2cwYmyfPHLOqxh8JbLDfyKW3r3e2UgP5N/4pI1k6DNoNAcEmNYGPzGwFHgUdJz3LYfYuFDZSydsVrwSB5CHAy/pErTJVDMmOackTy1q6Y+TNw7Cnq2imnKnBc70eldJn0gH/rtkrlPMS+WP3CXke6G7nLOzaUVIlnbHVoA232CPRcWuP1ykPjSfX12hAao6srrFMx5GUicO3Dvpir+z0U1BDEjux86Cu5/aFML2Go+3k9iHiaS3+WK/tNNui5vNAbQcPiZrnQy9wotJnw18bfHZzU/77uy22vaC+8vX1cmomiV70Ar33szSWTQjbrByyhbFbz9PHd3IVebHPlPGpdaUPxju5xkFQIJRnojsOARjc76WzTYCf479BiXUKNKflMFmr3Fp5S3BOdKFLBie1fBDgwaXX4PepOeZVm1ftY0YA4y8ObPxkJBcGh5YLxZ4vJr2z3pd8LT2i/2fyXJ9aXR9+SJzlWziu9bV8txiuJHSQNojr10mQv8MSCUAKUjT/fip8F3UE9l+zeQBOC++LEeQiTurHZD0GkNix8zQAHbNpGLBfvgocXZd/4KqqnBCLLwBVQobhRbJhbQJXbGYNs6zIXrnkx7CD9PjGKvRx9Eil19Yb5EqRLJQHSg5OdwafD1U+oyZwr3iUMXP/pJw5cTHMsK3X+dH4VkNxsG+KFzBzynKPdF17fQknzqwgmcQOxD6NN6158pi+9cM1UR4R7iwPwuBCOK04UaW3V1A9oWFGvKLls9OXbLq2DS4L3EyuorEHnxO+p8rrGWIS4aXpVVr4TxR3X79j4i8OVHhIUt8H+jo5deRZ6aG13+mXgZQd5Qu1Foo66M4sjUGs7VUcwYCXE/DP/NHToeU0hUi0sJs7+ftRy07U2Be/93TZjJXKIrsTQxxeNfyxQQMwBYZZRPPlH33t3o3gIo0Hx18tzGYj2v0gaBb+xBpx9mU9ytkceBdBPnZI1kJznArLquQQxN3IPjt6+80Vow74wy4Lvp7D+JCThAnQx4K8QbdKMWzCoKR63GTlBwLK2TiYMAVisM77XdrlH6F0g56PlGQt/RMtU0XM1QXgZvWr3KJDV8UTe0z1bj29sdTsHVJwME9eT62JGZFQAD4PoiqYl7nAB61ajAkcmxu0Zlg7+9N9tXbL44QOcY672uOQzRgDITmX6QdWnBqMjgmkIjSo1qo/VpUEzUXaVo5GHUn8ZOWI9xLrJWcOZeFl0ucyKZePMnIxeUU32EK/NY34eE6UfSTUkktkguisYIenZNfoPYehQF9ASL7t4qLiH5jca4FGgZW2kNKb3enjEmoKqbWDFMkc8/1lsk2eTd/GuhcTysVSxtvpDSlR0tjg8A2hVpR67t2rYm8iO/L1m8ImY48=',
            "__VIEWSTATEGENERATOR":'4F1A7E70',
            '__EVENTVALIDATION':'jVizPhFNJmo9F/GVlIrlMWMsjQe1UKHfYE4jlpTDfXZHWu9yAcpHUvT/1UsRpbgxYwZczJPd6gsvas8ilVSPkfwP1icGgOTXlWfzykkU86LyIEognwkhOfO1+suTK2e598vAjyLXRf555BXMtCO+oWoHcMjbVX2cHKtpBS1GyyqyyVB8IchAAtDEMD3G5bbzhvof6PX4Iwt5Sv1gXkHRKOR333OcYzmSGJvZgLsmo3qQ+5EOUIK5D71x/ZENmubZXvwbU0Ni6922E96RjCLh5cKgFSne5PcRDUeeDuEQhJLyD04K6N45Ow2RKyu7HN1n1YQGFfgAO3nMCsP51i7qEAohXK957z3m/H+FasHWF2u05laAWGVbPwT35utufotpPKi9qWAbCQSw9vW9HrvN01O97scG8HtWxIOnOdI6/nhke44FSpnvY1oPq+BuY2XKrb2404fKl5EPR4sjvNSYy1/8mn6IDH0eXvzoelNMwr/pKtKBESo3BthxTkkx5MR0J42qhgHURB9eUKlsGulAzjF27pyK4vjXxzlOlHG1pRiQm/wzB4om9dJmA27iaD7PJpQGgSwp7cTpbOuQgnwwrwUETxMOxuf3u1P9i+DzJqgKJbQ+pbKqtspwYuIpOR6r7dRh9nER2VXXD7fRfes1q2gQI29PtlbrRQViFM6ZlxqxqoAXVM8sk/RfSAL1LZ6qnlwGit2MvVYnAmBP9wtqcvqGaWjNdWLNsueL6DyUZ4qcLv42fVcOrsi8BPRnzJx0YiOYZ7gg7edHrJwpysSGDR1P/MZIYFEEUYh238e8I2EAeQZM70zHgQRsviD4o5r38VQf/cM9fjFii99E/mZ+6e0mIprhlM/g69MmkSahPQ5o/rhs8IJiM/GibjuZHSNfYiOspQYajMg0WIGeKWnywfaplt6/cqvcEbqt77tIx2Z0yGcXKYGehmhyHTWfaVkMuKbQP5Zw+F9X4Fv5ws76uCZkOxKV3wj3BW7+T2/nWwWMfGT1sD3LtQxiw0zhOXfY1bTB2XfxuL7+k5qE7TZWhKF4EMwLoaML9/yUA0dcXhoZBnSc',
            'ctl00$ContentPlaceHolder1$DdlprojectDistrict':'Agra',
            'ctl00$ContentPlaceHolder1$txtProject': '',
            'ctl00$ContentPlaceHolder1$btnSearch':'Search'}

#i dont know how to make payload
resp = requests.post(url, data=formfields, headers=headers)
data=resp.text
#print(data)

soup = BeautifulSoup(data,"html.parser")
get_details = soup.find_all(id="ContentPlaceHolder1_GridView1")

print(get_details)# whenever i run this code it prints empty list
## i dont know how to get details of each list from the dropdown

This code prints the Html page which i dont need, i need:

How can i print the data for individual districts like getting page details and how can i loop for all the districts to get the data of all districts and all pages data.

hbknjr · Oct-26-2017, 09:09 AM

You can parse the website for information with bs4 but you'll need selenium for interaction with the website.

As for your current problem you need to find every dropdown option and then get its value attribute.

Something like.

get_details  = soup.find_all('option')   #gets list of all <option> tag
for element in get_details :
    print(element['value'])              #gets the value atribute of the option tag

Prince_Bhatia · Oct-26-2017, 10:17 AM

This is my concern actually ....i am trying different other ways to scrape a website, i want to try it with requests and bs4..but avoiding selenium, let me know sir , if you find a way to scrape through all the list

hbknjr · Oct-26-2017, 11:07 AM

See there's a difference between scraping data from a static page and interacting with the page and the elements on it.(using DOM and JavaScript)

Eg:
bs4 and requests modules can scrap data from both URLs below.
https://www.google.com and https://www.google.co.in/search?q=python.

But what they can't do is to type search query ("python") into the Google's search box and press enter.

For that, you need to use selenium, splinter or any other browser automation tool.

In your case, you are required to interact with the page in order to get to the URL where actual data resides.

Prince_Bhatia · Oct-26-2017, 12:15 PM

Hi sir,

Just a small mistake in codes , actually this was missing __ from viewstate and when i added that i was able to scrape for CITY AGRA, data came , now minor help , how can i loop around for all the states?

Here do we also need Selenium? loop through all the cities, below is the code, if we can get for one city we can also get for other cities also?:

import requests
from bs4 import BeautifulSoup
import os
import time
import csv

final= []

url = "http://upreraportal.cloudapp.net/View_projects.aspx"

headers= {'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
          'Content-Type':'application/x-www-form-urlencoded',
          'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36'}

formfields={'__VIEWSTATE':'9VAv5iAKM/uLKHgQ6U91ShYmoKdKfrPqrxB2y86PhSY8pOPAulcgfrsPDINzwmvXGr+vdlE7FT6eBQCKtAFsJPQ9gQ9JIBTBCGCIjYwFuixL3vz6Q7R0OZTH2cwYmyfPHLOqxh8JbLDfyKW3r3e2UgP5N/4pI1k6DNoNAcEmNYGPzGwFHgUdJz3LYfYuFDZSydsVrwSB5CHAy/pErTJVDMmOackTy1q6Y+TNw7Cnq2imnKnBc70eldJn0gH/rtkrlPMS+WP3CXke6G7nLOzaUVIlnbHVoA232CPRcWuP1ykPjSfX12hAao6srrFMx5GUicO3Dvpir+z0U1BDEjux86Cu5/aFML2Go+3k9iHiaS3+WK/tNNui5vNAbQcPiZrnQy9wotJnw18bfHZzU/77uy22vaC+8vX1cmomiV70Ar33szSWTQjbrByyhbFbz9PHd3IVebHPlPGpdaUPxju5xkFQIJRnojsOARjc76WzTYCf479BiXUKNKflMFmr3Fp5S3BOdKFLBie1fBDgwaXX4PepOeZVm1ftY0YA4y8ObPxkJBcGh5YLxZ4vJr2z3pd8LT2i/2fyXJ9aXR9+SJzlWziu9bV8txiuJHSQNojr10mQv8MSCUAKUjT/fip8F3UE9l+zeQBOC++LEeQiTurHZD0GkNix8zQAHbNpGLBfvgocXZd/4KqqnBCLLwBVQobhRbJhbQJXbGYNs6zIXrnkx7CD9PjGKvRx9Eil19Yb5EqRLJQHSg5OdwafD1U+oyZwr3iUMXP/pJw5cTHMsK3X+dH4VkNxsG+KFzBzynKPdF17fQknzqwgmcQOxD6NN6158pi+9cM1UR4R7iwPwuBCOK04UaW3V1A9oWFGvKLls9OXbLq2DS4L3EyuorEHnxO+p8rrGWIS4aXpVVr4TxR3X79j4i8OVHhIUt8H+jo5deRZ6aG13+mXgZQd5Qu1Foo66M4sjUGs7VUcwYCXE/DP/NHToeU0hUi0sJs7+ftRy07U2Be/93TZjJXKIrsTQxxeNfyxQQMwBYZZRPPlH33t3o3gIo0Hx18tzGYj2v0gaBb+xBpx9mU9ytkceBdBPnZI1kJznArLquQQxN3IPjt6+80Vow74wy4Lvp7D+JCThAnQx4K8QbdKMWzCoKR63GTlBwLK2TiYMAVisM77XdrlH6F0g56PlGQt/RMtU0XM1QXgZvWr3KJDV8UTe0z1bj29sdTsHVJwME9eT62JGZFQAD4PoiqYl7nAB61ajAkcmxu0Zlg7+9N9tXbL44QOcY672uOQzRgDITmX6QdWnBqMjgmkIjSo1qo/VpUEzUXaVo5GHUn8ZOWI9xLrJWcOZeFl0ucyKZePMnIxeUU32EK/NY34eE6UfSTUkktkguisYIenZNfoPYehQF9ASL7t4qLiH5jca4FGgZW2kNKb3enjEmoKqbWDFMkc8/1lsk2eTd/GuhcTysVSxtvpDSlR0tjg8A2hVpR67t2rYm8iO/L1m8ImY48=',
            "__VIEWSTATEGENERATOR":'4F1A7E70',
            '__EVENTVALIDATION':'jVizPhFNJmo9F/GVlIrlMWMsjQe1UKHfYE4jlpTDfXZHWu9yAcpHUvT/1UsRpbgxYwZczJPd6gsvas8ilVSPkfwP1icGgOTXlWfzykkU86LyIEognwkhOfO1+suTK2e598vAjyLXRf555BXMtCO+oWoHcMjbVX2cHKtpBS1GyyqyyVB8IchAAtDEMD3G5bbzhvof6PX4Iwt5Sv1gXkHRKOR333OcYzmSGJvZgLsmo3qQ+5EOUIK5D71x/ZENmubZXvwbU0Ni6922E96RjCLh5cKgFSne5PcRDUeeDuEQhJLyD04K6N45Ow2RKyu7HN1n1YQGFfgAO3nMCsP51i7qEAohXK957z3m/H+FasHWF2u05laAWGVbPwT35utufotpPKi9qWAbCQSw9vW9HrvN01O97scG8HtWxIOnOdI6/nhke44FSpnvY1oPq+BuY2XKrb2404fKl5EPR4sjvNSYy1/8mn6IDH0eXvzoelNMwr/pKtKBESo3BthxTkkx5MR0J42qhgHURB9eUKlsGulAzjF27pyK4vjXxzlOlHG1pRiQm/wzB4om9dJmA27iaD7PJpQGgSwp7cTpbOuQgnwwrwUETxMOxuf3u1P9i+DzJqgKJbQ+pbKqtspwYuIpOR6r7dRh9nER2VXXD7fRfes1q2gQI29PtlbrRQViFM6ZlxqxqoAXVM8sk/RfSAL1LZ6qnlwGit2MvVYnAmBP9wtqcvqGaWjNdWLNsueL6DyUZ4qcLv42fVcOrsi8BPRnzJx0YiOYZ7gg7edHrJwpysSGDR1P/MZIYFEEUYh238e8I2EAeQZM70zHgQRsviD4o5r38VQf/cM9fjFii99E/mZ+6e0mIprhlM/g69MmkSahPQ5o/rhs8IJiM/GibjuZHSNfYiOspQYajMg0WIGeKWnywfaplt6/cqvcEbqt77tIx2Z0yGcXKYGehmhyHTWfaVkMuKbQP5Zw+F9X4Fv5ws76uCZkOxKV3wj3BW7+T2/nWwWMfGT1sD3LtQxiw0zhOXfY1bTB2XfxuL7+k5qE7TZWhKF4EMwLoaML9/yUA0dcXhoZBnSc',
            'ctl00$ContentPlaceHolder1$DdlprojectDistrict':'Agra',
            'ctl00$ContentPlaceHolder1$txtProject': '',
            'ctl00$ContentPlaceHolder1$btnSearch':'Search'}

r = requests.post(url, data=formfields, headers=headers)
data=r.text

soup = BeautifulSoup(data, "html.parser")

get_list  = soup.find_all('option')   #gets list of all <option> tag
for element in get_list :
    cities = element["value"]
    final.append(cities)

get_details = soup.find_all("table", attrs={"id":"ContentPlaceHolder1_GridView1"})

for details in get_details:
    text = details.find_all("td")
    print(text)

Prince_Bhatia · Oct-27-2017, 01:34 PM

hi, i got it solved this is the solution,

import requests
from bs4 import BeautifulSoup

url = "http://upreraportal.cloudapp.net/View_projects.aspx"
response = requests.get(url).text
soup = BeautifulSoup(response,"html.parser")

VIEWSTATE = soup.select("#__VIEWSTATE")[0]['value']
EVENTVALIDATION = soup.select("#__EVENTVALIDATION")[0]['value']

for title in soup.select("#ContentPlaceHolder1_DdlprojectDistrict [value]")[:-1]:
    search_item = title.text
    # print(search_item)

    headers= {'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
              'Content-Type':'application/x-www-form-urlencoded',
              'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36'}

    formfields = {'__VIEWSTATE':VIEWSTATE,  #Put the value in this variable
                '__VIEWSTATEGENERATOR':'4F1A7E70',
                '__EVENTVALIDATION':EVENTVALIDATION, #Put the value in this variable
                'ctl00$ContentPlaceHolder1$DdlprojectDistrict':search_item,
                'ctl00$ContentPlaceHolder1$txtProject':'',
                'ctl00$ContentPlaceHolder1$btnSearch':'Search'}

    #here in form details check agra , i am able to scrape one city only,
    # how to loop for all cities
    res = requests.post(url, data=formfields, headers=headers).text
    soup = BeautifulSoup(res, "html.parser")

    get_list  = soup.find_all('option')   #gets list of all <option> tag
    for element in get_list :
        cities = element["value"]
        #final.append(cities)
        #print(final)

    get_details = soup.find_all("table", attrs={"id":"ContentPlaceHolder1_GridView1"})

    for details in get_details:
        text = details.find_all("tr")[1:]
        for tds in text:
            td = tds.find_all("td")[1]
            rera = td.find_all("span")
            rnumber = ""
            for num in rera:
                rnumber = num.text
                print(rnumber)

By this code i can scrape this website completely.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	cant click button by script at page	michael1834	1	2,297	Dec-08-2023, 04:44 PM Last Post: SpongeB0B
	Click on a button on web page using Selenium	Pavel_47	7	7,260	Jan-05-2023, 04:20 AM Last Post: ellapurnellrt
	POST requests - different requests return the same response	Default_001	3	2,927	Mar-10-2022, 11:26 PM Last Post: Default_001
	Show HTML in Python application and handle click	SamHobbs	2	3,948	Sep-28-2021, 06:27 PM Last Post: SamHobbs
	button click error	rafarangel	2	4,018	Feb-11-2021, 08:19 PM Last Post: buran
	Problem with logging in on website - python w/ requests	GoldeNx	6	6,757	Sep-25-2020, 10:52 AM Last Post: snippsat
	Log In Button Won't Click - Python Selenium Webdriver	samlee916	2	4,973	Jun-07-2020, 04:42 PM Last Post: samlee916
	How to click facebook message button	JanelleGuthrie	2	3,251	May-14-2020, 06:02 PM Last Post: Larz60+
	How to perform a successful login(signin) through Requests in Python	Kalet	1	3,036	Apr-24-2020, 01:44 AM Last Post: Larz60+
	Contact form button click action	Man_from_India	1	3,529	Feb-01-2020, 06:21 PM Last Post: snippsat

Python requests to loop over the list and click button

User Panel Messages

Announcements