Nov-09-2017, 05:13 AM
(This post was last modified: Nov-09-2017, 05:13 AM by Prince_Bhatia.)
This is my another post related to python POST METHOD. i am trying to web scrape the website which is :https://maharerait.mahaonline.gov.in/SearchList/Search
I edited my this post after doing research and reached till here but still unable to loop through list
please select the registered agents first and then you can send 4 digits number
In the website it requires to send 4 digits number to get the data which starts from a500.
now i have an csv file starting from first cell and the range is from a500 to a599.
After sending 4 digits numbers some has data and some doesn't have data. i have written a code but it is not printing anything in CSV file, can anyone tell me where i am doing mistake.
attached is my csv file
I edited my this post after doing research and reached till here but still unable to loop through list
please select the registered agents first and then you can send 4 digits number
In the website it requires to send 4 digits number to get the data which starts from a500.
now i have an csv file starting from first cell and the range is from a500 to a599.
After sending 4 digits numbers some has data and some doesn't have data. i have written a code but it is not printing anything in CSV file, can anyone tell me where i am doing mistake.
import requests from bs4 import BeautifulSoup import csv final_data = [] file = [] url = "https://maharerait.mahaonline.gov.in/SearchList/Search" response = requests.get(url) data = response.text soup = BeautifulSoup(data, "html.parser") RequestVerificationToken = soup.find(attrs={"name":"__RequestVerificationToken"})['value'] filename = "Agents.csv" f = open(filename, "r") file_data = f.read() datas = file_data.split() file.append(datas) print(file) #f.close() for title in range(len(file)):#it doesnt loops all data search_item = title headers = {"user-agent":"Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36", "content-type":"application/x-www-form-urlencoded", "accept":"text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8"} formfields= {"__RequestVerificationToken":RequestVerificationToken, "Type":"Agent", "ID":"0", "pageTraverse":"1", "Project":'', "hdnProject":'', "Promoter":'', "hdnPromoter":'', "CertiNo":search_item, "hdnCertiNo":search_item, "Division":'', "hdnDivision":'', "hdnDistrict":'', "hdnProject":'', "hdnDTaluka":'', "hdnVillage":'', "District":'', "Taluka":'', "Village":'', "CompletionDate_From":'', "hdnfromdate":'', "CompletionDate_To":'', "hdntodate":'', "PType":'', "hdnPType":'', "btnSearch":"Search"} #"TotalRecords":"", #"CurrentPage":"1", #"TotalPages":"1"} r = requests.post(url, data=formfields, headers=headers) data = r.text #print(data) soup = BeautifulSoup(data, "html.parser") get_details =soup.find_all(class_="grid-wrap") for details in get_details: text = details.find_all("tr")[1:] count = 0 for tds in text: td = tds.find_all("td")[1] #rera = td.find_all("span") rnumber = "" for num in td: rnumber = num.replace("\n","") sublist = [] sublist.append(rnumber) name = tds.find_all("td")[2] #prj_name = name.find_all("span") prj = "" for prjname in name: prj = prjname.replace("\n","") sublist.append(prj) final_data.append(sublist) #get_details() filename = "maharera_agents5.csv" with open("./"+filename, "w") as csvfile: csvfile = csv.writer(csvfile, delimiter=",") csvfile.writerow("") for i in range(0, len(final_data)): csvfile.writerow(final_data[i])i have removed functions and made it simple and i was getting confused with functions
attached is my csv file
Attached Files
Agents.csv (Size: 600 bytes / Downloads: 645)