Oct-04-2017, 10:25 AM
I have a CSV file which has list of url that needed to be scraped. Website i am scraping is http://www.rera-rajasthan.in/ProjectSearch which is real estate website which has property name and a link which has property details. I was able to scrape those links into csv, now i need to loop through all the links which i extracted for further web scraping.
This website requires post method to search project. I applied same method on the extracted links too.
But when i run this code it prints nothing :
i am attaching the CSV. Can someone please guide?
This website requires post method to search project. I applied same method on the extracted links too.
But when i run this code it prints nothing :
import requests from bs4 import BeautifulSoup from urllib.request import urlopen import csv import json #links = [] links = [] reranumber = [] table_attr = {"class":"table table-bordered"} with open("RajLinks.csv", newline= '') as f: reader = csv.reader(f) for row in reader: reranumber = row[0] link = row[1] links.append(link) def getData(url): url = "http://www.rera-rajasthan.in/Home/GetProjectsList" user_agent = {"User-Agent":"Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:55.0) Gecko/20100101 Firefox/55.0"} payload = {'certificateNo': '', 'PageSize': '50', 'District': '', 'v': '', 'projectName': '', 'promoterName': '', 'page': '1', 'tehsil': ''} r = requests.post(url, headers=user_agent, params=payload) data = r.text return data #getdata for sublist in links: htmldata = getData(link) soup = BeautifulSoup(htmldata, "html.parser") tables = soup.find_all("table", table_attr) for table in tables: txt = table.text if txt.find("Contact Address"): trs = table.find_all("tr") for data in trs: name = data[1].text print(name)it should print first tr in contact address that it founds. i am extracting the links column
i am attaching the CSV. Can someone please guide?