Redirect Url - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html) +--- Thread: Redirect Url (/thread-23701.html) |
Redirect Url - calancathy - Jan-13-2020 Hi, I am trying to capture the [Save As] file to dataframe for the below url. 'https://www.nseindia.com/api/historical/cm/equity?symbol=INFY&series=[%22EQ%22]&from=01-12-2019&to=31-12-2019&csv=true' Pls share me the python code for my request. Thanks in Advance. RE: Redirect Url - Larz60+ - Jan-13-2020 please show us what you've tried so far. If you need basics on web scraping, please see these two threads: web scraping part 1 web scraping part 2 RE: Redirect Url - calancathy - Jan-13-2020 Hi, Beautiful Soup wont be best for this approach. Because, as soon as when you hit the URL then csv file downloaded to the default download location and there is no option to capture the page data. my approach is need to call the url by passing the required parameters and downloaded csv file content to be captured in the dataframe. Thanks for shaing the two web crawling link. Any alternate approach would be better. Thanks /* Python Script */ import requests import pandas as pd from datetime import date, timedelta import urllib as u import wget pd.options.display.width = 1500 pd.options.display.max_rows = 1000 pd.options.display.max_columns = 50 pd.options.display.max_colwidth = 75 url = "https://www.nseindia.com/api/equity-stockIndices?index=SECURITIES%20IN%20F%26O" headers = { "user-agent" : "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.117 Safari/537.36" , "accept-encoding" : "gzip, deflate", "accept-language" : "en-US,en;q=0.9"} cookie_dict = { 'bm_sv' : "93ECE40F315004D8086198FE4F2FAAFF~hVZZVk7LTIm/q1Wmp/HrMN12nAXZSE0FQZm7ForAw2DCWqAc5GZWhYxDtHCMq3S2X6HmPBhbmqHEUMA348kdDtLmFV4lizcCvxdC+xcKgBJ6B5AhDKN9UOlU2/kL0xbFNicj1pv6n9ezYv1PPSoOEL35C/FD6R7rFRz1qHWlRVc"} def prevMonth_Fist_Last_Date (): last_day_of_prev_month = date.today().replace(day=1) - timedelta(days=1) start_day_of_prev_month = date.today().replace(day=1) - timedelta(days=last_day_of_prev_month.day) last_day_of_prev_month = last_day_of_prev_month.strftime("%d-%m-%Y") start_day_of_prev_month = start_day_of_prev_month.strftime( def downloadHistoricalData(ind): fdate, ldate = prevMonth_Fist_Last_Date () session = requests.session() for cookie in cookie_dict: session.cookies.set(cookie, cookie_dict[cookie]) url = "https://www.nseindia.com/api/historical/cm/equity?symbol=" + ind + "&series=[%22EQ%22]&from=" + fdate + "&to=" + ldate + "&csv=true" r = requests.get(url, allow_redirects=True, data={'download_open': 'Download', 'format_open': '.csv'}) redirect = requests.get(url).content print(redirect) print(url) print (r) print(downloadHistoricalData('Infy') |