Python Forum
Extracting Data from Calendar - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html)
+--- Thread: Extracting Data from Calendar (/thread-24467.html)



Extracting Data from Calendar - AgileAVS - Feb-15-2020

I have a problem in scraping data from website which do not change the URLs when we choose data from ceratin dates with the calendar option given on the website. I have tried opening up the Network option from Developer Tools but that did not help as the new link which it showed had no values in it.
With reference to this thread:
https://stackoverflow.com/questions/54560084/python-web-scraping-interacting-with-calendar/54616315#54616315
Could someone please explain me on how to use this on other website:
https://www.sharesansar.com/today-share-price
This is my code but it only prints the date and not the data:
from bs4 import BeautifulSoup
import requests

url = 'https://www.sharesansar.com/today-share-price'
dates = ['2020-02-12', '2020-02-13']

for date in dates:
    req = requests.post(url, {'date' : date})      #Sends data to the server
    soup = BeautifulSoup(req.content,'lxml') #Response in bytes

    print(f'\n{date}\n')

    for article in soup.find_all('table', class_='table table-bordered table-striped table-hover dataTable compact'):
        data = article.text
        print(data)



RE: Extracting Data from Calendar - snippsat - Feb-15-2020

Look at Web-scraping part-2 God dammit JavaScript, why do i not get all content.
So Selenium would work.

Here a way it can be done if looking at site and try to understand what's going on.
So same cookie and token in a Ajax call do work for different dates,then it can be done like this(the harder way Doh if not using Selenium).
Parsing summary data at bottom of page.
import requests
from bs4 import BeautifulSoup

def stock_data(date):
    cookies = {
        '__cfduid': 'd9beb9712d05492d153761becc05fa2f01581793084',
        'XSRF-TOKEN': 'eyJpdiI6Ilc2VTVTRmd2OWJQQlwvakxmRHMzMGNRPT0iLCJ2YWx1ZSI6IkRLMHQ0Y1MzVUY5bVl4d2UwWVNrSlBGS3hNSmJaXC85MjZTNDNCUFlEM3V3ajV6RDFzS1dxYWY4SXFINFJFUVY5IiwibWFjIjoiYmFmZmVhMzRlNDcxZWU0N2FmN2EzOTQ1MmU1NmEwYTI1MzViM2UyOTA3ZTZhM2Y0YWVjMzYwMjQ2ZDkwYWI0ZSJ9',
        'sharesansar_session': 'eyJpdiI6IkxINjZUakgzdTR3eWplbWVXSThhUEE9PSIsInZhbHVlIjoiM3FqdFIxMnpBbmt5UFFVZXZlN2NNbWpYbkZ1WVljcG5YSGRFNVNpQ1RIV0laRW5DcVZONG1VSXJSTEdLS1NVWCIsIm1hYyI6IjJlMjEwN2RhNWExYzczYjNiMzhmMTkzMmM0ODc4YmFkZWFmNmFiNTMxZjYzNTAxYjM4NGJlMGFlMzM3ZDM1YzcifQ%3D%3D',
    }

    headers = {
        'Connection': 'keep-alive',
        'Accept': '*/*',
        'X-Requested-With': 'XMLHttpRequest',
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36',
        'Sec-Fetch-Site': 'same-origin',
        'Sec-Fetch-Mode': 'cors',
        'Referer': 'https://www.sharesansar.com/today-share-price',
        'Accept-Encoding': 'gzip, deflate, br',
        'Accept-Language': 'nb-NO,nb;q=0.9,no;q=0.8,nn;q=0.7,en-US;q=0.6,en;q=0.5',
    }

    params = (
        ('_token', 'nhM8Q4PLiX1hZu1K8W8yCqfP0iF9IsEgnO90e8dT'),
        ('sector', 'all_sec'),
        ('date', date),
    )

    response = requests.get('https://www.sharesansar.com/ajaxtodayshareprice', headers=headers, params=params, cookies=cookies)
    return response

def parse_data(response, date):
    soup = BeautifulSoup(response.content, 'lxml')
    numb_of_compaines = soup.find_all('h4', class_="text-right")
    total_traded_shares = soup.find_all('h4', class_="text-right")
    print(f'-----| {date} |-----')
    print(numb_of_compaines[0].text)
    print(total_traded_shares[1].text)

if __name__ == '__main__':
    for day in range(1,4):
        date = f'2020-02-1{day}'
        response = stock_data(date)
        parse_data(response, date)
Output:
-----| 2020-02-11 |----- Total number of Compaines: 175 Total Traded Shares: 3,996,877 -----| 2020-02-12 |----- Total number of Compaines: 173 Total Traded Shares: 3,986,769 -----| 2020-02-13 |----- Total number of Compaines: 171 Total Traded Shares: 3,720,085



RE: Extracting Data from Calendar - AgileAVS - Feb-16-2020

(Feb-15-2020, 08:35 PM)snippsat Wrote: Look at Web-scraping part-2 God dammit JavaScript, why do i not get all content.
So Selenium would work.

Here a way it can be done if looking at site and try to understand what's going on.
So same cookie and token in a Ajax call do work for different dates,then it can be done like this(the harder way Doh if not using Selenium).
Parsing summary data at bottom of page.
import requests
from bs4 import BeautifulSoup

def stock_data(date):
    cookies = {
        '__cfduid': 'd9beb9712d05492d153761becc05fa2f01581793084',
        'XSRF-TOKEN': 'eyJpdiI6Ilc2VTVTRmd2OWJQQlwvakxmRHMzMGNRPT0iLCJ2YWx1ZSI6IkRLMHQ0Y1MzVUY5bVl4d2UwWVNrSlBGS3hNSmJaXC85MjZTNDNCUFlEM3V3ajV6RDFzS1dxYWY4SXFINFJFUVY5IiwibWFjIjoiYmFmZmVhMzRlNDcxZWU0N2FmN2EzOTQ1MmU1NmEwYTI1MzViM2UyOTA3ZTZhM2Y0YWVjMzYwMjQ2ZDkwYWI0ZSJ9',
        'sharesansar_session': 'eyJpdiI6IkxINjZUakgzdTR3eWplbWVXSThhUEE9PSIsInZhbHVlIjoiM3FqdFIxMnpBbmt5UFFVZXZlN2NNbWpYbkZ1WVljcG5YSGRFNVNpQ1RIV0laRW5DcVZONG1VSXJSTEdLS1NVWCIsIm1hYyI6IjJlMjEwN2RhNWExYzczYjNiMzhmMTkzMmM0ODc4YmFkZWFmNmFiNTMxZjYzNTAxYjM4NGJlMGFlMzM3ZDM1YzcifQ%3D%3D',
    }

    headers = {
        'Connection': 'keep-alive',
        'Accept': '*/*',
        'X-Requested-With': 'XMLHttpRequest',
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36',
        'Sec-Fetch-Site': 'same-origin',
        'Sec-Fetch-Mode': 'cors',
        'Referer': 'https://www.sharesansar.com/today-share-price',
        'Accept-Encoding': 'gzip, deflate, br',
        'Accept-Language': 'nb-NO,nb;q=0.9,no;q=0.8,nn;q=0.7,en-US;q=0.6,en;q=0.5',
    }

    params = (
        ('_token', 'nhM8Q4PLiX1hZu1K8W8yCqfP0iF9IsEgnO90e8dT'),
        ('sector', 'all_sec'),
        ('date', date),
    )

    response = requests.get('https://www.sharesansar.com/ajaxtodayshareprice', headers=headers, params=params, cookies=cookies)
    return response

def parse_data(response, date):
    soup = BeautifulSoup(response.content, 'lxml')
    numb_of_compaines = soup.find_all('h4', class_="text-right")
    total_traded_shares = soup.find_all('h4', class_="text-right")
    print(f'-----| {date} |-----')
    print(numb_of_compaines[0].text)
    print(total_traded_shares[1].text)

if __name__ == '__main__':
    for day in range(1,4):
        date = f'2020-02-1{day}'
        response = stock_data(date)
        parse_data(response, date)
Output:
-----| 2020-02-11 |----- Total number of Compaines: 175 Total Traded Shares: 3,996,877 -----| 2020-02-12 |----- Total number of Compaines: 173 Total Traded Shares: 3,986,769 -----| 2020-02-13 |----- Total number of Compaines: 171 Total Traded Shares: 3,720,085

Thank you so so so much, it had been months since I had been trying to do this. I had tried to learn selenium but I failed to do so, anyways thank you so so much.