Python Forum
Scrap Yahoo Finance using BS4 - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html)
+--- Thread: Scrap Yahoo Finance using BS4 (/thread-12402.html)



Scrap Yahoo Finance using BS4 - mr_byte31 - Aug-23-2018

Hi All,

I am trying to scrap some information from yahoo finance.
I could collect what I want but I have an issue to scrap the price !
I attached a picture to show the information about the website code :
https://finance.yahoo.com/quote/INTC/key-statistics?p=INTC


[attachment=461]
LastPrice = soup.find_all('span',attrs={'class':'Trsdu(0.3s) Fw(b) Fz(36px) Mb(-4px) D(ib)'})
print(LastPrice)
the output is always empty
Output:
[]
any reason why it is empty ?


RE: Scrap Yahoo Finance using BS4 - buran - Aug-23-2018

don't parse the website with BeautifulSoup. Have a better look at the requests that are send via browser and you will see you can get directly all information as json


RE: Scrap Yahoo Finance using BS4 - mr_byte31 - Aug-23-2018

(Aug-23-2018, 10:48 AM)buran Wrote: don't parse the website with BeautifulSoup. Have a better look at the requests that are send via browser and you will see you can get directly all information as json

JSON doesn't have all information that exists on the website! Many missing information. you can look for the moving average 50 and 200. they don't exist !
thats why I want to scrape the website better.


RE: Scrap Yahoo Finance using BS4 - Larz60+ - Aug-23-2018

Quote:don't parse the website with BeautifulSoup. Have a better look at the requests that are send via browser and you will see you can get directly all information as json
How is this done?, I don't know about this


RE: Scrap Yahoo Finance using BS4 - buran - Aug-23-2018

(Aug-23-2018, 05:47 PM)Larz60+ Wrote: How is this done?, I don't know about this

[attachment=463]

as you can see there are several json files that are used to transfer information, e.g.

https://query1.finance.yahoo.com/v8/finance/chart/INTC?region=US&lang=en-US&includePrePost=false&interval=2m&range=1d

you can have a more detail look at them if you wish, to see available info. However OP was right - I was not able to find 50-day and 200-day MA. Maybe I didn't search thoroughly or they are calculated.

I also have script to download option chains using these json files:
https://github.com/boyank/yoc


RE: Scrap Yahoo Finance using BS4 - Larz60+ - Aug-23-2018

Ok, now I understand, this applies to the specific URL.
I was thinking (hoping) there might have a hidden json menu or button on my browser.


RE: Scrap Yahoo Finance using BS4 - mr_byte31 - Aug-24-2018

(Aug-23-2018, 06:16 PM)buran Wrote:
(Aug-23-2018, 05:47 PM)Larz60+ Wrote: How is this done?, I don't know about this



as you can see there are several json files that are used to transfer information, e.g.

https://query1.finance.yahoo.com/v8/finance/chart/INTC?region=US&lang=en-US&includePrePost=false&interval=2m&range=1d

you can have a more detail look at them if you wish, to see available info. However OP was right - I was not able to find 50-day and 200-day MA. Maybe I didn't search thoroughly or they are calculated.

I also have script to download option chains using these json files:
https://github.com/boyank/yoc

The 50-day and 200-day MA are collected from other websites.
Yahoo states this : 3 Data derived from multiple sources or calculated by Yahoo Finance.

I still need the help for my Beautiful soup scraping !
do anyone know to get the price from the link I shared above ?


RE: Scrap Yahoo Finance using BS4 - Larz60+ - Aug-24-2018

see: https://python-forum.io/Thread-Web-Scraping-part-1
and
https://python-forum.io/Thread-Web-Scraping-part-2

also take a look at: https://gist.github.com/scrapehero/516fc801a210433602fe9fd41a69b496
which does it all using lxml

but best to use Buran's suggestion
you can fetch the json data with requests, then to load simple as
import json

with open(filename) as fp:
    mydata = json.load(fp)