Python Forum
parsing comment tag - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html)
+--- Thread: parsing comment tag (/thread-7709.html)



parsing comment tag - ian - Jan-22-2018

I am trying to parse a webpage with python 3.6.2 with BeautifulSoup4.
What I need is to retrieve the text of news headlines that are in comment tags (<!-- .. -->). such as on https://ca.finance.yahoo.com/quote/GOOG/news?p=GOOG
Are there any sample codes ? Thanks.


RE: parsing comment tag - snippsat - Jan-22-2018

What have you tried?

Here a hint <!-- react-text: 226 -->
So news are generated be JavaScript using React.
Can BeautifulSoup alone parse JavaScript?
The answer is no Hand

Search this forum for Selenium,
there are many examples using it alone or together with BeautifulSoup.


RE: parsing comment tag - Larz60+ - Jan-22-2018

You should take a look at: http://meumobi.github.io/stocks%20apis/2016/03/13/get-realtime-stock-quotes-yahoo-finance-api.html
Especially the legal parts. Looks like it's OK to extract the dta so long as it's not for commercial use., but not otherwise.
But it does give some basic information on extracting what you are interested in.