Python Forum

Full Version: parsing comment tag
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I am trying to parse a webpage with python 3.6.2 with BeautifulSoup4.
What I need is to retrieve the text of news headlines that are in comment tags (<!-- .. -->). such as on https://ca.finance.yahoo.com/quote/GOOG/news?p=GOOG
Are there any sample codes ? Thanks.
What have you tried?

Here a hint <!-- react-text: 226 -->
So news are generated be JavaScript using React.
Can BeautifulSoup alone parse JavaScript?
The answer is no Hand

Search this forum for Selenium,
there are many examples using it alone or together with BeautifulSoup.
You should take a look at: http://meumobi.github.io/stocks%20apis/2...e-api.html
Especially the legal parts. Looks like it's OK to extract the dta so long as it's not for commercial use., but not otherwise.
But it does give some basic information on extracting what you are interested in.