![]() |
parsing comment tag - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html) +--- Thread: parsing comment tag (/thread-7709.html) |
parsing comment tag - ian - Jan-22-2018 I am trying to parse a webpage with python 3.6.2 with BeautifulSoup4. What I need is to retrieve the text of news headlines that are in comment tags (<!-- .. -->). such as on https://ca.finance.yahoo.com/quote/GOOG/news?p=GOOG Are there any sample codes ? Thanks. RE: parsing comment tag - snippsat - Jan-22-2018 What have you tried? Here a hint <!-- react-text: 226 --> So news are generated be JavaScript using React. Can BeautifulSoup alone parse JavaScript? The answer is no ![]() Search this forum for Selenium, there are many examples using it alone or together with BeautifulSoup. RE: parsing comment tag - Larz60+ - Jan-22-2018 You should take a look at: http://meumobi.github.io/stocks%20apis/2016/03/13/get-realtime-stock-quotes-yahoo-finance-api.html Especially the legal parts. Looks like it's OK to extract the dta so long as it's not for commercial use., but not otherwise. But it does give some basic information on extracting what you are interested in. |