Beautiful Soup find_all() - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html) +--- Thread: Beautiful Soup find_all() (/thread-19098.html) |
Beautiful Soup find_all() - kirito85 - Jun-13-2019 Hi, I have problems doing webscrapping with this code from Google Search. I am looking to get the top 10 titles from any search made in Google Search but all i get is the following: I even tried the search string to h3 tag but code doesnt even run then. Appreciate any help thanks. My search results are: Enter your search string : engineer ['engineer0 - Google Search'] ['engineer1 - Google Search'] ['engineer2 - Google Search'] ['engineer3 - Google Search'] ['engineer4 - Google Search'] ['engineer5 - Google Search'] ['engineer6 - Google Search'] ['engineer7 - Google Search'] ['engineer8 - Google Search'] ['engineer9 - Google Search'] titles = soup.find_all("title") for title in titles: print(title.contents) title_list.append(title) RE: Beautiful Soup find_all() - snippsat - Jun-13-2019 If you look at data you get back when using Requests and Beautiful Soup. You will see that's is a big mess,this is because Google search use JavaScript heavily. You will not find h3 tag or title (only for search word) in output.I have done this task for before i 2-3 year ago for this question then did i used PhantomJS. Now today is Selenium with Chrome and FireFox web-driver what's used. Look at Web-scraping part-2. Under: Quote:God dammit JavaScript, why do i not get all content Test my old code rewrite to use Selenium,search for python forum I only post output,give it some effort and try to use these tool yourself.
RE: Beautiful Soup find_all() - kirito85 - Jun-14-2019 Hi snippsat, thanks for the reply. I was using some chromedrive codes previously but did not know what it was for. I will look at your old post first. Thanks. |