Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Allintitle Search with Python
#1
I'm trying to use python to perform an allintitle google search and then return the number of results.

I'm able to use requests to perform the google search, and then I tried using bs4 to find the div/class, but I'm not able to get it to work.

I'm kind of new to both python and webscraping, but this seems like something that should be possible to do, but I'm stuck on ideas of what to try out next.

Any help or ideas to try would be greatly appreciated.
Reply
#2
I don't really know what you mean by "allintitle google search", but doesn't Google have a search API that you can use instead of scraping? APIs are meant for programmatic use, of course.
Reply
#3
(Jan-10-2021, 08:14 AM)ndc85430 Wrote: I don't really know what you mean by "allintitle google search"
allintitle search

(Jan-10-2021, 06:32 AM)lradue Wrote: I'm able to use requests to perform the google search, and then I tried using bs4 to find the div/class, but I'm not able to get it to work.
Will not work if not use with standard Requests and BS scraping,have to us Selenium(do deal with JavaScript return).
Look at this link

As mention there is an API Search JSON API
The can only use Requests and get json back.
Quick example,api key and cx you get bye follow instruction
import requests

api_key = 'your key'
cx = 'your cx'
search = 'allintitle:car bmw'

response = requests.get(f'https://www.googleapis.com/customsearch/v1?key={api_key}&cx={cx}&q={search}')
Usage test:
>>> response
<Response [200]>

>>> data = response.json()
>>> print(data['items'][0]['title'])
BMW X6 M: A powerful sports car that fails to deliver - MarketWatch
>>> print(data['items'][0]['htmlTitle'])
<b>BMW</b> X6 M: A powerful sports <b>car</b> that fails to deliver - MarketWatch
>>> print(data['items'][0]['link'])
https://www.google.com/newsstand/s/posts/CAIiEEPZe5Su2nbj_ZPYU2sYsHUqGAgEKg8IACoHCAowjujJATDXzBUwiJS0AQ/BMW+X6+M%3A+A+powerful+sports+car+that+fails+to+deliver
Reply
#4
I'm not sure, I will look into the Google API stuff to see if there's a solution there for this, thanks.
Reply
#5
(Jan-10-2021, 02:29 PM)snippsat Wrote:
(Jan-10-2021, 08:14 AM)ndc85430 Wrote: I don't really know what you mean by "allintitle google search"
allintitle search

(Jan-10-2021, 06:32 AM)lradue Wrote: I'm able to use requests to perform the google search, and then I tried using bs4 to find the div/class, but I'm not able to get it to work.
Will not work if not use with standard Requests and BS scraping,have to us Selenium(do deal with JavaScript return).
Look at this link

As mention there is an API Search JSON API
The can only use Requests and get json back.
Quick example,api key and cx you get bye follow instruction
import requests

api_key = 'your key'
cx = 'your cx'
search = 'allintitle:car bmw'

response = requests.get(f'https://www.googleapis.com/customsearch/v1?key={api_key}&cx={cx}&q={search}')
Usage test:
>>> response
<Response [200]>

>>> data = response.json()
>>> print(data['items'][0]['title'])
BMW X6 M: A powerful sports car that fails to deliver - MarketWatch
>>> print(data['items'][0]['htmlTitle'])
<b>BMW</b> X6 M: A powerful sports <b>car</b> that fails to deliver - MarketWatch
>>> print(data['items'][0]['link'])
https://www.google.com/newsstand/s/posts/CAIiEEPZe5Su2nbj_ZPYU2sYsHUqGAgEKg8IACoHCAowjujJATDXzBUwiJS0AQ/BMW+X6+M%3A+A+powerful+sports+car+that+fails+to+deliver

Ok, I'll take a look at the selenium solution, thanks.
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020