Python Forum

Full Version: Searching yahoo with selenium
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2
The last time I was doing an automatization of search ( on google ) it was with bs4 and webbrowser only. Now I'm trying to use selenium first. So, I want to achieve this:
1. open a browser
2. open yahoo search page
3. search for term 'seleniumhq'
4. open first 5 results in 5 different tabs

in later stage when I improve the algorithm planning to bundle it into functions and with input() but first want my basic code to work. Help is appreciated.

code:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import bs4

browser = webdriver.Firefox()
browser.get('http://www.yahoo.com')
assert 'Yahoo' in browser.title

elem = browser.find_element_by_name('p') # find the search box
res = elem.send_keys('seleniumhq' + Keys.RETURN)

soup = bs4.BeautifulSoup(res.text, "html.parser")
linkElems = soup.select('a')
for link in linkElems:
	print(f'link: {link}')
numOpen = min(5, len(linkElems))
for i in range(numOpen):
	webbrowser.open(linkElems[i].get('href'))

browser.quit
error:
Error:
Traceback (most recent call last): File "C:\Python36\kodovi\sel4.py", line 12, in <module> soup = bs4.BeautifulSoup(res.text, "html.parser") AttributeError: 'NoneType' object has no attribute 'text'
According to this res is nontype object. Don't know why and don't know how to make it type so that code can be fully conducted ( although not sure that the rest is good ).
find_element_by_name is not tag name, if that's what you're looking for.

browser.find_element_by_name('darla-assets-js-top')

as in:
Output:
<div id="darla-assets-js-top">
to find tag name, use: find_element_by_tag_name
In line 9 I'm looking for a search box.
And later I want to use BeautifulSoup to parse the result page and open 5 results.
I think that line 9 is fine as it manages to find the search box and do the search in the next line.
But...from line 12 trouble begins...
(Oct-10-2018, 11:54 PM)Truman Wrote: [ -> ]res = elem.send_keys('seleniumhq' + Keys.RETURN)

soup = bs4.BeautifulSoup(res.text, "html.parser")


What does element.send_keys() return?
It returns search result.
Are you sure? The error says it returns None.
that's line 12, line 11 works just fine. Although I don't quite understand this error.
(Oct-10-2018, 11:54 PM)Truman Wrote: [ -> ]soup = bs4.BeautifulSoup(res.text, "html.parser") AttributeError: 'NoneType' object has no attribute 'text'
(Oct-10-2018, 11:54 PM)Truman Wrote: [ -> ]res = elem.send_keys('seleniumhq' + Keys.RETURN)

Right, so res is None, and None doesn't have a .text, so elem.send_keys isn't returning a new page.

Which makes sense, because why would typing into a random text field render a new page? That's crazy talk lol.
Try it yourself. Make a file of first 10 lines and run it.
res will return None as mention.
This is not how you send source to BS,it's done by using browser.page_source.
If i test it also have to push agree button before can move on.
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import bs4
 
browser = webdriver.Chrome()
browser.get('http://www.yahoo.com')
assert 'Yahoo' in browser.title
 
agree =  browser.find_element_by_xpath('/html/body/div[1]/div[2]/div[4]/div/div[2]/form[1]/div/input')
agree.click()
elem = browser.find_element_by_name('p') # find the search box
res = elem.send_keys('seleniumhq' + Keys.RETURN)
print(repr(res))
Output:
None
It should look like this.
soup = bs4.BeautifulSoup(browser.page_source, "html.parser")
Pages: 1 2