Posts: 2,953
Threads: 48
Joined: Sep 2016
Google is using JS a lot. I wrote translation script back in the days and I was wondering why it worked only with the downloaded page.
Posts: 5,151
Threads: 396
Joined: Sep 2016
Dec-22-2017, 12:50 AM
(This post was last modified: Dec-22-2017, 06:40 PM by metulburr.)
(Dec-21-2017, 09:10 PM)DevinGP Wrote: (Dec-21-2017, 07:33 PM)metulburr Wrote: Then it probably is using javscript and you are only left with selenium as an option.
I didnt know the results might be javascript though.
Do you mind telling me how I would implement Selenium into my current code or at least pointing me to a tutorial on someone using it to scrape the titles and summaries? Thank you!
from selenium import webdriver
import time
from bs4 import BeautifulSoup
DRIVERPATH = '/home/metulburr/chromedriver'
class Data:
def __init__(self, search):
self.url = 'https://www.google.com/'
self.setup_driver(self.url)
#self.browser.delete_all_cookies()
self.search = search
self.handle_search()
self.get_data()
time.sleep(1010000000)
def get_data(self):
soup = BeautifulSoup(self.browser.page_source, 'html.parser')
divs = soup.find_all('div', {'class':'g'})
for div in divs:
print(div.a.text)
print(div.a['href'])
desc = div.find('span', {'class':'st'})
print(desc.text)
def handle_search(self):
self.browser.find_element_by_xpath('//*[@id="lst-ib"]').click()
self.browser.find_element_by_id("lst-ib").send_keys(self.search)
time.sleep(1)
self.browser.find_element_by_xpath('//*[@id="sbtc"]/div[2]/div[2]/div[1]/div/ul/li[7]/div/span[1]/span/input').click()
time.sleep(1)
def setup_driver(self, url):
self.browser = webdriver.Chrome(DRIVERPATH)
self.browser.set_window_position(0,0)
self.browser.get(self.url)
data = Data('python forum')
data.browser.quit()
Output: Forums | Python.org
https://www.python.org/community/forums/
The official home of the Python Programming Language.
Python Forum
https://python-forum.io/
The official forum for Python programming language.
What are the best Python forums to hang out in? : Python - Reddit
https://www.reddit.com/r/Python/comments/1lqr8j/what_are_the_best_python_forums_to_hang_out_in/
I'm a noob, I've completed the Python Codecademy course and as much as I practice, I feel that I'll learn much quicker if I talk to other Python...
Python For Beginners Forum | Codecademy
https://www.codecademy.com/en/forums/python-for-beginners
Why am I getting a ValueError for this array? B1671a1b260e84a63ac485357c417c42?s=140&d=retro HighKingOfGondor over 2 years ago. 0 answers. Can anyone tell me why this won't work? Picture Samik Maini over 2 years ago. 1 answer. Web Scraping Homework - Error. 526ff7f280ff3369df003e2c_43087875 ...
Python Forum | Dream.In.Code
http://www.dreamincode.net/forums/forum/29-python/
This sub-forum is for Python programmers and professionals to discuss topical and non-help related Python topics, start and participate in fun challenges (NOT HOMEWORK), and share news about the languages and related technologies. Please Post HOMEWORK/ACADEMIC questions in the main Python Help Forum.
13 Answers - What are some active Python forums for beginners? - Quora
https://www.quora.com/What-are-some-active-Python-forums-for-beginners
Here are some great places to get started: * Python FAQs * Attend a Conference * Diversity Statement Success Stories > My experience with the Python community has been awesome. I have met some fantastic people through local meetups and gotten grea...
Python Programming - Dev Shed Forums
http://forums.devshed.com/python-programming-11/
Python Programming - Python Programming forum discussing coding techniques, tips and tricks, and Zope related information. Python was designed from the.
FlaskBB - A Lightweight Forum Software in Python
https://flaskbb.org/
FlaskBB - A Lightweight Forum Software in Python.
Python - Raspberry Pi Forums
https://www.raspberrypi.org/forums/viewforum.php?f=32
STICKY: Python Usage Guide by ben_nuttall » Tue Jun 10, 2014 5:29 pm. 24 Replies: 49647 Views: Last post by gkreidl. Mon May 15, 2017 6:33 am. STICKY: Use code tags when posting Python code by mahjongg » Wed Aug 13, 2014 11:33 pm. 0 Replies: 14426 Views: Last post by mahjongg. Wed Aug 13, 2014 11:33 ...
Python - CodingForums
https://www.codingforums.com/python/
Model.predict() always returning the same value of 1 for opencv. Started by nastyheatnor, 12-14-2017 09:17 AM. box, face, import, print, python. Replies: 0; Views: 172; Rating0 / 5. Last Post By. nastyheatnor · View Profile · View Forum Posts. 12-14-2017, 09:17 AM Go to last post ...
Recommended Tutorials:
Posts: 22
Threads: 9
Joined: Nov 2017
The search results are generated with JavaScript and bs4 can't render JavaScript.
Posts: 2,953
Threads: 48
Joined: Sep 2016
BeautifulSoup doesn't render anything. It parses the file and creates a tree. And gives you methods to search that tree.
Posts: 7,310
Threads: 123
Joined: Sep 2016
(Dec-22-2017, 06:19 AM)RickyWilson Wrote: The search results are generated with JavaScript and bs4 can't render JavaScript. That's why first use Selenium with PhantomJS,
then give source code to BeautifulSoup for parsing as shown bye metulburr.
time.sleep(1010000000)
Posts: 5,151
Threads: 396
Joined: Sep 2016
Quote:time.sleep(1010000000) Sleepy Sleepy
oh yeah i forgot to take that our before posting. I use that to simplify for general answering questions otherwise i do use WebDriverWait/EC/NoSuchElementException etc.
Recommended Tutorials:
Posts: 7,310
Threads: 123
Joined: Sep 2016
Dec-22-2017, 10:23 PM
(This post was last modified: Dec-22-2017, 10:23 PM by snippsat.)
An other one one,i drop to do the search with Chrome/Phantom and use search?q=
Have added next page search Google_Search('python forum', page=1) .
from selenium import webdriver
from bs4 import BeautifulSoup
class Google_Search:
def __init__(self, search, page=0):
self.search = search
self.page = page
self.url = f'https://www.google.com/search?q={self.search}\
&ei=m3w9WuyXNJHMwALelovwAQ&start={str(self.page)+"0"}&sa=N&biw=848&bih=972'
self.result()
def result(self):
browser = webdriver.PhantomJS()
browser.get(self.url)
soup = BeautifulSoup(browser.page_source, 'lxml')
name_link = soup.find_all('h3', class_='r')
link = soup.find_all('cite')
for n_link, l in zip(name_link,link):
print(f'{n_link.text}\n{l.text}')
print('---------')
if __name__ == '__main__':
Google_Search('python forum')
#Google_Search('python forum', page=1) Output: Forums | Python.org
https://www.python.org/community/forums/
---------
Python Forum
https://python-forum.io/
---------
What are the best Python forums to hang out in? : Python - Reddit
https://www.reddit.com/.../Python/.../what_are_the_best_python_forums_to_ hang_out_in/
---------
Python Forum | Dream.In.Code
www.dreamincode.net/forums/forum/29-python/
---------
Python For Beginners Forum | Codecademy
https://www.codecademy.com/en/forums/python-for-beginners
---------
Python Syntax Forum | Codecademy
https://www.codecademy.com/.../forums/introduction-to-python-6WeG3
---------
Nytt Norsk Python Forum - Scriptingspråk (Python, Perl, Ruby o.l ...
https://www.diskusjon.no/index.php?showtopic=828151
---------
Python Programming - Dev Shed Forums
forums.devshed.com/python-programming-11/
---------
Python - Raspberry Pi Forums
https://www.raspberrypi.org/forums/viewforum.php?f=32
---------
Python - thenewboston Forum
https://thenewboston.com/forum/category.php?id=15
---------
|