With Selenium create a google Search list in Incognito mode withe specific location, - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html) +--- Thread: With Selenium create a google Search list in Incognito mode withe specific location, (/thread-27605.html) |
With Selenium create a google Search list in Incognito mode withe specific location, - tsurubaso - Jun-13-2020 Hello to all a bit of Context: I have to do a deadly repetitive task for a colleague that begging Monday. Make a list of Search results in Google Search in Incognito/Secret mode. 300 results for 17 locations, for each Link, title, short description. ARGGHHHHH I want also to help my colleague not to explode. What I did until now: I tried to use Selenium a first time, wired errors (I will come back to that after) occurred. I switched to fake_useragent and BeautifulSoup. The code is working but I don't know if it is possible to implement location and Incognito mode. Here is the code: import urllib import csv import requests from fake_useragent import UserAgent from bs4 import BeautifulSoup import re csv_list = [["順位", "タイトル", "要約", "リンク", "関連キーワード"]] query = "'tour eifelle'" query = urllib.parse.quote_plus(query) # Format into URL encoding number_result = 20 ua = UserAgent() google_url = "https://www.google.com/search?q=" + query + "&num=" + str(number_result) response = requests.get(google_url, {"User-Agent": ua.random}) soup = BeautifulSoup(response.text, "html.parser") result_div = soup.find_all('div', attrs = {'class': 'ZINbbc'}) links = [] titles = [] descriptions = [] link2= "" for r in result_div: # Checks if each element is present, else, raise exception try: link = r.find('a', href = True) title = r.find('div', attrs={'class':'vvjwJb'}).get_text() description = r.find('div', attrs={'class':'s3v9rd'}).get_text() # Check to make sure everything is present before appending if link != '' and title != '' and description != '': link3= link['href'].lstrip('/url?q=') link2=re.sub(r'&sa.*',"",link3) links.append(link2) titles.append(title) descriptions.append(description) # Next loop if one element is not present except: continue #to_remove = [] #clean_links = [] #for i, l in enumerate(links): # clean = re.search('\/url\?q\=(.*)\&sa',l) # Anything that doesn't fit the above pattern will be removed # if clean is None: # to_remove.append(i) # continue # clean_links.append(clean.group(1)) # Remove the corresponding titles & descriptions #for x in to_remove: # del titles[x] # del descriptions[x] for i in range(len(titles)): add_list=[i+1,titles[i],descriptions[i],links[i]] csv_list.append(add_list) # タイトルリストをcsvに保存 with open('Search_word.csv','w',encoding="utf-8_sig") as f: writecsv = csv.writer(f, lineterminator='\n') writecsv.writerows(csv_list) #links #titles #descriptionsThen After that I tried to go back to Selenium Here is the code: import csv import time # スリープを使うために必要 from selenium import webdriver # Webブラウザを自動操作する(python -m pip install selenium) import chromedriver_binary # パスを通すためのコード def ranking(driver): i = 1 # ループ番号、ページ番号を定義 title_list = [] # タイトルを格納する空リストを用意 link_list = [] # URLを格納する空リストを用意 summary_list = [] RelatedKeywords = [] # 現在のページが指定した最大分析ページを超えるまでループする while i <= i_max: # タイトルとリンクはclass="r"に入っている class_group = driver.find_elements_by_class_name('r') class_group1 = driver.find_elements_by_class_name('s') class_group2 = driver.find_elements_by_class_name('nVcaUb') # タイトルとリンクを抽出しリストに追加するforループ for elem in class_group: title_list.append(elem.find_element_by_class_name('LC20lb').text) # タイトル(class="LC20lb") link_list.append(elem.find_element_by_tag_name('a').get_attribute('href')) # リンク(aタグのhref属性) for elem in class_group1: summary_list.append(elem.find_element_by_class_name('st').text) # リンク(aタグのhref属性) for elem in class_group2: RelatedKeywords.append(elem.text) # リンク(aタグのhref属性) # 「次へ」は1つしかないが、あえてelementsで複数検索。空のリストであれば最終ページの意味になる。 if driver.find_elements_by_id('pnnext') == []: i = i_max + 1 else: # 次ページのURLはid="pnnext"のhref属性 next_page = driver.find_element_by_id('pnnext').get_attribute('href') driver.get(next_page) # 次ページへ遷移する i = i + 1 # iを更新 time.sleep(3) # 3秒間待機 return title_list, link_list, summary_list, RelatedKeywords # タイトルとリンクのリストを戻り値に指定 # driver = webdriver.Chrome() # Chromeを準備 # サンプルのHTMLを開く driver.get('https://www.google.com/') # Googleを開く i_max = 5 # 最大何ページまで分析するかを定義 search = driver.find_element_by_name('q') # HTML内で検索ボックス(name='q')を指定する search.send_keys('Test blender') # 検索ワードを送信する search.submit() # 検索を実行 time.sleep(1.5) # 1.5秒間待機 # ranking関数を実行してタイトルとURLリストを取得する title, link, summary, RelatedKeywords = ranking(driver) csv_list = [["順位", "タイトル", "要約", "リンク", "関連キーワード"]] for i in range(len(title)): add_list=[i+1,title[i],summary[i],link[i]] csv_list.append(add_list) # タイトルリストをcsvに保存 with open('Search_word.csv','w',encoding="utf-8_sig") as f: writecsv = csv.writer(f, lineterminator='\n') writecsv.writerows(csv_list) driver.quit()I specified the path Quote:C:\Users\Name\AppData\Local\Programs\Python\Python38-32\Lib\site-packages\chromedriver_binary But I get this Error Message. After taht I tried to specify the path directly in the codethis ligne driver = webdriver.Chrome()But I encounter an other problem, driver = webdriver.Chrome(r'C:\Users\Name\AppData\Local\Programs\Python\Python38-32\Lib\site-packages\chromedriver_binary') I tried all the solutions here, None of the solution were working If you can find something to help I will be extremely happy. RE: With Selenium create a google Search list in Incognito mode withe specific location, - mlieqo - Jun-14-2020 I am not familiar with chromdriver-binary package, just try downloading chromedriver from here: https://chromedriver.chromium.org/downloads and then browser = webdriver.Chrome(executable_path=r"C:\path\to\chromedriver.exe") RE: With Selenium create a google Search list in Incognito mode withe specific location, - Yoriz - Jun-14-2020 Same error discussed in the thread WebDriverException: 'chromedriver' executable needs to be in PATH RE: With Selenium create a google Search list in Incognito mode withe specific location, - tsurubaso - Jun-15-2020 mlieqo, Believe me I tried. Yoriz, thank you also for the Link, I don<t know working on a Japanese computer change parameters, I am just not finding solutions. Spent 2 days on this. I have to take an other route. Thanks. |