Hey Guys,
I have tweaked a code which scrapes data from google search results and place them into a csv file called data.
I already have made an api connection with gsheet except I can't get it working. It says "TypeError: open() takes 2 positional arguments but 3 were given"
I don't know if the commands I have used are correct.
I will post the original code and what I have changed so far.
Original:
Thanks for your help and effort!
Tummerke
I have tweaked a code which scrapes data from google search results and place them into a csv file called data.
I already have made an api connection with gsheet except I can't get it working. It says "TypeError: open() takes 2 positional arguments but 3 were given"
I don't know if the commands I have used are correct.
I will post the original code and what I have changed so far.
Original:
from urllib.parse import urlencode, urlparse, parse_qs from lxml.html import fromstring from requests import get import csv def scrape_run(): with open('searches.txt') as searches: for search in searches: userQuery = search raw = get("https://www.google.com/search?q=" + userQuery).text page = fromstring(raw) links = page.cssselect('.r a') csvfile = 'data.csv' for row in links: raw_url = row.get('href') title = row.text_content() if raw_url.startswith("/url?"): url = parse_qs(urlparse(raw_url).query)['q'] csvRow = [userQuery, url[0], title] with open(csvfile, 'a') as data: writer = csv.writer(data) writer.writerow(csvRow) scrape_run()My edit:
import gspread from oauth2client.service_account import ServiceAccountCredentials scope = ['https://spreadsheets.google.com/feeds', 'https://www.googleapis.com/auth/drive'] credentials = ServiceAccountCredentials.from_json_keyfile_name('Api koppeling Python-67433e13bb00.json', scope) gc = gspread.authorize(credentials) wks = gc.open('Python koppeling').sheet1 from urllib.parse import urlencode, urlparse, parse_qs from lxml.html import fromstring from requests import get def scrape_run(): with open('searches.txt', encoding='utf-8') as searches: for search in searches: userQuery = search raw = get("https://www.google.com/search?q=" + userQuery).text page = fromstring(raw) links = page.cssselect('.r a') csvfile = wks for row in links: raw_url = row.get('href') title = row.text_content() if raw_url.startswith("/url?"): url = parse_qs(urlparse(raw_url).query)['q'] wks.append_row = [userQuery, url[0], title] with gc.open(csvfile, 'a') as data: writer = gspread.writer(data) writer.writerow(csvRow) scrape_run()Can you help me what I am doing wrong? :)
Thanks for your help and effort!
Tummerke