Python Forum
change source from csv data to gsheet - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: change source from csv data to gsheet (/thread-18407.html)



change source from csv data to gsheet - Tummerke - May-16-2019

Hey Guys,

I have tweaked a code which scrapes data from google search results and place them into a csv file called data.
I already have made an api connection with gsheet except I can't get it working. It says "TypeError: open() takes 2 positional arguments but 3 were given"

I don't know if the commands I have used are correct.

I will post the original code and what I have changed so far.

Original:
 
from urllib.parse import urlencode, urlparse, parse_qs

from lxml.html import fromstring
from requests import get
import csv

def scrape_run():
    with open('searches.txt') as searches:
        for search in searches:
           userQuery = search
           raw = get("https://www.google.com/search?q=" + userQuery).text
           page = fromstring(raw)
           links = page.cssselect('.r a')
           csvfile = 'data.csv'
           for row in links:
               raw_url = row.get('href')
               title = row.text_content()
               if raw_url.startswith("/url?"):
                   url = parse_qs(urlparse(raw_url).query)['q']
                   csvRow = [userQuery, url[0], title]
                   with open(csvfile, 'a') as data:
                       writer = csv.writer(data)
                       writer.writerow(csvRow)

scrape_run()
My edit:
import gspread
from oauth2client.service_account import ServiceAccountCredentials

scope = ['https://spreadsheets.google.com/feeds', 'https://www.googleapis.com/auth/drive']

credentials = ServiceAccountCredentials.from_json_keyfile_name('Api koppeling Python-67433e13bb00.json', scope)

gc = gspread.authorize(credentials)

wks = gc.open('Python koppeling').sheet1

from urllib.parse import urlencode, urlparse, parse_qs

from lxml.html import fromstring
from requests import get

def scrape_run():
    with open('searches.txt', encoding='utf-8') as searches:
        for search in searches:
            userQuery = search
            raw = get("https://www.google.com/search?q=" + userQuery).text
            page = fromstring(raw)
            links = page.cssselect('.r a')
            csvfile = wks
            for row in links:
                raw_url = row.get('href')
                title = row.text_content()
                if raw_url.startswith("/url?"):
                    url = parse_qs(urlparse(raw_url).query)['q']
                    wks.append_row = [userQuery, url[0], title]
                    with gc.open(csvfile, 'a') as data:
                        writer = gspread.writer(data)
                        writer.writerow(csvRow)
scrape_run()
Can you help me what I am doing wrong? :)

Thanks for your help and effort!

Tummerke


RE: change source from csv data to gsheet - heiner55 - May-21-2019

Probably this line is wrong:
with gc.open(csvfile, 'a') as data:
I think it should be without: "gc."