pytrends problem

Karen · (This post was last modified: Apr-12-2017, 08:44 PM by zivoni.)

I've referred to the pytrends, but I still can't understand that the system always shows invalid syntax in the 44th line. Can anyone help me? Thanks a million.

# -*- coding: utf-8 -*-
import time
import codecs
import random
import glob  
from pytrends.request import TrendReq

google_username = "[email protected]"
google_password = "XXXXXXX"

f = open('stocks_tw.txt', 'r')
stocks_no_name = []
for line in f.readlines():
    data = line.split('\t')
    stock_no = data[0].strip()
    stock_name = data[1].strip()
    stocks_no_name.append([stock_no, stock_name])
f.close()


files=glob.glob('*.csv')  
downloaded_files = [fd.title().lower()[0:4] for fd in files]

stocks_no_name_new = [] 
for stock_no_name in stocks_no_name:
    if not stock_no_name[0] in downloaded_files:
        stocks_no_name_new.append(stock_no_name)
stocks_no_name = stocks_no_name_new       
    
print(len(stocks_no_name))

# connect to Google
pytrends = TrendReq(google_username, google_password, custom_useragent='My Pytrends Script')

while stocks_no_name:
    stock_index = random.randint(0,len(stocks_no_name)-1)
    stock_no_name = stocks_no_name[stock_index]
    stock_no = stock_no_name[0]
    stock_name = stock_no_name[1]
    print(stock_no, stock_name)

    try:
        one_stock_data = []    
        trend_payload = {'q': stock_name, 'date': ''2013-12-29 2016-12-31', 'geo': 'TW','tz': 'Etc/GMT+8'}
        # trend
        trend = pytrend.trend(trend_payload)
        time.sleep(random.randint(120, 360))
    
        table = trend['table']
        rows = table['rows']
        for i in range(len(rows)):
            row_data = []
            for j in range(len(rows[0]['c'])):
                row_data.append(rows[i]['c'][j]['v'])
            one_stock_data.append(row_data) 
                
        # output one_stock_data to a file
        filename = unicode(stock_no, errors='ignore') + '.csv'
        outfile = codecs.open(filename, "wb", "utf-8")
        for i in range(len(one_stock_data)):
            one_stock_data_str =  str(one_stock_data[i][0]) + ", " + str(one_stock_data[i][1])
            if i != len(one_stock_data) - 1:
                one_stock_data_str =  one_stock_data_str + "\r\n"
            outfile.write(one_stock_data_str)
        outfile.close()
        
        stocks_no_name.pop(stock_index)
    
    except:
        time.sleep(random.randint(120, 360))
        continue

Moderator zivoni: removed login informations

***zivoni*** · Apr-12-2017, 05:10 PM

There are two single quotes on start of 2013-.. string ''2013-12-29 2016-12-31', should be only one.

Karen · Apr-12-2017, 08:27 PM

After handling the line, I have another problem.
Below is the IPython console shows. I use the anaconda python. It runs, but no file downloads, why?

debugfile('C:/Users/user/Desktop/論文/資料來源/download_trend_past3years.py', wdir='C:/Users/user/Desktop/論文/資料來源')
Traceback (most recent call last):

File "<ipython-input-1-3457bb3b7f77>", line 1, in <module>
debugfile('C:/Users/user/Desktop/論文/資料來源/download_trend_past3years.py', wdir='C:/Users/user/Desktop/論文/資料來源')

File "C:\Users\user\Anaconda2\lib\site-packages\spyder\utils\site\sitecustomize.py", line 888, in debugfile
debugger.run("runfile(%r, args=%r, wdir=%r)" % (filename, args, wdir))

File "C:\Users\user\Anaconda2\lib\bdb.py", line 400, in run
exec cmd in globals, locals

File "<string>", line 1, in <module>

File "C:\Users\user\Anaconda2\lib\site-packages\spyder\utils\site\sitecustomize.py", line 866, in runfile
execfile(filename, namespace)

File "C:\Users\user\Anaconda2\lib\site-packages\spyder\utils\site\sitecustomize.py", line 80, in execfile
scripttext = builtins.open(fname).read()+ '\n'

IOError: [Errno 22] invalid mode ('r') or filename: 'c:/users/user/desktop/\xe8\xab\x96\xe6?/\xe8\xb3\x87\xe6?\xe4\xbe\x86\xe6?/download_trend_past3years.py'

runfile('C:/Users/user/Desktop/論文/資料來源/download_trend_past3years.py', wdir='C:/Users/user/Desktop/論文/資料來源')
849
('1718', '\xa4\xa4\xc5\xd6')
('5608', '\xa5|\xba\xfb\xaf\xe8')
('2395', '\xac\xe3\xb5\xd8')
('2103', '\xa5x\xbe\xf3')

('2430', '\xc0\xe9\xa9[\xb9\xea\xb7~')

**buran** · Apr-12-2017, 08:36 PM

That's not related to your problem, but perhaps you want to edit your first post and remove your password from the script. And change it! :-)

wavic · Apr-12-2017, 08:40 PM

(Apr-12-2017, 08:36 PM)buran Wrote: That's not related to your problem, but perhaps you want to edit your first post and remove your password from the script. And change it! :-)

Wow

I'm coming, I'm coming

***zivoni*** · Apr-12-2017, 09:05 PM

From that error output it seems it could be a spyder problem, maybe spyder has problem to run file with Chinese characters in path? Try to run your file directly from command prompt from your working directory with python download_trend_past3years.py

Karen · (This post was last modified: Apr-12-2017, 10:30 PM by Karen.)

Can anyone tell me how to use it? I haven't used .py in command prompt before.

I've found the solution to the command prompt. But it still can't work.

Karen · Apr-12-2017, 10:50 PM

what is the meaning in the module?

files=glob.glob('*.csv')
downloaded_files = [fd.title().lower()[0:4] for fd in files]

***zivoni*** · Apr-13-2017, 09:08 AM

glob.glob(pattern) returns list of paths matching given pattern. So if you have in working directory files

Output:
second.csv  test_file.csv  Tracker.csv  test.txt  boo.doc

then files will be list containing names of csv files.

Output:>>> files = glob.glob("*.csv")
>>> files
['Tracker.csv', 'second.csv', 'test_file.csv']

downloaded_files = [fd.title().lower()[0:4] for fd in files]

Right side is a list comprehension - this one "takes" names from files list, for each name it uppercases first characters for words in name, then lowercase all, then truncates it to first four characters.

Output:>>> downloaded_files = [fd.title().lower()[0:4] for fd in files] 
>>> downloaded_files
['trac', 'seco', 'test']

That .title() part is totally obsolete (due to following .lower()) and zero can be omitted too.

Karen · Apr-13-2017, 05:06 PM

Before the 42nd line, the code seems to be workable. But the pytrends can't work. And I've written the line from 44th to 46rd as

try:
one_stock_data = []
# trend
trend = pytrends.build_payload(kw_list=[stock_name],timeframe='2013-12-29 2016-12-31', geo='TW')
time.sleep(random.randint(120, 360))

But it still can't work.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Pytrends related querie & suggestion limitations	cmg15326	0	583	May-02-2023, 03:47 PM Last Post: cmg15326
	Need some help with Pytrends	cmg15326	4	1,297	May-01-2023, 05:41 AM Last Post: cmg15326

pytrends problem

User Panel Messages

Announcements