Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
pytrends problem
#1
I've referred to the pytrends, but I still can't understand that the system always shows invalid syntax in the 44th line. Can anyone help me? Thanks a million.
# -*- coding: utf-8 -*-
import time
import codecs
import random
import glob  
from pytrends.request import TrendReq

google_username = "[email protected]"
google_password = "XXXXXXX"

f = open('stocks_tw.txt', 'r')
stocks_no_name = []
for line in f.readlines():
    data = line.split('\t')
    stock_no = data[0].strip()
    stock_name = data[1].strip()
    stocks_no_name.append([stock_no, stock_name])
f.close()


files=glob.glob('*.csv')  
downloaded_files = [fd.title().lower()[0:4] for fd in files]

stocks_no_name_new = [] 
for stock_no_name in stocks_no_name:
    if not stock_no_name[0] in downloaded_files:
        stocks_no_name_new.append(stock_no_name)
stocks_no_name = stocks_no_name_new       
    
print(len(stocks_no_name))

# connect to Google
pytrends = TrendReq(google_username, google_password, custom_useragent='My Pytrends Script')

while stocks_no_name:
    stock_index = random.randint(0,len(stocks_no_name)-1)
    stock_no_name = stocks_no_name[stock_index]
    stock_no = stock_no_name[0]
    stock_name = stock_no_name[1]
    print(stock_no, stock_name)

    try:
        one_stock_data = []    
        trend_payload = {'q': stock_name, 'date': ''2013-12-29 2016-12-31', 'geo': 'TW','tz': 'Etc/GMT+8'}
        # trend
        trend = pytrend.trend(trend_payload)
        time.sleep(random.randint(120, 360))
    
        table = trend['table']
        rows = table['rows']
        for i in range(len(rows)):
            row_data = []
            for j in range(len(rows[0]['c'])):
                row_data.append(rows[i]['c'][j]['v'])
            one_stock_data.append(row_data) 
                
        # output one_stock_data to a file
        filename = unicode(stock_no, errors='ignore') + '.csv'
        outfile = codecs.open(filename, "wb", "utf-8")
        for i in range(len(one_stock_data)):
            one_stock_data_str =  str(one_stock_data[i][0]) + ", " + str(one_stock_data[i][1])
            if i != len(one_stock_data) - 1:
                one_stock_data_str =  one_stock_data_str + "\r\n"
            outfile.write(one_stock_data_str)
        outfile.close()
        
        stocks_no_name.pop(stock_index)
    
    except:
        time.sleep(random.randint(120, 360))
        continue
Moderator zivoni: removed login informations
Reply
#2
There are two single quotes on start of 2013-.. string ''2013-12-29 2016-12-31', should be only one.
Reply
#3
After handling the line, I have another problem.
Below is the IPython console shows. I use the anaconda python. It runs, but no file downloads, why?

debugfile('C:/Users/user/Desktop/論文/資料來源/download_trend_past3years.py', wdir='C:/Users/user/Desktop/論文/資料來源')
Traceback (most recent call last):

File "<ipython-input-1-3457bb3b7f77>", line 1, in <module>
debugfile('C:/Users/user/Desktop/論文/資料來源/download_trend_past3years.py', wdir='C:/Users/user/Desktop/論文/資料來源')

File "C:\Users\user\Anaconda2\lib\site-packages\spyder\utils\site\sitecustomize.py", line 888, in debugfile
debugger.run("runfile(%r, args=%r, wdir=%r)" % (filename, args, wdir))

File "C:\Users\user\Anaconda2\lib\bdb.py", line 400, in run
exec cmd in globals, locals

File "<string>", line 1, in <module>

File "C:\Users\user\Anaconda2\lib\site-packages\spyder\utils\site\sitecustomize.py", line 866, in runfile
execfile(filename, namespace)

File "C:\Users\user\Anaconda2\lib\site-packages\spyder\utils\site\sitecustomize.py", line 80, in execfile
scripttext = builtins.open(fname).read()+ '\n'

IOError: [Errno 22] invalid mode ('r') or filename: 'c:/users/user/desktop/\xe8\xab\x96\xe6?/\xe8\xb3\x87\xe6?\xe4\xbe\x86\xe6?/download_trend_past3years.py'


runfile('C:/Users/user/Desktop/論文/資料來源/download_trend_past3years.py', wdir='C:/Users/user/Desktop/論文/資料來源')
849
('1718', '\xa4\xa4\xc5\xd6')
('5608', '\xa5|\xba\xfb\xaf\xe8')
('2395', '\xac\xe3\xb5\xd8')
('2103', '\xa5x\xbe\xf3')

('2430', '\xc0\xe9\xa9[\xb9\xea\xb7~')
Reply
#4
That's not related to your problem, but perhaps you want to edit your first post and remove your password from the script. And change it! :-)
Reply
#5
(Apr-12-2017, 08:36 PM)buran Wrote: That's not related to your problem, but perhaps you want to edit your first post and remove your password from the script. And change it! :-)

Wow   LOL I'm coming, I'm coming
"As they say in Mexico 'dosvidaniya'. That makes two vidaniyas."
https://freedns.afraid.org
Reply
#6
From that error output it seems it could be a  spyder problem, maybe spyder has problem to run file with Chinese characters in path? Try to run your file directly from command prompt from your working directory with python download_trend_past3years.py
Reply
#7
Can anyone tell me how to use it? I haven't used .py in command prompt before.

I've found the solution to the command prompt. But it still can't work.
Reply
#8
what is the meaning in the module?

files=glob.glob('*.csv')
downloaded_files = [fd.title().lower()[0:4] for fd in files]
Reply
#9
glob.glob(pattern) returns list of paths matching given pattern. So if you have in working directory files
Output:
second.csv  test_file.csv  Tracker.csv  test.txt  boo.doc
then files will be list containing names of csv files.
Output:
>>> files = glob.glob("*.csv") >>> files ['Tracker.csv', 'second.csv', 'test_file.csv']
downloaded_files = [fd.title().lower()[0:4] for fd in files] 
Right side is a list comprehension - this one "takes" names from files list, for each name it uppercases first characters for words in name, then lowercase all, then truncates it to first four characters.
Output:
>>> downloaded_files = [fd.title().lower()[0:4] for fd in files] >>> downloaded_files ['trac', 'seco', 'test']
That .title() part is totally obsolete (due to following .lower()) and  zero can be omitted too.
Reply
#10
Before the 42nd line, the code seems to be workable. But the pytrends can't work. And I've written the line from 44th to 46rd as

try:
one_stock_data = []
# trend
trend = pytrends.build_payload(kw_list=[stock_name],timeframe='2013-12-29 2016-12-31', geo='TW')
time.sleep(random.randint(120, 360))

But it still can't work.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Pytrends related querie & suggestion limitations cmg15326 0 583 May-02-2023, 03:47 PM
Last Post: cmg15326
  Need some help with Pytrends cmg15326 4 1,297 May-01-2023, 05:41 AM
Last Post: cmg15326

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020