Jul-06-2019, 03:55 AM
hi all,
I copy the code from Sentdex (https://youtu.be/j0zW_KXyQJ4) to produce the output.
I got the below error:
ValueError: columns overlap but no suffix specified: Index(['Unnamed: 0'], dtype='object')
Any suggestion to change my code?
Here is my code:
Here is what my dataframes look like:
Input dataframe
![[Image: ZXZqVV9.jpg]](https://i.imgur.com/ZXZqVV9.jpg)
Output dataframe: get error in 2nd iteration for ABT.csv
I copy the code from Sentdex (https://youtu.be/j0zW_KXyQJ4) to produce the output.
I got the below error:
ValueError: columns overlap but no suffix specified: Index(['Unnamed: 0'], dtype='object')
Any suggestion to change my code?
Here is my code:
import bs4 as bs import datetime as dt import os import pandas as pd import pandas_datareader.data as web import pickle import requests def save_sp500_tickers(): resp = requests.get('https://en.wikipedia.org/wiki/List_of_S%26P_500_companies') soup = bs.BeautifulSoup(resp.text,'lxml') table = soup.find('table', {'class':'wikitable'}) tickers = [] for row in table.findAll('tr')[1:]: ticker = row.findAll('td')[0].text.replace('.','-') ticker = ticker[:-1] tickers.append(ticker) with open("sp500tickers.pickle", "wb") as f: pickle.dump(tickers, f) print(tickers) return(tickers) save_sp500_tickers() def get_data_from_yahoo(reload_sp500=False): if reload_sp500: tickers = save_sp500_tickers() else: with open("sp500tickers.pickle", "rb") as f: tickers = pickle.load(f) if not os.path.exists('stock_dfs'): os.makedirs('stock_dfs') start = dt.datetime(2019,6,8) end = dt.datetime.now() for ticker in tickers: print(ticker) if not os.path.exists('stock_dfs/{}.csv'.format(ticker)): df = web.DataReader(ticker, 'yahoo', start, end) df.reset_index(inplace=True) df.to_csv('stock_dfs/{}.csv'.format(ticker)) else: print('Already have{}'.format(ticker)) def compile_data(): with open("sp500tickers.pickle","rb") as f: tickers = pickle.load(f) main_df = pd.DataFrame() for count,ticker in enumerate(tickers): df = pd.read_csv('stock_dfs/{}.csv'.format(ticker)) df.set_index('Date',inplace=True) df.rename(columns = {'Adj Close':ticker}, inplace=True) df.drop(['Open','High','Low','Close','Volume'], 1, inplace=True) if main_df.empty: main_df = df else: main_df = main_df.join(df, how='outer') if count & 10 ==0: print(count) print(main_df.head()) main_df.to_csv('sp500_joined_closed.csv') compile_data()I run in debug mode in vscode and found in second iteration in the count,ticker enumerate for loop.
Here is what my dataframes look like:
Input dataframe
![[Image: ZXZqVV9.jpg]](https://i.imgur.com/ZXZqVV9.jpg)
Output dataframe: get error in 2nd iteration for ABT.csv
![[Image: 8JQREU3.jpg]](https://i.imgur.com/8JQREU3.jpg)