Jun-17-2018, 05:24 PM
Hi,
I'm trying to webscrape historical prices from yahoo finance.
I managed to get the data, however only for the most recent months (which is about 4-5 months)
I can't figure out how to access the time period to be able to add a start and end date.
any help would be really appreciated!
an example of apple below where you can see the time period which I'm trying to access.
https://finance.yahoo.com/quote/AAPL/history?p=AAPL
I forgot to add the columns. added in bold
I'm trying to webscrape historical prices from yahoo finance.
I managed to get the data, however only for the most recent months (which is about 4-5 months)
I can't figure out how to access the time period to be able to add a start and end date.
any help would be really appreciated!
an example of apple below where you can see the time period which I'm trying to access.
https://finance.yahoo.com/quote/AAPL/history?p=AAPL
import bs4 as bs import urllib.request import pandas as pd def get_ticker(ticker): url = 'https://finance.yahoo.com/quote/' + ticker + '/history?p=' + ticker source = urllib.request.urlopen(url).read() soup =bs.BeautifulSoup(source,'lxml') tr = soup.find_all('tr') data = [] for table in tr: td = table.find_all('td') row = [i.text for i in td] data.append(row) data = data[1:-2] df = pd.DataFrame(data) df.columns = columns df.set_index(columns[0], inplace=True) df = df.convert_objects(convert_numeric=True) df = df.iloc[::-1] df.dropna(inplace=True) return df
(Jun-17-2018, 05:24 PM)Jens89 Wrote: [ -> ]Hi,
I'm trying to webscrape historical prices from yahoo finance.
I managed to get the data, however only for the most recent months (which is about 4-5 months)
I can't figure out how to access the time period to be able to add a start and end date.
any help would be really appreciated!
an example of apple below where you can see the time period which I'm trying to access.
https://finance.yahoo.com/quote/AAPL/history?p=AAPL
import bs4 as bs import urllib.request import pandas as pd def get_ticker(ticker): url = 'https://finance.yahoo.com/quote/' + ticker + '/history?p=' + ticker source = urllib.request.urlopen(url).read() soup =bs.BeautifulSoup(source,'lxml') tr = soup.find_all('tr') data = [] for table in tr: td = table.find_all('td') row = [i.text for i in td] data.append(row) [b]columns = ['Date', 'Open', 'High', 'Low', 'Close', 'Adj Close', 'Volume'][/b] data = data[1:-2] df = pd.DataFrame(data) df.columns = columns df.set_index(columns[0], inplace=True) df = df.convert_objects(convert_numeric=True) df = df.iloc[::-1] df.dropna(inplace=True) return df
I forgot to add the columns. added in bold