Python Forum
webscraping yahoo data - custom date implementation
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
webscraping yahoo data - custom date implementation
#2
Please don't try to add bold or color inside the Python tags. It mucks everything up when others are trying to run the code.

I was able to get data for specific dates using the following code. There seems to be a limitation of some number less than 90 days using my code. I had to make an adjustment to the start date, to get the same print out as the web site run manually. The following should help you get started.
import bs4 as bs
import urllib.request
import pandas as pd
import time
 
def get_ticker(ticker, day_one, day_two):
     
    url = 'https://finance.yahoo.com/quote/' + ticker + '/history?period1=' + day_one + '&period2=' + day_two + '&interval=1d&filter=history&frequency=1d'
    source = urllib.request.urlopen(url).read()      
    soup =bs.BeautifulSoup(source,'lxml')
    tr = soup.find_all('tr')
     
    data = []
     
    for table in tr:
        td = table.find_all('td')
        row = [i.text for i in td]
        data.append(row)        
     
    columns = ['Date', 'Open', 'High', 'Low', 'Close', 'Adj Close', 'Volume']
 
    data = data[1:-2]
    df = pd.DataFrame(data)
    df.columns = columns
    df.set_index(columns[0], inplace=True)
    df = df.convert_objects(convert_numeric=True)
    df = df.iloc[::-1]
    df.dropna(inplace=True)
     
    return df
    

# April 3, 2018 = 1522728000  (seconds since UNIX epoch in 1970)
# June 12, 2018 = 1528776000
# https://finance.yahoo.com/quote/AAPL/history?period1=1522728000&period2=1528776000&interval=1d&filter=history&frequency=1d


format_string='%Y-%m-%d %H:%M:%S'

# One day (86400 second) adjustment required to get dates printed to match web site manual output
date1='2018-04-03 00:00:00'
date1_epoch = str(int(time.mktime(time.strptime(date1,format_string)))- 86400)
print("")
print(date1, date1_epoch)

date2='2018-06-12 00:00:00'
date2_epoch = str(int(time.mktime(time.strptime(date2,format_string))))
print(date2, date2_epoch)

df = get_ticker('AAPL', date1_epoch, date2_epoch)
print(df)    
Abridged output:
Output:
2018-04-03 00:00:00 1522728000 2018-06-12 00:00:00 1528776000 Open High Low Close Adj Close Volume Date Apr 03, 2018 167.64 168.75 164.88 168.39 167.74 30,278,000 Apr 04, 2018 164.88 172.01 164.77 171.61 170.95 34,605,500 Apr 05, 2018 172.58 174.23 172.08 172.80 172.14 26,933,200 ... Intermediate data was there - deleted by Lewis to save space Jun 11, 2018 191.35 191.97 190.21 191.23 191.23 18,308,500 Jun 12, 2018 191.39 192.61 191.15 192.28 192.28 16,911,100
Lewis
To paraphrase: 'Throw out your dead' code. https://www.youtube.com/watch?v=grbSQ6O6kbs Forward to 1:00
Reply


Messages In This Thread
RE: webscraping yahoo data - custom date implementation - by ljmetzger - Jun-17-2018, 11:58 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Webscraping news articles by using selenium cate16 7 3,151 Aug-28-2023, 09:58 AM
Last Post: snippsat
  Webscraping with beautifulsoup cormanstan 3 1,983 Aug-24-2023, 11:57 AM
Last Post: snippsat
  Webscraping returning empty table Buuuwq 0 1,402 Dec-09-2022, 10:41 AM
Last Post: Buuuwq
  WebScraping using Selenium library Korgik 0 1,049 Dec-09-2022, 09:51 AM
Last Post: Korgik
  How to get rid of numerical tokens in output (webscraping issue)? jps2020 0 1,955 Oct-26-2020, 05:37 PM
Last Post: jps2020
  Web Scraping with Yahoo Finance miloellison 1 2,068 Jul-03-2020, 11:12 PM
Last Post: Larz60+
  getting financial data from yahoo finance asiaphone12 7 7,000 Jun-15-2020, 05:49 AM
Last Post: mick_g
  Python Webscraping with a Login Website warriordazza 0 2,609 Jun-07-2020, 07:04 AM
Last Post: warriordazza
  Help with basic webscraping Captain_Snuggle 2 3,939 Nov-07-2019, 08:07 PM
Last Post: kozaizsvemira
  Can't Resolve Webscraping AttributeError Hass 1 2,315 Jan-15-2019, 09:36 PM
Last Post: nilamo

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020