Yahoo_fin, Pandas: how to convert data table structure in csv file

detlefschmitt · Feb-12-2021, 02:42 AM

The below Yahoo_fin script exports 59 stock stats for each ticker to csv. The program is working fine.

import pandas as pd
from yahoo_fin import stock_info as si
import glob
  
stock_list = "MSFT", "AAPL"
 
stats = {}
  
for ticker in stock_list:
  data2 = si.get_stats(ticker)
  data2 = data2.iloc[:,:2]
  data2.columns = ["Attribute", "Recent"]
  stats[ticker] = data2
  
combined2 = pd.concat(stats)
combined2 = combined2.reset_index()
del combined2["level_1"]
combined2.columns = ["Ticker", "Attribute", "Recent"]
  
df = pd.DataFrame(combined2)
df2 = df.drop_duplicates(subset=None, keep="first", inplace=False)
df2.to_csv(r'stats.csv')

The csv output file looks as follows:

Output:	Ticker	Attribute	               Recent
0	MSFT	Market Cap	               1.84T
1	MSFT	Enterprise Value	       1.78T
.
.
58	MSFT	Free Cash Flow	               34.79B
59	AAPL	Market Cap	               2.27T
60	AAPL	Enterprise Value	       2.31T
.
.
117	AAPL	Free Cash Flow	               66.89B

Is there are a way to modified the yahoo_fin script so that the output looks like below instead?

Output:Ticker     Market Cap     Enterprise Value   ..   Free Cash Flow
MSFT       1.84T          1.78T                   34.79B
AAPL       2.27T          2.31T                   66.89B

**buran** · Feb-12-2021, 09:47 AM

You can use pandas.DataFrame.pivot()

import pandas as pd
df = pd.read_csv('test.csv')
print(df)
print('--------------')
df2 = df.pivot(index='Ticker', columns='Attribute', values='Recent')
print(df2)
df2.to_csv('test2.csv')

output

Output:  Ticker         Attribute  Recent
0   MSFT        Market Cap   1.84T
1   MSFT  Enterprise Value   1.78T
2   MSFT    Free Cash Flow  34.79B
3   AAPL        Market Cap   2.27T
4   AAPL  Enterprise Value   2.31T
5   AAPL    Free Cash Flow  66.89B
--------------
Attribute Enterprise Value Free Cash Flow Market Cap
Ticker                                              
AAPL                 2.31T         66.89B      2.27T
MSFT                 1.78T         34.79B      1.84T

test2.csv

Output:Ticker,Enterprise Value,Free Cash Flow,Market Cap
AAPL,2.31T,66.89B,2.27T
MSFT,1.78T,34.79B,1.84T

detlefschmitt · (This post was last modified: Feb-12-2021, 11:47 AM by detlefschmitt.)

yes, that works! Thanks. I added the pivot line to my script. The question asked in this post is solved.

import pandas as pd
from yahoo_fin import stock_info as si

stock_list = "AAPL", "MSFT"

stats = {}

for ticker in stock_list:
  data = si.get_stats(ticker)
  stats[ticker] = data

combined = pd.concat(stats)
combined = combined.reset_index()
del combined["level_1"]
combined.columns = ["Ticker", "Attribute", "Recent"]

df = pd.DataFrame(combined)
df2 = df.pivot(index='Ticker', columns='Attribute', values='Recent')
df2.to_csv(r'stats.csv')

detlefschmitt · (This post was last modified: Feb-14-2021, 05:44 AM by detlefschmitt.)

To make the program more useful I added a few functions. I added importing ticker symbols from a csv file, deleting unwanted columns, renaming columns and changing the columns order. The program in this example imports the csv file column with the header name "Ticker". How to create a csv file containing e.g. the ticker symbols from the NASDAQ can be found in this post.

The program worked fine with about 350 stocks at once. However, running this program several times with that many tickers in a short time can cause a "HTTPError: Service Unavailable" type error. Also make sure that all data is retrieved. It happened to me that data in a couple columns was missing one time. The next morning then the program again was able to get all data. So, if you run into such a problem, it is not caused by an error in the program. It is an error related to retrieving the information from the website. A website access problem.

import pandas as pd
from yahoo_fin import stock_info as si
import glob
 
ti = pd.read_csv('TickerList.csv')
ti2 = ti["Ticker"].tolist()
 
stock_list = ti2
 
stats = {}
 
for ticker in stock_list:
  data = si.get_stats(ticker)
  stats[ticker] = data
 
combined = pd.concat(stats)
combined = combined.reset_index()
del combined["level_1"]
combined.columns = ["Ticker", "Attribute", "Recent"]
 
df = pd.DataFrame(combined)
df2 = df.pivot(index='Ticker', columns='Attribute', values='Recent')
 
cols = [3,7,12,13,14,17,18,19,20,22,23,24,25,26,27,28,30,31,32,33,34,35,38,39,41,43,45,49,51,56,57,58]
df2.drop(df2.columns[cols],axis=1,inplace=True)
 
df2.rename(columns={ df2.columns[0]: "% Insider" }, inplace = True)
df2.rename(columns={ df2.columns[1]: "% Institution" }, inplace = True)
df2.rename(columns={ df2.columns[2]: "200-Day MA" }, inplace = True)
df2.rename(columns={ df2.columns[3]: "50-Day MA" }, inplace = True)
df2.rename(columns={ df2.columns[4]: "52-Week H" }, inplace = True)
df2.rename(columns={ df2.columns[5]: "52-Week L" }, inplace = True)
df2.rename(columns={ df2.columns[6]: "Avg Vol 10D" }, inplace = True)
df2.rename(columns={ df2.columns[7]: "Avg Vol 3M" }, inplace = True)
df2.rename(columns={ df2.columns[8]: "Beta" }, inplace = True)
df2.rename(columns={ df2.columns[9]: "Book/Share" }, inplace = True)
df2.rename(columns={ df2.columns[10]: "EBITDA" }, inplace = True)
df2.rename(columns={ df2.columns[11]: "Enterprice V" }, inplace = True)
df2.rename(columns={ df2.columns[12]: "Float" }, inplace = True)
df2.rename(columns={ df2.columns[13]: "Market Cap" }, inplace = True)
df2.rename(columns={ df2.columns[14]: "Price/Book" }, inplace = True)
df2.rename(columns={ df2.columns[15]: "Price/Sales" }, inplace = True)
df2.rename(columns={ df2.columns[16]: "Q Rev (YOY)" }, inplace = True)
df2.rename(columns={ df2.columns[17]: "Return on Equity" }, inplace = True)
df2.rename(columns={ df2.columns[18]: "Revenue/Share" }, inplace = True)
df2.rename(columns={ df2.columns[19]: "Shares" }, inplace = True)
df2.rename(columns={ df2.columns[20]: "Shares Short" }, inplace = True)
df2.rename(columns={ df2.columns[21]: "Short prior" }, inplace = True)
df2.rename(columns={ df2.columns[22]: "% Short" }, inplace = True)
df2.rename(columns={ df2.columns[23]: "Cash" }, inplace = True)
df2.rename(columns={ df2.columns[24]: "Cash/Share" }, inplace = True)
df2.rename(columns={ df2.columns[25]: "Debt" }, inplace = True)
df2.rename(columns={ df2.columns[26]: "Debt/Equity" }, inplace = True)
 
order = [13,11,15,14,8,4,5,3,2,7,6,19,12,0,1,20,22,21,17,18,16,10,23,24,25,26,9]
df3 = df2[[df2.columns[i] for i in order]]
 
df3.to_csv(r'stats.csv')

detlefschmitt · (This post was last modified: Feb-15-2021, 05:55 AM by detlefschmitt.)

As alternative to reading the csv with python one can use an Excel VBA to convert an Excel column containing ticker symbols to a string that can then be pasted into the python code. The Excel VBA code is posted below.

Instructions for the Excel VBA: Put your tickers in column A starting in cell A1. Then execute the macro. The ticker string will be placed in cell B1.

How to export the tickers using python you can find here.

Sub CombineTickersWithQuotationMark()

    Application.ScreenUpdating = False
   
    Dim i As Long
    Dim LastRowA As Long

    LastRowA = Range("A" & Rows.Count).End(xlUp).Row
    
    For i = 1 To LastRowA - 1
        Cells(1, 2) = Cells(1, 2) & """" & Cells(i, 1) & """, "
    Next i
    
    Cells(1, 2) = Cells(1, 2) & """" & Cells(LastRowA, 1) & """"
    
    Application.ScreenUpdating = True

End Sub

**buran** · Feb-13-2021, 08:39 AM

why not learn how to read from a file, e.g. csv/text file with all tickers. instead of constructing a big literal.
And, please, make an effort first. You can read it in number of ways, incl. pandas obviously

detlefschmitt · (This post was last modified: Feb-14-2021, 02:49 AM by detlefschmitt.)

removed due to too personal

**buran** · (This post was last modified: Feb-13-2021, 09:30 AM by buran.)

I've been working with VBA/Office automation for years. Almost never looked back after I started with python. And you already know how to read data from a file into pandas dataframe and loop over it - what's the problem to read a list of tickers from cav or excel? Basically there is no learning here.
On a broader scale - isn't it better to store data in a DB and then connect from Excel, either with Power Query or VBA/ADODB if you prefer to manipulate your data in Excel?

detlefschmitt · (This post was last modified: Feb-15-2021, 05:56 AM by detlefschmitt.)

This is the same code as posted above.

import pandas as pd
from yahoo_fin import stock_info as si
import glob

ti = pd.read_csv('TickerList.csv')
ti2 = ti["Ticker"].tolist()

stock_list = ti2

stats = {}

for ticker in stock_list:
  data = si.get_stats(ticker)
  stats[ticker] = data

combined = pd.concat(stats)
combined = combined.reset_index()
del combined["level_1"]
combined.columns = ["Ticker", "Attribute", "Recent"]

df = pd.DataFrame(combined)
df2 = df.pivot(index='Ticker', columns='Attribute', values='Recent')

cols = [3,7,12,13,14,17,18,19,20,22,23,24,25,26,27,28,30,31,32,33,34,35,38,39,41,43,45,49,51,56,57,58]
df2.drop(df2.columns[cols],axis=1,inplace=True)

df2.rename(columns={ df2.columns[0]: "% Insider" }, inplace = True)
df2.rename(columns={ df2.columns[1]: "% Institution" }, inplace = True)
df2.rename(columns={ df2.columns[2]: "200-Day MA" }, inplace = True)
df2.rename(columns={ df2.columns[3]: "50-Day MA" }, inplace = True)
df2.rename(columns={ df2.columns[4]: "52-Week H" }, inplace = True)
df2.rename(columns={ df2.columns[5]: "52-Week L" }, inplace = True)
df2.rename(columns={ df2.columns[6]: "Avg Vol 10D" }, inplace = True)
df2.rename(columns={ df2.columns[7]: "Avg Vol 3M" }, inplace = True)
df2.rename(columns={ df2.columns[8]: "Beta" }, inplace = True)
df2.rename(columns={ df2.columns[9]: "Book/Share" }, inplace = True)
df2.rename(columns={ df2.columns[10]: "EBITDA" }, inplace = True)
df2.rename(columns={ df2.columns[11]: "Enterprice V" }, inplace = True)
df2.rename(columns={ df2.columns[12]: "Float" }, inplace = True)
df2.rename(columns={ df2.columns[13]: "Market Cap" }, inplace = True)
df2.rename(columns={ df2.columns[14]: "Price/Book" }, inplace = True)
df2.rename(columns={ df2.columns[15]: "Price/Sales" }, inplace = True)
df2.rename(columns={ df2.columns[16]: "Q Rev (YOY)" }, inplace = True)
df2.rename(columns={ df2.columns[17]: "Return on Equity" }, inplace = True)
df2.rename(columns={ df2.columns[18]: "Revenue/Share" }, inplace = True)
df2.rename(columns={ df2.columns[19]: "Shares" }, inplace = True)
df2.rename(columns={ df2.columns[20]: "Shares Short" }, inplace = True)
df2.rename(columns={ df2.columns[21]: "Short prior" }, inplace = True)
df2.rename(columns={ df2.columns[22]: "% Short" }, inplace = True)
df2.rename(columns={ df2.columns[23]: "Cash" }, inplace = True)
df2.rename(columns={ df2.columns[24]: "Cash/Share" }, inplace = True)
df2.rename(columns={ df2.columns[25]: "Debt" }, inplace = True)
df2.rename(columns={ df2.columns[26]: "Debt/Equity" }, inplace = True)

order = [13,11,15,14,8,4,5,3,2,7,6,19,12,0,1,20,22,21,17,18,16,10,23,24,25,26,9]
df3 = df2[[df2.columns[i] for i in order]]

df3.to_csv(r'stats.csv')

**buran** · Feb-14-2021, 07:05 AM

Please, don't remove post content, especially after conversation went and you get advise. I deleted one post that you marked as too personal and left the other with content removed, so that it's clear how the conversation went. However I reverted the 2 posts that have code published.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	[SOLVED] Correct way to convert file from cp-1252 to utf-8?	Winfried	8	873	Feb-29-2024, 12:30 AM Last Post: Winfried
	Better python library to create ER Diagram by using pandas data frames as tables	klllmmm	0	1,139	Oct-19-2023, 01:01 PM Last Post: klllmmm
	Trying to get counts/sum/percentages from pandas similar to pivot table	cubangt	6	1,415	Oct-06-2023, 04:32 PM Last Post: cubangt
	Convert word into pdf and copy table to outlook body in a prescribed format	email2kmahe	1	755	Sep-22-2023, 02:33 PM Last Post: carecavoador
	Using pyodbc&pandas to load a Table data to df	tester_V	3	822	Sep-09-2023, 08:55 PM Last Post: tester_V
	Hard disk structure like a file selection dialog	malonn	2	799	Aug-09-2023, 09:14 PM Last Post: malonn
	Convert File to Data URL	michaelnicol	3	1,174	Jul-08-2023, 11:35 AM Last Post: DeaD_EyE
	Python Script to convert Json to CSV file	chvsnarayana	8	2,525	Apr-26-2023, 10:31 PM Last Post: DeaD_EyE
	python pandas sql table with header	mg24	3	1,968	Dec-08-2022, 08:31 PM Last Post: Larz60+
	openpyxl convert data to float	jacklee26	13	6,006	Nov-19-2022, 11:59 AM Last Post: deanhystad

Yahoo_fin, Pandas: how to convert data table structure in csv file

User Panel Messages

Announcements