Python Forum
Yahoo_fin, Pandas: how to convert data table structure in csv file
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Yahoo_fin, Pandas: how to convert data table structure in csv file
#1
The below Yahoo_fin script exports 59 stock stats for each ticker to csv. The program is working fine.

import pandas as pd
from yahoo_fin import stock_info as si
import glob
  
stock_list = "MSFT", "AAPL"
 
stats = {}
  
for ticker in stock_list:
  data2 = si.get_stats(ticker)
  data2 = data2.iloc[:,:2]
  data2.columns = ["Attribute", "Recent"]
  stats[ticker] = data2
  
combined2 = pd.concat(stats)
combined2 = combined2.reset_index()
del combined2["level_1"]
combined2.columns = ["Ticker", "Attribute", "Recent"]
  
df = pd.DataFrame(combined2)
df2 = df.drop_duplicates(subset=None, keep="first", inplace=False)
df2.to_csv(r'stats.csv')
The csv output file looks as follows:

Output:
Ticker Attribute Recent 0 MSFT Market Cap 1.84T 1 MSFT Enterprise Value 1.78T . . 58 MSFT Free Cash Flow 34.79B 59 AAPL Market Cap 2.27T 60 AAPL Enterprise Value 2.31T . . 117 AAPL Free Cash Flow 66.89B
Is there are a way to modified the yahoo_fin script so that the output looks like below instead?

Output:
Ticker Market Cap Enterprise Value .. Free Cash Flow MSFT 1.84T 1.78T 34.79B AAPL 2.27T 2.31T 66.89B
Reply
#2
You can use pandas.DataFrame.pivot()

import pandas as pd
df = pd.read_csv('test.csv')
print(df)
print('--------------')
df2 = df.pivot(index='Ticker', columns='Attribute', values='Recent')
print(df2)
df2.to_csv('test2.csv')
output

Output:
Ticker Attribute Recent 0 MSFT Market Cap 1.84T 1 MSFT Enterprise Value 1.78T 2 MSFT Free Cash Flow 34.79B 3 AAPL Market Cap 2.27T 4 AAPL Enterprise Value 2.31T 5 AAPL Free Cash Flow 66.89B -------------- Attribute Enterprise Value Free Cash Flow Market Cap Ticker AAPL 2.31T 66.89B 2.27T MSFT 1.78T 34.79B 1.84T
test2.csv
Output:
Ticker,Enterprise Value,Free Cash Flow,Market Cap AAPL,2.31T,66.89B,2.27T MSFT,1.78T,34.79B,1.84T
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#3
yes, that works! Thanks. I added the pivot line to my script. The question asked in this post is solved.

import pandas as pd
from yahoo_fin import stock_info as si

stock_list = "AAPL", "MSFT"

stats = {}

for ticker in stock_list:
  data = si.get_stats(ticker)
  stats[ticker] = data

combined = pd.concat(stats)
combined = combined.reset_index()
del combined["level_1"]
combined.columns = ["Ticker", "Attribute", "Recent"]

df = pd.DataFrame(combined)
df2 = df.pivot(index='Ticker', columns='Attribute', values='Recent')
df2.to_csv(r'stats.csv')
Reply
#4
To make the program more useful I added a few functions. I added importing ticker symbols from a csv file, deleting unwanted columns, renaming columns and changing the columns order. The program in this example imports the csv file column with the header name "Ticker". How to create a csv file containing e.g. the ticker symbols from the NASDAQ can be found in this post.

The program worked fine with about 350 stocks at once. However, running this program several times with that many tickers in a short time can cause a "HTTPError: Service Unavailable" type error. Also make sure that all data is retrieved. It happened to me that data in a couple columns was missing one time. The next morning then the program again was able to get all data. So, if you run into such a problem, it is not caused by an error in the program. It is an error related to retrieving the information from the website. A website access problem.

import pandas as pd
from yahoo_fin import stock_info as si
import glob
 
ti = pd.read_csv('TickerList.csv')
ti2 = ti["Ticker"].tolist()
 
stock_list = ti2
 
stats = {}
 
for ticker in stock_list:
  data = si.get_stats(ticker)
  stats[ticker] = data
 
combined = pd.concat(stats)
combined = combined.reset_index()
del combined["level_1"]
combined.columns = ["Ticker", "Attribute", "Recent"]
 
df = pd.DataFrame(combined)
df2 = df.pivot(index='Ticker', columns='Attribute', values='Recent')
 
cols = [3,7,12,13,14,17,18,19,20,22,23,24,25,26,27,28,30,31,32,33,34,35,38,39,41,43,45,49,51,56,57,58]
df2.drop(df2.columns[cols],axis=1,inplace=True)
 
df2.rename(columns={ df2.columns[0]: "% Insider" }, inplace = True)
df2.rename(columns={ df2.columns[1]: "% Institution" }, inplace = True)
df2.rename(columns={ df2.columns[2]: "200-Day MA" }, inplace = True)
df2.rename(columns={ df2.columns[3]: "50-Day MA" }, inplace = True)
df2.rename(columns={ df2.columns[4]: "52-Week H" }, inplace = True)
df2.rename(columns={ df2.columns[5]: "52-Week L" }, inplace = True)
df2.rename(columns={ df2.columns[6]: "Avg Vol 10D" }, inplace = True)
df2.rename(columns={ df2.columns[7]: "Avg Vol 3M" }, inplace = True)
df2.rename(columns={ df2.columns[8]: "Beta" }, inplace = True)
df2.rename(columns={ df2.columns[9]: "Book/Share" }, inplace = True)
df2.rename(columns={ df2.columns[10]: "EBITDA" }, inplace = True)
df2.rename(columns={ df2.columns[11]: "Enterprice V" }, inplace = True)
df2.rename(columns={ df2.columns[12]: "Float" }, inplace = True)
df2.rename(columns={ df2.columns[13]: "Market Cap" }, inplace = True)
df2.rename(columns={ df2.columns[14]: "Price/Book" }, inplace = True)
df2.rename(columns={ df2.columns[15]: "Price/Sales" }, inplace = True)
df2.rename(columns={ df2.columns[16]: "Q Rev (YOY)" }, inplace = True)
df2.rename(columns={ df2.columns[17]: "Return on Equity" }, inplace = True)
df2.rename(columns={ df2.columns[18]: "Revenue/Share" }, inplace = True)
df2.rename(columns={ df2.columns[19]: "Shares" }, inplace = True)
df2.rename(columns={ df2.columns[20]: "Shares Short" }, inplace = True)
df2.rename(columns={ df2.columns[21]: "Short prior" }, inplace = True)
df2.rename(columns={ df2.columns[22]: "% Short" }, inplace = True)
df2.rename(columns={ df2.columns[23]: "Cash" }, inplace = True)
df2.rename(columns={ df2.columns[24]: "Cash/Share" }, inplace = True)
df2.rename(columns={ df2.columns[25]: "Debt" }, inplace = True)
df2.rename(columns={ df2.columns[26]: "Debt/Equity" }, inplace = True)
 
order = [13,11,15,14,8,4,5,3,2,7,6,19,12,0,1,20,22,21,17,18,16,10,23,24,25,26,9]
df3 = df2[[df2.columns[i] for i in order]]
 
df3.to_csv(r'stats.csv')
Reply
#5
As alternative to reading the csv with python one can use an Excel VBA to convert an Excel column containing ticker symbols to a string that can then be pasted into the python code. The Excel VBA code is posted below.

Instructions for the Excel VBA: Put your tickers in column A starting in cell A1. Then execute the macro. The ticker string will be placed in cell B1.

How to export the tickers using python you can find here.

Sub CombineTickersWithQuotationMark()

    Application.ScreenUpdating = False
   
    Dim i As Long
    Dim LastRowA As Long

    LastRowA = Range("A" & Rows.Count).End(xlUp).Row
    
    For i = 1 To LastRowA - 1
        Cells(1, 2) = Cells(1, 2) & """" & Cells(i, 1) & """, "
    Next i
    
    Cells(1, 2) = Cells(1, 2) & """" & Cells(LastRowA, 1) & """"
    
    Application.ScreenUpdating = True

End Sub
Reply
#6
why not learn how to read from a file, e.g. csv/text file with all tickers. instead of constructing a big literal.
And, please, make an effort first. You can read it in number of ways, incl. pandas obviously
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#7
removed due to too personal
Reply
#8
I've been working with VBA/Office automation for years. Almost never looked back after I started with python. And you already know how to read data from a file into pandas dataframe and loop over it - what's the problem to read a list of tickers from cav or excel? Basically there is no learning here.
On a broader scale - isn't it better to store data in a DB and then connect from Excel, either with Power Query or VBA/ADODB if you prefer to manipulate your data in Excel?
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#9
This is the same code as posted above.

import pandas as pd
from yahoo_fin import stock_info as si
import glob

ti = pd.read_csv('TickerList.csv')
ti2 = ti["Ticker"].tolist()

stock_list = ti2

stats = {}

for ticker in stock_list:
  data = si.get_stats(ticker)
  stats[ticker] = data

combined = pd.concat(stats)
combined = combined.reset_index()
del combined["level_1"]
combined.columns = ["Ticker", "Attribute", "Recent"]

df = pd.DataFrame(combined)
df2 = df.pivot(index='Ticker', columns='Attribute', values='Recent')

cols = [3,7,12,13,14,17,18,19,20,22,23,24,25,26,27,28,30,31,32,33,34,35,38,39,41,43,45,49,51,56,57,58]
df2.drop(df2.columns[cols],axis=1,inplace=True)

df2.rename(columns={ df2.columns[0]: "% Insider" }, inplace = True)
df2.rename(columns={ df2.columns[1]: "% Institution" }, inplace = True)
df2.rename(columns={ df2.columns[2]: "200-Day MA" }, inplace = True)
df2.rename(columns={ df2.columns[3]: "50-Day MA" }, inplace = True)
df2.rename(columns={ df2.columns[4]: "52-Week H" }, inplace = True)
df2.rename(columns={ df2.columns[5]: "52-Week L" }, inplace = True)
df2.rename(columns={ df2.columns[6]: "Avg Vol 10D" }, inplace = True)
df2.rename(columns={ df2.columns[7]: "Avg Vol 3M" }, inplace = True)
df2.rename(columns={ df2.columns[8]: "Beta" }, inplace = True)
df2.rename(columns={ df2.columns[9]: "Book/Share" }, inplace = True)
df2.rename(columns={ df2.columns[10]: "EBITDA" }, inplace = True)
df2.rename(columns={ df2.columns[11]: "Enterprice V" }, inplace = True)
df2.rename(columns={ df2.columns[12]: "Float" }, inplace = True)
df2.rename(columns={ df2.columns[13]: "Market Cap" }, inplace = True)
df2.rename(columns={ df2.columns[14]: "Price/Book" }, inplace = True)
df2.rename(columns={ df2.columns[15]: "Price/Sales" }, inplace = True)
df2.rename(columns={ df2.columns[16]: "Q Rev (YOY)" }, inplace = True)
df2.rename(columns={ df2.columns[17]: "Return on Equity" }, inplace = True)
df2.rename(columns={ df2.columns[18]: "Revenue/Share" }, inplace = True)
df2.rename(columns={ df2.columns[19]: "Shares" }, inplace = True)
df2.rename(columns={ df2.columns[20]: "Shares Short" }, inplace = True)
df2.rename(columns={ df2.columns[21]: "Short prior" }, inplace = True)
df2.rename(columns={ df2.columns[22]: "% Short" }, inplace = True)
df2.rename(columns={ df2.columns[23]: "Cash" }, inplace = True)
df2.rename(columns={ df2.columns[24]: "Cash/Share" }, inplace = True)
df2.rename(columns={ df2.columns[25]: "Debt" }, inplace = True)
df2.rename(columns={ df2.columns[26]: "Debt/Equity" }, inplace = True)

order = [13,11,15,14,8,4,5,3,2,7,6,19,12,0,1,20,22,21,17,18,16,10,23,24,25,26,9]
df3 = df2[[df2.columns[i] for i in order]]

df3.to_csv(r'stats.csv')
Reply
#10
Please, don't remove post content, especially after conversation went and you get advise. I deleted one post that you marked as too personal and left the other with content removed, so that it's clear how the conversation went. However I reverted the 2 posts that have code published.
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Convert MultiLayer XML to DataFrame using Pandas vsingh17 0 41 Yesterday, 03:50 PM
Last Post: vsingh17
  what data structure to use? Winfried 4 297 Mar-16-2021, 12:11 PM
Last Post: buran
  Exporting Stock Fundamental Data to a CSV file with yahoo_fin DustinKlent 1 964 Mar-02-2021, 02:07 AM
Last Post: detlefschmitt
  Pandas: how to split one row of data to multiple rows and columns in Python GerardMoussendo 4 401 Feb-22-2021, 06:51 PM
Last Post: eddywinch82
  Yahoo_fin question: how does one export all stock tickers to a csv file detlefschmitt 9 640 Feb-12-2021, 07:46 AM
Last Post: detlefschmitt
  write mariadb table rows query to each file? shams 1 206 Feb-02-2021, 04:10 PM
Last Post: buran
  Convert table in pandas tgottsc1 2 266 Jan-26-2021, 01:58 PM
Last Post: tgottsc1
  Data extraction from a table based on column and row names tgottsc1 1 440 Jan-09-2021, 10:04 PM
Last Post: buran
  Json File more pages #pandas #dataframe nio74maz 0 245 Dec-30-2020, 05:32 AM
Last Post: nio74maz
  Pandas Extract data from two dataframe nio74maz 1 253 Dec-26-2020, 09:52 PM
Last Post: nio74maz

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020