Is there a more elegant way to concatenate data frames?

db042190 · Jun-12-2023, 09:49 PM

Hi, I'm picturing a script that will include in a loop everything after the setting of "today" and before the "print" prototyped in the code below. The loop will read symbols from a notepad file line by line. I would substitute the hardcoded Ticker settings you see below with the symbol just read from notepad. If the record read is the first, the download call where AAPL is now a placeholder would be called. Otherwise the placeholder for the WMT call would be executed.

The download method allows passing multiple tickers in a call but long term I think this will be more flexible. From what I can tell, if you don't pass multiple symbols in a call, you don't get a column for Ticker.

Can I make this more elegant? It seems awkward.

import yfinance as yf
import pandas as pd
from datetime import date

today = date.today()

Ticker="AAPL"
data1 = yf.download(Ticker, start="2023-05-01", end=today).round(2)
data1["Ticker"]=Ticker

Ticker="WMT"
data2 = yf.download(Ticker, start="2023-05-01", end=today).round(2)
data2["Ticker"]=Ticker

data1=[data1,data2]
data1 = pd.concat(data1)

print(data1)

**deanhystad** · (This post was last modified: Jun-13-2023, 04:35 PM by deanhystad.)

Looks fine to me.

import yfinance as yf
import pandas as pd
from datetime import date

tickers = ("AAPL", "WMT")  # Or read from file
today = date.today()
start = today - timedelta(days=7)
data = None
for ticker in tickers:
    x = yf.download(ticker, start=start, end=today, progress=False).round(2)
    x.insert(0, "Ticker", ticker)
    if data is None:
        data = x
    else:
        data = pd.concat((data, x))
data = data.sort_index()
print(data)

Output:           Ticker    Open    High     Low   Close  Adj Close    Volume
Date
2023-06-06   AAPL  179.97  180.12  177.43  179.21     179.21  64848400
2023-06-06    WMT  149.70  150.19  148.51  149.78     149.78   5005200
2023-06-07   AAPL  178.44  181.21  177.32  177.82     177.82  61944600
2023-06-07    WMT  149.25  150.36  149.04  150.00     150.00   8085500
2023-06-08   AAPL  177.90  180.84  177.46  180.57     180.57  50214900
2023-06-08    WMT  150.39  152.43  149.79  152.17     152.17   6291000
2023-06-09   AAPL  181.50  182.23  180.63  180.96     180.96  48870700
2023-06-09    WMT  152.16  153.72  151.60  153.09     153.09   5201300
2023-06-12   AAPL  181.27  183.89  180.97  183.79     183.79  54274900
2023-06-12    WMT  153.43  154.30  153.17  154.10     154.10   4904500

I moved the ticker column. I think it makes more sense to place it ahead of the financial information. Also sorted the resulting table by the date index and changed the starting data to a calculation instead of a string. Just for fun.

db042190 · Jun-13-2023, 03:28 PM

much more elegant. thank you.

***snippsat*** · Jun-13-2023, 05:08 PM

Some tips about dates in Pandas and if look Date so is lower in header column and need a fix.
So here have i remove datatime import an used Pandas own date functionality
Can fine use both,but when first has import Pandas don't need a addition import of datetime.

import yfinance as yf
import pandas as pd

tickers = ("AAPL", "WMT")  # Or read from file
today = pd.to_datetime("today")
start = today - pd.Timedelta(days=7)
data = None
for ticker in tickers:
    x = yf.download(ticker, start=start, end=today, progress=False).round(2)
    x.insert(0, "Ticker", ticker)
    if data is None:
        data = x
    else:
        data = pd.concat((data, x))
    data = data.sort_index()
print(data)

>>> data
           Ticker    Open    High     Low   Close  Adj Close    Volume
Date                                                                  
2023-06-06   AAPL  179.97  180.12  177.43  179.21     179.21  64848400
2023-06-06    WMT  149.70  150.19  148.51  149.78     149.78   5005200
2023-06-07   AAPL  178.44  181.21  177.32  177.82     177.82  61944600
2023-06-07    WMT  149.25  150.36  149.04  150.00     150.00   8085500
2023-06-08   AAPL  177.90  180.84  177.46  180.57     180.57  50214900
2023-06-08    WMT  150.39  152.43  149.79  152.17     152.17   6291000
2023-06-09   AAPL  181.50  182.23  180.63  180.96     180.96  48870700
2023-06-09    WMT  152.16  153.72  151.60  153.09     153.09   5201300
2023-06-12   AAPL  181.27  183.89  180.97  183.79     183.79  54274900
2023-06-12    WMT  153.43  154.30  153.17  154.10     154.10   4904500
2023-06-13   AAPL  182.80  184.15  182.47  183.13     183.13  27582874
2023-06-13    WMT  154.52  155.49  154.07  155.40     155.40   1844848

>>> data.info()
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 12 entries, 2023-06-06 to 2023-06-13
Data columns (total 7 columns):
 #   Column     Non-Null Count  Dtype  
---  ------     --------------  -----  
 0   Ticker     12 non-null     object 
 1   Open       12 non-null     float64
 2   High       12 non-null     float64
 3   Low        12 non-null     float64
 4   Close      12 non-null     float64
 5   Adj Close  12 non-null     float64
 6   Volume     12 non-null     int64  
dtypes: float64(5), int64(1), object(1)
memory usage: 768.0+ bytes

So in info we see no Date info,to fix this.

>>> data = data.reset_index()
>>> data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 12 entries, 0 to 11
Data columns (total 8 columns):
 #   Column     Non-Null Count  Dtype         
---  ------     --------------  -----         
 0   Date       12 non-null     datetime64[ns]
 1   Ticker     12 non-null     object        
 2   Open       12 non-null     float64       
 3   High       12 non-null     float64       
 4   Low        12 non-null     float64       
 5   Close      12 non-null     float64       
 6   Adj Close  12 non-null     float64       
 7   Volume     12 non-null     int64         
dtypes: datetime64[ns](1), float64(5), int64(1), object(1)
memory usage: 900.0+ bytes

So now have a working DataFrame,Date see datetime64[ns]
Then can eg do a plot with Date and low last 90 days,high using eg seaborn

import yfinance as yf
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

tickers = ("AAPL", "WMT")  # Or read from file
today = pd.to_datetime("today")
start = today - pd.Timedelta(days=90)
data = None
for ticker in tickers:
    x = yf.download(ticker, start=start, end=today, progress=False).round(2)
    x.insert(0, "Ticker", ticker)
    if data is None:
        data = x
    else:
        data = pd.concat((data, x))
    data = data.sort_index()
#print(data)
data = data.reset_index()
# Plot
plt.figure(figsize=(15, 6))
sns.set_style("darkgrid")
sns.lineplot(data=data, x='Date', y='High', label='High')
sns.lineplot(data=data, x='Date', y='Low', label='Low')
plt.xlabel('Date')
plt.ylabel('Price')
plt.title('High and Low Stock Prices')
plt.legend()
plt.show()

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Elegant way to apply each element of an array to a dataframe?	sawtooth500	7	441	Mar-29-2024, 05:51 PM Last Post: deanhystad
	Better python library to create ER Diagram by using pandas data frames as tables	klllmmm	0	1,155	Oct-19-2023, 01:01 PM Last Post: klllmmm
	How to map two data frames based on multiple condition	SriRajesh	0	1,498	Oct-27-2021, 02:43 PM Last Post: SriRajesh
	Concatenate str	JohnnyCoffee	2	2,951	May-01-2021, 03:58 PM Last Post: JohnnyCoffee
	More elegant way to remove time from text lines.	Pedroski55	6	3,961	Apr-25-2021, 03:18 PM Last Post: perfringo
	Concatenate two dataframes	moralear27	2	1,897	Sep-15-2020, 08:04 AM Last Post: moralear27
	Moving Rows From Different Data Frames	JoeDainton123	1	4,364	Aug-06-2020, 05:19 AM Last Post: scidam
	Comparing Items Different Data Frames With a WHILE Loop	JoeDainton123	1	1,957	Jul-30-2020, 04:11 AM Last Post: scidam
	merging data frames	sportcardinal	0	1,215	Jun-30-2020, 12:21 AM Last Post: sportcardinal
	can only concatenate str (not "int") to str	gr3yali3n	6	4,148	May-28-2020, 07:20 AM Last Post: pyzyx3qwerty

Is there a more elegant way to concatenate data frames?

User Panel Messages

Announcements