Python Forum
Is there a more elegant way to concatenate data frames?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Is there a more elegant way to concatenate data frames?
#1
Hi, I'm picturing a script that will include in a loop everything after the setting of "today" and before the "print" prototyped in the code below. The loop will read symbols from a notepad file line by line. I would substitute the hardcoded Ticker settings you see below with the symbol just read from notepad. If the record read is the first, the download call where AAPL is now a placeholder would be called. Otherwise the placeholder for the WMT call would be executed.

The download method allows passing multiple tickers in a call but long term I think this will be more flexible. From what I can tell, if you don't pass multiple symbols in a call, you don't get a column for Ticker.

Can I make this more elegant? It seems awkward.

import yfinance as yf
import pandas as pd
from datetime import date

today = date.today()

Ticker="AAPL"
data1 = yf.download(Ticker, start="2023-05-01", end=today).round(2)
data1["Ticker"]=Ticker

Ticker="WMT"
data2 = yf.download(Ticker, start="2023-05-01", end=today).round(2)
data2["Ticker"]=Ticker

data1=[data1,data2]
data1 = pd.concat(data1)

print(data1)
Reply
#2
Looks fine to me.
import yfinance as yf
import pandas as pd
from datetime import date

tickers = ("AAPL", "WMT")  # Or read from file
today = date.today()
start = today - timedelta(days=7)
data = None
for ticker in tickers:
    x = yf.download(ticker, start=start, end=today, progress=False).round(2)
    x.insert(0, "Ticker", ticker)
    if data is None:
        data = x
    else:
        data = pd.concat((data, x))
data = data.sort_index()
print(data)
Output:
Ticker Open High Low Close Adj Close Volume Date 2023-06-06 AAPL 179.97 180.12 177.43 179.21 179.21 64848400 2023-06-06 WMT 149.70 150.19 148.51 149.78 149.78 5005200 2023-06-07 AAPL 178.44 181.21 177.32 177.82 177.82 61944600 2023-06-07 WMT 149.25 150.36 149.04 150.00 150.00 8085500 2023-06-08 AAPL 177.90 180.84 177.46 180.57 180.57 50214900 2023-06-08 WMT 150.39 152.43 149.79 152.17 152.17 6291000 2023-06-09 AAPL 181.50 182.23 180.63 180.96 180.96 48870700 2023-06-09 WMT 152.16 153.72 151.60 153.09 153.09 5201300 2023-06-12 AAPL 181.27 183.89 180.97 183.79 183.79 54274900 2023-06-12 WMT 153.43 154.30 153.17 154.10 154.10 4904500
I moved the ticker column. I think it makes more sense to place it ahead of the financial information. Also sorted the resulting table by the date index and changed the starting data to a calculation instead of a string. Just for fun.
snippsat likes this post
Reply
#3
much more elegant. thank you.
Reply
#4
Some tips about dates in Pandas and if look Date so is lower in header column and need a fix.
So here have i remove datatime import an used Pandas own date functionality
Can fine use both,but when first has import Pandas don't need a addition import of datetime.
import yfinance as yf
import pandas as pd

tickers = ("AAPL", "WMT")  # Or read from file
today = pd.to_datetime("today")
start = today - pd.Timedelta(days=7)
data = None
for ticker in tickers:
    x = yf.download(ticker, start=start, end=today, progress=False).round(2)
    x.insert(0, "Ticker", ticker)
    if data is None:
        data = x
    else:
        data = pd.concat((data, x))
    data = data.sort_index()
print(data)
>>> data
           Ticker    Open    High     Low   Close  Adj Close    Volume
Date                                                                  
2023-06-06   AAPL  179.97  180.12  177.43  179.21     179.21  64848400
2023-06-06    WMT  149.70  150.19  148.51  149.78     149.78   5005200
2023-06-07   AAPL  178.44  181.21  177.32  177.82     177.82  61944600
2023-06-07    WMT  149.25  150.36  149.04  150.00     150.00   8085500
2023-06-08   AAPL  177.90  180.84  177.46  180.57     180.57  50214900
2023-06-08    WMT  150.39  152.43  149.79  152.17     152.17   6291000
2023-06-09   AAPL  181.50  182.23  180.63  180.96     180.96  48870700
2023-06-09    WMT  152.16  153.72  151.60  153.09     153.09   5201300
2023-06-12   AAPL  181.27  183.89  180.97  183.79     183.79  54274900
2023-06-12    WMT  153.43  154.30  153.17  154.10     154.10   4904500
2023-06-13   AAPL  182.80  184.15  182.47  183.13     183.13  27582874
2023-06-13    WMT  154.52  155.49  154.07  155.40     155.40   1844848

>>> data.info()
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 12 entries, 2023-06-06 to 2023-06-13
Data columns (total 7 columns):
 #   Column     Non-Null Count  Dtype  
---  ------     --------------  -----  
 0   Ticker     12 non-null     object 
 1   Open       12 non-null     float64
 2   High       12 non-null     float64
 3   Low        12 non-null     float64
 4   Close      12 non-null     float64
 5   Adj Close  12 non-null     float64
 6   Volume     12 non-null     int64  
dtypes: float64(5), int64(1), object(1)
memory usage: 768.0+ bytes
So in info we see no Date info,to fix this.
>>> data = data.reset_index()
>>> data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 12 entries, 0 to 11
Data columns (total 8 columns):
 #   Column     Non-Null Count  Dtype         
---  ------     --------------  -----         
 0   Date       12 non-null     datetime64[ns]
 1   Ticker     12 non-null     object        
 2   Open       12 non-null     float64       
 3   High       12 non-null     float64       
 4   Low        12 non-null     float64       
 5   Close      12 non-null     float64       
 6   Adj Close  12 non-null     float64       
 7   Volume     12 non-null     int64         
dtypes: datetime64[ns](1), float64(5), int64(1), object(1)
memory usage: 900.0+ bytes
So now have a working DataFrame,Date see datetime64[ns]
Then can eg do a plot with Date and low last 90 days,high using eg seaborn
import yfinance as yf
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

tickers = ("AAPL", "WMT")  # Or read from file
today = pd.to_datetime("today")
start = today - pd.Timedelta(days=90)
data = None
for ticker in tickers:
    x = yf.download(ticker, start=start, end=today, progress=False).round(2)
    x.insert(0, "Ticker", ticker)
    if data is None:
        data = x
    else:
        data = pd.concat((data, x))
    data = data.sort_index()
#print(data)
data = data.reset_index()
# Plot
plt.figure(figsize=(15, 6))
sns.set_style("darkgrid")
sns.lineplot(data=data, x='Date', y='High', label='High')
sns.lineplot(data=data, x='Date', y='Low', label='Low')
plt.xlabel('Date')
plt.ylabel('Price')
plt.title('High and Low Stock Prices')
plt.legend()
plt.show()
[Image: jluiKZ.png]
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Elegant way to apply each element of an array to a dataframe? sawtooth500 7 441 Mar-29-2024, 05:51 PM
Last Post: deanhystad
  Better python library to create ER Diagram by using pandas data frames as tables klllmmm 0 1,155 Oct-19-2023, 01:01 PM
Last Post: klllmmm
  How to map two data frames based on multiple condition SriRajesh 0 1,498 Oct-27-2021, 02:43 PM
Last Post: SriRajesh
  Concatenate str JohnnyCoffee 2 2,951 May-01-2021, 03:58 PM
Last Post: JohnnyCoffee
  More elegant way to remove time from text lines. Pedroski55 6 3,961 Apr-25-2021, 03:18 PM
Last Post: perfringo
  Concatenate two dataframes moralear27 2 1,897 Sep-15-2020, 08:04 AM
Last Post: moralear27
  Moving Rows From Different Data Frames JoeDainton123 1 4,364 Aug-06-2020, 05:19 AM
Last Post: scidam
  Comparing Items Different Data Frames With a WHILE Loop JoeDainton123 1 1,957 Jul-30-2020, 04:11 AM
Last Post: scidam
  merging data frames sportcardinal 0 1,215 Jun-30-2020, 12:21 AM
Last Post: sportcardinal
  can only concatenate str (not "int") to str gr3yali3n 6 4,148 May-28-2020, 07:20 AM
Last Post: pyzyx3qwerty

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020