![]() |
Is there a more elegant way to concatenate data frames? - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Is there a more elegant way to concatenate data frames? (/thread-40166.html) |
Is there a more elegant way to concatenate data frames? - db042190 - Jun-12-2023 Hi, I'm picturing a script that will include in a loop everything after the setting of "today" and before the "print" prototyped in the code below. The loop will read symbols from a notepad file line by line. I would substitute the hardcoded Ticker settings you see below with the symbol just read from notepad. If the record read is the first, the download call where AAPL is now a placeholder would be called. Otherwise the placeholder for the WMT call would be executed. The download method allows passing multiple tickers in a call but long term I think this will be more flexible. From what I can tell, if you don't pass multiple symbols in a call, you don't get a column for Ticker. Can I make this more elegant? It seems awkward. import yfinance as yf import pandas as pd from datetime import date today = date.today() Ticker="AAPL" data1 = yf.download(Ticker, start="2023-05-01", end=today).round(2) data1["Ticker"]=Ticker Ticker="WMT" data2 = yf.download(Ticker, start="2023-05-01", end=today).round(2) data2["Ticker"]=Ticker data1=[data1,data2] data1 = pd.concat(data1) print(data1) RE: Is there a more elegant way to concatenate data frames? - deanhystad - Jun-13-2023 Looks fine to me. import yfinance as yf import pandas as pd from datetime import date tickers = ("AAPL", "WMT") # Or read from file today = date.today() start = today - timedelta(days=7) data = None for ticker in tickers: x = yf.download(ticker, start=start, end=today, progress=False).round(2) x.insert(0, "Ticker", ticker) if data is None: data = x else: data = pd.concat((data, x)) data = data.sort_index() print(data) I moved the ticker column. I think it makes more sense to place it ahead of the financial information. Also sorted the resulting table by the date index and changed the starting data to a calculation instead of a string. Just for fun.
RE: Is there a more elegant way to concatenate data frames? - db042190 - Jun-13-2023 much more elegant. thank you. RE: Is there a more elegant way to concatenate data frames? - snippsat - Jun-13-2023 Some tips about dates in Pandas and if look Date so is lower in header column and need a fix.So here have i remove datatime import an used Pandas own date functionality Can fine use both,but when first has import Pandas don't need a addition import of datetime. import yfinance as yf import pandas as pd tickers = ("AAPL", "WMT") # Or read from file today = pd.to_datetime("today") start = today - pd.Timedelta(days=7) data = None for ticker in tickers: x = yf.download(ticker, start=start, end=today, progress=False).round(2) x.insert(0, "Ticker", ticker) if data is None: data = x else: data = pd.concat((data, x)) data = data.sort_index() print(data) >>> data Ticker Open High Low Close Adj Close Volume Date 2023-06-06 AAPL 179.97 180.12 177.43 179.21 179.21 64848400 2023-06-06 WMT 149.70 150.19 148.51 149.78 149.78 5005200 2023-06-07 AAPL 178.44 181.21 177.32 177.82 177.82 61944600 2023-06-07 WMT 149.25 150.36 149.04 150.00 150.00 8085500 2023-06-08 AAPL 177.90 180.84 177.46 180.57 180.57 50214900 2023-06-08 WMT 150.39 152.43 149.79 152.17 152.17 6291000 2023-06-09 AAPL 181.50 182.23 180.63 180.96 180.96 48870700 2023-06-09 WMT 152.16 153.72 151.60 153.09 153.09 5201300 2023-06-12 AAPL 181.27 183.89 180.97 183.79 183.79 54274900 2023-06-12 WMT 153.43 154.30 153.17 154.10 154.10 4904500 2023-06-13 AAPL 182.80 184.15 182.47 183.13 183.13 27582874 2023-06-13 WMT 154.52 155.49 154.07 155.40 155.40 1844848 >>> data.info() <class 'pandas.core.frame.DataFrame'> DatetimeIndex: 12 entries, 2023-06-06 to 2023-06-13 Data columns (total 7 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Ticker 12 non-null object 1 Open 12 non-null float64 2 High 12 non-null float64 3 Low 12 non-null float64 4 Close 12 non-null float64 5 Adj Close 12 non-null float64 6 Volume 12 non-null int64 dtypes: float64(5), int64(1), object(1) memory usage: 768.0+ bytesSo in info we see no Date info,to fix this. >>> data = data.reset_index() >>> data.info() <class 'pandas.core.frame.DataFrame'> RangeIndex: 12 entries, 0 to 11 Data columns (total 8 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Date 12 non-null datetime64[ns] 1 Ticker 12 non-null object 2 Open 12 non-null float64 3 High 12 non-null float64 4 Low 12 non-null float64 5 Close 12 non-null float64 6 Adj Close 12 non-null float64 7 Volume 12 non-null int64 dtypes: datetime64[ns](1), float64(5), int64(1), object(1) memory usage: 900.0+ bytesSo now have a working DataFrame,Date see datetime64[ns] Then can eg do a plot with Date and low last 90 days,high using eg seaborn import yfinance as yf import pandas as pd import matplotlib.pyplot as plt import seaborn as sns tickers = ("AAPL", "WMT") # Or read from file today = pd.to_datetime("today") start = today - pd.Timedelta(days=90) data = None for ticker in tickers: x = yf.download(ticker, start=start, end=today, progress=False).round(2) x.insert(0, "Ticker", ticker) if data is None: data = x else: data = pd.concat((data, x)) data = data.sort_index() #print(data) data = data.reset_index() # Plot plt.figure(figsize=(15, 6)) sns.set_style("darkgrid") sns.lineplot(data=data, x='Date', y='High', label='High') sns.lineplot(data=data, x='Date', y='Low', label='Low') plt.xlabel('Date') plt.ylabel('Price') plt.title('High and Low Stock Prices') plt.legend() plt.show() ![]() |