Python Forum

Full Version: How does one combine 2 data sets ?
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
In the below script I am creating 2 datasets (stats1, stats2). One in each for loop. Now I want to combine the two data sets. That is, adding the columns from stats2 after the last column in stats1. I don't know how to do that. If somebody could help that would be great. Note: I need to delete "level_1" in each data set.

import pandas as pd
from yahoo_fin import stock_info as si

ti = pd.read_csv('/Users/detlefschmitt/Desktop/TickerList.csv')
stock_list = ti["Ticker"].tolist()

stats1 = {}
stats2 = {}

for ticker in stock_list:
  data = si.get_stats(ticker)
  stats1[ticker] = data

for ticker in stock_list:
  data = si.get_stats_valuation(ticker)
  stats2[ticker] = data
  
combined = pd.concat(stats1)
combined = combined.reset_index()
del combined["level_1"]
combined.columns = ["Ticker", "Attribute", "Recent"]

df = pd.DataFrame(combined)
df2 = df.pivot(index='Ticker', columns='Attribute', values='Recent')

df2.to_csv(r'/Users/detlefschmitt/Desktop/stats.csv')
Look at pandas concat function. Use axis=1 to concatenate columns. See docs
Thanks for the hint. I was able to solve the problem using pandas join function.

import pandas as pd
from yahoo_fin import stock_info as si

ti = pd.read_csv('/Users/detlefschmitt/Desktop/TickerList.csv')
stock_list = ti["Ticker"].tolist()

stats1 = {}
stats2 = {}

for ticker in stock_list:
  data = si.get_stats(ticker)
  stats1[ticker] = data

for ticker in stock_list:
  data = si.get_stats_valuation(ticker)
  stats2[ticker] = data

combined1 = pd.concat(stats1)
combined1 = combined1.reset_index()
del combined1["level_1"]
combined1.columns = ["Ticker", "Attribute", "Recent"]

combined2 = pd.concat(stats2)
combined2 = combined2.reset_index()
del combined2["level_1"]
combined2.columns = ["Ticker", "Attribute", "Recent"]

df1 = pd.DataFrame(combined1)
df2 = pd.DataFrame(combined2)

df11 = df1.pivot(index='Ticker', columns='Attribute', values='Recent')
df22 = df2.pivot(index='Ticker', columns='Attribute', values='Recent')

df3 = df11.join(df22)

df3.to_csv(r'/Users/detlefschmitt/Desktop/stats.csv')
The screenshot in the attachment illustrates what I was trying to achieve. I had df11 and df22 and tried to join them to get df3.