DataFRame.concat()

nafshar · Jul-14-2023, 01:24 PM

I am trying to convert from DataFrame.append() which has been deprecated to DataFrame.concat() and having some issues with an otherwise working code as below. The commented section works just fine with no issues, except that I get the deprecated warning.

def chunks(lst, n):
    """Yield successive n-sized chunks from lst."""
    for i in range(0, len(lst), n):
        yield lst[i:i + n]

symbol_groups = list(chunks(stocks['Ticker'], 100))
symbol_chunks = []
for i in range(0, len(symbol_groups)):
    symbol_chunks.append(','.join(symbol_groups[i]))

final_dataframe = pd.DataFrame(columns = my_columns)

for symbol_chunk in symbol_chunks:
    api_url = f'https://api.iex.cloud/v1/data/core/quote/{symbol_chunk}'
    response = requests.get(api_url, params=params)
    data = pd.DataFrame(response.json())
    data = data.set_index('symbol')
    #display(data) <------------------ At this point everything displays fine.

    for symbol in symbol_chunk.split(','):
        new_row = pd.Series([symbol, 
                             data.loc[symbol]['latestPrice'], 
                             data.loc[symbol]['marketCap'], 
                             'N/A'], 
                            index=my_columns)

        final_dataframe = pd.concat([final_dataframe, new_row], ignore_index=True)

        '''
        The following section works well, although I get a warning for Deprecation on the append()

        final_dataframe = final_dataframe.append(
                                        pd.Series([symbol, 
                                                   data.loc[symbol]['latestPrice'], 
                                                   data.loc[symbol]['marketCap'], 
                                                   'N/A'], 
                                                  index = my_columns), 
                                        ignore_index = True)
        '''
display(final_dataframe)

Output:Ticker	Price	Market Capitalization	Number Of Shares to Buy	0
0           NaN  NaN       NaN                         NaN                     A
1           NaN  NaN       NaN                         NaN              119.35
2           NaN  NaN       NaN                         NaN     35253103878
3           NaN  NaN       NaN                         NaN                   N/A

...	...	...	...	...	...
2015	     NaN	NaN	      NaN	                    NaN	                   N/A
2016	     NaN	NaN	      NaN	                    NaN	                   ZTS
2017	     NaN	NaN	      NaN	                    NaN	                 171.0
2018	     NaN	NaN	      NaN	                    NaN	      79021175940

**deanhystad** · (This post was last modified: Jul-14-2023, 06:24 PM by deanhystad.)

I don't understand why you are making all these dataframes. I would collect the data and make one. I don't know the shape of the data returned by the request, so this might be a little off.

rows = []
for symbol_chunk in symbol_chunks:
    api_url = f"https://api.iex.cloud/v1/data/core/quote/{symbol_chunk}"
    for d in requests.get(api_url, params=params).json():  # assuming response is a list of dictionaries.
        rows.append([d["symbol"], d["latestPrice"], d["marketCap"], "N/A"])

final_dataframe = pd.DataFrame(rows, columns=my_columns)

If you really want to use DataFrame.concat() you should concatinate the data dataframe all at once.

final_dataframe = None
for symbol_chunk in symbol_chunks:
    api_url = f"https://api.iex.cloud/v1/data/core/quote/{symbol_chunk}"
    response = requests.get(api_url, params=params).json()
    data = pd.DataFrame(response)["symbol", "latestPrice", "marketCap"]  # Extract columns I want
    final_dataframe = data if final_dataframe is None else pd.concat([final_dataframe, data], ignore_index=True)

final_dataframe["na"] = "N/A"  # Add the column that contains "N/A"
final_dataframe.columns = my_columns  # Rename the columns to my_columns

And if you really want to append data a row at a time, new_row needs to have columns that match final_dataframe. You named the rows, not the columns.

    for symbol in symbol_chunk.split(','):
        new_row = pd.DataFrame[([symbol, 
                             data.loc[symbol]['latestPrice'], 
                             data.loc[symbol]['marketCap'], 
                             'N/A']), 
                            columns=my_columns)
        final_dataframe = pd.concat([final_dataframe, new_row], ignore_index=True)

nafshar · (This post was last modified: Jul-14-2023, 04:41 PM by nafshar.)

(Jul-14-2023, 04:29 PM)deanhystad Wrote: I don't understand why you are making all these dataframes. I would collect the data and make one. I don't know the shape of the data returned by the request, so this might be a little off.

rows = []
for symbol_chunk in symbol_chunks:
    api_url = f"https://api.iex.cloud/v1/data/core/quote/{symbol_chunk}"
    data = requests.get(api_url, params=params).json()

    for symbol in symbol_chunk.split(","):  # Should get symbols from the request response.  Need to see response
        rows.append(
            [symbol, data[symbol]["latestPrice"], data[symbol]["marketCap"], "N/A"]  # Not sure if this is way to get values.  Need to see response.
        )

final_dataframe = pd.DataFrame(rows, columns=my_columns)

If you really want to use DataFrame.concat() you should concatinate the data dataframe all at once.

final_dataframe = pd.DataFrame(columns=["symbol", "latestPrice", "marketCap"])  # Names of columns in response.  Will rename later
for symbol_chunk in symbol_chunks:
    api_url = f"https://api.iex.cloud/v1/data/core/quote/{symbol_chunk}"
    response = requests.get(api_url, params=params)
    data = pd.DataFrame(response.json())["symbol", "latestPrice", "marketCap"]  # Extract columns I want
    final_dataframe = pd.concat([final_dataframe, data], ignore_index=True)

final_dataframe["na"] = "N/A"  # Add the column that contains "N/A"
final_dataframe.columns = my_columns  # Rename the columns to my_columns

And if you really want to append data a row at a time, new_row needs to have columns that match final_dataframe. You named the rows, not the columns.

    for symbol in symbol_chunk.split(','):
        new_row = pd.DataFrame[([symbol, 
                             data.loc[symbol]['latestPrice'], 
                             data.loc[symbol]['marketCap'], 
                             'N/A']), 
                            columns=my_columns)
        final_dataframe = pd.concat([final_dataframe, new_row], ignore_index=True)

nafshar · Jul-14-2023, 04:41 PM

Dean - Thank you for the brilliant suggestions. Sometimes a fresh set of eyes work best. I was trying to change an existing code with minimal changes and did not even think of altering the structure. These new forms you suggested work far better and are much more clear.

Thanks again

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Pandas AttributeError: 'DataFrame' object has no attribute 'concat'	Sameer33	5	5,691	Feb-17-2023, 06:01 PM Last Post: Sameer33
	Concat Strings	paulo79	5	1,455	Apr-15-2022, 09:58 PM Last Post: snippsat
	[SOLVED] Concat data from dictionary?	Winfried	4	1,738	Mar-30-2022, 02:55 PM Last Post: Winfried
	pd.concat Problem	WiPi	1	1,769	May-27-2020, 07:42 AM Last Post: WiPi
	Sqlite CONCAT columns	issac_n	4	5,106	Mar-22-2020, 09:31 AM Last Post: buran
	Concat multiple Excel sheets with exclusion	alessandrotk	1	2,856	Jan-10-2020, 04:43 AM Last Post: sandeep_ganga

DataFRame.concat()

User Panel Messages

Announcements