Apr-02-2024, 03:07 AM
(This post was last modified: Apr-02-2024, 03:08 AM by sawtooth500.)
Another thought - so in my while loop, each time the loop runs, I get a result row that I assemble into a results dataframe.
In the pandas version, I can just directly insert values for a new now into resultdf every time that I loop the loop.
In polars however, you can't insert a row into a dataframe - I need to create a new dataframe each iteration with the new row of data, then I use vstack to combine the new and old result dataframes.
This is in pandas:
In the pandas version, I can just directly insert values for a new now into resultdf every time that I loop the loop.
In polars however, you can't insert a row into a dataframe - I need to create a new dataframe each iteration with the new row of data, then I use vstack to combine the new and old result dataframes.
This is in pandas:
resultdf.loc[loc_counter, 'eastern_time'] = convert_nano_timestamp(time_ender) resultdf.loc[loc_counter, 'price'] = tempdf.loc[tempdf.index[-1], 'price'] resultdf.loc[loc_counter, 'high'] = tempdf['price'].max() resultdf.loc[loc_counter, 'low'] = tempdf['price'].min() resultdf.loc[loc_counter, 'volwa-price'] = wa resultdf.loc[loc_counter, 'size'] = weightsumThis is in polars:
new_row_data = { 'int_calc': int_calc2, 'int_dur': int_dur2, 'eastern_time': convert_nano_timestamp(time_ender), 'price': tempdf.select(pl.last('price')).to_series()[0], 'high': tempdf['price'].max(), 'low': tempdf['price'].min(), 'volwa-price': wa, 'volwa%': 0, 'size': weightsum } resultdf = resultdf.vstack(pl.DataFrame({name: [value] for name, value in new_row_data.items()}, schema={name: dtype for name, dtype in schema.items()}))Could this creation and combination of a new dataframe in each loop interation in polars be what is killing my performance? I could just put the result dataframe into a pandas dataframe - that's no big deal.