Multiprocessing on python

sawtooth500 · (This post was last modified: Apr-02-2024, 03:08 AM by sawtooth500.)

Another thought - so in my while loop, each time the loop runs, I get a result row that I assemble into a results dataframe.

In the pandas version, I can just directly insert values for a new now into resultdf every time that I loop the loop.

In polars however, you can't insert a row into a dataframe - I need to create a new dataframe each iteration with the new row of data, then I use vstack to combine the new and old result dataframes.

This is in pandas:

resultdf.loc[loc_counter, 'eastern_time'] = convert_nano_timestamp(time_ender)
    resultdf.loc[loc_counter, 'price'] = tempdf.loc[tempdf.index[-1], 'price']
    resultdf.loc[loc_counter, 'high'] = tempdf['price'].max()
    resultdf.loc[loc_counter, 'low'] = tempdf['price'].min()
    resultdf.loc[loc_counter, 'volwa-price'] = wa
    resultdf.loc[loc_counter, 'size'] = weightsum

This is in polars:

new_row_data = {
        'int_calc': int_calc2,
        'int_dur': int_dur2,
        'eastern_time': convert_nano_timestamp(time_ender),
        'price': tempdf.select(pl.last('price')).to_series()[0],
        'high': tempdf['price'].max(),
        'low': tempdf['price'].min(),
        'volwa-price': wa,
        'volwa%': 0,
        'size': weightsum
    }
    resultdf = resultdf.vstack(pl.DataFrame({name: [value] for name, value in new_row_data.items()}, schema={name: dtype for name, dtype in schema.items()}))

Could this creation and combination of a new dataframe in each loop interation in polars be what is killing my performance? I could just put the result dataframe into a pandas dataframe - that's no big deal.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	How to run existing python script parallel using multiprocessing	lravikumarvsp	3	4,903	May-24-2018, 05:23 AM Last Post: lravikumarvsp

Multiprocessing on python

User Panel Messages

Announcements