Jan-16-2020, 08:36 PM

I'm new to pandas have tried going through the docs and experiment with various examples, but this problem I'm tacking has really stumped me.

I have the following two dataframes (DataA/DataB) which I would like to merge on a per global_index/item/values basis.

The global_index could roughly be thought of as a unit of "time"

The mapping between each data frame (DataA/DataB) and the global_index is done via the following two mapper DFs:

I would like to merge the DFs so that I get the following dataframe:

With the requirement being if there is only one value for a given global_index/item (eg: valueA but no valueB) for the last value of the missing one to be used.

I have the following two dataframes (DataA/DataB) which I would like to merge on a per global_index/item/values basis.

DataA DataB row item_id valueA row item_id valueB 0 x A1 0 x B1 1 y A2 1 y B2 2 z A3 2 x B3 3 x A4 3 y B4 4 z A5 4 z B5 5 x A6 5 x B6 6 y A7 6 y B7 7 z A8 7 z B8The list of items(item_ids) is finite and each of the two dataframes represent a the value of a trait (trait A, trait B) for an item at a given global_index value.

The global_index could roughly be thought of as a unit of "time"

The mapping between each data frame (DataA/DataB) and the global_index is done via the following two mapper DFs:

DataA_mapper global_index start_row num_rows 0 0 3 1 3 2 3 5 3 DataB_mapper global_index start_row num_rows 0 0 2 2 2 3 4 5 3Simply put for a given global_index the mapper will define a list of rows into its respective DF (DataA or DataB) that are associated with that global_index.

I would like to merge the DFs so that I get the following dataframe:

row global_index item_id valueA valueB 0 0 x A1 B1 1 0 y A2 B2 2 0 z A3 NaN 3 1 x A4 B1 4 1 z A5 NaN 5 2 x A4 B3 6 2 y A2 B4 7 2 z A5 B5 8 3 x A6 B3 9 3 y A7 B4 10 3 z A8 B5 11 4 x A6 B6 12 4 y A7 B7 13 4 z A8 B8In the final datafram any pair of global_index/item_id there will ever be either:

- a value for both valueA and valueB

- a value only for valueA

- a value only for valueB

With the requirement being if there is only one value for a given global_index/item (eg: valueA but no valueB) for the last value of the missing one to be used.