Merging two DataFrames based on indexes from two other DataFrames - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Data Science (https://python-forum.io/forum-44.html) +--- Thread: Merging two DataFrames based on indexes from two other DataFrames (/thread-23782.html) |
Merging two DataFrames based on indexes from two other DataFrames - lucinda_rigeitti - Jan-16-2020 I'm new to pandas have tried going through the docs and experiment with various examples, but this problem I'm tacking has really stumped me. I have the following two dataframes (DataA/DataB) which I would like to merge on a per global_index/item/values basis. DataA DataB row item_id valueA row item_id valueB 0 x A1 0 x B1 1 y A2 1 y B2 2 z A3 2 x B3 3 x A4 3 y B4 4 z A5 4 z B5 5 x A6 5 x B6 6 y A7 6 y B7 7 z A8 7 z B8The list of items(item_ids) is finite and each of the two dataframes represent a the value of a trait (trait A, trait B) for an item at a given global_index value. The global_index could roughly be thought of as a unit of "time" The mapping between each data frame (DataA/DataB) and the global_index is done via the following two mapper DFs: DataA_mapper global_index start_row num_rows 0 0 3 1 3 2 3 5 3 DataB_mapper global_index start_row num_rows 0 0 2 2 2 3 4 5 3Simply put for a given global_index the mapper will define a list of rows into its respective DF (DataA or DataB) that are associated with that global_index. I would like to merge the DFs so that I get the following dataframe: row global_index item_id valueA valueB 0 0 x A1 B1 1 0 y A2 B2 2 0 z A3 NaN 3 1 x A4 B1 4 1 z A5 NaN 5 2 x A4 B3 6 2 y A2 B4 7 2 z A5 B5 8 3 x A6 B3 9 3 y A7 B4 10 3 z A8 B5 11 4 x A6 B6 12 4 y A7 B7 13 4 z A8 B8In the final datafram any pair of global_index/item_id there will ever be either:
With the requirement being if there is only one value for a given global_index/item (eg: valueA but no valueB) for the last value of the missing one to be used. |