Python script merging some columns to one column with new name

**Gribouillis** · (This post was last modified: Feb-11-2020, 11:31 PM by Gribouillis.)

What you could do is manipulate the rows at input by defining a 'merging plan' depending on the input headers. For example suppose that the input file has the headers ['header2', 'header5', 'header3', 'spam', 'header10', 'header13']. Then the merging plan would be: combine header2 and header3 to make a column newheaderA, change the column header5 into a column newheaderC, leave the spam column unchanged and combine header10 and header13 into a newheaderD. This plan can be represented by the python list

[('newheaderA', ['header2', 'header3']), ('newheaderC', ['header5']), ('spam', ['spam']), ('newheaderD', ['header10', 'header13'])]

.

The following code shows how one can automatically compute the merging plan from the input headers and how one can transform the input rows according to this plan. After that you can output the new rows as if they were the actual input rows, which you already know how to do.

I'm using the function more_itertools.unique_everseen(). If you don't want to import more_itertools, you can simply copy the implementation of unique_everseen that is given at the end of the official documentation page of module itertools.

from more_itertools import unique_everseen

rules = [
    ('newheaderA', ['header1', 'header2', 'header3']),
    ('newheaderB', ['header4']),
    ('newheaderC', ['header5']),
    ('newheaderD', ['header6', 'header7', 'header8',
                    'header9', 'header10', 'header11',
                    'header12', 'header13', 'header14']),
]

inverse_rules = { old: new for new, olds in rules for old in olds}
drules = dict(rules)

def merging_plan(headers):
    headers = list(headers)
    news = list(unique_everseen(inverse_rules.get(h, h) for h in headers))
    s = set(headers)
    plan = []
    for new in news:
        plan.append((new, [old for old in drules.get(new, [new]) if old in s]))
    return plan

def merge(plan, row):
    return {k: ' '.join(row[x] for x in v) for k, v in plan}

def main():
    # compute the merging plan for a given sequence of input headers
    headers = ['header2', 'header5', 'header3', 'spam', 'header10', 'header13']
    plan = merging_plan(headers)
    print(plan)
    
    # compute the merged row corresponding to an input row
    r = {'header2': 'v2', 'header5': 'v5',
         'header3': 'v3', 'spam': 'vspam',
         'header10': 'v10', 'header13': 'v13',}
    print(merge(plan, r))

if __name__ == '__main__':
    main()

Output:[('newheaderA', ['header2', 'header3']), ('newheaderC', ['header5']), ('spam', ['spam']), ('newheaderD', ['header10', 'header13'])]
{'spam': 'vspam', 'newheaderD': 'v10 v13', 'newheaderC': 'v5', 'newheaderA': 'v2 v3'}

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Transform 3 Columns into Single Column	DaveG	9	3,605	Mar-19-2025, 03:46 AM Last Post: robbert23
	Converting column of values into muliple columns of counts	highland44	0	913	Feb-01-2024, 12:48 AM Last Post: highland44
	Is there a .bat DOS batch script to .py Python Script converter?	pstein	3	8,215	Jun-29-2023, 11:57 AM Last Post: gologica
	J2534 Python Can Bus merging	natezoom	0	1,599	May-01-2023, 10:37 PM Last Post: natezoom
	Reshaping a single column in to multiple column using Python	sahar	7	3,608	Jun-20-2022, 12:35 PM Last Post: deanhystad
	df column aggregate and group by multiple columns	SriRajesh	0	1,663	May-06-2022, 02:26 PM Last Post: SriRajesh
	Split single column to multiple columns	SriRajesh	1	1,930	Jan-07-2022, 06:43 PM Last Post: jefsummers
	How to remove a column or two columns in a correlation heatmap?	lulu43366	3	7,682	Sep-30-2021, 03:47 PM Last Post: lulu43366
	Merging spreadsheets with the same columns and extracting rows with matching entries	johnbernard	3	14,221	Aug-19-2021, 03:08 PM Last Post: johnbernard
	Index error - columns vs non-column	Vinny	3	6,538	Aug-09-2021, 04:46 PM Last Post: snippsat

Python script merging some columns to one column with new name

User Panel Messages

Announcements