Python Forum
Python script merging some columns to one column with new name
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Python script merging some columns to one column with new name
#2
What you could do is manipulate the rows at input by defining a 'merging plan' depending on the input headers. For example suppose that the input file has the headers ['header2', 'header5', 'header3', 'spam', 'header10', 'header13']. Then the merging plan would be: combine header2 and header3 to make a column newheaderA, change the column header5 into a column newheaderC, leave the spam column unchanged and combine header10 and header13 into a newheaderD. This plan can be represented by the python list [('newheaderA', ['header2', 'header3']), ('newheaderC', ['header5']), ('spam', ['spam']), ('newheaderD', ['header10', 'header13'])].

The following code shows how one can automatically compute the merging plan from the input headers and how one can transform the input rows according to this plan. After that you can output the new rows as if they were the actual input rows, which you already know how to do.

I'm using the function more_itertools.unique_everseen(). If you don't want to import more_itertools, you can simply copy the implementation of unique_everseen that is given at the end of the official documentation page of module itertools.
from more_itertools import unique_everseen

rules = [
    ('newheaderA', ['header1', 'header2', 'header3']),
    ('newheaderB', ['header4']),
    ('newheaderC', ['header5']),
    ('newheaderD', ['header6', 'header7', 'header8',
                    'header9', 'header10', 'header11',
                    'header12', 'header13', 'header14']),
]

inverse_rules = { old: new for new, olds in rules for old in olds}
drules = dict(rules)

def merging_plan(headers):
    headers = list(headers)
    news = list(unique_everseen(inverse_rules.get(h, h) for h in headers))
    s = set(headers)
    plan = []
    for new in news:
        plan.append((new, [old for old in drules.get(new, [new]) if old in s]))
    return plan

def merge(plan, row):
    return {k: ' '.join(row[x] for x in v) for k, v in plan}

def main():
    # compute the merging plan for a given sequence of input headers
    headers = ['header2', 'header5', 'header3', 'spam', 'header10', 'header13']
    plan = merging_plan(headers)
    print(plan)
    
    # compute the merged row corresponding to an input row
    r = {'header2': 'v2', 'header5': 'v5',
         'header3': 'v3', 'spam': 'vspam',
         'header10': 'v10', 'header13': 'v13',}
    print(merge(plan, r))

if __name__ == '__main__':
    main()
Output:
[('newheaderA', ['header2', 'header3']), ('newheaderC', ['header5']), ('spam', ['spam']), ('newheaderD', ['header10', 'header13'])] {'spam': 'vspam', 'newheaderD': 'v10 v13', 'newheaderC': 'v5', 'newheaderA': 'v2 v3'}
Reply


Messages In This Thread
RE: Python script merging some columns to one column with new name - by Gribouillis - Feb-11-2020, 11:31 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Transform 3 Columns into Single Column DaveG 9 3,605 Mar-19-2025, 03:46 AM
Last Post: robbert23
  Converting column of values into muliple columns of counts highland44 0 913 Feb-01-2024, 12:48 AM
Last Post: highland44
  Is there a *.bat DOS batch script to *.py Python Script converter? pstein 3 8,215 Jun-29-2023, 11:57 AM
Last Post: gologica
  J2534 Python Can Bus merging natezoom 0 1,599 May-01-2023, 10:37 PM
Last Post: natezoom
  Reshaping a single column in to multiple column using Python sahar 7 3,608 Jun-20-2022, 12:35 PM
Last Post: deanhystad
  df column aggregate and group by multiple columns SriRajesh 0 1,663 May-06-2022, 02:26 PM
Last Post: SriRajesh
  Split single column to multiple columns SriRajesh 1 1,930 Jan-07-2022, 06:43 PM
Last Post: jefsummers
  How to remove a column or two columns in a correlation heatmap? lulu43366 3 7,682 Sep-30-2021, 03:47 PM
Last Post: lulu43366
  Merging spreadsheets with the same columns and extracting rows with matching entries johnbernard 3 14,221 Aug-19-2021, 03:08 PM
Last Post: johnbernard
  Index error - columns vs non-column Vinny 3 6,538 Aug-09-2021, 04:46 PM
Last Post: snippsat

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020