Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Looping a function
#1
Hello,

I am trying to run a function that creates additional columns in a data frame in case the column is missing.

The columns are for months. If the month is missing, the function should create an empty column for the missing month(s).

Below is what I have. However, the function stops once the first missing month is found. I would like it to run all the way to the end.

Can you see what I am missing? Thanks!

def add_column_missing(dataframe):
   
        if 'February'not in list(dataframe): 
            return dataframe.assign(February='')      
      
        elif 'March'not in list(dataframe): 
            return dataframe.assign(March='')  
        
        elif 'April'not in list(dataframe): 
            return dataframe.assign(April='')  
        
        elif 'May'not in list(dataframe): 
            return dataframe.assign(May='')  
        
        elif 'June'not in list(dataframe): 
            return dataframe.assign(June='')    

        elif 'July'not in list(dataframe): 
            return dataframe.assign(July='') 
            
        elif 'August'not in list(dataframe): 
            return dataframe.assign(August='') 

        elif 'September'not in list(dataframe): 
            return dataframe.assign(September='') 
            
        elif 'October'not in list(dataframe): 
            return dataframe.assign(October='')
    
        elif 'November'not in list(dataframe): 
            return dataframe.assign(November='') 
           
        elif 'December'not in list(dataframe): 
            return dataframe.assign(December='') 
        
        else:
            print('done!')

n = 12

    while n > 0:
        n -= 1
    
    return add_column_missing(grossadds_tableau)
Reply
#2
The problem is that you have too many return statements. Try this function
import datetime as dt
months = [dt.date(2019, i, 1).strftime('%B') for i in range(2, 13)]

def add_column_missing(dataframe):
    L = list(dataframe)
    for m in (x for x in months if x not in L):
        dataframe.assign(**{m: ''})
    print('done!')
or perhaps even this
import datetime as dt
months = [dt.date(2019, i, 1).strftime('%B') for i in range(2, 13)]

def add_column_missing(dataframe):
    L = list(dataframe)
    D = {x: '' for x in months if x not in L}
    if D:
        dataframe.assign(**D)
    print('done!')
Reply
#3
It's not clear for me what rows starting from #39 suppose to do.

If function returns something, control returns to the function caller. This means that after hitting first return function is done.

If you change dataframe you don't need to return anything.

I observe that there is no January + there are ways to express this logic in more DRY (Don't Repeat Yourself) style:

from calendar import month_name
for month in month_name:
    if month not in list(dataframe):
        dataframe.assign(month='')
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Reply
#4
Thank you. This is working. But the columns are not appending to the original dataframe, as I can't add dataframe2 = dataframe.assign(month='').

So how can I append the new columns to the original dataframe and create a new copy?

def add_column_missing(dataframe):
    
    from calendar import month_name
    for month in month_name:
        if month not in list(dataframe):
              dataframe.assign(month='')
                
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020