Python Forum
How to insert different types of data into a function - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: How to insert different types of data into a function (/thread-36360.html)



How to insert different types of data into a function - DrData82 - Feb-10-2022

I'm trying to design a function that will insert both a different "oldTable", string, and column name for each iteration. The "withColumn" calculation below works fine, but "withColumnRenamed" and the "where" line do not.

What I want, for example with newTable1, is "oldVar2" renamed to "string1_newVar2" and any rows with null values in the "oldVar_dropNull" variable dropped.

import pyspark.sql.functions as F

def functionName(x,y,z):
    return x.withColumn("newVar1", F.when(F.col("oldVar1") > 0, x.oldVar1*100/x.oldVar1)\
                                                    .otherwise(0)) \
               .withColumnRenamed("oldVar2", (y,"_newVar2")) \
               .where(F.col(z).isNotNull())
        
newTable1 = functionName(oldTable1,"string1","oldVar_dropNull")
newTable2 = functionName(oldTable2,"string2","oldVar_dropNull")
Some sample data:

import pandas as pd

df = {'oldVar1':['18.50', '649.27', '523.52'],
      'oldVar2':['24.56', '4564.56', '34.45'],
      'oldVar_dropNull':['12.54', '656.89', '0']
     }
 
oldTable1 = pd.DataFrame(df)
print(oldTable1)