Python Forum

Full Version: dynamic expression
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello,
The dataframe below (Works fine) uses the read col_Expr to apply expression to the columns where required.
from pyspark.sql.functions import expr
actual_df = actual_df.withColumn(col_name, expr(col_Expr))
For example,
col_name = Client
col_Expr = upper(Client)

Then the actual_df will return the upper case of the Client column values...
Question:
I am not sure how to get the above python to work if the col_Expr is more complicated than just upper or lower.
For example, if it is to format a date field, i.e. to_date("LoadDate", "MMM dd yyyy") then simply putting this to_date into the col_Expr will give an error when it is trying to apply the expression in the above python code.

error:
== SQL ==
to_date("LoadDate"

basically I can get the code to work as follows but I want to make sure the code works without the else for any expression applied
if (row["ColumnName"] != "LoadDate"):
                    actual_df = actual_df.withColumn(col_name, expr(col_Expr))
                else:                                        
                   actual_df = actual_df.withColumn(col_name, date_format(col_name, col_Expr))
Any suggestions?

Thanks
Not sure I completely understand the question, but it sounds like apply may work. df.apply() allows you to define a function, as long or complex as you like, then apply it to the dataframe.