Jan-05-2022, 07:51 AM
Hello,
The dataframe below (Works fine) uses the read col_Expr to apply expression to the columns where required.
col_name = Client
col_Expr = upper(Client)
Then the actual_df will return the upper case of the Client column values...
Question:
I am not sure how to get the above python to work if the col_Expr is more complicated than just upper or lower.
For example, if it is to format a date field, i.e. to_date("LoadDate", "MMM dd yyyy") then simply putting this to_date into the col_Expr will give an error when it is trying to apply the expression in the above python code.
error:
== SQL ==
to_date("LoadDate"
basically I can get the code to work as follows but I want to make sure the code works without the else for any expression applied
Thanks
The dataframe below (Works fine) uses the read col_Expr to apply expression to the columns where required.
from pyspark.sql.functions import expr actual_df = actual_df.withColumn(col_name, expr(col_Expr))For example,
col_name = Client
col_Expr = upper(Client)
Then the actual_df will return the upper case of the Client column values...
Question:
I am not sure how to get the above python to work if the col_Expr is more complicated than just upper or lower.
For example, if it is to format a date field, i.e. to_date("LoadDate", "MMM dd yyyy") then simply putting this to_date into the col_Expr will give an error when it is trying to apply the expression in the above python code.
error:
== SQL ==
to_date("LoadDate"
basically I can get the code to work as follows but I want to make sure the code works without the else for any expression applied
if (row["ColumnName"] != "LoadDate"): actual_df = actual_df.withColumn(col_name, expr(col_Expr)) else: actual_df = actual_df.withColumn(col_name, date_format(col_name, col_Expr))Any suggestions?
Thanks