Apr-03-2025, 12:02 PM
Team,
We have three columns in a table (Mobile Number, Home Phone Number, Business Phone Number).
I am writing a Python script to identify the Valid and Invalid contact Records.
1) If Any of the Phone Number is Valid US Number - we have to mark that record as Valid Number.
2) If all the three Phone Numbers are Invalid, then we have to mark that record as Invalid contact.
I have used this code - but the code always shows Incorrect Number even one of the column has a valid number. Can someone help me on this. Thank you!
We have three columns in a table (Mobile Number, Home Phone Number, Business Phone Number).
I am writing a Python script to identify the Valid and Invalid contact Records.
1) If Any of the Phone Number is Valid US Number - we have to mark that record as Valid Number.
2) If all the three Phone Numbers are Invalid, then we have to mark that record as Invalid contact.
I have used this code - but the code always shows Incorrect Number even one of the column has a valid number. Can someone help me on this. Thank you!
import phonenumbers from pyspark.sql.functions import udf from pyspark.sql.types import StringType # Define a UDF to parse and validate US phone numbers def parse_phone_number(mobile_phone_number, home_phone_number, busn_phone_number): try: is_valid_mobile = False is_valid_home = False is_valid_busn = False if mobile_phone_number: parsed_mobile_number = phonenumbers.parse(mobile_phone_number, "US") if phonenumbers.is_valid_number(parsed_mobile_number): is_valid_mobile = True if home_phone_number: parsed_home_number = phonenumbers.parse(home_phone_number, "US") if phonenumbers.is_valid_number(parsed_home_number): is_valid_home = True if busn_phone_number: parsed_busn_number = phonenumbers.parse(busn_phone_number, "US") if phonenumbers.is_valid_number(parsed_busn_number): is_valid_busn = True if is_valid_mobile or is_valid_home or is_valid_busn: return "Correct Number" else: return "Incorrect Number" except Exception as e: return f"Error parsing phone number: {e}" parse_phone_number_udf = udf(parse_phone_number, StringType()) Query = """ select * from datalabs.lab_tech_bi.KYC_dash_analysis where BUSN_PH_NBR = '5852330889' """ df = spark.sql(Query) df = df.withColumn('Correct_number', parse_phone_number_udf(df['MOBILE_PH_NBR'], df['HOME_PH_NBR'], df['BUSN_PH_NBR'])) # Create a temporary view from the DataFrame df.createOrReplaceTempView("final_output") # Use SQL to create a table from the temporary view spark.sql(""" CREATE OR REPLACE TABLE datalabs.lab_tech_bi.final_output_table AS SELECT * FROM final_output """) display(df)