Python Forum

Full Version: Python Alteryx QS-Passing pandas dataframe column inside SQL query where condition
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi Team
I have a alteryx module that uses python query component

Input to python is 1 column that has ~ 3M values. I read it inside python using Alteryx.read("#1")

This column has 3 M numeric values.

I need to pull all details for these 3 M values from 1 of larger Athena database table 'A' by using where condition on column_A from the table and passing these values as condition on the column_A

I am trying code as below but it is giving me different syntax or logical errors: Please can anyone help me with correcting it or suggesting alternatives to pass these 3 M values of a column into SQL query where condition;

My code which is failing is as follow:

pin_list = Alteryx.read("#1")    (Note: contains 3 Million numeric values like 1 , 2 ,3 etc. ) 

df1= pd.DataFrame(pin_list)
pins= df1['ColumnName'].tolist()
con= pyodbc.connect(DSN= 'Athena' , DATABASE= 'Random')
sql2=  "Select * from A where date = 'randomdate' and column_A in ({}) "\ .format(',  'join(['?' for _  in pins])) 
df= pd.read_sql(sql2, con, params= [pins])
print(df)
Alteryx.write(df,1)
How can i pass 3M numeric values present in 'pins' into where conditon of query on 'column_A' ?
Note: Also Column_A has to be passed With numeric values as required by table 'A' . passing string or characters wont retrieve correct data from the table