May-29-2019, 11:20 AM
All,
I have an XML file which contains a tag called Relation. This tag will have Custom SQL Query something like below.
<relation connection='oraRLe.1vt9plg0hto4k31akb7rk07dlyaz' name='Custom SQL Query' type='text'>Select i.RL_PROGRAM_ID ,i.RL_INVESTMENT_ID ,p.INVESTMENT_ID||': '||p.INVESTMENT_NAME as "Program" ,p.INVESTMENT_ID as "Program ID" ,p.INVESTMENT_NAME as "Program Name" ,CASE WHEN p.IS_ACTIVE = 1 THEN 'Yes' ELSE 'No' END as "Program Is Active?" ,p.MANAGER_FULL_NAME as "Program Manager" ,p.TECH_OWNER_FULL_NAME as "Program Tech Owner" ,sRBS.RBS_NAME1 as "Program Sponsor Lvl1" ,sRBS.RBS_NAME2 as "Program Sponsor Lvl2" ,sRBS.RBS_NAME3 as "Program Sponsor Lvl3" ,p.MC_SPONSOR_RBS_FULL_PATH as "Program Sponsor Path" ,p.INVESTMENT_RBS_FULL_PATH as "Program Investment RBS" ,p.RD_DEMAND_REGION_DESC as "Program Demand Region" ,p.RD_FUND_SGMNT_DESC as "Program Funding Segment" ,p.ACCOUNTING_REGION_DESC as "Program Accounting Region" From RL_PROJ_PROG_XREF i LEFT JOIN RL_PROJECT_DIM p ON i.RL_PROGRAM_ID = p.RL_INVESTMENT_ID LEFT JOIN RL_RBS sRBS ON p.MC_SPONSOR_RBS_UNIT_ID = sRBS.RBS_UNIT_ID Where (p.RELEASE_INDICATOR is null or p.RELEASE_INDICATOR = 'RD_notinarelease') UNION Select i.RL_PROGRAM_ID ,i.RL_PROGRAM_ID as RL_INVESTMENT_ID ,p.INVESTMENT_ID||': '||p.INVESTMENT_NAME as Program ,p.INVESTMENT_ID as "Program ID" ,p.INVESTMENT_NAME as "Program Name" ,CASE WHEN p.IS_ACTIVE = 1 THEN 'Yes' ELSE 'No' END as "Program Is Active?" ,p.MANAGER_FULL_NAME as "Program Manager" ,p.TECH_OWNER_FULL_NAME as "Program Tech Owner" ,sRBS.RBS_NAME1 as "Program Sponsor Lvl1" ,sRBS.RBS_NAME2 as "Program Sponsor Lvl2" ,sRBS.RBS_NAME3 as "Program Sponsor Lvl3" ,p.MC_SPONSOR_RBS_FULL_PATH as "Program Sponsor Path" ,p.INVESTMENT_RBS_FULL_PATH as "Program Investment RBS" ,p.RD_DEMAND_REGION_DESC as "Program Demand Region" ,p.RD_FUND_SGMNT_DESC as "Program Funding Segment" ,p.ACCOUNTING_REGION_DESC as "Program Accounting Region" From RL_PROJ_PROG_XREF i LEFT JOIN RL_PROJECT_DIM p ON i.RL_PROGRAM_ID = p.RL_INVESTMENT_ID LEFT JOIN RL_RBS sRBS ON p.MC_SPONSOR_RBS_UNIT_ID = sRBS.RBS_UNIT_ID Where (p.RELEASE_INDICATOR is null or p.RELEASE_INDICATOR = 'RD_notinarelease')</relation>
My requirement is to get TABLE and its relevant COLUMN NAMES used in he query. I have 120 XML file like this.
Expected Output:
FileName TableName ColumnName
F1 T1 C1
F1 T1 C2
F1 T2 C1
F2 T1 C1
F2 T2 C1
There are 2 cases:
#1 - If alias found in the column name then we have to find the same alias name after the table name and match it with that particular table.
#2 - If alias not found, then we have to map each and very column to every tables used after the FROM clause ( It is like a cartesian product)
I will take the actual table name & columns name from ALL_TAB_COLUMNS( Oracle) and match with the output point#2 to eliminate the unwanted combinations.
Inputs are welcome!
This is my 2nd post and 2nd task in Python...
Let me tell you what are all the options i explored so far.. I dont know is this the right approach to get the desired result.
Note the below result is somehow ok, but i tried to replace JUNK character, with the help of REPLACE(), But no luck.
I have an XML file which contains a tag called Relation. This tag will have Custom SQL Query something like below.
<relation connection='oraRLe.1vt9plg0hto4k31akb7rk07dlyaz' name='Custom SQL Query' type='text'>Select i.RL_PROGRAM_ID ,i.RL_INVESTMENT_ID ,p.INVESTMENT_ID||': '||p.INVESTMENT_NAME as "Program" ,p.INVESTMENT_ID as "Program ID" ,p.INVESTMENT_NAME as "Program Name" ,CASE WHEN p.IS_ACTIVE = 1 THEN 'Yes' ELSE 'No' END as "Program Is Active?" ,p.MANAGER_FULL_NAME as "Program Manager" ,p.TECH_OWNER_FULL_NAME as "Program Tech Owner" ,sRBS.RBS_NAME1 as "Program Sponsor Lvl1" ,sRBS.RBS_NAME2 as "Program Sponsor Lvl2" ,sRBS.RBS_NAME3 as "Program Sponsor Lvl3" ,p.MC_SPONSOR_RBS_FULL_PATH as "Program Sponsor Path" ,p.INVESTMENT_RBS_FULL_PATH as "Program Investment RBS" ,p.RD_DEMAND_REGION_DESC as "Program Demand Region" ,p.RD_FUND_SGMNT_DESC as "Program Funding Segment" ,p.ACCOUNTING_REGION_DESC as "Program Accounting Region" From RL_PROJ_PROG_XREF i LEFT JOIN RL_PROJECT_DIM p ON i.RL_PROGRAM_ID = p.RL_INVESTMENT_ID LEFT JOIN RL_RBS sRBS ON p.MC_SPONSOR_RBS_UNIT_ID = sRBS.RBS_UNIT_ID Where (p.RELEASE_INDICATOR is null or p.RELEASE_INDICATOR = 'RD_notinarelease') UNION Select i.RL_PROGRAM_ID ,i.RL_PROGRAM_ID as RL_INVESTMENT_ID ,p.INVESTMENT_ID||': '||p.INVESTMENT_NAME as Program ,p.INVESTMENT_ID as "Program ID" ,p.INVESTMENT_NAME as "Program Name" ,CASE WHEN p.IS_ACTIVE = 1 THEN 'Yes' ELSE 'No' END as "Program Is Active?" ,p.MANAGER_FULL_NAME as "Program Manager" ,p.TECH_OWNER_FULL_NAME as "Program Tech Owner" ,sRBS.RBS_NAME1 as "Program Sponsor Lvl1" ,sRBS.RBS_NAME2 as "Program Sponsor Lvl2" ,sRBS.RBS_NAME3 as "Program Sponsor Lvl3" ,p.MC_SPONSOR_RBS_FULL_PATH as "Program Sponsor Path" ,p.INVESTMENT_RBS_FULL_PATH as "Program Investment RBS" ,p.RD_DEMAND_REGION_DESC as "Program Demand Region" ,p.RD_FUND_SGMNT_DESC as "Program Funding Segment" ,p.ACCOUNTING_REGION_DESC as "Program Accounting Region" From RL_PROJ_PROG_XREF i LEFT JOIN RL_PROJECT_DIM p ON i.RL_PROGRAM_ID = p.RL_INVESTMENT_ID LEFT JOIN RL_RBS sRBS ON p.MC_SPONSOR_RBS_UNIT_ID = sRBS.RBS_UNIT_ID Where (p.RELEASE_INDICATOR is null or p.RELEASE_INDICATOR = 'RD_notinarelease')</relation>
My requirement is to get TABLE and its relevant COLUMN NAMES used in he query. I have 120 XML file like this.
Expected Output:
FileName TableName ColumnName
F1 T1 C1
F1 T1 C2
F1 T2 C1
F2 T1 C1
F2 T2 C1
There are 2 cases:
#1 - If alias found in the column name then we have to find the same alias name after the table name and match it with that particular table.
#2 - If alias not found, then we have to map each and very column to every tables used after the FROM clause ( It is like a cartesian product)
I will take the actual table name & columns name from ALL_TAB_COLUMNS( Oracle) and match with the output point#2 to eliminate the unwanted combinations.
Inputs are welcome!
This is my 2nd post and 2nd task in Python...
Let me tell you what are all the options i explored so far.. I dont know is this the right approach to get the desired result.
Note the below result is somehow ok, but i tried to replace JUNK character, with the help of REPLACE(), But no luck.
from pandas import DataFrame a = DataFrame([{'var1': ' c.Type ,c.RL_INVESTMENT_ID -- ,c.RL_RESOURCE_ID ', 'var2': 1}, {'var1': 'd,e,f', 'var2': 2}]) a= a.replace(" ","") a= a.replace(""",'"') def tidy_split(df, column, sep='|', keep=False): """ Split the values of a column and expand so the new DataFrame has one split value per row. Filters rows where the column is missing. Params ------ df : pandas.DataFrame dataframe with the column to split and expand column : str the column to split and expand sep : str the string used to split the column's values keep : bool whether to retain the presplit value as it's own row Returns ------- pandas.DataFrame Returns a dataframe with the same columns as `df`. """ indexes = list() new_values = list() df = df.dropna(subset=[column]) print ("df as ", df) for i, presplit in enumerate(df[column].astype(str)): values = presplit.split(sep) print ("values as ", values) print ("i as" , i) if keep and len(values) > 1: indexes.append(i) new_values.append(presplit) for value in values: indexes.append(i) new_values.append(value) new_df = df.iloc[indexes, :].copy() new_df[column] = new_values print( new_df[column]) def main(): print (a) tidy_split(a, 'var1', sep=',') if __name__ == '__main__': main()