Sep-02-2022, 06:34 AM
Hello All,
I am trying to parse complex JSON String from AWS Pricing JSON file and convert it into CSV file based on some conditions. I was able to achieve the parsing code implementation using for loops and pandas, but need more efficient way of implementing the code as the performance of the code is really not good. I wanted understand how we can parse complex JSON without using for loops or if we have any other efficient way to implement the code, I tried using JSON normalize in pandas also. I would really appreciate the inputs.
Below is the code snippet I have used -
I am trying to parse complex JSON String from AWS Pricing JSON file and convert it into CSV file based on some conditions. I was able to achieve the parsing code implementation using for loops and pandas, but need more efficient way of implementing the code as the performance of the code is really not good. I wanted understand how we can parse complex JSON without using for loops or if we have any other efficient way to implement the code, I tried using JSON normalize in pandas also. I would really appreciate the inputs.
Below is the code snippet I have used -
def create_all_ri_regions_pricing_file(): with open(INDEX_JSON_1) as pricing: read_content = json.load(pricing) reserved_df = read_content["terms"]["Reserved"] df_final = pd.DataFrame() for key in reserved_df: for key2 in reserved_df[key]: df = pd.DataFrame(reserved_df[key][key2]).reset_index(drop=True) print("Data frame 0--> ", df.columns) df["LeaseContractLength"] = reserved_df[key][key2]["termAttributes"]["LeaseContractLength"] df["PurchaseOption"] = reserved_df[key][key2]["termAttributes"]["PurchaseOption"] df["OfferingClass"] = reserved_df[key][key2]["termAttributes"]["OfferingClass"] df = df.drop(columns=["termAttributes"]) df = pd.concat([df["priceDimensions"].apply(pd.Series), df.drop("priceDimensions", axis=1)], axis=1) df = df[ [ "offerTermCode", "sku", "LeaseContractLength", "OfferingClass", "PurchaseOption", "rateCode", "description", "unit", "pricePerUnit", ] ] print("Data frame --> ", df) df_final = pd.concat([df_final, df]).reset_index(drop=True) df_final = df_final.loc[ (df_final["LeaseContractLength"] == "1yr") & (df_final["OfferingClass"] == "standard") & (df_final["PurchaseOption"] == "All Upfront") & (df_final["unit"] == "Hrs") ] # print("final Data frame --> ", df_final) df_final.to_csv("./final.csv")I have attached the JSON file to be parsed and expected CSV file for the reference.