Nov-28-2020, 05:36 PM
(This post was last modified: Nov-28-2020, 09:01 PM by Larz60+.
Edit Reason: added proper code tags
)
My apologies for the similar question asked previously. This question is in Python. But I can't find correct solution I have the following dataframe df1
It brings in the first key as well like:
I had tried
Output:SomeJson
[{ "Number": "1234", "Color": "blue", "size": "Medium" }, { "Number": "2222", "Color": "red", "size": "Small" } ]
and I am trying to write just the contents of this dataframe as a json.df0.coalesce(300).write.mode('append').json(<json_Path>)
It brings in the first key as well like:
{ "SomeJson": [{ "Number": "1234", "Color": "blue", "size": "Medium" }, { "Number": "2222", "Color": "red", "size": "Small" } ] }but, I would not like to have { "SomeJson": } this in the output file. I have tried to write below. But, I am getting lost at writing the custom Python function to eliminate the first header. Any assistance is highly appreciated
df0.rdd.map(<custom_function>).saveAsTextFile(<json_Path>)
I had tried
df0.rdd.map(lambda x: json.dumps(x["SomeJson"])).saveAsTextFile("filepath")
but this gives only values but not keys in Square brackets..also, I would like to remove SomeJson from the output. Any help is much appreciated