Python Forum
pyspark dataframe to json without header
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
pyspark dataframe to json without header
#1
My apologies for the similar question asked previously. This question is in Python. But I can't find correct solution I have the following dataframe df1

Output:
SomeJson [{ "Number": "1234", "Color": "blue", "size": "Medium" }, { "Number": "2222", "Color": "red", "size": "Small" } ]
and I am trying to write just the contents of this dataframe as a json.

df0.coalesce(300).write.mode('append').json(<json_Path>)
It brings in the first key as well like:
{
        "SomeJson": [{
                "Number": "1234",
                "Color": "blue",
                "size": "Medium"
            }, {
                "Number": "2222",
                "Color": "red",
                "size": "Small"
            }
        ]
    }
but, I would not like to have { "SomeJson": } this in the output file. I have tried to write below. But, I am getting lost at writing the custom Python function to eliminate the first header. Any assistance is highly appreciated

df0.rdd.map(<custom_function>).saveAsTextFile(<json_Path>)

I had tried df0.rdd.map(lambda x: json.dumps(x["SomeJson"])).saveAsTextFile("filepath") but this gives only values but not keys in Square brackets..also, I would like to remove SomeJson from the output. Any help is much appreciated
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  PySpark Coding Challenge cpatte7372 4 6,067 Jun-25-2023, 12:56 PM
Last Post: prajwal_0078
  Pyspark dataframe siddhi1919 3 1,215 Apr-25-2023, 12:39 PM
Last Post: snippsat
  Converting a json file to a dataframe with rows and columns eyavuz21 13 4,398 Jan-29-2023, 03:59 PM
Last Post: eyavuz21
  pyspark help lokesh 0 754 Jan-03-2023, 04:34 PM
Last Post: lokesh
  Convert python dataframe to nested json kat417 1 6,324 Mar-18-2022, 09:14 PM
Last Post: kat417
  How to iterate Groupby in Python/PySpark DrData82 2 2,808 Feb-05-2022, 09:59 PM
Last Post: DrData82
  PySpark Equivalent Code cpatte7372 0 1,253 Jan-14-2022, 08:59 PM
Last Post: cpatte7372
  Pyspark - my code works but I want to make it better Kevin 1 1,779 Dec-01-2021, 05:04 AM
Last Post: Kevin
  pyspark parallel write operation not working aliyesami 1 1,684 Oct-16-2021, 05:18 PM
Last Post: aliyesami
  pyspark creating temp files in /tmp folder aliyesami 1 4,967 Oct-16-2021, 05:15 PM
Last Post: aliyesami

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020