Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
PySpark Equivalent Code
#1
Hello Community,

I have coded the following logic into SQL as follows:

Join very_large_dataframe to small_product_dimension_dataframe on column [B]
Only join records to small_product_dimension_dataframe where O is greater then 10
Keep only Column [P]

SELECT
small_product_dimension_dataframe.P
FROM dbo.small_product_dimension_dataframe
INNER JOIN dbo.very_large_dataframe
ON small_product_dimension_dataframe.B = very_large_dataframe.B
WHERE small_product_dimension_dataframe.O > 10

I would like help with the equivalent code in PySpark.

I have made a start withn the following:

df = very_large_dataframe.join(small_product_dimension_dataframe,
                                                        (very_large_dataframe.B == small_product_dimension_dataframe.B))
I would like help amending the pyspark to include col P and WHERE small_product_dimension_dataframe.O > 10
Larz60+ write Jan-14-2022, 10:53 PM:
Please post all code, output and errors (it it's entirety) between their respective tags. Refer to BBCode help topic on how to post. Use the "Preview Post" button to make sure the code is presented as you expect before hitting the "Post Reply/Thread" button.

You can use for SQL.
Reply


Messages In This Thread
PySpark Equivalent Code - by cpatte7372 - Jan-14-2022, 08:59 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Equivalent Python code from VBA Mishal0488 2 798 Apr-19-2024, 10:32 AM
Last Post: masonsbore
  PySpark Coding Challenge cpatte7372 4 6,114 Jun-25-2023, 12:56 PM
Last Post: prajwal_0078
  Pyspark dataframe siddhi1919 3 1,236 Apr-25-2023, 12:39 PM
Last Post: snippsat
  pyspark help lokesh 0 766 Jan-03-2023, 04:34 PM
Last Post: lokesh
  How to iterate Groupby in Python/PySpark DrData82 2 2,849 Feb-05-2022, 09:59 PM
Last Post: DrData82
  Pyspark - my code works but I want to make it better Kevin 1 1,799 Dec-01-2021, 05:04 AM
Last Post: Kevin
  pyspark parallel write operation not working aliyesami 1 1,708 Oct-16-2021, 05:18 PM
Last Post: aliyesami
  pyspark creating temp files in /tmp folder aliyesami 1 5,030 Oct-16-2021, 05:15 PM
Last Post: aliyesami
  KafkaUtils module not found on spark 3 pyspark aupres 2 7,405 Feb-17-2021, 09:40 AM
Last Post: Larz60+
  pyspark dataframe to json without header vijz 0 1,964 Nov-28-2020, 05:36 PM
Last Post: vijz

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020