Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Pyspark dataframe
#4
(Apr-25-2023, 07:16 AM)siddhi1919 Wrote: We are looking for a solution in pyspark where we can compare/match the one col4 value with entire table col3 value.
Next time if post give it a try with some code,to show some effort and not just post the the task to.
Something like this.
import pandas as pd

data = {
    'Col1': [1, 2, 3, 4],
    'Col2': ['A', 'B', 'C', 'D'],
    'Col3': [101, 102, 103, 104],
    'Col4': ['arn:aws:savingsplans::104:savingsplan/f001', '', 'arn:aws:savingsplans::101:savingsplan/f002', '']
}
df = pd.DataFrame(data)

# Use regex to extract 104 and 101 from Col4
df['Col4_extracted'] = df['Col4'].str.extract(r':(\d{3}):')
# Check if 104 appears in Col3
match = df['Col3'] == int(df['Col4_extracted'].iloc[0])
print(df['Col3'][match])
Output:
3 104 Name: Col3, dtype: int64
Spark provides a createDataFrame(pandas_dataframe) method to convert pandas to Spark DataFrame.
Reply


Messages In This Thread
Pyspark dataframe - by siddhi1919 - Apr-24-2023, 07:48 PM
RE: Pyspark dataframe - by deanhystad - Apr-24-2023, 09:17 PM
RE: Pyspark dataframe - by siddhi1919 - Apr-25-2023, 07:16 AM
RE: Pyspark dataframe - by snippsat - Apr-25-2023, 12:39 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  PySpark Coding Challenge cpatte7372 4 8,482 Jun-25-2023, 12:56 PM
Last Post: prajwal_0078
  pyspark help lokesh 0 1,193 Jan-03-2023, 04:34 PM
Last Post: lokesh
  How to iterate Groupby in Python/PySpark DrData82 2 3,958 Feb-05-2022, 09:59 PM
Last Post: DrData82
  PySpark Equivalent Code cpatte7372 0 1,724 Jan-14-2022, 08:59 PM
Last Post: cpatte7372
  Pyspark - my code works but I want to make it better Kevin 1 2,371 Dec-01-2021, 05:04 AM
Last Post: Kevin
  pyspark parallel write operation not working aliyesami 1 2,411 Oct-16-2021, 05:18 PM
Last Post: aliyesami
  pyspark creating temp files in /tmp folder aliyesami 1 7,101 Oct-16-2021, 05:15 PM
Last Post: aliyesami
  KafkaUtils module not found on spark 3 pyspark aupres 2 8,900 Feb-17-2021, 09:40 AM
Last Post: Larz60+
  pyspark dataframe to json without header vijz 0 2,572 Nov-28-2020, 05:36 PM
Last Post: vijz
  Pyspark SQL Error - mismatched input 'FROM' expecting <EOF> Ariean 3 53,017 Nov-20-2020, 03:49 PM
Last Post: Ariean

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020