Python Forum
How to add multiple tables to pyspark sql
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How to add multiple tables to pyspark sql
#1
Hello community,

Can someone let me know how to add multiple tables to a my query?

As you can see from the code below I have two tables i) Person_Person ii) appl_stock. The problem is the code won't work with the two tables. It will only work with single table. I have tried the following but it didn't work.

df = spark.read.csv('/home/packt/Downloads/Spark_DataFrames/"Person_Person.csv", "appl_stock.csv"',inferSchema=True,header=True)
#%%
import findspark
findspark.init('/home/packt/spark-2.1.0-bin-hadoop2.7')
import pyspark
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName('ops').getOrCreate()
df = spark.read.csv('/home/packt/Downloads/Spark_DataFrames/Person_Person.csv, appl_stock.csv',inferSchema=True,header=True)
df.createOrReplaceTempView('Person_Person, appl_stock')
results = spark.sql("SELECT \
appl_stock.Open\
, appl_stock.Close\
 FROM appl_stock\
 WHERE appl_stock.Close < 500")
carl = spark.sql("SELECT * FROM Person_Person")
results.show()
Any help will be greatly appreciated.

Cheers

Carlton
Reply
#2
Normal SQL would be to use an Inner Join.
You can find a spark example here: http://bailiwick.io/2015/07/12/joining-d...spark-sql/
Reply
#3
Larz60+

Thanks for reaching out. I will check out the link you provided..

In the meantime, I'm happy for this question to be closed.

Cheers
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  PySpark Coding Challenge cpatte7372 4 6,072 Jun-25-2023, 12:56 PM
Last Post: prajwal_0078
  Pyspark dataframe siddhi1919 3 1,216 Apr-25-2023, 12:39 PM
Last Post: snippsat
  pyspark help lokesh 0 754 Jan-03-2023, 04:34 PM
Last Post: lokesh
  How to iterate Groupby in Python/PySpark DrData82 2 2,814 Feb-05-2022, 09:59 PM
Last Post: DrData82
  PySpark Equivalent Code cpatte7372 0 1,253 Jan-14-2022, 08:59 PM
Last Post: cpatte7372
  Pyspark - my code works but I want to make it better Kevin 1 1,779 Dec-01-2021, 05:04 AM
Last Post: Kevin
  pyspark parallel write operation not working aliyesami 1 1,685 Oct-16-2021, 05:18 PM
Last Post: aliyesami
  pyspark creating temp files in /tmp folder aliyesami 1 4,977 Oct-16-2021, 05:15 PM
Last Post: aliyesami
  Slittping table into Multiple tables by rows drunkenneo 1 2,050 Oct-06-2021, 03:17 PM
Last Post: snippsat
  KafkaUtils module not found on spark 3 pyspark aupres 2 7,365 Feb-17-2021, 09:40 AM
Last Post: Larz60+

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020