Hello community,
Can someone let me know how to add multiple tables to a my query?
As you can see from the code below I have two tables i) Person_Person ii) appl_stock. The problem is the code won't work with the two tables. It will only work with single table. I have tried the following but it didn't work.
Cheers
Carlton
Can someone let me know how to add multiple tables to a my query?
As you can see from the code below I have two tables i) Person_Person ii) appl_stock. The problem is the code won't work with the two tables. It will only work with single table. I have tried the following but it didn't work.
df = spark.read.csv('/home/packt/Downloads/Spark_DataFrames/"Person_Person.csv", "appl_stock.csv"',inferSchema=True,header=True)
#%% import findspark findspark.init('/home/packt/spark-2.1.0-bin-hadoop2.7') import pyspark from pyspark.sql import SparkSession spark = SparkSession.builder.appName('ops').getOrCreate() df = spark.read.csv('/home/packt/Downloads/Spark_DataFrames/Person_Person.csv, appl_stock.csv',inferSchema=True,header=True) df.createOrReplaceTempView('Person_Person, appl_stock') results = spark.sql("SELECT \ appl_stock.Open\ , appl_stock.Close\ FROM appl_stock\ WHERE appl_stock.Close < 500") carl = spark.sql("SELECT * FROM Person_Person") results.show()Any help will be greatly appreciated.
Cheers
Carlton