Aug-31-2019, 07:28 AM
Hi,
I am looking for help.
I am trying to use SerDes with Hive in pySpark.sql.
Here is my SQL:
Obviously I use some wrong syntax but I cannot find out what's exactly wrong:
I took the line
Any ideas? I would really, really appreciate any help.
Thank you!
I am looking for help.
I am trying to use SerDes with Hive in pySpark.sql.
Here is my SQL:
CREATE EXTERNAL TABLE IF NOT EXISTS store_user ( user_id VARCHAR(36), weekstartdate date, user_name VARCHAR(36), user_age int, ... ) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' FIELDS TERMINATED BY '|t' STORED AS TEXTFILE LOCATION 's3://stx-apollo-pr-datascience-shared/unloads/testdata/v1/mphd/customer_attributes_weekly' TBLPROPERTIES ('hive.lazysimple.extended_boolean_literal'='true')With that, I receive the error:
Error:pyspark.sql.utils.ParseException: "\nmismatched input 'FIELDS' expecting <EOF>... "
FIELDS TERMINATED BY '|t'
-----------------------^^^
If instead ofROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'I put something like
ROW FORMAT DELIMITED-- there is no error.
Obviously I use some wrong syntax but I cannot find out what's exactly wrong:
I took the line
Quote:ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'from Hive documentation.
Any ideas? I would really, really appreciate any help.
Thank you!