Python Forum
Pyspark "mismatched input FIELDS"
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Pyspark "mismatched input FIELDS"
#1
Hi,
I am looking for help.
I am trying to use SerDes with Hive in pySpark.sql.
Here is my SQL:

CREATE EXTERNAL TABLE IF NOT EXISTS store_user (
user_id VARCHAR(36),
weekstartdate date, 
user_name VARCHAR(36), 
user_age int, ... )
                       ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
                       FIELDS TERMINATED BY '|t' 
                       STORED AS TEXTFILE
                       LOCATION 's3://stx-apollo-pr-datascience-shared/unloads/testdata/v1/mphd/customer_attributes_weekly'
                       TBLPROPERTIES ('hive.lazysimple.extended_boolean_literal'='true')
                       
With that, I receive the error:
Error:
pyspark.sql.utils.ParseException: "\nmismatched input 'FIELDS' expecting <EOF>... " FIELDS TERMINATED BY '|t' -----------------------^^^
If instead of
 
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
I put something like
ROW FORMAT DELIMITED  
-- there is no error.

Obviously I use some wrong syntax but I cannot find out what's exactly wrong:
I took the line
Quote:ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
from Hive documentation.

Any ideas? I would really, really appreciate any help.
Thank you!
Reply
#2

By accident, posted the same twice. Don't know how to delete a post.


... Please disregard. Posted twice by mistake (I am new to this forum),
couldn't find how to remove post..
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Compare fields from two csv files georgebijum 3 568 Apr-25-2022, 11:16 PM
Last Post: Pedroski55
  How to iterate Groupby in Python/PySpark DrData82 2 856 Feb-05-2022, 09:59 PM
Last Post: DrData82
  PySpark Equivalent Code cpatte7372 0 590 Jan-14-2022, 08:59 PM
Last Post: cpatte7372
  Pyspark - my code works but I want to make it better Kevin 1 1,002 Dec-01-2021, 05:04 AM
Last Post: Kevin
  pyspark parallel write operation not working aliyesami 1 964 Oct-16-2021, 05:18 PM
Last Post: aliyesami
  pyspark creating temp files in /tmp folder aliyesami 1 1,787 Oct-16-2021, 05:15 PM
Last Post: aliyesami
Photo Integration of apache spark and Kafka on eclipse pyspark aupres 1 2,396 Feb-27-2021, 08:38 AM
Last Post: Serafim
  KafkaUtils module not found on spark 3 pyspark aupres 2 4,252 Feb-17-2021, 09:40 AM
Last Post: Larz60+
  PySpark Coding Challenge cpatte7372 3 3,083 Feb-14-2021, 04:49 PM
Last Post: ndc85430
  pyspark dataframe to json without header vijz 0 1,231 Nov-28-2020, 05:36 PM
Last Post: vijz

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020