Python Forum
Pyspark "mismatched input FIELDS"
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Pyspark "mismatched input FIELDS"
#1
Hi,
I am looking for help.
I am trying to use SerDes with Hive in pySpark.sql.
Here is my SQL:

CREATE EXTERNAL TABLE IF NOT EXISTS store_user (
user_id VARCHAR(36),
weekstartdate date, 
user_name VARCHAR(36), 
user_age int, ... )
                       ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
                       FIELDS TERMINATED BY '|t' 
                       STORED AS TEXTFILE
                       LOCATION 's3://stx-apollo-pr-datascience-shared/unloads/testdata/v1/mphd/customer_attributes_weekly'
                       TBLPROPERTIES ('hive.lazysimple.extended_boolean_literal'='true')
                       
With that, I receive the error:
Error:
pyspark.sql.utils.ParseException: "\nmismatched input 'FIELDS' expecting <EOF>... " FIELDS TERMINATED BY '|t' -----------------------^^^
If instead of
 
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
I put something like
ROW FORMAT DELIMITED  
-- there is no error.

Obviously I use some wrong syntax but I cannot find out what's exactly wrong:
I took the line
Quote:ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
from Hive documentation.

Any ideas? I would really, really appreciate any help.
Thank you!
Reply
#2

By accident, posted the same twice. Don't know how to delete a post.


... Please disregard. Posted twice by mistake (I am new to this forum),
couldn't find how to remove post..
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  PySpark Coding Challenge cpatte7372 4 8,591 Jun-25-2023, 12:56 PM
Last Post: prajwal_0078
  Pyspark dataframe siddhi1919 3 2,177 Apr-25-2023, 12:39 PM
Last Post: snippsat
  pyspark help lokesh 0 1,215 Jan-03-2023, 04:34 PM
Last Post: lokesh
  Compare fields from two csv files georgebijum 3 2,262 Apr-25-2022, 11:16 PM
Last Post: Pedroski55
  How to iterate Groupby in Python/PySpark DrData82 2 3,992 Feb-05-2022, 09:59 PM
Last Post: DrData82
  PySpark Equivalent Code cpatte7372 0 1,744 Jan-14-2022, 08:59 PM
Last Post: cpatte7372
  Pyspark - my code works but I want to make it better Kevin 1 2,399 Dec-01-2021, 05:04 AM
Last Post: Kevin
  pyspark parallel write operation not working aliyesami 1 2,427 Oct-16-2021, 05:18 PM
Last Post: aliyesami
  pyspark creating temp files in /tmp folder aliyesami 1 7,146 Oct-16-2021, 05:15 PM
Last Post: aliyesami
  KafkaUtils module not found on spark 3 pyspark aupres 2 8,935 Feb-17-2021, 09:40 AM
Last Post: Larz60+

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020