Pyspark "mismatched input FIELDS" - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Pyspark "mismatched input FIELDS" (/thread-20809.html) |
Pyspark "mismatched input FIELDS" - Mabooka - Aug-31-2019 Hi, I am looking for help. I am trying to use SerDes with Hive in pySpark.sql. Here is my SQL: CREATE EXTERNAL TABLE IF NOT EXISTS store_user ( user_id VARCHAR(36), weekstartdate date, user_name VARCHAR(36), user_age int, ... ) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' FIELDS TERMINATED BY '|t' STORED AS TEXTFILE LOCATION 's3://stx-apollo-pr-datascience-shared/unloads/testdata/v1/mphd/customer_attributes_weekly' TBLPROPERTIES ('hive.lazysimple.extended_boolean_literal'='true')With that, I receive the error: If instead ofROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'I put something like ROW FORMAT DELIMITED-- there is no error. Obviously I use some wrong syntax but I cannot find out what's exactly wrong: I took the line Quote:ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'from Hive documentation. Any ideas? I would really, really appreciate any help. Thank you! RE: Pyspark "mismatched input FIELDS" - Mabooka - Aug-31-2019 By accident, posted the same twice. Don't know how to delete a post. ... Please disregard. Posted twice by mistake (I am new to this forum), couldn't find how to remove post.. |