Python Forum
Facing error while executing below Python code
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Facing error while executing below Python code
#1
Hi Team,

I am facing error while executing below code . Please help me on this.

#Defining the Schema
from pyspark.sql.functions import date_format
orders_df=spark.read.csv('/content/sample_data/orders.txt').toDF('order_id','order_date','order_customer_id','order_status')
orders_df.withColumn('order_month',date_format(orders_df.order_date,'YYYYMM')).show()
Error Details:
Error:
--------------------------------------------------------------------------- Py4JJavaError Traceback (most recent call last) <ipython-input-150-588c7016c28b> in <module>() 1 from pyspark.sql.functions import date_format ----> 2 orders_df.withColumn('order_month',date_format(orders_df.order_date,'YYYYMM')).show() 3 frames /usr/local/lib/python3.6/dist-packages/pyspark/python/lib/py4j-0.10.9-src.zip/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name) 326 raise Py4JJavaError( 327 "An error occurred while calling {0}{1}{2}.\n". --> 328 format(target_id, ".", name), value) 329 else: 330 raise Py4JError( Py4JJavaError: An error occurred while calling o865.showString. : org.apache.spark.SparkUpgradeException: You may get a different result due to the upgrading of Spark 3.0: Fail to recognize 'YYYYMM' pattern in the DateTimeFormatter. 1) You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0. 2) You can form a valid datetime pattern with the guide from https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html at org.apache.spark.sql.catalyst.util.DateTimeFormatterHelper$$anonfun$checkLegacyFormatter$1.applyOrElse(DateTimeFormatterHelper.scala:196) at org.apache.spark.sql.catalyst.util.DateTimeFormatterHelper$$anonfun$checkLegacyFormatter$1.applyOrElse(DateTimeFormatterHelper.scala:185) at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:38) at org.apache.spark.sql.catalyst.util.Iso8601TimestampFormatter.validatePatternString(TimestampFormatter.scala:109) at org.apache.spark.sql.catalyst.util.TimestampFormatter$.getFormatter(TimestampFormatter.scala:278) at org.apache.spark.sql.catalyst.util.TimestampFormatter$.apply(TimestampFormatter.scala:312) at org.apache.spark.sql.catalyst.expressions.DateFormatClass.$anonfun$formatter$1(datetimeExpressions.scala:646) at scala.Option.map(Option.scala:230) at org.apache.spark.sql.catalyst.expressions.DateFormatClass.formatter$lzycompute(datetimeExpressions.scala:641) at org.apache.spark.sql.catalyst.expressions.DateFormatClass.formatter(datetimeExpressions.scala:639) at org.apache.spark.sql.catalyst.expressions.DateFormatClass.doGenCode(datetimeExpressions.scala:665) at org.apache.spark.sql.catalyst.expressions.Expression.$anonfun$genCode$3(Expression.scala:146) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.sql.catalyst.expressions.Expression.genCode(Expression.scala:141) at org.apache.spark.sql.catalyst.expressions.Alias.genCode(namedExpressions.scala:159) at org.apache.spark.sql.execution.ProjectExec.$anonfun$doConsume$1(basicPhysicalOperators.scala:66) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238) at scala.collection.immutable.List.foreach(List.scala:392) at scala.collection.TraversableLike.map(TraversableLike.scala:238) at scala.collection.TraversableLike.map$(TraversableLike.scala:231) at scala.collection.immutable.List.map(List.scala:298) at org.apache.spark.sql.execution.ProjectExec.doConsume(basicPhysicalOperators.scala:66) at org.apache.spark.sql.execution.CodegenSupport.consume(WholeStageCodegenExec.scala:194) at org.apache.spark.sql.execution.CodegenSupport.consume$(WholeStageCodegenExec.scala:149) at org.apache.spark.sql.execution.InputAdapter.consume(WholeStageCodegenExec.scala:496) at org.apache.spark.sql.execution.InputRDDCodegen.doProduce(WholeStageCodegenExec.scala:483) at org.apache.spark.sql.execution.InputRDDCodegen.doProduce$(WholeStageCodegenExec.scala:456) at org.apache.spark.sql.execution.InputAdapter.doProduce(WholeStageCodegenExec.scala:496) at org.apache.spark.sql.execution.CodegenSupport.$anonfun$produce$1(WholeStageCodegenExec.scala:95) at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:213) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:210) at org.apache.spark.sql.execution.CodegenSupport.produce(WholeStageCodegenExec.scala:90) at org.apache.spark.sql.execution.CodegenSupport.produce$(WholeStageCodegenExec.scala:90) at org.apache.spark.sql.execution.InputAdapter.produce(WholeStageCodegenExec.scala:496) at org.apache.spark.sql.execution.ProjectExec.doProduce(basicPhysicalOperators.scala:51) at org.apache.spark.sql.execution.CodegenSupport.$anonfun$produce$1(WholeStageCodegenExec.scala:95) at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:213) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:210) at org.apache.spark.sql.execution.CodegenSupport.produce(WholeStageCodegenExec.scala:90) at org.apache.spark.sql.execution.CodegenSupport.produce$(WholeStageCodegenExec.scala:90) at org.apache.spark.sql.execution.ProjectExec.produce(basicPhysicalOperators.scala:41) at org.apache.spark.sql.execution.WholeStageCodegenExec.doCodeGen(WholeStageCodegenExec.scala:632) at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:692) at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:175) at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:213) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:210) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:171) at org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:316) at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:434) at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:420) at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:47) at org.apache.spark.sql.Dataset.collectFromPlan(Dataset.scala:3627) at org.apache.spark.sql.Dataset.$anonfun$head$1(Dataset.scala:2697) at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3618) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:100) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:160) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:87) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3616) at org.apache.spark.sql.Dataset.head(Dataset.scala:2697) at org.apache.spark.sql.Dataset.take(Dataset.scala:2904) at org.apache.spark.sql.Dataset.getRows(Dataset.scala:300) at org.apache.spark.sql.Dataset.showString(Dataset.scala:337) at sun.reflect.GeneratedMethodAccessor74.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:282) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:238) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.IllegalArgumentException: All week-based patterns are unsupported since Spark 3.0, detected: Y, Please use the SQL function EXTRACT instead at org.apache.spark.sql.catalyst.util.DateTimeFormatterHelper$.$anonfun$convertIncompatiblePattern$4(DateTimeFormatterHelper.scala:323) at org.apache.spark.sql.catalyst.util.DateTimeFormatterHelper$.$anonfun$convertIncompatiblePattern$4$adapted(DateTimeFormatterHelper.scala:321) at scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:877) at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36) at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33) at scala.collection.immutable.StringOps.foreach(StringOps.scala:33) at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:876) at org.apache.spark.sql.catalyst.util.DateTimeFormatterHelper$.$anonfun$convertIncompatiblePattern$2(DateTimeFormatterHelper.scala:321) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238) at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36) at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198) at scala.collection.TraversableLike.map(TraversableLike.scala:238) at scala.collection.TraversableLike.map$(TraversableLike.scala:231) at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:198) at org.apache.spark.sql.catalyst.util.DateTimeFormatterHelper$.convertIncompatiblePattern(DateTimeFormatterHelper.scala:318) at org.apache.spark.sql.catalyst.util.DateTimeFormatterHelper.getOrCreateFormatter(DateTimeFormatterHelper.scala:121) at org.apache.spark.sql.catalyst.util.DateTimeFormatterHelper.getOrCreateFormatter$(DateTimeFormatterHelper.scala:117) at org.apache.spark.sql.catalyst.util.Iso8601TimestampFormatter.getOrCreateFormatter(TimestampFormatter.scala:59) at org.apache.spark.sql.catalyst.util.Iso8601TimestampFormatter.formatter$lzycompute(TimestampFormatter.scala:68) at org.apache.spark.sql.catalyst.util.Iso8601TimestampFormatter.formatter(TimestampFormatter.scala:67) at org.apache.spark.sql.catalyst.util.Iso8601TimestampFormatter.validatePatternString(TimestampFormatter.scala:108) ... 73 more
Please help me on this
Reply
#2
(Jan-26-2021, 10:45 AM)ramu4651 Wrote: Fail to recognize 'YYYYMM' pattern in the DateTimeFormatter
I don't know spark. But when I look at the manual of datetime I see that only lowercase "y" is defined to designate a (portion of a) year. So perhaps you should define: 'yyyyMM' instead of 'YYYYMM'.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Code with empty list not executing adeana 9 3,718 Dec-11-2023, 08:27 AM
Last Post: buran
  Facing issue in python regex newline match Shr 6 1,262 Oct-25-2023, 09:42 AM
Last Post: Shr
  Code error from Fundamentals of Python Programming van Richard L. Halterman Heidi 12 1,675 Jul-25-2023, 10:32 PM
Last Post: Skaperen
  Syntax error while executing the Python code in Linux DivAsh 8 1,526 Jul-19-2023, 06:27 PM
Last Post: Lahearle
  Error 1064 (42000) when executing UPDATE SQL gratiszzzz 7 1,451 May-22-2023, 02:38 PM
Last Post: buran
  Compiles Python code with no error but giving out no output - what's wrong with it? pythonflea 6 1,549 Mar-27-2023, 07:38 AM
Last Post: buran
  Facing problem with Pycharm - Not getting the expected output amortal03 1 852 Sep-09-2022, 05:44 PM
Last Post: Yoriz
  Error in if-then-else python code Led_Zeppelin 6 2,362 Jul-27-2022, 07:53 PM
Last Post: deanhystad
  Facing Problem while opening a file through command prompt vlearner 4 1,897 Jan-30-2022, 08:10 AM
Last Post: snippsat
  I want to create small multiples grouped bar plot, and facing some aesthetics issuess dev_kaur 0 1,606 Dec-05-2020, 08:49 AM
Last Post: dev_kaur

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020