get year information from a timestamp data frame - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: get year information from a timestamp data frame (/thread-31895.html) |
get year information from a timestamp data frame - asli - Jan-08-2021 Hi all, I am new to python. I am reading a datafile where there is timestamp values as string. I want to ger distinct years from this dataframe and keep them in an array. I have some trial below that don't work. Could you give a help about how to do it import pyspark from pyspark.sql import SparkSession spark = SparkSession.builder.appName('pyspark-by-examples').getOrCreate() from pyspark.sql.types import StructType,StructField, StringType, IntegerType,ArrayType from pyspark.sql.functions import split, explode import pyspark.sql.types import calendar import datetime import pandas as pd from pyspark.sql import functions as F from pyspark.sql import types as T import datetime as dt arrayData = spark.read.format("delta").load("/mnt/datalake/....something") #arraySchema = StructType([ \ # StructField("repair_year",StringType(),True), \ #]) arrayData['repair_year']= arrayData.select('repair_date').withColumn("repair_date", F.col("repair_date").cast(T.TimestampType())) #df = arraySchema #df.printSchema() #df.show() arraySchema.show() RE: get year information from a timestamp data frame - Larz60+ - Jan-08-2021 import datetime import time # create a timestamp -- you won't have to do this as you already have timestamp timestamp = time.time() print(f"\n\ntimestamp: {timestamp}") year = datetime.date.fromtimestamp(timestamp).year print(f"Year: {year}")
|