Python Forum
Trying to get year not the entire year & time
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Trying to get year not the entire year & time
#1
I am trying to convert the string into the date format

Here is my code

#!/usr/bin/env python

# make sure to install these packages before running:
# pip install pandas
# pip install sodapy

import pandas as pd
from sodapy import Socrata
import datetime as dt

# Unauthenticated client only works with public data sets. Note 'None'
# in place of application token, and no username or password:
client = Socrata("opendata.maryland.gov", "##KEY##")

# First 2000 results, returned as JSON from API / converted to Python list of
# dictionaries by sodapy.
results = client.get("q4mw-f34p", limit=2000)

# Convert to pandas DataFrame
results_df = pd.DataFrame.from_records(results)

#creating dataframe + adding first field from the originating data frame
cleandata = results_df[['legal_description_line_2_mdp_field_legal2_sdat_field_18']].copy()

#changing column name in data frame
cleandata.rename(columns = {'legal_description_line_2_mdp_field_legal2_sdat_field_18':'address'}, inplace = True)

#copying over columns from originating data frame
cleandata['accountnumber'] = results_df['record_key_account_number_sdat_field_3']
cleandata['housetype'] = results_df['mdp_street_address_type_code_mdp_field_resityp']
cleandata['landuse'] = results_df['land_use_code_mdp_field_lu_desclu_sdat_field_50']
cleandata['exemptclass'] = results_df['exempt_class_mdp_field_exclass_descexcl_sdat_field_49']
cleandata['assessmentyear'] = results_df['assessment_cycle_year_sdat_field_399']
cleandata['currentyeartotalassessment'] = results_df['current_assessment_year_total_phase_in_value_sdat_field_171']
cleandata['owneroccupancycode'] = results_df['record_key_owner_occupancy_code_mdp_field_ooi_sdat_field_6']
cleandata['homesteadcreditqualificationcode'] = results_df['homestead_qualification_code_mdp_field_homqlcod_sdat_field_259']
cleandata['homesteadqualificationdate'] = results_df['homestead_qualification_date_mdp_field_homqldat_sdat_field_260']
cleandata['yearbuilt'] = results_df['c_a_m_a_system_data_year_built_yyyy_mdp_field_yearblt_sdat_field_235']
cleandata['datepurchased'] = results_df['sales_segment_1_transfer_date_yyyy_mm_dd_mdp_field_tradate_sdat_field_89']
cleandata['zoning'] = results_df['zoning_code_mdp_field_zoning_sdat_field_45']

cleandata['assessmentyear'] = [dt.datetime.strptime(x, '%Y')
                for x in cleandata['assessmentyear']]


#Saving to CSV
cleandata.to_csv('cleandata.csv')

#printing data frame to screen
cleandata
The output for assessmentyear is 2023-01-01 00:00:00. For this instance I just want it to be 2023.

Any help would be great!
Reply
#2
Why do you convert what is clearly year as string to a datetime object?

cleandata['assessmentyear'] = [dt.datetime.strptime(x, '%Y') for x in cleandata['assessmentyear']]
i.e. datetime object always has all components. It's a representation to show just the year.
If you want a number consider
cleandata['assessmentyear'] = [dt.datetime.strptime(x, '%Y').year for x in cleandata['assessmentyear']]
Note, I didn't test the code above
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#3
(Jan-09-2023, 03:33 AM)mbrown009 Wrote: The output for assessmentyear is 2023-01-01 00:00:00. For this instance I just want it to be 2023.
First convert the date to pandas datetime64,then can covert to year only.
Example.
import pandas as pd

data = {
    'date': ['2023-01-02 00:00:01', '2023-01-03 00:00:02', '2023-01-04 00:00:02'],
    'country': ['USA', 'USA', 'USA'],
    'population': [328000000, 328000000, 328000000],
}

df = pd.DataFrame(data)
# Convert to pandas datetime
df['date'] = pd.to_datetime(df['date'])
>>> df
                 date country  population
0 2023-01-02 00:00:01     USA   328000000
1 2023-01-03 00:00:02     USA   328000000
2 2023-01-04 00:00:02     USA   328000000

# Always check this,see that date type is now ok
>>> df.dtypes
date          datetime64[ns]
country               object
population             int64
dtype: object

# Convert to year only
>>> df['date'] = df['date'].dt.year
>>> df
   date country  population
0  2023     USA   328000000
1  2023     USA   328000000
2  2023     USA   328000000
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Make entire script run again every 45 mo NDillard 0 325 Jan-23-2024, 09:40 PM
Last Post: NDillard
  Why does newly-formed dict only consist of last row of each year? Mark17 6 800 Nov-17-2023, 05:28 PM
Last Post: Mark17
  Modify an Energy Model to account for year dependent interest rate rather than only t giovanniandrean 0 435 Oct-10-2023, 07:00 AM
Last Post: giovanniandrean
  Can someone explain this small snippet of code like I am a 5 year old? PythonNPC 3 1,251 Apr-08-2022, 05:54 PM
Last Post: deanhystad
  Reiszing figure to occupy entire frame fishbackp 0 1,380 Jan-06-2022, 10:33 PM
Last Post: fishbackp
  How to get OpenCV to display entire camera frame? George713 1 3,271 Aug-12-2021, 02:45 AM
Last Post: Pedroski55
  Filtering files, for current year files tester_V 8 3,973 Aug-07-2021, 03:58 AM
Last Post: tester_V
Photo Algorithm for leap year (structogramm coder_sw99 3 2,239 Jul-23-2021, 02:13 PM
Last Post: DeaD_EyE
  python tool to collect the time measurement of entire code maiya 3 2,324 Feb-12-2021, 05:39 PM
Last Post: BashBedlam
  get year information from a timestamp data frame asli 1 1,682 Jan-08-2021, 09:11 PM
Last Post: Larz60+

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020