![]() |
Trying to get year not the entire year & time - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Trying to get year not the entire year & time (/thread-39145.html) |
Trying to get year not the entire year & time - mbrown009 - Jan-09-2023 I am trying to convert the string into the date format Here is my code #!/usr/bin/env python # make sure to install these packages before running: # pip install pandas # pip install sodapy import pandas as pd from sodapy import Socrata import datetime as dt # Unauthenticated client only works with public data sets. Note 'None' # in place of application token, and no username or password: client = Socrata("opendata.maryland.gov", "##KEY##") # First 2000 results, returned as JSON from API / converted to Python list of # dictionaries by sodapy. results = client.get("q4mw-f34p", limit=2000) # Convert to pandas DataFrame results_df = pd.DataFrame.from_records(results) #creating dataframe + adding first field from the originating data frame cleandata = results_df[['legal_description_line_2_mdp_field_legal2_sdat_field_18']].copy() #changing column name in data frame cleandata.rename(columns = {'legal_description_line_2_mdp_field_legal2_sdat_field_18':'address'}, inplace = True) #copying over columns from originating data frame cleandata['accountnumber'] = results_df['record_key_account_number_sdat_field_3'] cleandata['housetype'] = results_df['mdp_street_address_type_code_mdp_field_resityp'] cleandata['landuse'] = results_df['land_use_code_mdp_field_lu_desclu_sdat_field_50'] cleandata['exemptclass'] = results_df['exempt_class_mdp_field_exclass_descexcl_sdat_field_49'] cleandata['assessmentyear'] = results_df['assessment_cycle_year_sdat_field_399'] cleandata['currentyeartotalassessment'] = results_df['current_assessment_year_total_phase_in_value_sdat_field_171'] cleandata['owneroccupancycode'] = results_df['record_key_owner_occupancy_code_mdp_field_ooi_sdat_field_6'] cleandata['homesteadcreditqualificationcode'] = results_df['homestead_qualification_code_mdp_field_homqlcod_sdat_field_259'] cleandata['homesteadqualificationdate'] = results_df['homestead_qualification_date_mdp_field_homqldat_sdat_field_260'] cleandata['yearbuilt'] = results_df['c_a_m_a_system_data_year_built_yyyy_mdp_field_yearblt_sdat_field_235'] cleandata['datepurchased'] = results_df['sales_segment_1_transfer_date_yyyy_mm_dd_mdp_field_tradate_sdat_field_89'] cleandata['zoning'] = results_df['zoning_code_mdp_field_zoning_sdat_field_45'] cleandata['assessmentyear'] = [dt.datetime.strptime(x, '%Y') for x in cleandata['assessmentyear']] #Saving to CSV cleandata.to_csv('cleandata.csv') #printing data frame to screen cleandataThe output for assessmentyear is 2023-01-01 00:00:00. For this instance I just want it to be 2023. Any help would be great! RE: Trying to get year not the entire year & time - buran - Jan-09-2023 Why do you convert what is clearly year as string to a datetime object? cleandata['assessmentyear'] = [dt.datetime.strptime(x, '%Y') for x in cleandata['assessmentyear']]i.e. datetime object always has all components. It's a representation to show just the year. If you want a number consider cleandata['assessmentyear'] = [dt.datetime.strptime(x, '%Y').year for x in cleandata['assessmentyear']]Note, I didn't test the code above RE: Trying to get year not the entire year & time - snippsat - Jan-09-2023 (Jan-09-2023, 03:33 AM)mbrown009 Wrote: The output for assessmentyear is 2023-01-01 00:00:00. For this instance I just want it to be 2023.First convert the date to pandas datetime64 ,then can covert to year only.Example. import pandas as pd data = { 'date': ['2023-01-02 00:00:01', '2023-01-03 00:00:02', '2023-01-04 00:00:02'], 'country': ['USA', 'USA', 'USA'], 'population': [328000000, 328000000, 328000000], } df = pd.DataFrame(data) # Convert to pandas datetime df['date'] = pd.to_datetime(df['date']) >>> df date country population 0 2023-01-02 00:00:01 USA 328000000 1 2023-01-03 00:00:02 USA 328000000 2 2023-01-04 00:00:02 USA 328000000 # Always check this,see that date type is now ok >>> df.dtypes date datetime64[ns] country object population int64 dtype: object # Convert to year only >>> df['date'] = df['date'].dt.year >>> df date country population 0 2023 USA 328000000 1 2023 USA 328000000 2 2023 USA 328000000 |