Python Forum

Full Version: Problem with replacing strings
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi guys,

I got a data set with Dates looking like this :

Jan. 01
Feb. 01
Mar. 01

and so on. It seems Python can't transform this kind of date format using

df['Date'] = pd.to_datetime(df['Date'])
In order to work I guess the format has to be:

Jan. 2001
Feb. 2001
Mar. 2001

What I have been trying, and I am 100% sure there is a fancier way, is to replace all "01" with "2001" and so on. The code looks like that :

for i in range(len(df["Date"])):
        df["Date"][i] = df["Date"][i].replace("00","2000")
        df["Date"][i] = df["Date"][i].replace("01","2001")
        df["Date"][i] = df["Date"][i].replace("02","2002")
        df["Date"][i] = df["Date"][i].replace("03","2003")
The problem what that code is, that transformed strings, e.g. 2020, are getting replaced again, because they have the string "02" in them. As a result, I get dates like : Jan 202000 Blush
Is there a way to fix my approach, or a different approach that solves my problem?
def y2k(datestr):
    year=datestr.split()[-1]
    if len(year) < 4:
        return datestr[:-2]+'20'+datestr[-2:]
    return datestr

print(y2k('Jan. 01'))
print(y2k('Jan. 1999'))
Maybe something along these lines:

>>> import pandas as pd
>>> df = pd.DataFrame({'date': ['Jan. 01', 'Feb. 01', 'Mar. 01']})
>>> df
      date
0  Jan. 01
1  Feb. 01
2  Mar. 01
>>> df['date'] = pd.to_datetime(df['date'], format='%b. %y') 
>>> df
        date
0 2001-01-01
1 2001-02-01
2 2001-03-01
Both methods worked! Thank you so much