Python Forum

Full Version: Converting data object
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I am pulling a csv from a database. With that the date is an object, I am attempting to convert it to be in datetime64[ns].

My code is as follows:

#!/usr/bin/env python
__author__ = "Michael Brown"
__license__ = "Based off of sript by Sreenivas Bhattiprolu of Python for Microscopists"

import pandas as pd
import datetime as dt
from matplotlib import pyplot as plt
import matplotlib

CVD = pd.read_csv('https://opendata.arcgis.com/datasets/18582de727934249b92c52542395a3bf_0.csv')
#print(CVD.head())
print(CVD.dtypes)
CVD['DATE'] = [dt.datetime.strptime(x,'%Y/%m/%d %H:%M:%S') 
               for x in CVD['DATE']] 
print(CVD.dtypes)
the output is the following

ValueError: time data '2020/03/04 15:00:00+00' does not match format '%Y/%m/%d %H:%M:%S%z'
I am confused what I am missing. I believe it has something to do with the +00 format for the timezone offset.

Any assistance would be great!
Your code doesn't seem to match your error.

But you're right. A timezone offset would be a + or - followed by 4 digits. You only have 2. So you can either manually strip the timezone and not parse it, or you could add a couple of zeros as the minute portion and parse it with %z
sorry about that. I thought I did an updated copy. Here is the updated code.

#!/usr/bin/env python
__author__ = "Michael Brown"
__license__ = "Based off of sript by Sreenivas Bhattiprolu of Python for Microscopists"
 
import pandas as pd
import datetime as dt
from matplotlib import pyplot as plt
import matplotlib
 
CVD = pd.read_csv('https://opendata.arcgis.com/datasets/18582de727934249b92c52542395a3bf_0.csv')
#print(CVD.head())
print(CVD.dtypes)
CVD['DATE'] = [dt.datetime.strptime(x,'%Y/%m/%d %H:%M:%S%z') 
               for x in CVD['DATE']] 
print(CVD.dtypes)
Thank you for your information. I added two additional "0"s. This was able to resolve the issue. What is your opinion on this solution?

#!/usr/bin/env python
__author__ = "Michael Brown"
__license__ = "Based off of sript by Sreenivas Bhattiprolu of Python for Microscopists"

import pandas as pd
import datetime as dt
from matplotlib import pyplot as plt
import matplotlib

CVD = pd.read_csv('https://opendata.arcgis.com/datasets/18582de727934249b92c52542395a3bf_0.csv')
#print(CVD.head())
CVD['ndate'] = CVD['DATE']+'00'

print(CVD.dtypes)
CVD['ndate'] = [dt.datetime.strptime(x,'%Y/%m/%d %H:%M:%S%z') 
               for x in CVD['ndate']] 
print(CVD.dtypes)
If you're sure the format is always the same (like it's generated by a program and that's always used), then it seems workable to me.
(May-24-2021, 03:56 AM)bowlofred Wrote: [ -> ]If you're sure the format is always the same (like it's generated by a program and that's always used), then it seems workable to me.

Yes it is the same way all the time. I am not sure why they do not do the full format.