Python Forum

Full Version: Problem with date type (object to datetime)
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi all,
so I have this issue with the convertion to date format.
here is how my csv file looks like:csv

and after setting headers and selecting some columns I got this: df

but when I want to convert to datetime64[ns] I got this error: error

Ialso tried this:
new_data['Date_Time']=pd.to_datetime(new_data['Date_Time'])
new_data.dtypes
but almost the same error:
~/anaconda3/lib/python3.7/site-packages/dateutil/parser/_parser.py in _build_naive(self, res, default)
   1225                 repl['day'] = monthrange(cyear, cmonth)[1]
   1226 
-> 1227         naive = default.replace(**repl)
   1228 
   1229         if res.weekday is not None and not res.day:

ValueError: year 1552231082 is out of range
Thks for your help.
B. regards
Karlito
While using read_csv one can pass something like

df = pd.read_csv('my.csv', headers=None, converters={0: lambda x: datetime.datetime.strptime(x, '%d.%m.%Y %H:%M')})
and format as datetime while reading without need to convert it later (if passing headers name then converters key must be corresponding to that header name)
(Oct-15-2019, 01:48 PM)perfringo Wrote: [ -> ]While using read_csv one can pass something like

df = pd.read_csv('my.csv', headers=None, converters={0: lambda x: datetime.datetime.strptime(x, '%d.%m.%Y %H:%M')})
and format as datetime while reading without need to convert it later (if passing headers name then converters key must be corresponding to that header name)

Thks Perfringo,

but I got this new error:
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-76-5dc85874d75f> in <module>
      7 
      8 #data = pd.read_csv('data.csv', low_memory = False)
----> 9 data = pd.read_csv('data.csv', header=None, converters={0: lambda x: datetime.datetime.strptime(x, '%d.%m.%Y %H:%M')})
     10 data.head()

~/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, escapechar, comment, encoding, dialect, tupleize_cols, error_bad_lines, warn_bad_lines, skipfooter, doublequote, delim_whitespace, low_memory, memory_map, float_precision)
    676                     skip_blank_lines=skip_blank_lines)
    677 
--> 678         return _read(filepath_or_buffer, kwds)
    679 
    680     parser_f.__name__ = name

~/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py in _read(filepath_or_buffer, kwds)
    444 
    445     try:
--> 446         data = parser.read(nrows)
    447     finally:
    448         parser.close()

~/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py in read(self, nrows)
   1034                 raise ValueError('skipfooter not supported for iteration')
   1035 
-> 1036         ret = self._engine.read(nrows)
   1037 
   1038         # May alter columns / col_dict

~/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py in read(self, nrows)
   1846     def read(self, nrows=None):
   1847         try:
-> 1848             data = self._reader.read(nrows)
   1849         except StopIteration:
   1850             if self._first_chunk:

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader.read()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._read_low_memory()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._read_rows()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._convert_column_data()

pandas/_libs/parsers.pyx in pandas._libs.parsers._apply_converter()

<ipython-input-76-5dc85874d75f> in <lambda>(x)
      7 
      8 #data = pd.read_csv('data.csv', low_memory = False)
----> 9 data = pd.read_csv('data.csv', header=None, converters={0: lambda x: datetime.datetime.strptime(x, '%d.%m.%Y %H:%M')})
     10 data.head()

AttributeError: type object 'datetime.datetime' has no attribute 'datetime'
It might be related how you import: import datetime or from datetime import datetime
(Oct-16-2019, 06:22 AM)perfringo Wrote: [ -> ]It might be related how you import: import datetime or from datetime import datetime
Thks for answering! :)

I tried 3 cases :) ... first
just import datetime as
and from datetime import datetime
also both together
You should try to make .strptime to work and then use it in lambda. Something like:

>>> datetime.datetime.strptime('31.12.2018 23:54', '%d.%m.%Y %H:%M')      # whichever version of it works in your settings                      
datetime.datetime(2018, 12, 31, 23, 54)
I personally use startup.py file which loads before interactive session starts and in this file I use 'import datetime'
edited: didnt see you reply before posting this :)

Ok this is what I've done so far and its working.

import pandas as pd
import datetime as dt
#from datetime import datetime 
import matplotlib.pyplot as plt
import numpy as np
import random, string

data = pd.read_csv('dec18_oct19.csv', sep = ';', header = None, low_memory = False)
data.head()
have this dataframe -> df
I don't really know what happened so that data[0].dtype became datetime64 type because since yesterday it was an object type !!!! and the reason of my post. lol

and then

#Because I just want to work with specific columns
dat = data[[0, 5, 6, 7, 8, 9, 10, 14, 15, 16]].copy(deep=False)
#dat[0] = pd.to_datetime(data[0], errors = 'coerce')
dat[5] = pd.to_numeric(dat[5], errors = 'coerce')
dat.head()
result -> slicing

now I can set the headers

dat.columns = [
'Date_Time',
'Netzspannung1', 
'Netzspannung2', 
'Netzspannung3', 
'Phasenstrom (Wirkstrom)1', 
'Phasenstrom (Wirkstrom)2', 
'Phasenstrom (Wirkstrom)3',
'Netzstrom1', 
'Netzstrom2', 
'Netzstrom3']
result -> final df

Is there any ideas to improve this. Thks