Access specific value in df

ju21878436312 · (This post was last modified: Aug-16-2021, 09:45 AM by ju21878436312.)

Hi there,

Is there a way to assign a time difference to the whole column, from a certain row on?
How do I access e.g. the value in row 2, column 0?

import pandas as pd
import numpy as np

df = pd.read_csv('2021-06-15_data.txt', delimiter= '\t',parse_dates=[[0, 1]], header=None, names=["Date","Time","Channel","time","0.3","0.5","1.0","3.0","5.0","10.0"])
# adjust year in the column
df['Date_Time'] = df['Date_Time'] + pd.Timedelta(days = 365*20)

df.head(10)

# Output:
     Date_Time 	            Channel 	...
0 	2021-06-10 09:50:09 	1 	            ...
1 	2021-06-10 09:50:20 	1 	            ...
2 	2021-06-10 09:50:31 	1 	            ...
3 	2021-06-10 09:50:42 	1 	            ...
4 	2021-06-10 09:50:53 	1 	            ...


# Desired output:
     Date_Time 	            Channel 	...
0 	2021-06-10 09:50:09 	1 	            ...
1 	2021-06-10 09:50:20 	1 	            ...
2 	00:00:00 	         1 	            ...
3 	00:00:11 	         1 	            ...
4 	00:00:22 	         1 	            ...

What I've tried:
1.) df.loc to access time range
2.) Access value in row 2, column 0, to substract this value from the others
3.) Assign values to first column with pandas time difference

df.loc[(df['Date_Time']  > '2021-06-10 09:50:20'),'Date_Time']=(df['Date_Time'] - "2021-06-10 09:50:20")

I am encountering several errors, which are hard for me to interprete, such as

TypeError                                 Traceback (most recent call last)
<ipython-input-44-d1ca9152d6b1> in <module>
      2 df.iloc[0:5]
      3 
----> 4 df.loc[(df['Date_Time']  >= '2021-06-10 09:50:20'),'Date_Time']=(df['Date_Time'] - "2021-06-10 09:50:20")

~\AppData\Roaming\Python\Python37\site-packages\pandas\core\ops\common.py in new_method(self, other)
     62         other = item_from_zerodim(other)
     63 
---> 64         return method(self, other)
     65 
     66     return new_method

~\AppData\Roaming\Python\Python37\site-packages\pandas\core\ops\__init__.py in wrapper(left, right)
    501         lvalues = extract_array(left, extract_numpy=True)
    502         rvalues = extract_array(right, extract_numpy=True)
--> 503         result = arithmetic_op(lvalues, rvalues, op, str_rep)
    504 
    505         return _construct_result(left, result, index=left.index, name=res_name)

~\AppData\Roaming\Python\Python37\site-packages\pandas\core\ops\array_ops.py in arithmetic_op(left, right, op, str_rep)
    191         #  by dispatch_to_extension_op.
    192         # Timedelta is included because numexpr will fail on it, see GH#31457
--> 193         res_values = dispatch_to_extension_op(op, lvalues, rvalues)
    194 
    195     else:

~\AppData\Roaming\Python\Python37\site-packages\pandas\core\ops\dispatch.py in dispatch_to_extension_op(op, left, right)
    123     # The op calls will raise TypeError if the op is not defined
    124     # on the ExtensionArray
--> 125     res_values = op(left, right)
    126     return res_values

TypeError: unsupported operand type(s) for -: 'DatetimeArray' and 'str'

Could someone give me a hint?
Kind regards,
Julia

jamesaarr · Aug-17-2021, 12:33 PM

Hello mate,

The issue looks like it's storing the data as different data types. When you declare the variable to store each bit of data, just set the variable to the desired type. EG:

a = datetimearray(10.10.10.0.0.0)
b = datetimearray(11.10.10.0.0.0)
c = str("Hello")
d = int(123)

etc

Try something similar to that.

ju21878436312 · (This post was last modified: Aug-18-2021, 08:24 AM by ju21878436312.)

Hi,

I think the first mistake was to do a substraction between 'Timestamp' and 'str', as the error was

TypeError: unsupported operand type(s) for -: 'DatetimeArray' and 'str'

But if I change the 'str' to 'Timestamp', I get the following:

import pandas as pd
import numpy as np

df = pd.read_csv('2021-06-15_data.txt', delimiter= '\t',parse_dates=[[0, 1]], header=None, names=["Date","Time","Channel","time","0.3","0.5","1.0","3.0","5.0","10.0"])
# adjust year in the column
df['Date_Time'] = df['Date_Time'] + pd.Timedelta(days = 365*20)

df.head(5)

# Try with pandas.to_datetime
df.loc[(df['Date_Time']  >= '2021-06-10 09:50:20'),'Date_Time']=(df['Date_Time'] - pd.to_datetime(1623318620, unit='s'))
df

with the output:

Date_Time 	Channel 	...
0 	2021-06-10 09:50:09 	1 	...
1 	2021-06-10 09:50:20 	1 	...
2 	2021-06-10 09:50:31 	1 	...
3 	2021-06-10 09:50:42 	1 	...
4 	2021-06-10 09:50:53 	1 	...

TypeError: unsupported operand type(s) for -: 'numpy.ndarray' and 'Timestamp'

P.S. I have also tried out

import pandas as pd
import numpy as np
import datetime

df = pd.read_csv('2021-06-15_data.txt', delimiter= '\t',parse_dates=[[0, 1]], header=None, names=["Date","Time","Channel","time","0.3","0.5","1.0","3.0","5.0","10.0"])
# adjust year in the column
df['Date_Time'] = df['Date_Time'] + pd.Timedelta(days = 365*20)

df.head(5)

# Try with datetimearray
b = datetimearray(2021-06-10 09:50:20)
df.loc[(df['Date_Time']  >= '2021-06-10 09:50:20'),'Date_Time']=(df['Date_Time'] - b)
df

when I got:

NameError: name 'datetimearray' is not defined

(importing datetime or not)

The original plan was to refer on a cell in the column, which reveals another error.

df.loc[(df['Date_Time']  >= '2021-06-10 09:50:20'),'Date_Time']=(df['Date_Time'] - df['Date_Time',1])

Access specific value in df

User Panel Messages

Announcements