Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
key error
#1
The following code is not working and I believe it should.

q = df.loc[df2['machine_status']==0]['time_period']
p = df.loc[df2['machine_status']==1]['time_period'][:q.shape[0]]

pq = np.sum(p * np.log(p/q))
qp = np.sum(q * np.log(q/p))
print('KL(P || Q) : %. pq)%.3f' % pq)
print('KL(Q || P) : %. pq)%.3f' % qp)
It gives me the following error:

Error:
KeyError Traceback (most recent call last) File ~\miniconda3\envs\pump-failure-pred\lib\site-packages\pandas\core\indexes\base.py:3621, in Index.get_loc(self, key, method, tolerance) 3620 try: -> 3621 return self._engine.get_loc(casted_key) 3622 except KeyError as err: File ~\miniconda3\envs\pump-failure-pred\lib\site-packages\pandas\_libs\index.pyx:136, in pandas._libs.index.IndexEngine.get_loc() File ~\miniconda3\envs\pump-failure-pred\lib\site-packages\pandas\_libs\index.pyx:163, in pandas._libs.index.IndexEngine.get_loc() File pandas\_libs\hashtable_class_helper.pxi:5198, in pandas._libs.hashtable.PyObjectHashTable.get_item() File pandas\_libs\hashtable_class_helper.pxi:5206, in pandas._libs.hashtable.PyObjectHashTable.get_item() KeyError: 'time_period' The above exception was the direct cause of the following exception: KeyError Traceback (most recent call last) Input In [50], in <cell line: 1>() ----> 1 q = df.loc[df2['machine_status']==0]['time_period'] 2 p = df.loc[df2['machine_status']==1]['time_period'][:q.shape[0]] 4 pq = np.sum(p * np.log(p/q)) File ~\miniconda3\envs\pump-failure-pred\lib\site-packages\pandas\core\frame.py:3505, in DataFrame.__getitem__(self, key) 3503 if self.columns.nlevels > 1: 3504 return self._getitem_multilevel(key) -> 3505 indexer = self.columns.get_loc(key) 3506 if is_integer(indexer): 3507 indexer = [indexer] File ~\miniconda3\envs\pump-failure-pred\lib\site-packages\pandas\core\indexes\base.py:3623, in Index.get_loc(self, key, method, tolerance) 3621 return self._engine.get_loc(casted_key) 3622 except KeyError as err: -> 3623 raise KeyError(key) from err 3624 except TypeError: 3625 # If we have a listlike key, _check_indexing_error will raise 3626 # InvalidIndexError. Otherwise we fall through and re-raise 3627 # the TypeError. 3628 self._check_indexing_error(key) KeyError: 'time_period' 1 ​
Now time period is not an array of numbers and Kullback-Liebler cannot work.

It cannot compare anything but numbers. I am only going to ask for now is that the correct explanation for this error?

I do know what a key error is. Please explain.

I am going to show the last columns so anyone can see that is the error.

I decided to attach a screenshot it is better an easier to show the error.

Respectfully,

LZ

Attached Files

Thumbnail(s)
   
Reply
#2
"time_period" is not a column in df. This is easy to test by printing "df" and looking at the columns. "time_period" will not be there. I don't know if this is because there is no time_period column, or if the string doesn't match ("time_period" does not match "time period" or "time_period ").

As far as I can tell, df.loc[df2['machine_status']==0]['time_period'] and df[df2['machine_status']==0]['time_period'] produce the same result.
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020