Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
using function in np.where
#1
I'm using np.where and trying to reference a function call instead of putting in a static value, something like:
np.where(condition,function(arg1,arg2),0) | so if conditions returns true: call function, else set value to 0

My problem is that I can't just pass the current row and I don't know how to deal with an entire array in the function.

Without serialization it would be something like:

for x in array/series:
    if y > z:
        x = SET VALUE
    else:
        x= 0
does anyone know what to do here?
Reply
#2
Where do arg1 and arg2 come from? I mean what are y and z?
Reply
#3
Each row in my dataframe is for a certain date and I'm looking to only call the function on dates where it's relevant, to save computation power.

So I need to pass the date (different for each row) and a more static variable (string)

usually with np.where I just do it like: np.where(condition,sales_column/average_price,0)

But I don't know if it's even possible with np.where to reference a function.

Could be that I would need to map a function. But I like the conditional element in the np.where function.
Reply
#4
You could do
np.where(condition, func(sales_column, average_price), 0)
but func would be called before the np.where() is executed, like in any function composition.
Reply
#5
But I would then be passing an array to the function and the function would have to work on the entire array and return a modified array -> np.where would then have to handle that this gets passed into the right rows.
Maybe that's doable, I don't know. Its been surprisingly hard to find anyone describe this use-case.
Reply
#6
When you write np.where(condition,sales_column/average_price,0), the situation is similar: the result of sales_column/average_price is computed before np.where() is called. I think it works because the length of the resulting array is the same as the number of rows.
Reply
#7
(Feb-17-2022, 07:14 PM)glidecode Wrote: I'm looking to only call the function on dates where it's relevant, to save computation power.

Is computation power (really) a bottleneck?

Maybe straightforward approach will work: create mask, new column with zeros, apply function to rows matching mask. Something like:

mask = (df[some_column] some_condition)  

df[new_column] = 0
df.loc[mask, new_column] = some_calculated_value
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020