Posts: 164
Threads: 88
Joined: Feb 2021
Sep-08-2022, 12:42 PM
(This post was last modified: Sep-08-2022, 12:43 PM by Led_Zeppelin.)
I am using the following statement in my program
df2["machine_status"] = df2["machine_status"].map(
lambda x: 0
if x == "NORMAL"
else 1
) This clearly refers to the machine status in the program. I am changing the values normal and broken to numeric values. If normal, then use 0 else use 1.
I would like to use the same logic code on my time period where I say if morning insert 1 in place or morning and if afternoon insert 2 in place of morning and if evening insert 3 in place of evening and if night insert 4 in place of night. I tried that and I got an error.
Th error said expected else after if statement. Well in my current situation, I have four not two conditions. Thus, there will be three more if statements and else follows the last if statement.
But the interpreter does not like it and it gives me an error.
How can I fix this.
The data that I am trying to change is in a dataframe like this
time period
night
morning
afternoon
evening
That is what I am applying my python 3 code to. How do I write the code?
Respectfully,
LZ
Posts: 6,813
Threads: 20
Joined: Feb 2020
Sep-08-2022, 02:02 PM
(This post was last modified: Sep-08-2022, 05:07 PM by deanhystad.)
df2["machine_status"] = df2["machine_status"].map(lambda x: 0 if x == "NORMAL" else 1) or
func = lambda x: 0 if x == "NORMAL" else 1
df2["machine_status"] = df2["machine_status"].map(func) Python doesn't have code block markers like C/C++, so indenting and linefeeds have meaning. They cannot be used willy-nilly just because you think it makes the code pretty. The way you wrote your lambda expression must have confused Python into thinking it was and if/else statement instead of a conditional expression (sometimes called ternary operator). Though a conditional expression contains if and else, it is not the same as an if/else statement.
Posts: 164
Threads: 88
Joined: Feb 2021
I know that indentation is critical in python. I beleive you misunderstood my post. I used an expanded version of the code in my first post.
Here is a sample.
f4["machine_status"] = df["machine_status"].map(
# lambda x: 1
# if x == "Morning"
# lambda x: 2
# if x == "Aftenoon"
# lambda x: 3
# if x == "Evening"
# else "Night"
#) That is a variation of the code in my first post. It gave an error " expected else after if statement". To me that means it only supports two conditions. But in this case, I have four conditions.
Thus, I need to modify the code to do that. The interpreter gave me an error. I want to use map and lambda. I think that if I do the code segment will run faster.
My original code from my initial post is running ok. I thought that I would use something like it in the current situation. But it gave an error.
R,
LZ
Posts: 6,813
Threads: 20
Joined: Feb 2020
Sep-08-2022, 05:02 PM
(This post was last modified: Sep-08-2022, 05:04 PM by deanhystad.)
If you want to ask a question about this:
f4["machine_status"] = df["machine_status"].map(
lambda x: 1
if x == "Morning"
lambda x: 2
if x == "Aftenoon"
lambda x: 3
if x == "Evening"
else "Night"
) Why did you post this?
df2["machine_status"] = df2["machine_status"].map(
lambda x: 0
if x == "NORMAL"
else 1
) Oh well.
You are confused about lambda expressions. A lambda expression defines a function, it does not call a function. In your examples you pass a function to map(). The function is defined using a lambda expression, but it could just as well be defined using def. The map() function expects a function as an argument, and it calls the function for each element in the series.
In your more complex example you are trying to use the lambda expression as a FUNCTON CALL. Even then the coding is wrong. This code is syntactically correct, but it does not do what you want.
import pandas as pd
df = pd.DataFrame({"status": ["Morining", "Evening", "Afternoon", "Night"]})
df["value"] = df["status"].map(
lambda x: 1
if x == "Morning"
else lambda x: 2
if x == "Afternoon"
else lambda x: 3
if x == "Evening"
else 4
)
print(df) Output: status value
Morining <function <lambda>.<locals>.<lambda> at 0x0000...1
Evening <function <lambda>.<locals>.<lambda> at 0x0000...2
Afternoon <function <lambda>.<locals>.<lambda> at 0x0000...3
Night <function <lambda>.<locals>.<lambda> at 0x0000...
The function is added to the dataframe, not the value that would be returned by the function if it were called.
You could write something like this:
import pandas as pd
df = pd.DataFrame({"status": ["Morning", "Evening", "Afternoon", "Night"]})
df["value"] = df["status"].map(
lambda x: 1
if x == "Morning"
else 2
if x == "Evening"
else 3
if x == "Afternoon"
else 4
)
print(df) Output: status value
0 Morning 1
1 Evening 2
2 Afternoon 3
3 Night 4
I think this code is hideous, but I don't like this much either.
df2["machine_status"] = df2["machine_status"].map(
lambda x: 0
if x == "NORMAL"
else 1
) It is not obvious to me at all that this evaluates to 0 if x == "NORMAL" else 0. The way it reads is very clunky and the blocking obscures the meaning. When you have something complex, define a function. Limit lambda expressions to equations.
Posts: 164
Threads: 88
Joined: Feb 2021
Sep-08-2022, 05:12 PM
(This post was last modified: Sep-08-2022, 05:20 PM by Led_Zeppelin.)
Te following python code worrks in my program and is very fast.
df2["machine_status"] = df2["machine_status"].map(
lambda x: 0
if x == "NORMAL"
else 1
) I tried expanding it to four conditons
0 for normal and 1 fo anything else for the working code.. It works and it is fast. It goes through 220320 daata it like lighning.
Now that code has only two situations
I tried to expand the code to deal with a different set of data which I willl show you now.
df4["machine_status"] = df["machine_status"].map(
lambda x: 1
if x == "Morning"
lambda x: 2
if x == "Aftenoon"
lambda x: 3
if x == "Evening"
else "Night"
) This code does not work. I must encode the data because I cannot do Kullback-Leibler on data such as morning, afternoon, evening and night. Just like the original short code I am encoding but, for four conditions not two. But the code I am using does not work. It throws an error:
expected else statement after if.
I like the shorter code currently in use. It is very fast. I just want to use the longer to do something similar, but it fails
What am I doing wrong?
R,
LZ
Posts: 6,813
Threads: 20
Joined: Feb 2020
Sep-08-2022, 06:09 PM
(This post was last modified: Sep-08-2022, 08:20 PM by deanhystad.)
You were using lambda functions incorrectly. You used lambda expressions inside your lambda expression, expecting them to act like a function call. lambda expressions define a function, they don't call it. If you want to use a lambda function with multiple if's do it like this:
df["value"] = df["status"].map(
lambda x: 1
if x == "Morning"
else 2
if x == "Afternoon"
else 3
if x == "Evening"
else 4 That code works and will be about as fast as your two state lambda code.
Your two state lambda code is not fast because it uses a lambda expression. Named functions and lambda expressions should take about the same amount of time. Here I test using a lambda expression against using a named function. For good measure I tossed in using a dictionary. I measured the time to process 10 million entries.
import pandas as pd
import random
from time import time
states = ["Morning", "Afternoon", "Evening", "Night"]
df = pd.DataFrame({"state": random.choices(states, k=10000000)})
mapping = dict(zip(states, (1, 2, 3, 4)))
def func(x):
if x == "Morning":
return 1
elif x == "Afternoon":
return 2
elif x == "Evening":
return 3
return 4
a = time()
df["a"] = df["state"].map(func)
b = time()
df["b"] = df["state"].map(
lambda x: 1
if x == "Morning"
else 2
if x == "Afternoon"
else 3
if x == "Evening"
else 4
)
c = time()
df["c"] = df["state"].map(mapping)
d = time()
print("Function", b - a, " Lambda ", c - b, " Dictionary", d - c) Output: Function 3.0800023078918457 Lambda 3.074204206466675 Dictionary 0.2832789421081543
I ran this several times and even changed the order, testing the lambda expression first. That made no difference. The named function and lambda expression results are nearly identical and both are 10 times slower than using a dictionary.
Posts: 6,813
Threads: 20
Joined: Feb 2020
Sep-08-2022, 07:19 PM
(This post was last modified: Sep-08-2022, 08:17 PM by deanhystad.)
Egg on my face. You can embed lambda expressions inside lambda expressions. It just doesn't make any sense to do so.
This code uses lambda expressions inside lambda expressions. It runs slower than either of my previous examples, about 40% slower. I still can't believe it works.
import pandas as pd
import random
from time import time
states = ["Morning", "Afternoon", "Evening", "Night"]
df = pd.DataFrame({"state": random.choices(states, k=10000000)})
a = time()
df["a"] = df["state"].map(
lambda x: 1
if x == "Morning"
else (
lambda x: 2 if x == "Afternoon" else (lambda x: 3 if x == "Evening" else 4)(x) # <- Calls (3,4) lambda
)(x) # Calls (2, (3,4)) lambda
)
print(time() - a) Even though it works, DO NOT USE IT!!! lambda expressions only make sense when you need to define a short function that is used elsewhere. It makes sense to use a lambda expression with map() because map takes a function as an argument. It does not make sense to define a lambda expression when you don't need a function. You can do this.
x = (lambda a, b: a + b)(1, 2) But you should do this.
x = 1 + 2
Posts: 1,950
Threads: 8
Joined: Jun 2018
Sep-09-2022, 08:16 AM
(This post was last modified: Sep-09-2022, 08:16 AM by perfringo.)
(Sep-08-2022, 12:42 PM)Led_Zeppelin Wrote: This clearly refers to the machine status in the program. I am changing the values normal and broken to numeric values. If normal, then use 0 else use 1.
I would achieve this objective with something like that (because it's simple boolean comparison):
import pandas as pd
col = ("NORMAL", "", "NORMAL", "something", "NORMAL")
df = pd.DataFrame(col, columns=["machine_state"])
df["machine_state"] = (df["machine_state"] != "NORMAL").astype(int)
# initial df
machine_state
0 NORMAL
1
2 NORMAL
3 something
4 NORMAL
# after
machine_state
0 0
1 1
2 0
3 1
4 0 With more than two output values this (boolean) approach can't be used. One way would be to change one value at the time:
df.loc[df["machine_state"] == "NORMAL"] = 0
#
machine_state
0 0
1
2 0
3 something
4 0 One can make a loop to avoid repetition, untested idea:
states = ("morning", "afternoon", "evening", "night")
for i, state in enumerate(states, start=1):
df.loc[df["machine_learning"] == state] = i
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy
Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Posts: 6,813
Threads: 20
Joined: Feb 2020
Sep-09-2022, 07:27 PM
(This post was last modified: Sep-09-2022, 07:56 PM by deanhystad.)
import pandas as pd
import random
from time import time
states = ["Morning", "Afternoon", "Evening", "Night"]
df = pd.DataFrame({"state": random.choices(states, k=10000000)})
a = time()
for i, state in enumerate(states, start=1):
df.loc[df["state"] == state, ["state"]] = i
print(time() - a) Output: 1.6639153957366943
Almost twice as fast as calling a functon/lambda. Nowhere near as fast as passing a dictionary
|