Changing a string value to a numerical value using python code and a lamda function

**deanhystad** · Jul-05-2022, 05:45 PM

I don't like the index idea. It is relatively slow and doesn't handle the case where the status name is not in the list. A better solution is to use a dictionary which is both faster and handles unexpected status names better.

I tried using map with your if statement and with a dictionary. The dictionary is slightly faster. I also tried using apply instead of map, and they are about the same.

The only way I could figure out to vectorize the substitution is using replace().

Here are my tests. Printed times are how long it took to create a new column of 20,000 values. I hand to make special accommodations to prevent the index method from crashing:

import pandas as pd
import numpy as np
from random import choice
from time import time

states = {"NORMAL": 0, "BROKEN": 1, "RECOVERING": 2}
keys = list(states.keys()) + [""]  # Add an invalid state

df = pd.DataFrame({"State": [choice(keys) for _ in range(20000)]})

start = time()
df["if"] = df["State"].map(
    lambda x: 0
    if x == "NORMAL"
    else 1
    if x == "BROKEN"
    else 2
    if x == "RECOVERING"
    else np.NaN
)
print("if map", time() - start)

start = time()
df["dict"] = df["State"].map(states)
print("dict map", time() - start)

start = time()
df["if apply"] = df["State"].apply(
    lambda x: 0
    if x == "NORMAL"
    else 1
    if x == "BROKEN"
    else 2
    if x == "RECOVERING"
    else np.NaN
)
print("if apply", time() - start)

start = time()
df["index map"] = df["State"].map(lambda x: keys.index(x))
print("index map", time() - start)

start = time()
df["replace"] = df["State"].replace("NORMAL", 0)
df["replace"] = df["replace"].replace("BROKEN", 1)
df["replace"] = df["replace"].replace("RECOVERING", 2)
print("replace", time() - start)

print(df[:10])

Output:if map 0.0060176849365234375
dict map 0.0010302066802978516
if apply 0.007014036178588867
index map 0.005976438522338867
replace 0.003970146179199219
        State   if  dict  if apply  index map replace
0  RECOVERING  2.0   2.0       2.0          2       2
1              NaN   NaN       NaN          3
2              NaN   NaN       NaN          3
3  RECOVERING  2.0   2.0       2.0          2       2
4              NaN   NaN       NaN          3
5      NORMAL  0.0   0.0       0.0          0       0
6      NORMAL  0.0   0.0       0.0          0       0
7  RECOVERING  2.0   2.0       2.0          2       2
8      BROKEN  1.0   1.0       1.0          1       1
9      BROKEN  1.0   1.0       1.0          1       1

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Virtual Env changing mysql connection string in python	Fredesetes	0	419	Dec-20-2023, 04:06 PM Last Post: Fredesetes
	restrict user input to numerical values	MCL169	2	988	Apr-08-2023, 05:40 PM Last Post: MCL169
	Inserting Numerical Value to the Element in Optionlist and Printing it into Entry	drbilgehanbakirhan	1	856	Jan-30-2023, 05:16 AM Last Post: deanhystad
	Code changing rder of headers	Led_Zeppelin	0	934	Jul-13-2022, 05:38 PM Last Post: Led_Zeppelin
	Sorting numerical values provided by QAbstractTableModel	BigMan	0	1,409	Jun-04-2022, 12:32 AM Last Post: BigMan
	Convert a string to a function	mikepy	8	2,627	May-13-2022, 07:28 PM Last Post: mikepy
	I want to simplify this python code into fewer lines, it's about string	mandaxyz	5	2,213	Jan-15-2022, 01:28 PM Last Post: mandaxyz
	Parsing a YAML file without changing the string content..?, Flask - solved.	SpongeB0B	2	2,331	Aug-05-2021, 08:02 AM Last Post: SpongeB0B
	changing Python files to .exe	alok	2	2,286	Jul-20-2021, 02:49 PM Last Post: alok
	Putting code into a function breaks its functionality, though the code is identical!	PCesarano	1	2,041	Apr-05-2021, 05:40 PM Last Post: deanhystad

Changing a string value to a numerical value using python code and a lamda function

User Panel Messages

Announcements