Python Forum
Changing a string value to a numerical value using python code and a lamda function
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Changing a string value to a numerical value using python code and a lamda function
#3
I don't like the index idea. It is relatively slow and doesn't handle the case where the status name is not in the list. A better solution is to use a dictionary which is both faster and handles unexpected status names better.

I tried using map with your if statement and with a dictionary. The dictionary is slightly faster. I also tried using apply instead of map, and they are about the same.

The only way I could figure out to vectorize the substitution is using replace().

Here are my tests. Printed times are how long it took to create a new column of 20,000 values. I hand to make special accommodations to prevent the index method from crashing:
import pandas as pd
import numpy as np
from random import choice
from time import time

states = {"NORMAL": 0, "BROKEN": 1, "RECOVERING": 2}
keys = list(states.keys()) + [""]  # Add an invalid state

df = pd.DataFrame({"State": [choice(keys) for _ in range(20000)]})

start = time()
df["if"] = df["State"].map(
    lambda x: 0
    if x == "NORMAL"
    else 1
    if x == "BROKEN"
    else 2
    if x == "RECOVERING"
    else np.NaN
)
print("if map", time() - start)

start = time()
df["dict"] = df["State"].map(states)
print("dict map", time() - start)

start = time()
df["if apply"] = df["State"].apply(
    lambda x: 0
    if x == "NORMAL"
    else 1
    if x == "BROKEN"
    else 2
    if x == "RECOVERING"
    else np.NaN
)
print("if apply", time() - start)

start = time()
df["index map"] = df["State"].map(lambda x: keys.index(x))
print("index map", time() - start)

start = time()
df["replace"] = df["State"].replace("NORMAL", 0)
df["replace"] = df["replace"].replace("BROKEN", 1)
df["replace"] = df["replace"].replace("RECOVERING", 2)
print("replace", time() - start)

print(df[:10])
Output:
if map 0.0060176849365234375 dict map 0.0010302066802978516 if apply 0.007014036178588867 index map 0.005976438522338867 replace 0.003970146179199219 State if dict if apply index map replace 0 RECOVERING 2.0 2.0 2.0 2 2 1 NaN NaN NaN 3 2 NaN NaN NaN 3 3 RECOVERING 2.0 2.0 2.0 2 2 4 NaN NaN NaN 3 5 NORMAL 0.0 0.0 0.0 0 0 6 NORMAL 0.0 0.0 0.0 0 0 7 RECOVERING 2.0 2.0 2.0 2 2 8 BROKEN 1.0 1.0 1.0 1 1 9 BROKEN 1.0 1.0 1.0 1 1
Reply


Messages In This Thread
RE: Changing a string value to a numerical value using python code and a lamda function - by deanhystad - Jul-05-2022, 05:45 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Virtual Env changing mysql connection string in python Fredesetes 0 419 Dec-20-2023, 04:06 PM
Last Post: Fredesetes
  restrict user input to numerical values MCL169 2 988 Apr-08-2023, 05:40 PM
Last Post: MCL169
Question Inserting Numerical Value to the Element in Optionlist and Printing it into Entry drbilgehanbakirhan 1 856 Jan-30-2023, 05:16 AM
Last Post: deanhystad
  Code changing rder of headers Led_Zeppelin 0 934 Jul-13-2022, 05:38 PM
Last Post: Led_Zeppelin
  Sorting numerical values provided by QAbstractTableModel BigMan 0 1,409 Jun-04-2022, 12:32 AM
Last Post: BigMan
  Convert a string to a function mikepy 8 2,627 May-13-2022, 07:28 PM
Last Post: mikepy
  I want to simplify this python code into fewer lines, it's about string mandaxyz 5 2,213 Jan-15-2022, 01:28 PM
Last Post: mandaxyz
Thumbs Up Parsing a YAML file without changing the string content..?, Flask - solved. SpongeB0B 2 2,331 Aug-05-2021, 08:02 AM
Last Post: SpongeB0B
  changing Python files to .exe alok 2 2,286 Jul-20-2021, 02:49 PM
Last Post: alok
  Putting code into a function breaks its functionality, though the code is identical! PCesarano 1 2,041 Apr-05-2021, 05:40 PM
Last Post: deanhystad

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020