[pandas] Convert categorical data to numbers - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Data Science (https://python-forum.io/forum-44.html) +--- Thread: [pandas] Convert categorical data to numbers (/thread-19151.html) |
[pandas] Convert categorical data to numbers - pradeep_as400 - Jun-15-2019 Hello, I have a data frame df_train which has a column sub_division. The values in the column is look like below ABC_commercial ABC_Private Test ROM DIV ROM DIV TEST SEC ROM I am trying to 1. convert anything starts with ABC* to a number (for ex: 1) 2. convert anything contains ROM to a number (for ex: 2) Can you suggest please? Thanks in advance. RE: Convert categorical data to numbers - ThomasL - Jun-15-2019 A possibility that might be useful for you: import pandas as pd s = pd.Series(['ABC_commercial', 'ABC_Private', 'Test ROM DIV', 'ROM DIV', 'TEST SEC ROM'], dtype="object") df = pd.DataFrame(s, columns=['sub_division']) df['ABC'] = (df.sub_division.str.find('ABC_') > -1) * 1 df['ROM'] = (df.sub_division.str.find('ROM') > -1) * 1 print(df)Output: sub_division ABC ROM 0 ABC_commercial 1 0 1 ABC_Private 1 0 2 Test ROM DIV 0 1 3 ROM DIV 0 1 4 TEST SEC ROM 0 1 |