Python Forum
[pandas] Convert categorical data to numbers - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Data Science (https://python-forum.io/forum-44.html)
+--- Thread: [pandas] Convert categorical data to numbers (/thread-19151.html)



[pandas] Convert categorical data to numbers - pradeep_as400 - Jun-15-2019

Hello,

I have a data frame df_train which has a column sub_division.

The values in the column is look like below

ABC_commercial
ABC_Private
Test ROM DIV
ROM DIV
TEST SEC ROM

I am trying to
1. convert anything starts with ABC* to a number (for ex: 1)
2. convert anything contains ROM to a number (for ex: 2)

Can you suggest please?

Thanks in advance.


RE: Convert categorical data to numbers - ThomasL - Jun-15-2019

A possibility that might be useful for you:
import pandas as pd

s = pd.Series(['ABC_commercial', 'ABC_Private', 'Test ROM DIV', 'ROM DIV', 'TEST SEC ROM'], dtype="object")
df = pd.DataFrame(s, columns=['sub_division'])

df['ABC'] = (df.sub_division.str.find('ABC_') > -1) * 1
df['ROM'] = (df.sub_division.str.find('ROM') > -1) * 1

print(df)
Output:
     sub_division  ABC  ROM
0  ABC_commercial    1    0
1     ABC_Private    1    0
2    Test ROM DIV    0    1
3         ROM DIV    0    1
4    TEST SEC ROM    0    1