Python Forum
Converting text data into numeric to use SVM
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Converting text data into numeric to use SVM
#1
Hello,
I am very new to Python, so I apologize if this is silly question and answer is very easy. I have recently assigned a task to get the text input and apply machine learning so it can perform classification(true or not true). I have 3 text field and base on that a decision has to be made. I want to use SVM. but unfortunately I dont know enough python to convert my text data into int. If someone can guide me into right direction (tutorial, example) to convert the text data to int that will be greatly appreciated. I learn most of the things by taking courses on plural sight. I have not figure out which course would guide me through the conversion, hence my question.
Thanks in advance for your help.
Reply
#2
(Jan-17-2019, 06:33 PM)mukhan169 Wrote: I dont know enough python to convert my text data into int. If someone can guide me into right direction (tutorial, example) to convert the text data to int that will be greatly appreciated.

In Python you convert string to int with built-in function int().

Some examples:

>>> int('10')
10
>>> int('0010')
10
>>> int('1.0')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: '1.0'
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Reply
#3
May be I wasnt clear in my question. I have some data in string like "clutch repair", "Customer complaint noise replace the ball joint" they (Analysts) do the analysis on that to determine if that was correct or not. If it was correct then they will set a certain flag to True and if it wasnt then false. I have to run the data in machine learning and I have to convert "clutch repair" and "Customer complaint noise replace the ball joint" into int so I can run supervised machine learning on it as it only takes integer as parameters.
Reply
#4
(Jan-18-2019, 01:28 PM)mukhan169 Wrote: May be I wasnt clear in my question. I have some data in string like "clutch repair", "Customer complaint noise replace the ball joint" they (Analysts) do the analysis on that to determine if that was correct or not. If it was correct then they will set a certain flag to True and if it wasnt then false. I have to run the data in machine learning and I have to convert "clutch repair" and "Customer complaint noise replace the ball joint" into int so I can run supervised machine learning on it as it only takes integer as parameters.


It doesn't make it any clearer. To you want replace specific text like "Customer complaint noise replace the ball joint" with some specific int like 42? And "clutch repair" with 43 or something?
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Reply
#5
The solution is to use a dictionary
table = {
    "Customer complaint noise replace the ball joint": 42,
    "clutch repair": 43,
}

data = 'clutch repair'
print(table[data]) # prints 43
Reply
#6
(Jan-18-2019, 07:18 AM)perfringo Wrote:
(Jan-17-2019, 06:33 PM)mukhan169 Wrote: I dont know enough python to convert my text data into int. If someone can guide me into right direction (tutorial, example) to convert the text data to int that will be greatly appreciated.

In Python you convert string to int with built-in function int().

Some examples:

>>> int('10')
10
>>> int('0010')
10
>>> int('1.0')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: '1.0'
Yes I am but i want to use tf and itf. I cant hard code a number to a string as their will be over 100 k records. once i have converted string to tf and itf I should be able to use the SVM algorithm.
Does it make any more sense?
Reply
#7
I guess something like this but alot simpler. Again I am very new to this
https://www.kaggle.com/sudhirnl7/simple-...es-xgboost
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Convert dataframe string column to numeric in Python darpInd 1 2,297 Mar-14-2020, 10:07 AM
Last Post: ndc85430
  Handling escape charters while converting data frame to JSON RahulShukla 0 1,677 Nov-11-2019, 11:22 AM
Last Post: RahulShukla

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020