Python Forum
Converting text data into numeric to use SVM - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Data Science (https://python-forum.io/forum-44.html)
+--- Thread: Converting text data into numeric to use SVM (/thread-15447.html)



Converting text data into numeric to use SVM - mukhan169 - Jan-17-2019

Hello,
I am very new to Python, so I apologize if this is silly question and answer is very easy. I have recently assigned a task to get the text input and apply machine learning so it can perform classification(true or not true). I have 3 text field and base on that a decision has to be made. I want to use SVM. but unfortunately I dont know enough python to convert my text data into int. If someone can guide me into right direction (tutorial, example) to convert the text data to int that will be greatly appreciated. I learn most of the things by taking courses on plural sight. I have not figure out which course would guide me through the conversion, hence my question.
Thanks in advance for your help.


RE: Converting text data into numeric to use SVM - perfringo - Jan-18-2019

(Jan-17-2019, 06:33 PM)mukhan169 Wrote: I dont know enough python to convert my text data into int. If someone can guide me into right direction (tutorial, example) to convert the text data to int that will be greatly appreciated.

In Python you convert string to int with built-in function int().

Some examples:

>>> int('10')
10
>>> int('0010')
10
>>> int('1.0')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: '1.0'



RE: Converting text data into numeric to use SVM - mukhan169 - Jan-18-2019

May be I wasnt clear in my question. I have some data in string like "clutch repair", "Customer complaint noise replace the ball joint" they (Analysts) do the analysis on that to determine if that was correct or not. If it was correct then they will set a certain flag to True and if it wasnt then false. I have to run the data in machine learning and I have to convert "clutch repair" and "Customer complaint noise replace the ball joint" into int so I can run supervised machine learning on it as it only takes integer as parameters.


RE: Converting text data into numeric to use SVM - perfringo - Jan-18-2019

(Jan-18-2019, 01:28 PM)mukhan169 Wrote: May be I wasnt clear in my question. I have some data in string like "clutch repair", "Customer complaint noise replace the ball joint" they (Analysts) do the analysis on that to determine if that was correct or not. If it was correct then they will set a certain flag to True and if it wasnt then false. I have to run the data in machine learning and I have to convert "clutch repair" and "Customer complaint noise replace the ball joint" into int so I can run supervised machine learning on it as it only takes integer as parameters.


It doesn't make it any clearer. To you want replace specific text like "Customer complaint noise replace the ball joint" with some specific int like 42? And "clutch repair" with 43 or something?


RE: Converting text data into numeric to use SVM - Gribouillis - Jan-18-2019

The solution is to use a dictionary
table = {
    "Customer complaint noise replace the ball joint": 42,
    "clutch repair": 43,
}

data = 'clutch repair'
print(table[data]) # prints 43



RE: Converting text data into numeric to use SVM - mukhan169 - Jan-22-2019

(Jan-18-2019, 07:18 AM)perfringo Wrote:
(Jan-17-2019, 06:33 PM)mukhan169 Wrote: I dont know enough python to convert my text data into int. If someone can guide me into right direction (tutorial, example) to convert the text data to int that will be greatly appreciated.

In Python you convert string to int with built-in function int().

Some examples:

>>> int('10')
10
>>> int('0010')
10
>>> int('1.0')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: '1.0'
Yes I am but i want to use tf and itf. I cant hard code a number to a string as their will be over 100 k records. once i have converted string to tf and itf I should be able to use the SVM algorithm.
Does it make any more sense?


RE: Converting text data into numeric to use SVM - mukhan169 - Jan-23-2019

I guess something like this but alot simpler. Again I am very new to this
https://www.kaggle.com/sudhirnl7/simple-naive-bayes-xgboost