Mar-14-2020, 02:31 AM
You posted a while ago and no responses, so I am going to stab at it. First, you need to add "header = None" to your import statement, as the first row is being read as the column names, which it isn't. Next, for your question, I would use the "one hot encoding" technique (can read about it multiple sites) using .map . Others will hopefully be able to give you better ideas, but this can give you a start.