Jan-06-2020, 02:21 PM
Hi, thanks for answering! Well my intention was to remove all the unicode characters by normalizing the text and then I wanted to separate the text on sentances and store them to an array which is why I used the sent_tokenize method. I don't know how to achieve that by preserving the normalized text.