Python Forum

Full Version: Issue on text decoding and encoding problem

You're currently viewing a stripped down version of our content. View the full version with proper formatting.

HELLO,

while reading data from the text it is not reading properly as it. Please tell me how can solve it ...

sentences = nltk.sent_tokenize(texts.decode('utf-8'))

Output:
u2013specific

thanks for ur help

Looks like the text contains unicode escapes and is not de-escaped. What is the output of print repr(texts)?

desul