![]() |
lexical diversity calculation - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: lexical diversity calculation (/thread-27886.html) |
lexical diversity calculation - AOCL1234 - Jun-25-2020 How does one limit text length when calculating lexical diversity? For instance, say I would like to calculate the lexical diversity of TextA. While its total text length is 10,000, I would like to consider only the first 1,000 tokens of Text A. Thank you. RE: lexical diversity calculation - Larz60+ - Jun-26-2020 you can slice off what you want: >>> zz = 'This is a rather long rambling, uninformative sentence.' >>> zstr = zz[:20] >>> len(zstr) 20 >>> zstr 'This is a rather lon' >>>But this is not exactly what you are asking. so: >>> tokens = zz.split() >>> tokens ['This', 'is', 'a', 'rather', 'long', 'rambling,', 'uninformative', 'sentence.'] >>> yy = ' '.join(tokens[:5]) >>> yy 'This is a rather long' >>> |