I'm a bit confused with the terminology used, tokens, windows words, slices of letters....
But this is what i did: I created a mock text, in this case 3000 string numbers, so it is easy to see the overlap.
And then sliced it up into 999 letter segments, based on what i proposed in post # 2.
Instead of printing the segment, do your calculations on it.
Hope this helps.
Paul
But this is what i did: I created a mock text, in this case 3000 string numbers, so it is easy to see the overlap.
And then sliced it up into 999 letter segments, based on what i proposed in post # 2.
Instead of printing the segment, do your calculations on it.
Hope this helps.
Paul
totalText = '' for x in range(3000): totalText += str(x) + ' ' for x in range(0,len(totalText),500): slice = totalText[x:x+999] print(f'Slice length: {len(slice)}') print(slice)
It is more important to do the right thing, than to do the thing right.(P.Drucker)
Better is the enemy of good. (Montesquieu) = French version for 'kiss'.
Better is the enemy of good. (Montesquieu) = French version for 'kiss'.