Python Forum

Full Version: unicode weirdness
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
this is probably a feature. when i run this code (python3 required):
out = '\n'*8
for n in range(0,1024):
    if n%32 == 0:
        if out:
            print(out)
        out = hex(0x100000000+n)[-8:] + '  '
    if len(repr(chr(n))) < 4:
        out = out + ' ' + chr(n)
    else:
        out = out + ' .'
print(out)
it outputs a bunch of ASCII then a bunch more non-ASCII characters. but starting at line 00000300 a few lines are shorter and more compact, despite the space included between characters (see line 8 in the code). even weirder, at 00000320 and 00000340, characters have backed up over the spaces that came before them.

i am wondering how i can detect this given a character code so i can format my output correctly.
I had a similar issue with printing filenames. I have some files with Chinese and other characters, which breaks the formatting.
To fit them in one line, I had to use wcwidth to get the real width of the Unicode chars.
Instead of using the len function to get the width, I used wcwidth.
i know there are some double wide characters, but the odd thing is that the space i put between them gets canceled away.