Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
unicode weirdness
#1
this is probably a feature. when i run this code (python3 required):
out = '\n'*8
for n in range(0,1024):
    if n%32 == 0:
        if out:
            print(out)
        out = hex(0x100000000+n)[-8:] + '  '
    if len(repr(chr(n))) < 4:
        out = out + ' ' + chr(n)
    else:
        out = out + ' .'
print(out)
it outputs a bunch of ASCII then a bunch more non-ASCII characters. but starting at line 00000300 a few lines are shorter and more compact, despite the space included between characters (see line 8 in the code). even weirder, at 00000320 and 00000340, characters have backed up over the spaces that came before them.

i am wondering how i can detect this given a character code so i can format my output correctly.
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#2
I had a similar issue with printing filenames. I have some files with Chinese and other characters, which breaks the formatting.
To fit them in one line, I had to use wcwidth to get the real width of the Unicode chars.
Instead of using the len function to get the width, I used wcwidth.
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply
#3
i know there are some double wide characters, but the odd thing is that the space i put between them gets canceled away.
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  clean unicode string to contain only characters from some unicode blocks gmarcon 2 3,916 Nov-23-2018, 09:17 PM
Last Post: Gribouillis

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020