Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 Unicode character widths
#1
while trying to print out a display map of various Unicode characters paired with their UTF-8 bytes in hex, i am finding unpredictability in how much space a character uses to know how many spaces need to follow it. it looks like what i need to do is some form of absolute positioning around each character i don't know the width of (most of them).

is there a database of this info in Python, somewhere?

do the tools that manage text screen displays support the full Unicode set? they'd need to know how to handle them the right way to format the screen correctly.
What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Quote
#2
import wcwidth


symbols = [
    '\N{ZERO WIDTH SPACE}',
    '\N{NARROW NO-BREAK SPACE}',
    '\N{MEDIUM MATHEMATICAL SPACE}',
    '\N{IDEOGRAPHIC SPACE}'
    ]
for symbol in symbols:
    print(symbol, wcwidth.wcwidth(symbol))
Output:
0   1   1   2
My code examples are always for Python >=3.6.0
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Quote
#3
it looks like Unicode has lots of oddities that will make a character code chart very hard to make. but at least my program to show utf-8 byte codes for unicode and beyond (codes all the way up to 2**42 can be encoded if you don't mind having FE and FF in the results).
What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Quote
#4
Yes, Unicode is complicated.
Ever heard about ligatures?: https://github.com/tonsky/FiraCode
My code examples are always for Python >=3.6.0
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Quote

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  unicode variable names Skaperen 1 89 Apr-16-2019, 08:21 AM
Last Post: DeaD_EyE
  Runtime error: coercing to Unicode: need string or buffer, NoneType found satheesh_rvs 2 304 Dec-10-2018, 12:46 PM
Last Post: satheesh_rvs
  is this character printable? Skaperen 2 414 Aug-15-2018, 08:59 PM
Last Post: Skaperen
  character code conversions Skaperen 1 473 May-11-2018, 05:46 AM
Last Post: wavic
  unicode error message Skaperen 2 870 Oct-11-2017, 08:28 AM
Last Post: Skaperen

Forum Jump:


Users browsing this thread: 1 Guest(s)