Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Unicode identifiers
#1
i found a total of 112333 Unicode characters, counting ASCII, that are valid for identifiers, according to str.isidentifier().

only 109808 of them are valid as a single character identifier. the others are probably modifiers that can be part of an identifier.

>>> len(''.join(chr(x) for x in range(0,0x110000) if ('abc'+chr(x)+'abc').isidentifier()))
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#2
I have more identifiers. Maybe they changed the database of symbols which are allowed.
The versions I used, have different unicode_versions but, they seems to have the same amount of allowed identifiers.

'3.7.3 (default, Apr 2 2019, 20:16:32) \n[GCC 8.2.1 20181127]'
Result: 128491
unicodedata.unidata_version: 11.0.0

3.8.0a3+ (heads/master:ddbb978, Mar 30 2019, 13:28:31) \n[GCC 8.2.1 20181127]
Result: 128491
unicodedata.unidata_version: 12.0.0
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply
#3
it could be the version of Python. i have 3.5.
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020