Python Forum
anyone here use UTF-16?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
anyone here use UTF-16?
#1
does anyone here use UTF-16? i'm wondering if we even need it anymore.
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#2
Here's a response to consider:
If you found a plant growing in your pack yard that was declared extinct, and you're pretty sure it was the last one in the world, do you pick it and throw it on the compost heap, or call the Botanical Society?
Reply
#3
(Jul-26-2018, 06:29 PM)Skaperen Wrote: does anyone here use UTF-16? i'm wondering if we even need it anymore.
actually this week there was thread about problems reading text file and it turn out that encoding was UTF-16
https://python-forum.io/Thread-Text-file...2#pid53292
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#4
for the utf encodings there is more to it than just the "bit work" to derive the code values. they are intended for use in particular data architectures. i do kn Microsoft uses UTF-16 for file names, and have architectured things to use two bytes per character. so there is a good fit for UTF-16. i don't know how they deal with UTF-16's byte order mechanisms. i have also not yet determined what they do with data. i do know that data i get from people using various Microsoft systems that has Unicode in it comes across in UTF-8 in one byte form consistently. but that doesn't tell me how they store such data in files.

but you can store UTF-16 in 32 bits . you can store UTF-8 in 32 bits, too. but i doubt they store file data that way.

what i mean in my question is whether anyone deliberately works with those UTF-16 values directly in their their programming, a opposed to merely setting parameters to let something else deal with what is or is not a 16-bit or dual octet world.

IMHO, we can do without UTF-16 because there is very little history of need or use cases to put characters in 16-bit units. and UTF-16 comes with baggage ... some code points that are reserved in various ways for UTF-16. UTF-8 didn't do that. if UTF-16 had never come about the entire Unicode code space could be used as Unicode specifies, and it all can be transported in UTF-8.

there are plenty of use cases for UTF-8 that UTF-16 cannot meet. but are there an use cases for UTF-16 that UTF-8 cannot meet?

i want to have a campaign to abandon UTF-16, including such things as eventually opening up for assignment, code points that UTF-16 has caused to be reserved, like the surrogate pairs.
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020