Python Forum
Who converts data when writing to a database with an encoding different from utf8?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Who converts data when writing to a database with an encoding different from utf8?
#1
Python 3.7.2

I write the strings from my Python code into my database. My strings contain Latin and Cyrillic characters, so in the database I use 1-byte encoding koi8-r. The miracle is that my strings without distortion are written to the database, although utf8 and koi8r have completely different sequence of characters (for example, as in ascii and utf8). Sometimes characters of other layouts appear in the text and then write errors appear.

Therefore, the questions appear:
1. Who converts strings: the database or the aiomysql library, that I use to write to the database.
2. How quickly in Python / MariaDB to remove non-koi8-r characters to avoid errors.

Thank you in advance for participating in the conversation.

Perhaps, there are databases that support an "economical" (for my case) multibyte encoding, which stores the Latin and Cyrillic characters in the first byte, and other layouts in other bytes?
Reply
#2
If I understood the code right, there happens an implicit encoding to utf8mb4, if no other default encoding has been set.
If I'm right, this means the encoding takes 4 byte for each character. This can be something for internal optimization of MySQL itself.

Where the encoding happens: https://github.com/aio-libs/aiomysql/blo...on.py#L426
DEFAULT_CHARSET from external dependency: https://github.com/PyMySQL/PyMySQL/blob/...ons.py#L91

If you have a str, it's already encoded internally with utf8.
If your input was made with a koi8-r encoding, somewhere must happen a conversion.
For example if you enter form data on a web page, you should receive the parameters as raw bytes.
Then they need to be decoded, to be represented as str.

If the query is a str, then it is automatically encoded to utf8mb4, if no other encoding has been set somewhere.
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  How to detect abnormal data in big database python vanphuht91 5 1,064 Jun-27-2023, 11:22 PM
Last Post: Skaperen
  Database that can compress a column, or all data, automatically? Calab 3 1,120 May-22-2023, 03:25 AM
Last Post: Calab
  Issue in writing sql data into csv for decimal value to scientific notation mg24 8 2,881 Dec-06-2022, 11:09 AM
Last Post: mg24
  Basic SQL query using Py: Inserting or querying sqlite3 database not returning data marlonbown 3 1,306 Nov-08-2022, 07:16 PM
Last Post: marlonbown
  Create a function for writing to SQL data to csv mg24 4 1,111 Oct-01-2022, 04:30 AM
Last Post: mg24
  [SOLVED] [Windows] Converting filename to UTF8? Winfried 5 2,447 Sep-06-2022, 10:47 PM
Last Post: snippsat
  Need Help writing data into Excel format ajitnayak87 8 2,438 Feb-04-2022, 03:00 AM
Last Post: Jeff_t
  I need help parsing through data and creating a database using beautiful soup username369 1 1,688 Sep-22-2021, 08:45 PM
Last Post: Larz60+
  Fastest Way of Writing/Reading Data JamesA 1 2,139 Jul-27-2021, 03:52 PM
Last Post: Larz60+
  SaltStack: MySQL returner save less data into Database table columns xtc14 2 2,116 Jul-02-2021, 02:19 PM
Last Post: xtc14

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020