Inserting binary data results in "Data too long for column…"

Issue

I have a MySQL table created with the following code:

CREATE TABLE `uuid` (
    `id` BINARY(16) NOT NULL,
    PRIMARY KEY (`id`) USING BTREE
)
COLLATE='utf8mb4_unicode_ci'
ENGINE=InnoDB
;

I’m trying to insert a binary value to the id record.

If I do SELECT UUID_TO_BIN('7be7daf5-73ef-4f73-a784-46ea44403293',1); it shows me Ossï{çÚõ§„FêD@2“

Now if I try to insert that output with this query INSERT INTO uuid VALUES ('Ossï{çÚõ§„FêD@2“'); I get SQL Error (1406): Data too long for column 'id' at row 1.

This question might seems wired but that is how I receive my binary data from the back-end and I’m trying to troubleshoot here. Any help why I’m getting that error please?

Thanks.

Solution

The reason for the length discrepancy is that some of the weird characters produced by your binary output are being encoded as UTF-8 characters, which take more than one byte per character.

For example, look at the first five characters, "Ossï{"

mysql> select hex(UUID_TO_BIN('7be7daf5-73ef-4f73-a784-46ea44403293',1)) as hex;
+----------------------------------+
| hex                              |
+----------------------------------+
| 4F7373EF7BE7DAF5A78446EA44403293 |
+----------------------------------+
  O s s ï {

The byte EF is the Latin1 encoding for ï, so the nearest printable character to that byte is output as ï.

But then you copy & paste the string, and it gets converted to UTF-8 encoding, where ï has a multibyte sequence.

mysql> select hex('Ossï{çÚõ§„FêD@2“') as hex;
+------------------------------------------------------+
| hex                                                  |
+------------------------------------------------------+
| 4F7373C3AF7BC3A7C39AC3B5C2A7E2809E46C3AA444032E2809C |
+------------------------------------------------------+
  O s s ï   {

You can see that ï is encoded in UTF-8 as C3AF (two bytes), followed by the { as 7B.

There are several other multibyte characters in this string; I’m only calling attention to the first one.

When you copy & paste a string with funny non-ASCII characters, it’s not as binary bytes, it’s a string of UTF-8 characters, some of which are multibyte.

If you avoid copying & pasting the string, you can get the expected length of binary bytes:

mysql> SELECT LENGTH(UUID_TO_BIN('7be7daf5-73ef-4f73-a784-46ea44403293',1)) as len;
+------+
| len  |
+------+
|   16 |
+------+

So use that when you insert into your table:

mysql> insert into uuid set id = 'Ossï{çÚõ§„FêD@2“';
ERROR 1406 (22001): Data too long for column 'id' at row 1

mysql> insert into uuid set id = UUID_TO_BIN('7be7daf5-73ef-4f73-a784-46ea44403293',1);
Query OK, 1 row affected (0.00 sec)

Answered By – Bill Karwin

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply

(*) Required, Your email will not be published