MySQL - length() vs char_length()
MysqlStringMysql Problem Overview
What's the main difference between length()
and char_length()
?
I believe it has something to do with binary and non-binary strings. Is there any practical reason to store strings as binary?
mysql> select length('MySQL'), char_length('MySQL');
+-----------------+----------------------+
| length('MySQL') | char_length('MySQL') |
+-----------------+----------------------+
| 5 | 5 |
+-----------------+----------------------+
1 row in set (0.01 sec)
Mysql Solutions
Solution 1 - Mysql
LENGTH()
returns the length of the string measured in bytes.
CHAR_LENGTH()
returns the length of the string measured in characters.
This is especially relevant for Unicode, in which most characters are encoded in two bytes. Or UTF-8, where the number of bytes varies. For example:
select length(_utf8 '€'), char_length(_utf8 '€')
--> 3, 1
As you can see the Euro sign occupies 3 bytes (it's encoded as 0xE282AC
in UTF-8) even though it's only one character.
Solution 2 - Mysql
varchar(10) will store 10 characters, which may be more than 10 bytes. In indexes, it will allocate the maximium length of the field - so if you are using UTF8-mb4, it will allocate 40 bytes for the 10 character field.
Solution 3 - Mysql
Though here answer provided by @Andomar is correct, I am tempted to provide a more detailed answer.
NAME IN ENGLISH -
Suppose I create a variable @name
storing my name -
SET @name = "Payel Senapati";
Now, I create a variable total_characters
to store the number of characters my name occupy -
SET @total_characters = CHAR_LENGTH(@name);
SELECT @total_characters;
+-------------------+
| @total_characters |
+-------------------+
| 14 |
+-------------------+
Case 1:
I create a variable @test
to store @name
converted to latin1
character set -
SET @test = CONVERT(@name USING latin1);
I create a variable @total_bytes
and store the length of @test
in terms of bytes -
SET @total_bytes = LENGTH(@test);
SELECT @total_bytes;
+--------------+
| @total_bytes |
+--------------+
| 14 |
+--------------+
Now, latin1
character set allocated 1 byte per character.
Thus, @total_characters = @total_bytes
Case 2:
Now, in variable @test
I store @name
converted to ucs2
character set -
SET @test = CONVERT(@name USING ucs2);
Now, in variable @total_bytes
I store the length of @test
in terms of bytes -
SET @total_bytes = LENGTH(@test);
SELECT @total_bytes;
+--------------+
| @total_bytes |
+--------------+
| 28 |
+--------------+
Now, ucs2
character set allocated 2 bytes per character.
Thus, 2 * @total_characters = @total_bytes
NAME IN HINDI -
Now, I store my name in variable @name
in Hindi -
SET @name = "पायल सेनापति";
Now, in variable total_characters
I store the number of characters my name occupy in Hindi -
SET @total_characters = CHAR_LENGTH(@name);
SELECT @total_characters;
+-------------------+
| @total_characters |
+-------------------+
| 14 |
+-------------------+
Case 1:
Now, in variable @test
I store @name
converted to ucs2
character set -
SET @test = CONVERT(@name USING ucs2);
Now, in variable @total_bytes
I store the length of @test
in terms of bytes -
SET @total_bytes = LENGTH(@test);
SELECT @total_bytes;
+--------------+
| @total_bytes |
+--------------+
| 28 |
+--------------+
Now, ucs2
character set allocated 2 bytes per character.
Thus, 2 * @total_characters = @total_bytes
Case 2:
Now, in variable @test
I store @name
converted to utf32
character set -
SET @test = CONVERT(@name USING utf32);
Now, in variable @total_bytes
I store the length of @test
in terms of bytes -
SET @total_bytes = LENGTH(@test);
SELECT @total_bytes;
+--------------+
| @total_bytes |
+--------------+
| 56 |
+--------------+
Now, utf32
character set allocates 4 bytes per character.
Thus, 4 * @total_characters = @total_bytes
To see all character sets supported by MySQL use -
SHOW CHARACTER SET;