Use of 'use utf8;' gives me 'Wide character in print'

PerlUnicodeUtf 8

Perl Problem Overview


If I run the following Perl program:

perl -e 'use utf8; print "鸡\n";'

I get this warning:

Wide character in print at -e line 1.

If I run this Perl program:

perl -e 'print "鸡\n";'

I do not get a warning.

I thought use utf8 was required to use UTF-8 characters in a Perl script. Why does this not work and how can I fix it? I'm using Perl 5.16.2. I have the same issue if this is in a file instead of being a one liner on the command line.

Perl Solutions


Solution 1 - Perl

Without use utf8 Perl interprets your string as a sequence of single byte characters. There are four bytes in your string as you can see from this:

$ perl -E 'say join ":", map { ord } split //, "鸡\n";'
233:184:161:10

The first three bytes make up your character, the last one is the line-feed.

The call to print sends these four characters to STDOUT. Your console then works out how to display these characters. If your console is set to use UTF8, then it will interpret those three bytes as your single character and that is what is displayed.

If we add in the utf8 module, things are different. In this case, Perl interprets your string as just two characters.

$ perl -Mutf8 -E 'say join ":", map { ord } split //, "鸡\n";'
40481:10

By default, Perl's IO layer assumes that it is working with single-byte characters. So when you try to print a multi-byte character, Perl thinks that something is wrong and gives you a warning. As ever, you can get more explanation for this error by including use diagnostics. It will say this:

> (S utf8) Perl met a wide character (>255) when it wasn't expecting > one. This warning is by default on for I/O (like print). The easiest > way to quiet this warning is simply to add the :utf8 layer to the > output, e.g. binmode STDOUT, ':utf8'. Another way to turn off the > warning is to add no warnings 'utf8'; but that is often closer to > cheating. In general, you are supposed to explicitly mark the > filehandle with an encoding, see open and perlfunc/binmode.

As others have pointed out you need to tell Perl to accept multi-byte output. There are many ways to do this (see the Perl Unicode Tutorial for some examples). One of the simplest ways is to use the -CS command line flag - which tells the three standard filehandles (STDIN, STDOUT and STDERR) to deal with UTF8.

$ perl -Mutf8 -e 'print "鸡\n";'
Wide character in print at -e line 1.

vs

$ perl -Mutf8 -CS -e 'print "鸡\n";'

Unicode is a big and complex area. As you've seen, many simple programs appear to do the right thing, but for the wrong reasons. When you start to fix part of the program, things will often get worse until you've fixed all of the program.

Solution 2 - Perl

All use utf8; does is tell Perl the source code is encoded using UTF-8. You need to tell Perl how to encode your text:

use open ':std', ':encoding(UTF-8)';

Solution 3 - Perl

Encode all standard output as UTF-8:

binmode STDOUT, ":utf8";

Solution 4 - Perl

You can get close to "just do utf8 everywhere" by using the CPAN module utf8::all.

perl -Mutf8::all -e 'print "鸡\n";'

When print receives something that it can't print (character larger than 255 when no :encoding layer is provided), it assumes you meant to encode it using UTF-8. It does so, after warning about the problem.

Solution 5 - Perl

You can use this,

perl -CS filename.

It will also terminates that error.

Solution 6 - Perl

In Spanish you can find this error when beside of begin using:

use utf8;

Your editor encoding is in a different encoding. So what you see on the editor is not what Perl does. To solve that error just change the editor encoding to Unicode/UTF-8.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionEric JohnsonView Question on Stackoverflow
Solution 1 - PerlDave CrossView Answer on Stackoverflow
Solution 2 - PerlikegamiView Answer on Stackoverflow
Solution 3 - PerlBoris IvanovView Answer on Stackoverflow
Solution 4 - PerlJoel BergerView Answer on Stackoverflow
Solution 5 - PerlKarthikeyan.R.SView Answer on Stackoverflow
Solution 6 - PerlDiegoArView Answer on Stackoverflow