How to write unicode strings into a file?
PythonUnicodePython Problem Overview
I am using python 2.6.5 I want to write some japanese characters to a file. I am getting this error & I don't know how to change the encoding.
Python 2.6.5 (r265:79063, Jun 12 2010, 17:07:01)
[GCC 4.3.4 20090804 (release) 1] on cygwin
>>> s = u'\u5E73\u621015'
>>> with open("yop", "wb") as f:
... f.write( s + "\n" );
...
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1:
ordinal not in range(128)
>>> type( s )
<type 'unicode'>
Python Solutions
Solution 1 - Python
you're going to have to 'encode' the unicode string.
s = u'\u5E73\u621015'
with open("yop", "wb") as f:
f.write(s.encode("UTF-8"))
try this out for a bit of a friendly look at unicode and python: http://farmdev.com/talks/unicode/
Solution 2 - Python
As an alternative, you can use the codecs
module:
import codecs
s = u'\u5E73\u621015'
with codecs.open("yop", "w", encoding="utf-8") as f:
f.write(s)
Solution 3 - Python
The codecs.open() function in 2.6 is very similar to the built-in open() function in python3.x (which makes sense since Py3k strings are always Unicode). For future proofing your code in case it is used under Py3k you could do the following.
import sys
if sys.version_info[0] < 3:
import codecs
_open_func_bak = open # Make a back up, just in case
open = codecs.open
with open('myfile', 'w', encoding='utf-8') as f:
f.write(u'\u5E73\u621015')
Now your code should work the same in both 2.x and 3.3+.
Solution 4 - Python
Inserting this at the beginning of my script tends to solve unicode problems.
import sys
reload(sys)
sys.setdefaultencoding('utf8')