hash unicode string in python

PythonUnicodeUtf 8

Python Problem Overview


I try to hash some unicode strings:

hashlib.sha1(s).hexdigest()
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-81: 
ordinal not in range(128)

where s is something like:

> œ∑¡™£¢∞§¶•ªº–≠œ∑´®†¥¨ˆøπ“‘åß∂ƒ©˙∆˚¬…æΩ≈ç√∫˜µ≤≥÷åйцукенгшщзхъфывапролджэячсмитьбююю..юбьтијџўќ†њѓѕ'‘“«««\dzћ÷…•∆љl«єђxcvіƒm≤≥ї!@#$©^&*(()––––––––––∆∆∆∆∆∆∆∆∆∆∆∆∆∆∆∆∆∆∆•…÷ћzdzћ÷…•∆љlљ∆•…÷ћzћ÷…•∆љ∆•…љ∆•…љ∆•…∆љ•…∆љ•…љ∆•…∆•…∆•…∆•∆…•÷∆•…÷∆•…÷∆•…÷∆•…÷∆•…÷∆•…÷∆•…

what should I fix?

Python Solutions


Solution 1 - Python

Apparently hashlib.sha1 isn't expecting a unicode object, but rather a sequence of bytes in a str object. Encoding your unicode string to a sequence of bytes (using, say, the UTF-8 encoding) should fix it:

>>> import hashlib
>>> s = u'é'
>>> hashlib.sha1(s.encode('utf-8'))
<sha1 HASH object @ 029576A0>

The error is because it is trying to convert the unicode object to a str automatically, using the default ascii encoding, which can't handle all those non-ASCII characters (since your string isn't pure ASCII).

A good starting point for learning more about Unicode and encodings is the Python docs, and this article by Joel Spolsky.

Solution 2 - Python

Use encoding format utf-8, Try this easy way,

>>> import hashlib
>>> hashlib.sha256(str(random.getrandbits(256)).encode('utf-8')).hexdigest()
'cd183a211ed2434eac4f31b317c573c50e6c24e3a28b82ddcb0bf8bedf387a9f'

Solution 3 - Python

You hash bytes, not strings. So you gotta know what bytes you really want to hash, for example an utf8 memory representation of the string or a utf16 memory representation of the string, etc.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionVladimir KeleshevView Question on Stackoverflow
Solution 1 - PythonCameronView Answer on Stackoverflow
Solution 2 - PythonJaykumar PatelView Answer on Stackoverflow
Solution 3 - PythondrakorgView Answer on Stackoverflow