Does Python forbid two similarly looking Unicode identifiers?

PythonUnicode

Python Problem Overview


I was playing around with Unicode identifiers and stumbled upon this:

>>> 𝑓, x = 1, 2
>>> 𝑓, x
(1, 2)
>>> 𝑓, f = 1, 2
>>> 𝑓, f
(2, 2)

What's going on here? Why does Python replace the object referenced by 𝑓, but only sometimes? Where is that behavior described?

Python Solutions


Solution 1 - Python

PEP 3131 -- Supporting Non-ASCII Identifiers says

> All identifiers are converted into the normal form NFKC while parsing; comparison of identifiers is based on NFKC.

You can use unicodedata to test the conversions:

import unicodedata

unicodedata.normalize('NFKC', '𝑓')
# f

which would indicate that '𝑓' gets converted to 'f' in parsing. Leading to the expected:

𝑓  = "Some String"
print(f)
# "Some String"

Solution 2 - Python

Here's a small example, just to show how horrible this "feature" is:

𝕋𝐑ᡒ𝔰_ο½†π”’π˜’πšπ“Šα΅£β‚‘_𝕀ₕ𝔬𝔲𝖑𝔑_dβ‚‘π•—α΅’π˜―ο½‰π˜΅πšŽβ„“y_π’·π˜¦_𝐚_πš‹α΅˜g = 42
print(Tπ—΅β„Ήπšœ_𝒇eπ–†πšπ™ͺα΅£e_β‚›π”₯º𝓾𝗹𝙙_𝚍eπ’‡α΅’π’β±ο½”α΅‰π•π˜†_𝖻ℯ_π”ž_π–‡π–šπ“°)
# => 42

[Try it online!](https://tio.run/##JU45DoJAFO09haW2aust7GkkaqPE0FBKCESNC5sCOia4FcZENJkYNVh4E@YCzA3wocWf92feNpIit3vdapZx4o45mYcJ3XLiREIa60DsHiYgnNijhO6YagpQ7pnqgj1jbpwsTGAoNEGCW/4SvEsaDwEU5inTbAU2646HI5Z5gCMYJ9RrFevFWqUg9TtdudTgZEmZ9gC3zuWGiGz93@6f0C4KTF2h6/B54j9vyKH1/TxsIuaGX7U1Y4NrGjsJxQdceD0disWLaRegs8kvBiZARlTOsi8 "Python 3 – Try It Online") (But please don't use it)

And as mentioned by @MarkMeyer, two identifiers might be distinct even though they look just the same ("CYRILLIC CAPITAL LETTER A" and "LATIN CAPITAL LETTER A")

А = 42
print(A)
# => NameError: name 'A' is not defined

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionErik CederstrandView Question on Stackoverflow
Solution 1 - PythonMarkView Answer on Stackoverflow
Solution 2 - PythonEric DuminilView Answer on Stackoverflow