How does Zalgo text work?

HtmlUnicodeZalgo

Html Problem Overview


I've seen weirdly formatted text called Zalgo like below written on various forums. It's kind of annoying to look at, but it really bothers me because it undermines my notion of what a character is supposed to be. My understanding is that a character is supposed to move horizontally across a line and stay within a certain "container". Obviously the Zalgo text is moving vertically and doesn't seem to be restricted to any space.

Is this a bug/flaw/exploit/hack in Unicode? Are these individual characters with weird properties? "What" is happening here?

>
> H̡̫̤̤̣͉̤ͭ̓̓̇͗̎̀ơ̯̗̱̘̮͒̄̀̈ͤ̀͡w͓̲͙͖̥͉̹͋ͬ̊ͦ̂̀̚ ͎͉͖̌ͯͅͅd̳̘̿̃̔̏ͣ͂̉̕ŏ̖̙͋ͤ̊͗̓͟͜e͈͕̯̮̙̣͓͌ͭ̍̐̃͒s͙͔̺͇̗̱̿̊̇͞ ̸̤͓̞̱̫ͩͩ͑̋̀ͮͥͦ̊Z̆̊͊҉҉̠̱̦̩͕ą̟̹͈̺̹̋̅ͯĺ̡̘̹̻̩̩͋͘g̪͚͗ͬ͒o̢̖͇̬͍͇͓̔͋͊̓ ̢͈͙͂ͣ̏̿͐͂ͯ͠t̛͓̖̻̲ͤ̈ͣ͝e͋̄ͬ̽͜҉͚̭͇ͅx͎̬̠͇̌ͤ̓̂̓͐͐́͋͡ț̗̹̝̄̌̀ͧͩ̕͢ ̮̗̩̳̱̾w͎̭̤͍͇̰̄͗ͭ̃͗ͮ̐o̢̯̻̰̼͕̾ͣͬ̽̔̍͟ͅr̢̪͙͍̠̀ͅǩ̵̶̗̮̮ͪ́?̙͉̥̬͙̟̮͕ͤ̌͗ͩ̕͡ >
>
>

Html Solutions


Solution 1 - Html

The text uses combining characters, also known as combining marks. See section 2.11 of Combining Characters in the Unicode Standard (PDF).

In Unicode, character rendering does not use a simple character cell model where each glyph fits into a box with given height. Combining marks may be rendered above, below, or inside a base character

So you can easily construct a character sequence, consisting of a base character and “combining above” marks, of any length, to reach any desired visual height, assuming that the rendering software conforms to the Unicode rendering model. Such a sequence has no meaning of course, and even a monkey could produce it (e.g., given a keyboard with suitable driver).

And you can mix “combining above” and “combining below” marks.

The sample text in the question starts with:

Solution 2 - Html

Zalgo text works because of combining characters. These are special characters that allow to modify character that comes before.

enter image description here

OR

y + ̆ = y̆ which actually is

y + ̆ = y̆

Since you can stack them one atop the other you can produce the following:


y̆̆̆̆̆̆̆̆̆̆̆̆̆̆̆̆̆̆

which actually is:

y̆̆̆̆̆̆̆̆̆̆̆̆̆̆̆̆̆̆

The same goes for putting stuff underneath:


y̰̰̰̰̰̰̰̰̰̰̰̰̰̰̰̆̆̆̆̆̆̆̆̆̆̆̆̆̆̆̆̆̆



that in fact is:

y̰̰̰̰̰̰̰̰̰̰̰̰̰̰̰̆̆̆̆̆̆̆̆̆̆̆̆̆̆̆̆̆̆

In Unicode, the main block of combining diacritics for European languages and the International Phonetic Alphabet is U+0300–U+036F.

More about it here

To produce a list of combining diacritical marks you can use the following script (since links keep on dying)

for(var i=768; i<879; i++){console.log(new DOMParser().parseFromString("&#"+i+";", "text/html").documentElement.textContent +"  "+"&#"+i+";");}

Also check em out



Mͣͭͣ̾ Vͣͥͭ͛ͤͮͥͨͥͧ̾

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionMikeView Question on Stackoverflow
Solution 1 - HtmlJukka K. KorpelaView Answer on Stackoverflow
Solution 2 - HtmlMatas VaitkeviciusView Answer on Stackoverflow