\u200b (Zero width space) characters in my JS code. Where did they come from?

JavascriptHtmlGoogle ChromeNetbeansUnicode

Javascript Problem Overview


I am developing a front end of a web app using NetBeans IDE 7.0.1. Recently I had a very nasty bug, which I finally fixed.

Say I have code

var element = '<input size="3" id="foo" name="elements[foo][0]" />';
$('#bar').append(element);

I noticed that something gone wrong when I saw that size attribute doesn't work in Chrome (didn't checked in other browsers). When I opened that element in Inspector, it was interpreted as something like

<input id="&quot;3&quot;" name="&quot;elements[foo][0]&quot;" 
    size="&quot;foo&quot;" />

Which was rather strange. After manually retyping the element string character-in-character, the bug was gone. When I undo'ed that change I noticed that Netbeans alerted me about some Unicode characters in my old code. It was \u200b - a zero width spaces after each '=', between '][' and in the end of the string. So the string appeared normal because zero width spaces wasn't displayed, but after escaping them my string was

'<input size=\u200b"3" id=\u200b"foo" name=\u200b"elements[foo]\u200b[0]" />\u200b'

Now where the hell did I get them?

I'm not sure where did I copied the code of element from, but it's definitely one of the following:

  • Other pane of Netbeans Editor with HTML template file;
  • Google Chrome Inspector, 'Copy as HTML' action;
  • Google Chrome source view page (very doubtfully).

But I can't reproduce the bug with neither of that.

I use Netbeans 7.0.1 and Google Chrome 13.0 under Windows 7. No keyboard switchers or anything like it is running. Also I'm using Git for version control, but I didn't pulled that code, so it is very unlikely that Git is to blame. It can't be a stupid joke of my colleagues, because they are quite well-mannered.

Any suggestions who messed up my code?

Javascript Solutions


Solution 1 - Javascript

Here's a stab in the dark.

My bet would be on Google Chrome Inspector. Searching through the Chromium source, I spotted the following block of code

    if (hasText)
        attrSpanElement.appendChild(document.createTextNode("=\u200B\""));

    if (linkify && (name === "src" || name === "href")) {
        var rewrittenHref = WebInspector.resourceURLForRelatedNode(node, value);
        value = value.replace(/([\/;:\)\]\}])/g, "$1\u200B");
        attrSpanElement.appendChild(linkify(rewrittenHref, value, "webkit-html-attribute-value", node.nodeName().toLowerCase() === "a"));
    } else {
        value = value.replace(/([\/;:\)\]\}])/g, "$1\u200B");
        var attrValueElement = attrSpanElement.createChild("span", "webkit-html-attribute-value");
        attrValueElement.textContent = value;
    }

It's quite possible that I'm simply barking up the wrong tree here, but it looks like zero-width spaces were being inserted (to handle soft text wrapping?) during the display of attributes. Perhaps the "Copy as HTML" function had not properly removed them?


Update

After fiddling with the Chrome element inspector, I'm almost convinced that's where your stray \u200b came from. Notice how the line can wrap not only at visible space but also after = or chars matched by /([\/;:\)\]\}])/ thanks to the inserted zero-width space.

chrome inspector screenshot

Unfortunately, I am unable to replicate your problem where they inadvertently get included into your clipboard (I used Chrome 13.0.782.112 on Win XP).

It would certainly be worth submitting a bug report should your be able to reproduce the behaviour.

Solution 2 - Javascript

This happened to me when I copied source code from another site into my editor. If your using visual studio code or Atom editor, this will highlight those pesky characters zero-width space \u200b) etc.

https://github.com/nhoizey/vscode-gremlins/raw/master/images/screenshot.png"/>

Solution 3 - Javascript

As Mr Shawn Chin has addressed it already. I just happened to replicate the issue while I was copy pasting jquery code from a webpage.

When it happened: Copying text from Google Chrome Version 41.0.2272.118 m ( not tested with other browsers ) to Dreamweaver code window. This copied unwanted characters along the code like this happening here

you copied text from a webpage as

$('.btn-pageMenu').css('display'​​​​​​​​​​​​​​​​​​​​​​​​​​​,'block');​​​​​​

behind the scenes, this is what makes that line

<code><span class="pun">&#8203;</span><span class="pln">$</span><span class="pun">(</span><span class="str">'.btn-pageMenu'</span><span class="pun">).</span><span class="pln">css</span><span class="pun">(</span><span class="str">'display'</span><span class="pun">&#8203;&#8203;&#8203;&#8203;&#8203;&#8203;&#8203;&#8203;&#8203;&#8203;&#8203;&#8203;&#8203;&#8203;&#8203;&#8203;&#8203;&#8203;&#8203;&#8203;&#8203;&#8203;&#8203;&#8203;&#8203;&#8203;&#8203;,</span><span class="str">'block'</span><span class="pun">);&#8203;&#8203;&#8203;&#8203;&#8203;&#8203;</span></code>

Copied to an advanced editor like those you mentioned or Dreamweaver gives errors in browser, probably failing of javascript code too

Uncaught SyntaxError: Unexpected token ILLEGAL

Solution: When it happens, embrace the worth of notepad until this gets fixed by the big guys. It is more related to the editor then the browsers.

Solution 4 - Javascript

After longer than 6 years I am getting the same issue but I am able to reproduce it.

I am learning JavaScript from this blog which contains code snippets. Whenever I copy all the code from a snippet and paste it into the JavaScript editors of JS Fiddle or JS Bin I get some red tokens spread into the code. Here are screenshots of the first code snippet from the above blog post in the JS Fiddle and JS Bin. Hovering the mouse over one of these red tokens shows up the tip: "\u200b" (the zero-width space).

I'm working on Linux Ubuntu 16.04 and if I paste the code into one of my editors (Atom 1.22.1 or Geany 1.32) and then open the file in web browsers, I get the following errors in the console:

  • Chrome 63 --> SyntaxError: Invalid or unexpected token
  • Firefox 57 --> SyntaxError: illegal character

I hope this may help a bit to clarify why these zero-width spaces get copied into the clipboard.

Solution 5 - Javascript

I have similar issue with '\u200b' zero-width space character in my current project. I need to work on JSON object returned from server. Email object with '[at]' need to be replaced with '@' character. Surprisingly, several of the objects had the email address littered with 'space' in and around the '@'.

Long story short, I checked using Postman and scrutinised the returned JSON as RAW. Here is the raw example:

johndoe[at]\u200bxyz.org

I could see the character '\u200b' on all those trouble email address. Since only few email addresses affected, I removed the character manually. The server gets the data from Sharepoint.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionHnattView Question on Stackoverflow
Solution 1 - JavascriptShawn ChinView Answer on Stackoverflow
Solution 2 - JavascriptClifford FajardoView Answer on Stackoverflow
Solution 3 - JavascriptmobbyView Answer on Stackoverflow
Solution 4 - JavascriptGeorge DuneeView Answer on Stackoverflow
Solution 5 - JavascriptYazidEFView Answer on Stackoverflow