CSS: text-transform not working properly for Turkish characters

HtmlCssInternationalizationUppercaseLang

Html Problem Overview


The implementations of the major browsers seem to have problems with text-transform: uppercase with Turkish characters. As far as I know (I'm not Turkish.) there are four different i characters: ı i I İ where the last two are the uppercase representations of the former two.

However applying text-transform:uppercase to ı i, the browsers (checked IE, Firefox, Chrome and Safari) results in I I which is not correct and may change the meaning of the words so much so that they become insults. (That's what I've been told)

As my research for solutions did not reveal any my question is: Are there workarounds for this issue? The first workaround might be to remove text-transform: uppercase entirely but that's some sort of last resort.

Funny thing, the W3C has tests for this problem on their site, but lack of further information about this issue. http://www.w3.org/International/tests/tests-html-css/tests-text-transform/generate?test=5

I appreciate any help and looking forward to your answers :-)

Here's a codepen

Html Solutions


Solution 1 - Html

You can add lang attribute and set its value to tr to solve this:

<html lang="tr"> or <div lang="tr">

Here is working example.

Solution 2 - Html

Here's a quick and dirty workaround example - it's faster than I thought (tested in a document with 2400 tags -> no delay). But I see that js workarounds are not the very best solution

<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-3">
</head>
<body>
<div style="text-transform:uppercase">a b c ç d e f g ğ h ı i j k l m n o ö p r s ş t u ü v y z (source)</div> <div>A B C Ç D E F G Ğ H I İ J K L M N O Ö P R S Ş T U Ü V Y Z (should be like this)</div>

<script>
    function getStyle(element, style) {
        var result;

        if (document.defaultView && document.defaultView.getComputedStyle) {
            result = document.defaultView.getComputedStyle(element, '').getPropertyValue(style);
        } else if(element.currentStyle) {
            style = style.replace(/\-(\w)/g, function (strMatch, p1) {
                return p1.toUpperCase();
            });
            result = element.currentStyle[style];
        }
        return result;
    }

    function replaceRecursive(element) {
        if (element && element.style && getStyle(element, 'text-transform') == 'uppercase') {
            element.innerHTML = element.innerHTML.replace(/ı/g, 'I');
            element.innerHTML = element.innerHTML.replace(/i/g, 'İ');    // replaces 'i' in tags too, regular expression should be extended if necessary
        }

        if (!element.childNodes || element.childNodes.length == 0) return;

        for (var n in element.childNodes) {
            replaceRecursive(element.childNodes[n]);
        }
    }

    window.onload = function() {    // as appropriate 'ondomready'
        alert('before...');
        replaceRecursive(document.getElementsByTagName('body')[0]);
        alert('...after');
    }
</script>

</body>
</html>

Solution 3 - Html

Here's my enhanced version of alex's code that I am using in production:

(function($) {
  function getStyle(element, style) {
    var result;

    if (document.defaultView && document.defaultView.getComputedStyle) {
      result = document.defaultView.getComputedStyle(element, '').getPropertyValue(style);
    } else if(element.currentStyle) {
      style = style.replace(/\-(\w)/g, function (strMatch, p1) {
        return p1.toUpperCase();
      });
      result = element.currentStyle[style];
    }
    return result;
  }

  function replaceRecursive(element, lang) {
    if(element.lang) {
      lang = element.lang; // Maintain language context
    }

    if (element && element.style && getStyle(element, 'text-transform') == 'uppercase') {
      if (lang == 'tr' && element.value) {
        element.value = element.value.replace(/ı/g, 'I');
        element.value = element.value.replace(/i/g, 'İ');
      }

      for (var i = 0; i < element.childNodes.length; ++i) {
        if (lang == 'tr' && element.childNodes[i].nodeType == Node.TEXT_NODE) {
          element.childNodes[i].textContent = element.childNodes[i].textContent.replace(/ı/g, 'I');
          element.childNodes[i].textContent = element.childNodes[i].textContent.replace(/i/g, 'İ');
        } else {
          replaceRecursive(element.childNodes[i], lang);
        }
      }
    } else {
      if (!element.childNodes || element.childNodes.length == 0) return;

      for (var i = 0; i < element.childNodes.length; ++i) {
        replaceRecursive(element.childNodes[i], lang);
      }
    }
  }

  $(document).ready(function(){ replaceRecursive(document.getElementsByTagName('html')[0], ''); })
})(jQuery);

Note that I am using jQuery here only for the ready() function. The jQuery compatibility wrapper is also as a convenient way to namespace the functions. Other than that, the two functions do not rely on jQuery at all, so you could pull them out.

Compared to alex's original version this one solves a couple problems:

  • It keeps track of the lang attribute as it recurses through, since if you have mixed Turkish and other latin content you will get improper transforms on the non-Turkish without it. Pursuant to this I pass in the base html element, not the body. You can stick lang="en" on any tag that is not Turkish to prevent improper capitalization.

  • It applies the transformation only to TEXT_NODES because the previous innerHTML method did not work with mixed text/element nodes such as labels with text and checkboxes inside them.

While having some notable deficiencies compared to a server side solution, it also has some major advantages, the chief of which is guaranteed coverage without the server-side having to be aware of what styles are applied to what content. If any of the content is being indexed and shown in Google summaries (for example) it is much better if it stays lowercase when served.

Solution 4 - Html

The next version of Firefox Nightly (which should become Firefox 14) has a fix for this problem and should handle the case without any hack (as the CSS3 specs request it).

The gory details are available in that bug : https://bugzilla.mozilla.org/show_bug.cgi?id=231162

They also fixed the problem for font-variant I think (For those not knowing what font-variant does, see https://developer.mozilla.org/en/CSS/font-variant , not yet up-to-date with the change but the doc is browser-agnostic and a wiki, so...)

Solution 5 - Html

The root cause of this problem must be incorrect handling of these turkish characters by unicode library used in all these browsers. So I doubt there is an front-end-side fix for that.

Someone has to report this issue to the developers of these unicode libs, and it would be fixed in few weeks/months.

Solution 6 - Html

If you can't rely on text-transform and browsers you will have to render your text in uppercase yourself on the server (hope you're not uppercasing the text as the user types it). You should have a better support for internationalisation there.

Solution 7 - Html

This work-around requires some Javascript. If you don't want to do that, but have something server side that can preprocess the text, this idea will work there too (I think).

First, detect if you are running in Turkish. If you are, then scan whatever you are going to uppercase to see if it contains the problem characters. If they do, replace all of those characters with the uppercase version of them. Then apply the uppercase CSS. Since the problem characters are already uppercase, that should be a totally fine (ghetto) work around. For Javascript, I envision having to deal with some .innerHTML on your impacted elements.

Let me know if you need any implementation details, I have a good idea of how to do this in Javascript using Javascript string manipulation methods. This general idea should get you most of the way there (and hopefully get me a bounty!)

-Brian J. Stinar-

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionMalaxView Question on Stackoverflow
Solution 1 - HtmlHkanView Answer on Stackoverflow
Solution 2 - HtmlalexView Answer on Stackoverflow
Solution 3 - HtmlgtdView Answer on Stackoverflow
Solution 4 - HtmlteoliView Answer on Stackoverflow
Solution 5 - HtmlBarsMonsterView Answer on Stackoverflow
Solution 6 - HtmlJakub KoneckiView Answer on Stackoverflow
Solution 7 - HtmlBrian StinarView Answer on Stackoverflow