How long can a TLD possibly be?

PhpRegexEmail ValidationTld

Php Problem Overview


I'm working on an email validation regex in PHP and I need to know how long the TLD could possibly be and still be valid. I did a few searches but couldn't find much information on the topic. So how long can a TLD possibly be?

Php Solutions


Solution 1 - Php

DNS allows for a maximum of 63 characters for an individual label.

Solution 2 - Php

The longest TLD currently in existence is 24 characters long, and subject to change. The maximum TLD length specified by RFC 1034 is 63 octets.

To get the length of the longest existing TLD:

wget -qO - http://data.iana.org/TLD/tlds-alpha-by-domain.txt | tail -n+2 | wc -L

Here's what that command does:

  1. Get the latest list of actual existing TLDs from IANA
  2. Strip the first line, which is a long-ish comment
  3. Launch wc to count the longest line

Alternative using curl thanks to Stefan:

curl -s http://data.iana.org/TLD/tlds-alpha-by-domain.txt | tail -n+2 | wc -L

Solution 3 - Php

-EDIT-

According to RFC 2606 .localhost is reserved domain name and its length is 9 characters. That is the longest I am aware of.

-END OF EDIT-

However, I think that you should care about email address length and not only TLD length. Below is a quote from this article. The email address length is 254 characters:

> There appears to be some confusion over the maximum valid email address size. Most people believe it to be 320 characters (64 characters for the username + 255 characters for the domain + 1 character for the @ symbol). Other sources suggest 129 (64 + 1 + 64) or 384 (128+1+255, assuming the username doubles in length in the future).

> This confusion means you should heed the 'robustness principle' ("developers should carefully write software that adheres closely to extant RFCs but accept and parse input from peers that might not be consistent with those RFCs." - Wikipedia) when writing software that deals with email addresses. Furthermore, some software may be crippled by naive assumptions, e.g. thinking that 50 characters is adequate (examples). Your 200 character email address may be technically valid but that will not help you if most websites or applications reject it.

> The actual maximum email length is currently 254 characters:

>> "The original version of RFC 3696 did indeed say 320 was the maximum length, but John Klensin (ICANN) subsequently accepted this was wrong."

>> "This arises from the simple arithmetic of maximum length of a domain (255 characters) + maximum length of a mailbox (64 characters) + the @ symbol = 320 characters. Wrong. This canard is actually documented in the original version of RFC3696. It was corrected in the errata. There's actually a restriction from RFC5321 on the path element of an SMTP transaction of 256 characters. But this includes angled brackets around the email address, so the maximum length of an email address is 254 characters."

Solution 4 - Php

The longest with latin letters is .MUSEUM (source), but there are some with special characters. The longest from them is XN--CLCHC0EA0B2G2A9GCD. Also, in a short time, it will be possible to reserve your own TLD for a high price and so it will be possible to be longer.

Solution 5 - Php

Since I'm a .net developer following is the java-script representation of determining the longest TLD currently available.this will return the length of the longest TLD which you would be able to use in your RegEx.

please try the following Code Snippet

function getTLD() {
    var length = 0;
    var longest;
    var request = new XMLHttpRequest();

    request.open('GET', 'http://data.iana.org/TLD/tlds-alpha-by-domain.txt', true);
    request.send(null);
    request.onreadystatechange = function () {
        if (request.readyState === 4 && request.status === 200) {
            var type = request.getResponseHeader('Content-Type');
            if (type.indexOf("text") !== 1) {
                var tldArr = request.responseText.split('\n'); 
                tldArr.splice(0, 1);

                for (var i = 0; i < tldArr.length; i++) {
                    if (tldArr[i].length > length) {
                        length = tldArr[i].length;
                        longest = tldArr[i];
                    }
                } 

                console.log("Longest >> " + longest + " >> " + length);
                return length;
            }
        }
    }
}

<button onclick="getTLD()">Get TLD</button>

Solution 6 - Php

This is PHP code to get up-to-date vertical bar separated UTF-8 TLDs list to be used directly in a regular expression:

<?php 
  function getTLDs($separator){
    $tlds=file('http://data.iana.org/TLD/tlds-alpha-by-domain.txt');
    array_shift($tlds); // remove heading comment
    usort($tlds,function($a,$b){ return strlen($b)-strlen($a); }); // sort from longest to shortest
    return implode($separator,array_map(function($e){ return idn_to_utf8(trim(strtolower($e))); },$tlds));
  }
  echo getTLDs('|');
?>

To match a host name you could use it like this:

$tlds=getTLDs('|');
if (preg_match("{([\da-z\.-]+)\.($tlds)}u",$address)) {
  ..
}

Solution 7 - Php

A TLD can be any length at all. New TLDs happen all the time. In the future there will be more TLDs not regulated by the entity currently regulating the majority of TLDs. We also won't use email in the future as we presently do. That said:

You don't need to validate an email address ever. If you want to slow people down and have an idea as to whether they're actually human, include a CAPTCHA. If you need to confirm working email, send an email with a validation link they can open. If you aren't throttling submissions of things that can generate things like emails being sent for verification, it won't matter whether you're confirming the address is technically valid anyway, it will be abused at that point regardless.

Solution 8 - Php

Longest TLD up to date is .xn--vermgensberatung-pwb, at 24 characters in Punycode and 17 when decoded [vermögensberatung]. With no Punycode it would be .northwesternmutual or .travelersinsurance both at 18 characters.

However, a domain name, the thing that goes before an TLD, can be up to 63 characters long, as seen here: http://www.thelongestdomainnameintheworldandthensomeandthensomemoreandmore.com

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionHellaMadView Question on Stackoverflow
Solution 1 - PhptripleeeView Answer on Stackoverflow
Solution 2 - PhpDan DascalescuView Answer on Stackoverflow
Solution 3 - PhpaviadView Answer on Stackoverflow
Solution 4 - PhpaxiomerView Answer on Stackoverflow
Solution 5 - PhpChathura EdirisingheView Answer on Stackoverflow
Solution 6 - PhpMeisnerView Answer on Stackoverflow
Solution 7 - PhpJan Kyu PeblikView Answer on Stackoverflow
Solution 8 - PhpTobiaGamesView Answer on Stackoverflow