How to check for a valid URL in Java?

JavaValidationUrl

Java Problem Overview


What is the best way to check if a URL is valid in Java?

If tried to call new URL(urlString) and catch a MalformedURLException, but it seems to be happy with anything that begins with http://.

I'm not concerned about establishing a connection, just validity. Is there a method for this? An annotation in Hibernate Validator? Should I use a regex?

Edit: Some examples of accepted URLs are http://*** and http://my favorite site!.

Java Solutions


Solution 1 - Java

Consider using the Apache Commons UrlValidator class

UrlValidator urlValidator = new UrlValidator();
urlValidator.isValid("http://my favorite site!");

There are several properties that you can set to control how this class behaves, by default http, https, and ftp are accepted.

Solution 2 - Java

Here is way I tried and found useful,

URL u = new URL(name); // this would check for the protocol
u.toURI(); // does the extra checking required for validation of URI 

Solution 3 - Java

I'd love to post this as a comment to Tendayi Mawushe's answer, but I'm afraid there is not enough space ;)

This is the relevant part from the Apache Commons UrlValidator source:

/**
 * This expression derived/taken from the BNF for URI (RFC2396).
 */
private static final String URL_PATTERN =
        "/^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\\?([^#]*))?(#(.*))?/";
//         12            3  4          5       6   7        8 9

/**
 * Schema/Protocol (ie. http:, ftp:, file:, etc).
 */
private static final int PARSE_URL_SCHEME = 2;

/**
 * Includes hostname/ip and port number.
 */
private static final int PARSE_URL_AUTHORITY = 4;

private static final int PARSE_URL_PATH = 5;

private static final int PARSE_URL_QUERY = 7;

private static final int PARSE_URL_FRAGMENT = 9;

You can easily build your own validator from there.

Solution 4 - Java

My favorite approach, without external libraries:

try {
    URI uri = new URI(name);

    // perform checks for scheme, authority, host, etc., based on your requirements

    if ("mailto".equals(uri.getScheme()) {/*Code*/}
    if (uri.getHost() == null) {/*Code*/}

} catch (URISyntaxException e) {
}

Solution 5 - Java

I didn't like any of the implementations (because they use a Regex which is an expensive operation, or a library which is an overkill if you only need one method), so I ended up using the java.net.URI class with some extra checks, and limiting the protocols to: http, https, file, ftp, mailto, news, urn.

And yes, catching exceptions can be an expensive operation, but probably not as bad as Regular Expressions:

final static Set<String> protocols, protocolsWithHost;

static {
  protocolsWithHost = new HashSet<String>( 
      Arrays.asList( new String[]{ "file", "ftp", "http", "https" } ) 
  );
  protocols = new HashSet<String>( 
      Arrays.asList( new String[]{ "mailto", "news", "urn" } ) 
  );
  protocols.addAll(protocolsWithHost);
}

public static boolean isURI(String str) {
  int colon = str.indexOf(':');
  if (colon < 3)                      return false;

  String proto = str.substring(0, colon).toLowerCase();
  if (!protocols.contains(proto))     return false;

  try {
    URI uri = new URI(str);
    if (protocolsWithHost.contains(proto)) {
      if (uri.getHost() == null)      return false;

      String path = uri.getPath();
      if (path != null) {
        for (int i=path.length()-1; i >= 0; i--) {
          if ("?<>:*|\"".indexOf( path.charAt(i) ) > -1)
            return false;
        }
      }
    }

    return true;
  } catch ( Exception ex ) {}

  return false;
}

Solution 6 - Java

The most "foolproof" way is to check for the availability of URL:

public boolean isURL(String url) {
  try {
     (new java.net.URL(url)).openStream().close();
     return true;
  } catch (Exception ex) { }
  return false;
}

Solution 7 - Java

Judging by the source code for URI, the

public URL(URL context, String spec, URLStreamHandler handler)

constructor does more validation than the other constructors. You might try that one, but YMMV.

Solution 8 - Java

###validator package: There seems to be a nice package by Yonatan Matalon called UrlUtil. Quoting its API:

isValidWebPageAddress(java.lang.String address, boolean validateSyntax, 
                      boolean validateExistance) 
Checks if the given address is a valid web page address.

###Sun's approach - check the network address Sun's Java site offers connect attempt as a solution for validating URLs.

###Other regex code snippets: There are regex validation attempts at Oracle's site and weberdev.com.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionEric WilsonView Question on Stackoverflow
Solution 1 - JavaTendayi MawusheView Answer on Stackoverflow
Solution 2 - JavaPrasanna PillaView Answer on Stackoverflow
Solution 3 - Javauser123444555621View Answer on Stackoverflow
Solution 4 - JavaAndrei VolginView Answer on Stackoverflow
Solution 5 - JavaisapirView Answer on Stackoverflow
Solution 6 - Javauser2670200View Answer on Stackoverflow
Solution 7 - JavauckelmanView Answer on Stackoverflow
Solution 8 - JavaAdam MatanView Answer on Stackoverflow