How to check if a URL exists or returns 404 with Java?

JavaUrlHttp Status-Code-404

Java Problem Overview


String urlString = "http://www.nbc.com/Heroes/novels/downloads/Heroes_novel_001.pdf";
URL url = new URL(urlString);
if(/* Url does not return 404 */) {
    System.out.println("exists");
} else {
    System.out.println("does not exists");
}
urlString = "http://www.nbc.com/Heroes/novels/downloads/Heroes_novel_190.pdf";
url = new URL(urlString);
if(/* Url does not return 404 */) {
    System.out.println("exists");
} else {
    System.out.println("does not exists");
}

This should print

exists
does not exists

TEST

public static String URL = "http://www.nbc.com/Heroes/novels/downloads/";

public static int getResponseCode(String urlString) throws MalformedURLException, IOException {
    URL u = new URL(urlString); 
    HttpURLConnection huc =  (HttpURLConnection)  u.openConnection(); 
    huc.setRequestMethod("GET"); 
    huc.connect(); 
    return huc.getResponseCode();
}
 
System.out.println(getResponseCode(URL + "Heroes_novel_001.pdf")); 
System.out.println(getResponseCode(URL + "Heroes_novel_190.pdf"));   
System.out.println(getResponseCode("http://www.example.com")); 
System.out.println(getResponseCode("http://www.example.com/junk"));           

Output

200
200
200
404

SOLUTION

Add the next line before .connect() and the output would be 200, 404, 200, 404

huc.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2 (.NET CLR 3.5.30729)");

Java Solutions


Solution 1 - Java

You may want to add

HttpURLConnection.setFollowRedirects(false);
// note : or
//        huc.setInstanceFollowRedirects(false)

if you don't want to follow redirection (3XX)

Instead of doing a "GET", a "HEAD" is all you need.

huc.setRequestMethod("HEAD");
return (huc.getResponseCode() == HttpURLConnection.HTTP_OK);

Solution 2 - Java

this worked for me:

URL u = new URL ( "http://www.example.com/");
HttpURLConnection huc =  ( HttpURLConnection )  u.openConnection (); 
huc.setRequestMethod ("GET");  //OR  huc.setRequestMethod ("HEAD"); 
huc.connect () ; 
int code = huc.getResponseCode() ;
System.out.println(code);

thanks for the suggestions above.

Solution 3 - Java

Use HttpUrlConnection by calling openConnection() on your URL object.

getResponseCode() will give you the HTTP response once you've read from the connection.

e.g.

   URL u = new URL("http://www.example.com/"); 
   HttpURLConnection huc = (HttpURLConnection)u.openConnection(); 
   huc.setRequestMethod("GET"); 
   huc.connect() ; 
   OutputStream os = huc.getOutputStream(); 
   int code = huc.getResponseCode(); 

(not tested)

Solution 4 - Java

There is nothing wrong with your code. It's the NBC.com doing tricks on you. When NBC.com decides that your browser is not capable of displaying PDF, it simply sends back a webpage regardless what you are requesting, even if it doesn't exist.

You need to trick it back by telling it your browser is capable, something like,

conn.setRequestProperty("User-Agent",
    "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.0.13) Gecko/2009073021 Firefox/3.0.13");

Solution 5 - Java

Based on the given answers and information in the question, this is the code you should use:

public static boolean doesURLExist(URL url) throws IOException
{
	// We want to check the current URL
	HttpURLConnection.setFollowRedirects(false);

	HttpURLConnection httpURLConnection = (HttpURLConnection) url.openConnection();

	// We don't need to get data
	httpURLConnection.setRequestMethod("HEAD");

	// Some websites don't like programmatic access so pretend to be a browser
	httpURLConnection.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2 (.NET CLR 3.5.30729)");
	int responseCode = httpURLConnection.getResponseCode();

	// We only accept response code 200
	return responseCode == HttpURLConnection.HTTP_OK;
}

Of course tested and working.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionSergio del AmoView Question on Stackoverflow
Solution 1 - JavaRealHowToView Answer on Stackoverflow
Solution 2 - JavammmView Answer on Stackoverflow
Solution 3 - JavaBrian AgnewView Answer on Stackoverflow
Solution 4 - JavaZZ CoderView Answer on Stackoverflow
Solution 5 - JavaBullyWiiPlazaView Answer on Stackoverflow