Can you explain the HttpURLConnection connection process?

JavaInputstreamHttpurlconnectionOutputstreamUrlconnection

Java Problem Overview


I am using HTTPURLConnection to connect to a web service. I know how to use HTTPURLConnection but I want to understand how it works. Basically, I want to know the following:

  • On which point does HTTPURLConnection try to establish a connection to the given URL?
  • On which point can I know that I was able to successfully establish a connection?
  • Are establishing a connection and sending the actual request done in one step/method call? What method is it?
  • Can you explain the function of getOutputStream and getInputStream in layman's term? I notice that when the server I'm trying to connect to is down, I get an Exception at getOutputStream. Does it mean that HTTPURLConnection will only start to establish a connection when I invoke getOutputStream? How about the getInputStream? Since I'm only able to get the response at getInputStream, then does it mean that I didn't send any request at getOutputStream yet but simply establishes a connection? Do HttpURLConnection go back to the server to request for response when I invoke getInputStream?
  • Am I correct to say that openConnection simply creates a new connection object but does not establish any connection yet?
  • How can I measure the read overhead and connect overhead?

Java Solutions


Solution 1 - Java

String message = URLEncoder.encode("my message", "UTF-8");

try {
	// instantiate the URL object with the target URL of the resource to
	// request
	URL url = new URL("http://www.example.com/comment");
	
	// instantiate the HttpURLConnection with the URL object - A new
	// connection is opened every time by calling the openConnection
	// method of the protocol handler for this URL.
	// 1. This is the point where the connection is opened.
	HttpURLConnection connection = (HttpURLConnection) url
			.openConnection();
	// set connection output to true
	connection.setDoOutput(true);
	// instead of a GET, we're going to send using method="POST"
	connection.setRequestMethod("POST");
	
	// instantiate OutputStreamWriter using the output stream, returned
	// from getOutputStream, that writes to this connection.
	// 2. This is the point where you'll know if the connection was
	// successfully established. If an I/O error occurs while creating
	// the output stream, you'll see an IOException.
	OutputStreamWriter writer = new OutputStreamWriter(
			connection.getOutputStream());
	
	// write data to the connection. This is data that you are sending
	// to the server
	// 3. No. Sending the data is conducted here. We established the
	// connection with getOutputStream
	writer.write("message=" + message);
	
	// Closes this output stream and releases any system resources
	// associated with this stream. At this point, we've sent all the
	// data. Only the outputStream is closed at this point, not the
	// actual connection
	writer.close();
	// if there is a response code AND that response code is 200 OK, do
	// stuff in the first if block
	if (connection.getResponseCode() == HttpURLConnection.HTTP_OK) {
		// OK
		
		// otherwise, if any other status code is returned, or no status
		// code is returned, do stuff in the else block
	} else {
		// Server returned HTTP error code.
	}
} catch (MalformedURLException e) {
	// ...
} catch (IOException e) {
	// ...
}

The first 3 answers to your questions are listed as inline comments, beside each method, in the example HTTP POST above.

From getOutputStream:

> Returns an output stream that writes to this connection.

Basically, I think you have a good understanding of how this works, so let me just reiterate in layman's terms. getOutputStream basically opens a connection stream, with the intention of writing data to the server. In the above code example "message" could be a comment that we're sending to the server that represents a comment left on a post. When you see getOutputStream, you're opening the connection stream for writing, but you don't actually write any data until you call writer.write("message=" + message);.

From getInputStream():

> Returns an input stream that reads from this open connection. A SocketTimeoutException can be thrown when reading from the returned input stream if the read timeout expires before data is available for read.

getInputStream does the opposite. Like getOutputStream, it also opens a connection stream, but the intent is to read data from the server, not write to it. If the connection or stream-opening fails, you'll see a SocketTimeoutException.

> How about the getInputStream? Since I'm only able to get the response at getInputStream, then does it mean that I didn't send any request at getOutputStream yet but simply establishes a connection?

Keep in mind that sending a request and sending data are two different operations. When you invoke getOutputStream or getInputStream url.openConnection(), you send a request to the server to establish a connection. There is a handshake that occurs where the server sends back an acknowledgement to you that the connection is established. It is then at that point in time that you're prepared to send or receive data. Thus, you do not need to call getOutputStream to establish a connection open a stream, unless your purpose for making the request is to send data.

In layman's terms, making a getInputStream request is the equivalent of making a phone call to your friend's house to say "Hey, is it okay if I come over and borrow that pair of vice grips?" and your friend establishes the handshake by saying, "Sure! Come and get it". Then, at that point, the connection is made, you walk to your friend's house, knock on the door, request the vice grips, and walk back to your house.

Using a similar example for getOutputStream would involve calling your friend and saying "Hey, I have that money I owe you, can I send it to you"? Your friend, needing money and sick inside that you kept it for so long, says "Sure, come on over you cheap bastard". So you walk to your friend's house and "POST" the money to him. He then kicks you out and you walk back to your house.

Now, continuing with the layman's example, let's look at some Exceptions. If you called your friend and he wasn't home, that could be a 500 error. If you called and got a disconnected number message because your friend is tired of you borrowing money all the time, that's a 404 page not found. If your phone is dead because you didn't pay the bill, that could be an IOException. (NOTE: This section may not be 100% correct. It's intended to give you a general idea of what's happening in layman's terms.)

Question #5:

Yes, you are correct that openConnection simply creates a new connection object but does not establish it. The connection is established when you call either getInputStream or getOutputStream.

openConnection creates a new connection object. From the URL.openConnection javadocs:

> A new connection is opened every time by calling the openConnection method of the protocol handler for this URL.

The connection is established when you call openConnection, and the InputStream, OutputStream, or both, are called when you instantiate them.

Question #6:

To measure the overhead, I generally wrap some very simple timing code around the entire connection block, like so:

long start = System.currentTimeMillis();
log.info("Time so far = " + new Long(System.currentTimeMillis() - start) );

// run the above example code here
log.info("Total time to send/receive data = " + new Long(System.currentTimeMillis() - start) );

I'm sure there are more advanced methods for measuring the request time and overhead, but this generally is sufficient for my needs.

For information on closing connections, which you didn't ask about, see https://stackoverflow.com/questions/272910/in-java-when-does-a-url-connection-close.

Solution 2 - Java

Tim Bray presented a concise step-by-step, stating that openConnection() does not establish an actual connection. Rather, an actual HTTP connection is not established until you call methods such as getInputStream() or getOutputStream().

http://www.tbray.org/ongoing/When/201x/2012/01/17/HttpURLConnection

Solution 3 - Java

> On which point does HTTPURLConnection try to establish a connection to the given URL?

On the port named in the URL if any, otherwise 80 for HTTP and 443 for HTTPS. I believe this is documented.

> On which point can I know that I was able to successfully establish a connection?

When you call getInputStream() or getOutputStream() or getResponseCode() without getting an exception.

> Are establishing a connection and sending the actual request done in one step/method call? What method is it?

No and none.

> Can you explain the function of getOutputStream() and getInputStream() in layman's term?

Either of them first connects if necessary, then returns the required stream.

> I notice that when the server I'm trying to connect to is down, I get an Exception at getOutputStream(). Does it mean that HTTPURLConnection will only start to establish a connection when I invoke getOutputStream()? How about the getInputStream()? Since I'm only able to get the response at getInputStream(), then does it mean that I didn't send any request at getOutputStream() yet but simply establishes a connection? Do HttpURLConnection go back to the server to request for response when I invoke getInputStream()?

See above.

> Am I correct to say that openConnection() simply creates a new connection object but does not establish any connection yet?

Yes.

> How can I measure the read overhead and connect overhead?

Connect: take the time getInputStream() or getOutputStream() takes to return, whichever you call first. Read: time from starting first read to getting the EOS.

Solution 4 - Java

On which point does HTTPURLConnection try to establish a connection to the given URL?

It's worth clarifying, there's the 'UrlConnection' instance and then there's the underlying Tcp/Ip/SSL socket connection, 2 different concepts. The 'UrlConnection' or 'HttpUrlConnection' instance is synonymous with a single HTTP page request, and is created when you call url.openConnection(). But if you do multiple url.openConnection()'s from the one 'url' instance then if you're lucky, they'll reuse the same Tcp/Ip socket and SSL handshaking stuff...which is good if you're doing lots of page requests to the same server, especially good if you're using SSL where the overhead of establishing the socket is very high.

See: https://stackoverflow.com/questions/3460990/httpurlconnection-implementation

Solution 5 - Java

I went through the exercise to capture low level packet exchange, and found that network connection is only triggered by operations like getInputStream, getOutputStream, getResponseCode, getResponseMessage etc.

Here is the packet exchange captured when I try to write a small program to upload file to Dropbox.

enter image description here

Below is my toy program and annotation

    /* Create a connection LOCAL object,
     * the openConnection() function DOES NOT initiate
     * any packet exchange with the remote server.
     * 
     * The configurations only setup the LOCAL
     * connection object properties.
     */
    HttpURLConnection connection = (HttpURLConnection) dst.openConnection();
    connection.setDoOutput(true);
    connection.setRequestMethod("POST");
    ...//headers setup
    byte[] testContent = {0x32, 0x32};
            
    /**
     * This triggers packet exchange with the remote
     * server to create a link. But writing/flushing
     * to a output stream does not send out any data.
     * 
     * Payload are buffered locally.
     */
    try (BufferedOutputStream outputStream = new BufferedOutputStream(connection.getOutputStream())) {
        outputStream.write(testContent);
        outputStream.flush();
    }

    /**
     * Trigger payload sending to the server.
     * Client get ALL responses (including response code,
     * message, and content payload) 
     */
    int responseCode = connection.getResponseCode();
    System.out.println(responseCode);

    /* Here no further exchange happens with remote server, since
     * the input stream content has already been buffered
     * in previous step
     */
    try (InputStream is = connection.getInputStream()) {
        Scanner scanner = new Scanner(is);
        StringBuilder stringBuilder = new StringBuilder();
        while (scanner.hasNextLine()) {
        stringBuilder.append(scanner.nextLine()).append(System.lineSeparator());
        }
    }
    
    /**
     * Trigger the disconnection from the server.
     */
    String responsemsg = connection.getResponseMessage();
    System.out.println(responsemsg);
    connection.disconnect();

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionArciView Question on Stackoverflow
Solution 1 - Javajmort253View Answer on Stackoverflow
Solution 2 - JavaanonymousView Answer on Stackoverflow
Solution 3 - Javauser207421View Answer on Stackoverflow
Solution 4 - JavaTim CooperView Answer on Stackoverflow
Solution 5 - JavaHarryQView Answer on Stackoverflow