HttpClient 4 - how to capture last redirect URL

JavaApache Httpclient-4.x

Java Problem Overview


I have rather simple HttpClient 4 code that calls HttpGet to get HTML output. The HTML returns with scripts and image locations all set to local (e.g. <img src="/images/foo.jpg"/>) so I need calling URL to make these into absolute (<img src="http://foo.com/images/foo.jpg"/>) Now comes the problem - during the call there may be one or two 302 redirects so the original URL is no longer reflects the location of HTML.

How do I get the latest URL of the returned content given all the redirects I may (or may not) have?

I looked at HttpGet#getAllHeaders() and HttpResponse#getAllHeaders() - couldn't find anything.

Edited: HttpGet#getURI() returns original calling address

Java Solutions


Solution 1 - Java

That would be the current URL, which you can get by calling

  HttpGet#getURI();

EDIT: You didn't mention how you are doing redirect. That works for us because we handle the 302 ourselves.

Sounds like you are using DefaultRedirectHandler. We used to do that. It's kind of tricky to get the current URL. You need to use your own context. Here are the relevant code snippets,

        HttpGet httpget = new HttpGet(url);
        HttpContext context = new BasicHttpContext(); 
        HttpResponse response = httpClient.execute(httpget, context); 
        if (response.getStatusLine().getStatusCode() != HttpStatus.SC_OK)
            throw new IOException(response.getStatusLine().toString());
        HttpUriRequest currentReq = (HttpUriRequest) context.getAttribute( 
                ExecutionContext.HTTP_REQUEST);
        HttpHost currentHost = (HttpHost)  context.getAttribute( 
                ExecutionContext.HTTP_TARGET_HOST);
        String currentUrl = (currentReq.getURI().isAbsolute()) ? currentReq.getURI().toString() : (currentHost.toURI() + currentReq.getURI());

The default redirect didn't work for us so we changed but I forgot what was the problem.

Solution 2 - Java

In HttpClient 4, if you are using LaxRedirectStrategy or any subclass of DefaultRedirectStrategy, this is the recommended way (see source code of DefaultRedirectStrategy) :

HttpContext context = new BasicHttpContext();
HttpResult<T> result = client.execute(request, handler, context);
URI finalUrl = request.getURI();
RedirectLocations locations = (RedirectLocations) context.getAttribute(DefaultRedirectStrategy.REDIRECT_LOCATIONS);
if (locations != null) {
    finalUrl = locations.getAll().get(locations.getAll().size() - 1);
}

Since HttpClient 4.3.x, the above code can be simplified as:

HttpClientContext context = HttpClientContext.create();
HttpResult<T> result = client.execute(request, handler, context);
URI finalUrl = request.getURI();
List<URI> locations = context.getRedirectLocations();
if (locations != null) {
    finalUrl = locations.get(locations.size() - 1);
}

Solution 3 - Java

    HttpGet httpGet = new HttpHead("<put your URL here>");
    HttpClient httpClient = HttpClients.createDefault();
    HttpClientContext context = HttpClientContext.create();
    httpClient.execute(httpGet, context);
    List<URI> redirectURIs = context.getRedirectLocations();
    if (redirectURIs != null && !redirectURIs.isEmpty()) {
        for (URI redirectURI : redirectURIs) {
            System.out.println("Redirect URI: " + redirectURI);
        }
        URI finalURI = redirectURIs.get(redirectURIs.size() - 1);
    }

Solution 4 - Java

I found this on HttpComponents Client Documentation

CloseableHttpClient httpclient = HttpClients.createDefault();
HttpClientContext context = HttpClientContext.create();
HttpGet httpget = new HttpGet("http://localhost:8080/");
CloseableHttpResponse response = httpclient.execute(httpget, context);
try {
    HttpHost target = context.getTargetHost();
    List<URI> redirectLocations = context.getRedirectLocations();
    URI location = URIUtils.resolve(httpget.getURI(), target, redirectLocations);
    System.out.println("Final HTTP location: " + location.toASCIIString());
    // Expected to be an absolute URI
} finally {
    response.close();
}

Solution 5 - Java

An IMHO improved way based upon ZZ Coder's solution is to use a ResponseInterceptor to simply track the last redirect location. That way you don't lose information e.g. after an hashtag. Without the response interceptor you lose the hashtag. Example: http://j.mp/OxbI23

private static HttpClient createHttpClient() throws NoSuchAlgorithmException, KeyManagementException {
	SSLContext sslContext = SSLContext.getInstance("SSL");
	TrustManager[] trustAllCerts = new TrustManager[] { new TrustAllTrustManager() };
	sslContext.init(null, trustAllCerts, new java.security.SecureRandom());

	SSLSocketFactory sslSocketFactory = new SSLSocketFactory(sslContext);
	SchemeRegistry schemeRegistry = new SchemeRegistry();
	schemeRegistry.register(new Scheme("https", 443, sslSocketFactory));
	schemeRegistry.register(new Scheme("http", 80, new PlainSocketFactory()));

	HttpParams params = new BasicHttpParams();
	ClientConnectionManager cm = new org.apache.http.impl.conn.SingleClientConnManager(schemeRegistry);

	// some pages require a user agent
	AbstractHttpClient httpClient = new DefaultHttpClient(cm, params);
	HttpProtocolParams.setUserAgent(httpClient.getParams(), "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:13.0) Gecko/20100101 Firefox/13.0.1");

	httpClient.setRedirectStrategy(new RedirectStrategy());

	httpClient.addResponseInterceptor(new HttpResponseInterceptor() {
		@Override
		public void process(HttpResponse response, HttpContext context)
				throws HttpException, IOException {
			if (response.containsHeader("Location")) {
				Header[] locations = response.getHeaders("Location");
				if (locations.length > 0)
					context.setAttribute(LAST_REDIRECT_URL, locations[0].getValue());
			}
		}
	});

	return httpClient;
}

private String getUrlAfterRedirects(HttpContext context) {
	String lastRedirectUrl = (String) context.getAttribute(LAST_REDIRECT_URL);
	if (lastRedirectUrl != null)
		return lastRedirectUrl;
	else {
		HttpUriRequest currentReq = (HttpUriRequest) context.getAttribute(ExecutionContext.HTTP_REQUEST);
		HttpHost currentHost = (HttpHost)  context.getAttribute(ExecutionContext.HTTP_TARGET_HOST);
		String currentUrl = (currentReq.getURI().isAbsolute()) ? currentReq.getURI().toString() : (currentHost.toURI() + currentReq.getURI());
		return currentUrl;
	}
}

public static final String LAST_REDIRECT_URL = "last_redirect_url";

use it just like ZZ Coder's solution:

HttpResponse response = httpClient.execute(httpGet, context);
String url = getUrlAfterRedirects(context);

Solution 6 - Java

I think easier way to find last URL is to use DefaultRedirectHandler.

package ru.test.test;

import java.net.URI;

import org.apache.http.HttpResponse;
import org.apache.http.ProtocolException;
import org.apache.http.impl.client.DefaultRedirectHandler;
import org.apache.http.protocol.HttpContext;

public class MyRedirectHandler extends DefaultRedirectHandler {

	public URI lastRedirectedUri;

	@Override
	public boolean isRedirectRequested(HttpResponse response, HttpContext context) {
		
		return super.isRedirectRequested(response, context);
	}

	@Override
	public URI getLocationURI(HttpResponse response, HttpContext context)
			throws ProtocolException {

		lastRedirectedUri = super.getLocationURI(response, context);
		
		return lastRedirectedUri;
	}

}

Code to use this handler:

  DefaultHttpClient httpclient = new DefaultHttpClient();
  MyRedirectHandler handler = new MyRedirectHandler();
  httpclient.setRedirectHandler(handler);

  HttpGet get = new HttpGet(url);
    
  HttpResponse response = httpclient.execute(get);

  HttpEntity entity = response.getEntity();
  lastUrl = url;
  if(handler.lastRedirectedUri != null){
      lastUrl = handler.lastRedirectedUri.toString();
  }

Solution 7 - Java

In version 2.3 Android still do not support following redirect (HTTP code 302). I just read location header and download again:

if (statusCode != HttpStatus.SC_OK) {
    Header[] headers = response.getHeaders("Location");
    
    if (headers != null && headers.length != 0) {
        String newUrl = headers[headers.length - 1].getValue();
        // call again the same downloading method with new URL
        return downloadBitmap(newUrl);
    } else {
        return null;
    }
}

No circular redirects protection here so be careful. More on by blog Follow 302 redirects with AndroidHttpClient

Solution 8 - Java

This is how I managed to get the redirect URL:

Header[] arr = httpResponse.getHeaders("Location");
for (Header head : arr){
    String whatever = arr.getValue();
}

Or, if you are sure that there is only one redirect location, do this:

httpResponse.getFirstHeader("Location").getValue();

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionBostoneView Question on Stackoverflow
Solution 1 - JavaZZ CoderView Answer on Stackoverflow
Solution 2 - Javadavid_pView Answer on Stackoverflow
Solution 3 - JavaAtharvaView Answer on Stackoverflow
Solution 4 - JavaAmirHosseinView Answer on Stackoverflow
Solution 5 - JavaMichael PollmeierView Answer on Stackoverflow
Solution 6 - JavaydanilaView Answer on Stackoverflow
Solution 7 - JavaNikolaView Answer on Stackoverflow
Solution 8 - JavaSalmanView Answer on Stackoverflow