Get final URL after curl is redirected

LinuxRedirectCurlWget

Linux Problem Overview


I need to get the final URL after a page redirect preferably with curl or wget.

For example http://google.com may redirect to http://www.google.com.

The contents are easy to get(ex. curl --max-redirs 10 http://google.com -L), but I'm only interested in the final url (in the former case http://www.google.com).

Is there any way of doing this by using only Linux built-in tools? (command line only)

Linux Solutions


Solution 1 - Linux

curl's -w option and the sub variable url_effective is what you are looking for.

Something like

curl -Ls -o /dev/null -w %{url_effective} http://google.com

More info

-L         Follow redirects
-s         Silent mode. Don't output anything
-o FILE    Write output to <file> instead of stdout
-w FORMAT  What to output after completion

More

You might want to add -I (that is an uppercase i) as well, which will make the command not download any "body", but it then also uses the HEAD method, which is not what the question included and risk changing what the server does. Sometimes servers don't respond well to HEAD even when they respond fine to GET.

Solution 2 - Linux

Thanks, that helped me. I made some improvements and wrapped that in a helper script "finalurl":

#!/bin/bash
curl $1 -s -L -I -o /dev/null -w '%{url_effective}'
  • -o output to /dev/null
  • -I don't actually download, just discover the final URL
  • -s silent mode, no progressbars

This made it possible to call the command from other scripts like this:

echo `finalurl http://someurl/`

Solution 3 - Linux

as another option:

$ curl -i http://google.com
HTTP/1.1 301 Moved Permanently
Location: http://www.google.com/
Content-Type: text/html; charset=UTF-8
Date: Sat, 19 Jun 2010 04:15:10 GMT
Expires: Mon, 19 Jul 2010 04:15:10 GMT
Cache-Control: public, max-age=2592000
Server: gws
Content-Length: 219
X-XSS-Protection: 1; mode=block

<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
<A HREF="http://www.google.com/">here</A>.
</BODY></HTML>

But it doesn't go past the first one.

Solution 4 - Linux

Thank you. I ended up implementing your suggestions: curl -i + grep

curl -i http://google.com -L | egrep -A 10 '301 Moved Permanently|302 Found' | grep 'Location' | awk -F': ' '{print $2}' | tail -1

Returns blank if the website doesn't redirect, but that's good enough for me as it works on consecutive redirections.

Could be buggy, but at a glance it works ok.

Solution 5 - Linux

You can do this with wget usually. wget --content-disposition "url" additionally if you add -O /dev/null you will not be actually saving the file.

wget -O /dev/null --content-disposition example.com

Solution 6 - Linux

curl can only follow http redirects. To also follow meta refresh directives and javascript redirects, you need a full-blown browser like headless chrome:

#!/bin/bash
real_url () {
    printf 'location.href\nquit\n' | \
    chromium-browser --headless --disable-gpu --disable-software-rasterizer \
    --disable-dev-shm-usage --no-sandbox --repl "$@" 2> /dev/null \
    | tr -d '>>> ' | jq -r '.result.value'
}

If you don't have chrome installed, you can use it from a docker container:

#!/bin/bash
real_url () {
    printf 'location.href\nquit\n' | \
    docker run -i --rm --user "$(id -u "$USER")" --volume "$(pwd)":/usr/src/app \
    zenika/alpine-chrome --no-sandbox --repl "$@" 2> /dev/null \
    | tr -d '>>> ' | jq -r '.result.value'
}

Like so:

$ real_url http://dx.doi.org/10.1016/j.pgeola.2020.06.005 
https://www.sciencedirect.com/science/article/abs/pii/S0016787820300638?via%3Dihub

Solution 7 - Linux

This would work:

 curl -I somesite.com | perl -n -e '/^Location: (.*)$/ && print "$1\n"'

Solution 8 - Linux

The parameters -L (--location) and -I (--head) still doing unnecessary HEAD-request to the location-url.

If you are sure that you will have no more than one redirect, it is better to disable follow location and use a curl-variable %{redirect_url}.

This code do only one HEAD-request to the specified URL and takes redirect_url from location-header:

curl --head --silent --write-out "%{redirect_url}\n" --output /dev/null "https://""goo.gl/QeJeQ4"

Speed test

all_videos_link.txt - 50 links of goo.gl+bit.ly which redirect to youtube

1. With follow location
time while read -r line; do
    curl -kIsL -w "%{url_effective}\n" -o /dev/null  $line
done < all_videos_link.txt

Results:

real    1m40.832s
user    0m9.266s
sys     0m15.375s
2. Without follow location
time while read -r line; do
    curl -kIs -w "%{redirect_url}\n" -o /dev/null  $line
done < all_videos_link.txt

Results:

real    0m51.037s
user    0m5.297s
sys     0m8.094s

Solution 9 - Linux

I'm not sure how to do it with curl, but libwww-perl installs the GET alias.

$ GET -S -d -e http://google.com
GET http://google.com --> 301 Moved Permanently
GET http://www.google.com/ --> 302 Found
GET http://www.google.ca/ --> 200 OK
Cache-Control: private, max-age=0
Connection: close
Date: Sat, 19 Jun 2010 04:11:01 GMT
Server: gws
Content-Type: text/html; charset=ISO-8859-1
Expires: -1
Client-Date: Sat, 19 Jun 2010 04:11:01 GMT
Client-Peer: 74.125.155.105:80
Client-Response-Num: 1
Set-Cookie: PREF=ID=a1925ca9f8af11b9:TM=1276920661:LM=1276920661:S=ULFrHqOiFDDzDVFB; expires=Mon, 18-Jun-2012 04:11:01 GMT; path=/; domain=.google.ca
Title: Google
X-XSS-Protection: 1; mode=block

Solution 10 - Linux

Can you try with it?

#!/bin/bash 
LOCATION=`curl -I 'http://your-domain.com/url/redirect?r=something&a=values-VALUES_FILES&e=zip' | perl -n -e '/^Location: (.*)$/ && print "$1\n"'` 
echo "$LOCATION"

Note: when you execute the command curl -I http://your-domain.com have to use single quotes in the command like curl -I 'http://your-domain.com'

Solution 11 - Linux

You could use grep. doesn't wget tell you where it's redirecting too? Just grep that out.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionviseView Question on Stackoverflow
Solution 1 - LinuxDaniel StenbergView Answer on Stackoverflow
Solution 2 - LinuxJan KoriĆ„ákView Answer on Stackoverflow
Solution 3 - LinuxGavin MoganView Answer on Stackoverflow
Solution 4 - LinuxviseView Answer on Stackoverflow
Solution 5 - LinuxCeagleView Answer on Stackoverflow
Solution 6 - LinuxDave BairdView Answer on Stackoverflow
Solution 7 - LinuxMike QView Answer on Stackoverflow
Solution 8 - LinuxGeographView Answer on Stackoverflow
Solution 9 - LinuxGavin MoganView Answer on Stackoverflow
Solution 10 - LinuxLakshmikandanView Answer on Stackoverflow
Solution 11 - LinuxSpliFFView Answer on Stackoverflow