How to get the contents of a webpage in a shell variable?

LinuxBashShellWget

Linux Problem Overview


In Linux how can I fetch an URL and get its contents in a variable in shell script?

Linux Solutions


Solution 1 - Linux

You can use wget command to download the page and read it into a variable as:

content=$(wget google.com -q -O -)
echo $content

We use the -O option of wget which allows us to specify the name of the file into which wget dumps the page contents. We specify - to get the dump onto standard output and collect that into the variable content . You can add the -q quiet option to turn off's wget output.

You can use the curl command for this aswell as:

content=$(curl -L google.com)
echo $content

We need to use the -L option as the page we are requesting might have moved. In which case we need to get the page from the new location. The -L or --location option helps us with this.

Solution 2 - Linux

There are many ways to get a page from the command line... but it also depends if you want the code source or the page itself:

If you need the code source:

with curl:

curl $url

with wget:

wget -O - $url

but if you want to get what you can see with a browser, lynx can be useful:

lynx -dump $url

I think you can find so many solutions for this little problem, maybe you should read all man pages for those commands. And don't forget to replace $url by your URL :)

Good luck :)

Solution 3 - Linux

There is the wget command or the curl.

You can now use the file you downloaded with wget. Or you can handle a stream with curl.


Resources :

Solution 4 - Linux

content=`wget -O - $url`

Solution 5 - Linux

You can use curl or wget to retrieve the raw data, or you can use w3m -dump to have a nice text representation of a web page.

$ foo=$(w3m -dump http://www.example.com/); echo $foo
You have reached this web page by typing "example.com", "example.net","example.org" or "example.edu" into your web browser. These domain names are reserved for use in documentation and are not available for registration. See RFC 2606, Section 3.

Solution 6 - Linux

If you have LWP installed, it provides a binary simply named "GET".

$ GET http://example.com
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<HTML>
<HEAD>
<META http-equiv="Content-Type" content="text/html; charset=utf-8">
<TITLE>Example Web Page</TITLE>
</HEAD>
<body>
<p>You have reached this web page by typing &quot;example.com&quot;, &quot;example.net&quot;,&quot;example.org&quot or &quot;example.edu&quot; into your web browser.</p> <p>These domain names are reserved for use in documentation and are not available for registration. See <a href="http://www.rfc-editor.org/rfc/rfc2606.txt"&gt;RFC 2606</a>, Section 3.</p> </BODY> </HTML>

wget -O-, curl, and lynx -source behave similarly.

Solution 7 - Linux

No curl, no wget, no ncat, nothing? Use telnet:

$ content=$(telnet localhost 80)
GET / HTTP/1.1
Host: localhost
Connection: close
 
Connection closed by foreign host.

$ echo $content
HTTP/1.1 200 OK Date: Mon, 22 Mar 2021 12:45:02 GMT Server:
Apache/2.4.46 (Fedora) OpenSSL/1.1.1j Last-Modified: Mon, 31 Dec 2018
15:56:45 GMT ETag: "a4-57e5375ad21bd" Accept-Ranges: bytes
Content-Length: 164 Connection: close Content-Type: text/html;
charset=UTF-8 Success! 192.168.1.1

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionAillynView Question on Stackoverflow
Solution 1 - LinuxcodaddictView Answer on Stackoverflow
Solution 2 - LinuxjulianvdbView Answer on Stackoverflow
Solution 3 - LinuxColin HebertView Answer on Stackoverflow
Solution 4 - LinuxJim LewisView Answer on Stackoverflow
Solution 5 - LinuxGiacomoView Answer on Stackoverflow
Solution 6 - LinuxephemientView Answer on Stackoverflow
Solution 7 - Linuxuser15452187View Answer on Stackoverflow