InputStream from a URL

JavaUrlInputstream

Java Problem Overview


How do I get an InputStream from a URL?

for example, I want to take the file at the url wwww.somewebsite.com/a.txt and read it as an InputStream in Java, through a servlet.

I've tried

InputStream is = new FileInputStream("wwww.somewebsite.com/a.txt");

but what I got was an error:

java.io.FileNotFoundException

Java Solutions


Solution 1 - Java

Use java.net.URL#openStream() with a proper URL (including the protocol!). E.g.

InputStream input = new URL("http://www.somewebsite.com/a.txt").openStream();
// ...

###See also:

Solution 2 - Java

Try:

final InputStream is = new URL("http://wwww.somewebsite.com/a.txt").openStream();

Solution 3 - Java

(a) wwww.somewebsite.com/a.txt isn't a 'file URL'. It isn't a URL at all. If you put http:// on the front of it it would be an HTTP URL, which is clearly what you intend here.

(b) FileInputStream is for files, not URLs.

(c) The way to get an input stream from any URL is via URL.openStream(), or URL.getConnection().getInputStream(), which is equivalent but you might have other reasons to get the URLConnection and play with it first.

Solution 4 - Java

Your original code uses FileInputStream, which is for accessing file system hosted files.

The constructor you used will attempt to locate a file named a.txt in the www.somewebsite.com subfolder of the current working directory (the value of system property user.dir). The name you provide is resolved to a file using the File class.

URL objects are the generic way to solve this. You can use URLs to access local files but also network hosted resources. The URL class supports the file:// protocol besides http:// or https:// so you're good to go.

Solution 5 - Java

Pure Java:

 urlToInputStream(url,httpHeaders);

With some success I use this method. It handles redirects and one can pass a variable number of HTTP headers asMap<String,String>. It also allows redirects from HTTP to HTTPS.

private InputStream urlToInputStream(URL url, Map<String, String> args) {
	HttpURLConnection con = null;
	InputStream inputStream = null;
	try {
		con = (HttpURLConnection) url.openConnection();
		con.setConnectTimeout(15000);
		con.setReadTimeout(15000);
		if (args != null) {
			for (Entry<String, String> e : args.entrySet()) {
				con.setRequestProperty(e.getKey(), e.getValue());
			}
		}
		con.connect();
		int responseCode = con.getResponseCode();
		/* By default the connection will follow redirects. The following
		 * block is only entered if the implementation of HttpURLConnection
		 * does not perform the redirect. The exact behavior depends to 
		 * the actual implementation (e.g. sun.net).
		 * !!! Attention: This block allows the connection to 
		 * switch protocols (e.g. HTTP to HTTPS), which is <b>not</b> 
		 * default behavior. See: https://stackoverflow.com/questions/1884230 
		 * for more info!!!
		 */
		if (responseCode < 400 && responseCode > 299) {
			String redirectUrl = con.getHeaderField("Location");
			try {
				URL newUrl = new URL(redirectUrl);
				return urlToInputStream(newUrl, args);
			} catch (MalformedURLException e) {
				URL newUrl = new URL(url.getProtocol() + "://" + url.getHost() + redirectUrl);
				return urlToInputStream(newUrl, args);
			}
		}
		/*!!!!!*/
		
		inputStream = con.getInputStream();
		return inputStream;
	} catch (Exception e) {
		throw new RuntimeException(e);
	}
}

Full example call

private InputStream getInputStreamFromUrl(URL url, String user, String passwd) throws IOException {
		String encoded = Base64.getEncoder().encodeToString((user + ":" + passwd).getBytes(StandardCharsets.UTF_8));
        Map<String,String> httpHeaders=new Map<>();
        httpHeaders.put("Accept", "application/json");
        httpHeaders.put("User-Agent", "myApplication");
        httpHeaders.put("Authorization", "Basic " + encoded);
		return urlToInputStream(url,httpHeaders);
	}

Solution 6 - Java

Here is a full example which reads the contents of the given web page. The web page is read from an HTML form. We use standard InputStream classes, but it could be done more easily with JSoup library.

<dependency>
    <groupId>javax.servlet</groupId>
    <artifactId>javax.servlet-api</artifactId>
    <version>3.1.0</version>
    <scope>provided</scope>
    
</dependency>

<dependency>
    <groupId>commons-validator</groupId>
    <artifactId>commons-validator</artifactId>
    <version>1.6</version>
</dependency>  

These are the Maven dependencies. We use Apache Commons library to validate URL strings.

package com.zetcode.web;

import com.zetcode.service.WebPageReader;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import javax.servlet.ServletException;
import javax.servlet.ServletOutputStream;
import javax.servlet.annotation.WebServlet;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

@WebServlet(name = "ReadWebPage", urlPatterns = {"/ReadWebPage"})
public class ReadWebpage extends HttpServlet {

    @Override
    protected void doGet(HttpServletRequest request, HttpServletResponse response)
            throws ServletException, IOException {

        response.setContentType("text/plain;charset=UTF-8");
        
        String page = request.getParameter("webpage");
        
        String content = new WebPageReader().setWebPageName(page).getWebPageContent();
        
        ServletOutputStream os = response.getOutputStream();
        os.write(content.getBytes(StandardCharsets.UTF_8));
    }
}

The ReadWebPage servlet reads the contents of the given web page and sends it back to the client in plain text format. The task of reading the page is delegated to WebPageReader.

package com.zetcode.service;

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.net.URL;
import java.nio.charset.StandardCharsets;
import java.util.logging.Level;
import java.util.logging.Logger;
import java.util.stream.Collectors;
import org.apache.commons.validator.routines.UrlValidator;

public class WebPageReader {

    private String webpage;
    private String content;

    public WebPageReader setWebPageName(String name) {

        webpage = name;
        return this;
    }

    public String getWebPageContent() {

        try {

            boolean valid = validateUrl(webpage);

            if (!valid) {

                content = "Invalid URL; use http(s)://www.example.com format";
                return content;
            }

            URL url = new URL(webpage);

            try (InputStream is = url.openStream();
                    BufferedReader br = new BufferedReader(
                            new InputStreamReader(is, StandardCharsets.UTF_8))) {

                content = br.lines().collect(
                      Collectors.joining(System.lineSeparator()));
            }

        } catch (IOException ex) {

            content = String.format("Cannot read webpage %s", ex);
            Logger.getLogger(WebPageReader.class.getName()).log(Level.SEVERE, null, ex);
        }

        return content;
    }

    private boolean validateUrl(String webpage) {

        UrlValidator urlValidator = new UrlValidator();

        return urlValidator.isValid(webpage);
    }
}

WebPageReader validates the URL and reads the contents of the web page. It returns a string containing the HTML code of the page.

<!DOCTYPE html>
<html>
    <head>
        <title>Home page</title>
        <meta charset="UTF-8">
    </head>
    <body>
        <form action="ReadWebPage">
            
            <label for="page">Enter a web page name:</label>
            <input  type="text" id="page" name="webpage">
            
            <button type="submit">Submit</button>
            
        </form>
    </body>
</html>

Finally, this is the home page containing the HTML form. This is taken from my tutorial about this topic.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionWhitebearView Question on Stackoverflow
Solution 1 - JavaBalusCView Answer on Stackoverflow
Solution 2 - JavawhiskeysierraView Answer on Stackoverflow
Solution 3 - Javauser207421View Answer on Stackoverflow
Solution 4 - JavaCristian BotizaView Answer on Stackoverflow
Solution 5 - JavajschnasseView Answer on Stackoverflow
Solution 6 - JavaJan BodnarView Answer on Stackoverflow