How to check if a URL is valid

Ruby

Ruby Problem Overview


How can I check if a string is a valid URL?

For example:

http://hello.it => yes
http:||bra.ziz, => no

If this is a valid URL how can I check if this is relative to a image file?

Ruby Solutions


Solution 1 - Ruby

Notice:

As pointed by @CGuess, there's a bug with this issue and it's been documented for over 9 years now that validation is not the purpose of this regular expression (see https://bugs.ruby-lang.org/issues/6520).




Use the URI module distributed with Ruby:

require 'uri'

if url =~ URI::regexp
    # Correct URL
end

Like Alexander Günther said in the comments, it checks if a string contains a URL.

To check if the string is a URL, use:

url =~ /\A#{URI::regexp}\z/

If you only want to check for web URLs (http or https), use this:

url =~ /\A#{URI::regexp(['http', 'https'])}\z/

Solution 2 - Ruby

Similar to the answers above, I find using this regex to be slightly more accurate:

URI::DEFAULT_PARSER.regexp[:ABS_URI]

That will invalidate URLs with spaces, as opposed to URI.regexp which allows spaces for some reason.

I have recently found a shortcut that is provided for the different URI rgexps. You can access any of URI::DEFAULT_PARSER.regexp.keys directly from URI::#{key}.

For example, the :ABS_URI regexp can be accessed from URI::ABS_URI.

Solution 3 - Ruby

The problem with the current answers is that a URI is not an URL.

> A URI can be further classified as a locator, a name, or both. The > term "Uniform Resource Locator" (URL) refers to the subset of URIs > that, in addition to identifying a resource, provide a means of > locating the resource by describing its primary access mechanism > (e.g., its network "location").

Since URLs are a subset of URIs, it is clear that matching specifically for URIs will successfully match undesired values. For example, URNs:

 "urn:isbn:0451450523" =~ URI::regexp
 => 0 

That being said, as far as I know, Ruby doesn't have a default way to parse URLs , so you'll most likely need a gem to do so. If you need to match URLs specifically in HTTP or HTTPS format, you could do something like this:

uri = URI.parse(my_possible_url)
if uri.kind_of?(URI::HTTP) or uri.kind_of?(URI::HTTPS)
  # do your stuff
end

Solution 4 - Ruby

I prefer the Addressable gem. I have found that it handles URLs more intelligently.

require 'addressable/uri'

SCHEMES = %w(http https)

def valid_url?(url)
  parsed = Addressable::URI.parse(url) or return false
  SCHEMES.include?(parsed.scheme)
rescue Addressable::URI::InvalidURIError
  false
end

Solution 5 - Ruby

This is a fairly old entry, but I thought I'd go ahead and contribute:

String.class_eval do
    def is_valid_url?
        uri = URI.parse self
        uri.kind_of? URI::HTTP
    rescue URI::InvalidURIError
        false
    end
end

Now you can do something like:

if "http://www.omg.wtf".is_valid_url?
    p "huzzah!"
end

Solution 6 - Ruby

For me, I use this regular expression:

/^(http|https):\/\/[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(:[0-9]{1,5})?(\/.*)?$/ix

Option:

  • i - case insensitive
  • x - ignore whitespace in regex

You can set this method to check URL validation:

def valid_url?(url)
  return false if url.include?("<script")
  url_regexp = /^(http|https):\/\/[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(:[0-9]{1,5})?(\/.*)?$/ix
  url =~ url_regexp ? true : false
end

To use it:

valid_url?("http://stackoverflow.com/questions/1805761/check-if-url-is-valid-ruby")

Testing with wrong URLs:

  • http://ruby3arabi - result is invalid
  • http://http://ruby3arabi.com - result is invalid
  • http:// - result is invalid
  • http://test.com\n<script src=\"nasty.js\"> (Just simply check "
  • 127.0.0.1 - not support IP address

Test with correct URLs:

  • http://ruby3arabi.com - result is valid
  • http://www.ruby3arabi.com - result is valid
  • https://www.ruby3arabi.com - result is valid
  • https://www.ruby3arabi.com/article/1 - result is valid
  • https://www.ruby3arabi.com/websites/58e212ff6d275e4bf9000000?locale=en - result is valid

Solution 7 - Ruby

This is a little bit old but here is how I do it. Use Ruby's URI module to parse the URL. If it can be parsed then it's a valid URL. (But that doesn't mean accessible.)

URI supports many schemes, plus you can add custom schemes yourself:

irb> uri = URI.parse "http://hello.it" rescue nil
=> #<URI::HTTP:0x10755c50 URL:http://hello.it>

irb> uri.instance_values
=> {"fragment"=>nil,
 "registry"=>nil,
 "scheme"=>"http",
 "query"=>nil,
 "port"=>80,
 "path"=>"",
 "host"=>"hello.it",
 "password"=>nil,
 "user"=>nil,
 "opaque"=>nil}

irb> uri = URI.parse "http:||bra.ziz" rescue nil
=> nil


irb> uri = URI.parse "ssh://hello.it:5888" rescue nil
=> #<URI::Generic:0x105fe938 URL:ssh://hello.it:5888>
[26] pry(main)> uri.instance_values
=> {"fragment"=>nil,
 "registry"=>nil,
 "scheme"=>"ssh",
 "query"=>nil,
 "port"=>5888,
 "path"=>"",
 "host"=>"hello.it",
 "password"=>nil,
 "user"=>nil,
 "opaque"=>nil}

See the documentation for more information about the URI module.

Solution 8 - Ruby

In general,

/^#{URI::regexp}$/

will work well, but if you only want to match http or https, you can pass those in as options to the method:

/^#{URI::regexp(%w(http https))}$/

That tends to work a little better, if you want to reject protocols like ftp://.

Solution 9 - Ruby

You could also use a regex, maybe something like http://www.geekzilla.co.uk/View2D3B0109-C1B2-4B4E-BFFD-E8088CBC85FD.htm assuming this regex is correct (I haven't fully checked it) the following will show the validity of the url.

url_regex = Regexp.new("((https?|ftp|file):((//)|(\\\\))+[\w\d:\#@%/;$()~_?\+-=\\\\.&]*)")

urls = [
    "http://hello.it",
    "http:||bra.ziz"
]

urls.each { |url|
    if url =~ url_regex then
        puts "%s is valid" % url
    else
        puts "%s not valid" % url
    end
}

The above example outputs:

http://hello.it is valid
http:||bra.ziz not valid

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionLuca RomagnoliView Question on Stackoverflow
Solution 1 - RubyMikael SView Answer on Stackoverflow
Solution 2 - RubyjonutsView Answer on Stackoverflow
Solution 3 - RubyfotanusView Answer on Stackoverflow
Solution 4 - RubyDavid J.View Answer on Stackoverflow
Solution 5 - RubyWilhelm MurdochView Answer on Stackoverflow
Solution 6 - RubyKomsun K.View Answer on Stackoverflow
Solution 7 - RubynyzmView Answer on Stackoverflow
Solution 8 - Rubyuser2275806View Answer on Stackoverflow
Solution 9 - RubyJamieView Answer on Stackoverflow