Rails: What's a good way to validate links (URLs)?

Ruby on-RailsRubyRegexValidationUrl

Ruby on-Rails Problem Overview


I was wondering how I would best validate URLs in Rails. I was thinking of using a regular expression, but am not sure if this is the best practice.

And, if I were to use a regex, could someone suggest one to me? I am still new to Regex.

Ruby on-Rails Solutions


Solution 1 - Ruby on-Rails

Validating an URL is a tricky job. It's also a very broad request.

What do you want to do, exactly? Do you want to validate the format of the URL, the existence, or what? There are several possibilities, depending on what you want to do.

A regular expression can validate the format of the URL. But even a complex regular expression cannot ensure you are dealing with a valid URL.

For instance, if you take a simple regular expression, it will probably reject the following host

http://invalid##host.com

but it will allow

http://invalid-host.foo

that is a valid host, but not a valid domain if you consider the existing TLDs. Indeed, the solution would work if you want to validate the hostname, not the domain because the following one is a valid hostname

http://host.foo

as well the following one

http://localhost

Now, let me give you some solutions.

If you want to validate a domain, then you need to forget about regular expressions. The best solution available at the moment is the Public Suffix List, a list maintained by Mozilla. I created a Ruby library to parse and validate domains against the Public Suffix List, and it's called PublicSuffix.

If you want to validate the format of an URI/URL, then you might want to use regular expressions. Instead of searching for one, use the built-in Ruby URI.parse method.

require 'uri'

def valid_url?(uri)
  uri = URI.parse(uri) && uri.host
rescue URI::InvalidURIError
  false
end

You can even decide to make it more restrictive. For instance, if you want the URL to be an HTTP/HTTPS URL, then you can make the validation more accurate.

require 'uri'

def valid_url?(url)
  uri = URI.parse(url)
  uri.is_a?(URI::HTTP) && !uri.host.nil?
rescue URI::InvalidURIError
  false
end

Of course, there are tons of improvements you can apply to this method, including checking for a path or a scheme.

Last but not least, you can also package this code into a validator:

class HttpUrlValidator < ActiveModel::EachValidator

  def self.compliant?(value)
    uri = URI.parse(value)
    uri.is_a?(URI::HTTP) && !uri.host.nil?
  rescue URI::InvalidURIError
    false
  end

  def validate_each(record, attribute, value)
    unless value.present? && self.class.compliant?(value)
      record.errors.add(attribute, "is not a valid HTTP URL")
    end
  end

end

# in the model
validates :example_attribute, http_url: true

Solution 2 - Ruby on-Rails

I use a one liner inside my models:

validates :url, format: URI::regexp(%w[http https])

I think is good enough and simple to use. Moreover it should be theoretically equivalent to the Simone's method, as it use the very same regexp internally.

Solution 3 - Ruby on-Rails

Following Simone's idea, you can easily create you own validator.

class UrlValidator < ActiveModel::EachValidator
  def validate_each(record, attribute, value)
    return if value.blank?
    begin
      uri = URI.parse(value)
      resp = uri.kind_of?(URI::HTTP)
    rescue URI::InvalidURIError
      resp = false
    end
    unless resp == true
      record.errors[attribute] << (options[:message] || "is not an url")
    end
  end
end

and then use

validates :url, :presence => true, :url => true

in your model.

Solution 4 - Ruby on-Rails

There is also validate_url gem (which is just a nice wrapper for Addressable::URI.parse solution).

Just add

gem 'validate_url'

to your Gemfile, and then in models you can

validates :click_through_url, url: true

Solution 5 - Ruby on-Rails

This question is already answered, but what the heck, I propose the solution I'm using.

The regexp works fine with all urls I've met. The setter method is to take care if no protocol is mentioned (let's assume http://).

And finally, we make a try to fetch the page. Maybe I should accept redirects and not only HTTP 200 OK.

# app/models/my_model.rb
validates :website, :allow_blank => true, :uri => { :format => /(^$)|(^(http|https):\/\/[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(([0-9]{1,5})?\/.*)?$)/ix }

def website= url_str
  unless url_str.blank?
    unless url_str.split(':')[0] == 'http' || url_str.split(':')[0] == 'https'
        url_str = "http://" + url_str
    end
  end  
  write_attribute :website, url_str
end

and...

# app/validators/uri_vaidator.rb
require 'net/http'

# Thanks Ilya! http://www.igvita.com/2006/09/07/validating-url-in-ruby-on-rails/
# Original credits: http://blog.inquirylabs.com/2006/04/13/simple-uri-validation/
# HTTP Codes: http://www.ruby-doc.org/stdlib/libdoc/net/http/rdoc/classes/Net/HTTPResponse.html

class UriValidator < ActiveModel::EachValidator
  def validate_each(object, attribute, value)
    raise(ArgumentError, "A regular expression must be supplied as the :format option of the options hash") unless options[:format].nil? or options[:format].is_a?(Regexp)
    configuration = { :message => I18n.t('errors.events.invalid_url'), :format => URI::regexp(%w(http https)) }
    configuration.update(options)
    
    if value =~ configuration[:format]
      begin # check header response
        case Net::HTTP.get_response(URI.parse(value))
          when Net::HTTPSuccess then true
          else object.errors.add(attribute, configuration[:message]) and false
        end
      rescue # Recover on DNS failures..
        object.errors.add(attribute, configuration[:message]) and false
      end
    else
      object.errors.add(attribute, configuration[:message]) and false
    end
  end
end

Solution 6 - Ruby on-Rails

The solution that worked for me was:

validates_format_of :url, :with => /\A(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w\.-]*)*\/?\Z/i

I did try to use some of the example that you attached but I'm supporting url like so:

Notice the use of A and Z because if you use ^ and $ you will see this warning security from Rails validators.

 Valid ones:
 'www.crowdint.com'
 'crowdint.com'
 'http://crowdint.com'
 'http://www.crowdint.com'

 Invalid ones:
  'http://www.crowdint. com'
  'http://fake'
  'http:fake'

Solution 7 - Ruby on-Rails

You can also try valid_url gem which allows URLs without the scheme, checks domain zone and ip-hostnames.

Add it to your Gemfile:

gem 'valid_url'

And then in model:

class WebSite < ActiveRecord::Base
  validates :url, :url => true
end

Solution 8 - Ruby on-Rails

Just my 2 cents:

before_validation :format_website
validate :website_validator

private

def format_website
  self.website = "http://#{self.website}" unless self.website[/^https?/]
end

def website_validator
  errors[:website] << I18n.t("activerecord.errors.messages.invalid") unless website_valid?
end

def website_valid?
  !!website.match(/^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-=\?]*)*\/?$/)
end

EDIT: changed regex to match parameter urls.

Solution 9 - Ruby on-Rails

I ran into the same problem lately (I needed to validate urls in a Rails app) but I had to cope with the additional requirement of unicode urls (e.g. http://кц.рф)...

I researched a couple of solutions and came across the following:

Solution 10 - Ruby on-Rails

Here is an updated version of the validator posted by David James. It has been published by Benjamin Fleischer. Meanwhile, I pushed an updated fork which can be found here.

require 'addressable/uri'

# Source: http://gist.github.com/bf4/5320847
# Accepts options[:message] and options[:allowed_protocols]
# spec/validators/uri_validator_spec.rb
class UriValidator < ActiveModel::EachValidator

  def validate_each(record, attribute, value)
    uri = parse_uri(value)
    if !uri
      record.errors[attribute] << generic_failure_message
    elsif !allowed_protocols.include?(uri.scheme)
      record.errors[attribute] << "must begin with #{allowed_protocols_humanized}"
    end
  end

private

  def generic_failure_message
    options[:message] || "is an invalid URL"
  end

  def allowed_protocols_humanized
    allowed_protocols.to_sentence(:two_words_connector => ' or ')
  end

  def allowed_protocols
    @allowed_protocols ||= [(options[:allowed_protocols] || ['http', 'https'])].flatten
  end

  def parse_uri(value)
    uri = Addressable::URI.parse(value)
    uri.scheme && uri.host && uri
  rescue URI::InvalidURIError, Addressable::URI::InvalidURIError, TypeError
  end

end

...

require 'spec_helper'

# Source: http://gist.github.com/bf4/5320847
# spec/validators/uri_validator_spec.rb
describe UriValidator do
  subject do
    Class.new do
      include ActiveModel::Validations
      attr_accessor :url
      validates :url, uri: true
    end.new
  end

  it "should be valid for a valid http url" do
    subject.url = 'http://www.google.com'
    subject.valid?
    subject.errors.full_messages.should == []
  end

  ['http://google', 'http://.com', 'http://ftp://ftp.google.com', 'http://ssh://google.com'].each do |invalid_url|
    it "#{invalid_url.inspect} is a invalid http url" do
      subject.url = invalid_url
      subject.valid?
      subject.errors.full_messages.should == []
    end
  end

  ['http:/www.google.com','<>hi'].each do |invalid_url|
    it "#{invalid_url.inspect} is an invalid url" do
      subject.url = invalid_url
      subject.valid?
      subject.errors.should have_key(:url)
      subject.errors[:url].should include("is an invalid URL")
    end
  end

  ['www.google.com','google.com'].each do |invalid_url|
    it "#{invalid_url.inspect} is an invalid url" do
      subject.url = invalid_url
      subject.valid?
      subject.errors.should have_key(:url)
      subject.errors[:url].should include("is an invalid URL")
    end
  end

  ['ftp://ftp.google.com','ssh://google.com'].each do |invalid_url|
    it "#{invalid_url.inspect} is an invalid url" do
      subject.url = invalid_url
      subject.valid?
      subject.errors.should have_key(:url)
      subject.errors[:url].should include("must begin with http or https")
    end
  end
end

Please notice that there are still strange HTTP URIs that are parsed as valid addresses.

http://google  
http://.com  
http://ftp://ftp.google.com  
http://ssh://google.com

Here is a issue for the addressable gem which covers the examples.

Solution 11 - Ruby on-Rails

I use a slight variation on lafeber solution above. It disallows consecutive dots in the hostname (such as for instance in www.many...dots.com):

%r"\A(https?://)?[a-z\d\-]+(\.[a-z\d\-]+)*\.[a-z]{2,6}(/.*)?\Z"i

URI.parse seems to mandate scheme prefixing, which in some cases is not what you may want (e.g. if you want to allow your users to quickly spell URLs in forms such as twitter.com/username)

Solution 12 - Ruby on-Rails

I have been using the 'activevalidators' gem and it's works pretty well (not just for urls validation)

you can find it here

It's all documented but basically once the gem added you'll want to add the following few lines in an initializer say : /config/environments/initializers/active_validators_activation.rb

# Activate all the validators
ActiveValidators.activate(:all)

(Note : you can replace :all by :url or :whatever if you just want to validate specific types of values)

And then back in your model something like this

class Url < ActiveRecord::Base
   validates :url, :presence => true, :url => true
end

Now Restart the server and that should be it

Solution 13 - Ruby on-Rails

If you want simple validation and a custom error message:

  validates :some_field_expecting_url_value,
            format: {
              with: URI.regexp(%w[http https]),
              message: 'is not a valid URL'
            }

Solution 14 - Ruby on-Rails

I liked to monkeypatch the URI module to add the valid? method

inside config/initializers/uri.rb

module URI
  def self.valid?(url)
    uri = URI.parse(url)
    uri.is_a?(URI::HTTP) && !uri.host.nil?
  rescue URI::InvalidURIError
    false
  end
end

Solution 15 - Ruby on-Rails

You can validate multiple urls using something like:

validates_format_of [:field1, :field2], with: URI.regexp(['http', 'https']), allow_nil: true

Solution 16 - Ruby on-Rails

https://github.com/perfectline/validates_url is a nice and simple gem that will do pretty much everything for you

Solution 17 - Ruby on-Rails

Recently I had this same issue and I found a work around for valid urls.

validates_format_of :url, :with => URI::regexp(%w(http https))
validate :validate_url
def validate_url

  unless self.url.blank?

  	begin

	  source = URI.parse(self.url)

	  resp = Net::HTTP.get_response(source)

    rescue URI::InvalidURIError

      errors.add(:url,'is Invalid')

    rescue SocketError 

	  errors.add(:url,'is Invalid')

    end



  end

The first part of the validate_url method is enough to validate url format. The second part will make sure the url exists by sending a request.

Solution 18 - Ruby on-Rails

And as a module

module UrlValidator
  extend ActiveSupport::Concern
  included do
    validates :url, presence: true, uniqueness: true
    validate :url_format
  end

  def url_format
    begin
      errors.add(:url, "Invalid url") unless URI(self.url).is_a?(URI::HTTP)
    rescue URI::InvalidURIError
      errors.add(:url, "Invalid url")
    end
  end
end

And then just include UrlValidator in any model that you want to validate url's for. Just including for options.

Solution 19 - Ruby on-Rails

URL validation cannot be handled simply by using a Regular Expression as the number of websites keep growing and new domain naming schemes keep coming up.

In my case, I simply write a custom validator that checks for a successful response.

class UrlValidator < ActiveModel::Validator
  def validate(record)
    begin
      url = URI.parse(record.path)
      response = Net::HTTP.get(url)
      true if response.is_a?(Net::HTTPSuccess)   
    rescue StandardError => error
      record.errors[:path] << 'Web address is invalid'
      false
    end  
  end
end

I am validating the path attribute of my model by using record.path. I am also pushing the error to the respective attribute name by using record.errors[:path].

You can simply replace this with any attribute name.

Then on, I simply call the custom validator in my model.

class Url < ApplicationRecord

  # validations
  validates_presence_of :path
  validates_with UrlValidator

end

Solution 20 - Ruby on-Rails

You could use regex for this, for me works good this one:

(^|[\s.:;?\-\]<\(])(ftp|https?:\/\/[-\w;\/?:@&=+$\|\_.!~*\|'()\[\]%#,]+[\w\/#](\(\))?)(?=$|[\s',\|\(\).:;?\-\[\]>\)])

Solution 21 - Ruby on-Rails

URI::regexp(%w[http https]) is obsolete and should not be used.

Instead, use URI::DEFAULT_PARSER.make_regexp(%w[http https])

Solution 22 - Ruby on-Rails

Keep it simple:

validates :url, format: %r{http(s)://.+}

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionjayView Question on Stackoverflow
Solution 1 - Ruby on-RailsSimone CarlettiView Answer on Stackoverflow
Solution 2 - Ruby on-RailsMatteo CollinaView Answer on Stackoverflow
Solution 3 - Ruby on-RailsjlfenauxView Answer on Stackoverflow
Solution 4 - Ruby on-RailsdolzenkoView Answer on Stackoverflow
Solution 5 - Ruby on-RailsStefan PetterssonView Answer on Stackoverflow
Solution 6 - Ruby on-RailsHeriberto MagañaView Answer on Stackoverflow
Solution 7 - Ruby on-RailsRoman RalovetsView Answer on Stackoverflow
Solution 8 - Ruby on-RailslafeberView Answer on Stackoverflow
Solution 9 - Ruby on-RailsseverinView Answer on Stackoverflow
Solution 10 - Ruby on-RailsJJDView Answer on Stackoverflow
Solution 11 - Ruby on-RailsFrancoView Answer on Stackoverflow
Solution 12 - Ruby on-RailsArnaud BouchotView Answer on Stackoverflow
Solution 13 - Ruby on-RailsCalebView Answer on Stackoverflow
Solution 14 - Ruby on-RailsBlair AndersonView Answer on Stackoverflow
Solution 15 - Ruby on-RailsDamien RocheView Answer on Stackoverflow
Solution 16 - Ruby on-RailsstuartchaneyView Answer on Stackoverflow
Solution 17 - Ruby on-RailsDilnavazView Answer on Stackoverflow
Solution 18 - Ruby on-RailsMCBView Answer on Stackoverflow
Solution 19 - Ruby on-RailsNoman Ur RehmanView Answer on Stackoverflow
Solution 20 - Ruby on-Railsspirito_liberoView Answer on Stackoverflow
Solution 21 - Ruby on-RailsyldView Answer on Stackoverflow
Solution 22 - Ruby on-RailsKirill PlatonovView Answer on Stackoverflow