What's the state of the art in email validation for Rails?
Ruby on-RailsValidationEmailRuby on-Rails Problem Overview
What are you using to validate users' email addresses, and why?
I had been using validates_email_veracity_of
which actually queries the MX servers. But that is full of fail for various reasons, mostly related to network traffic and reliability.
I looked around and I couldn't find anything obvious that a lot of people are using to perform a sanity check on an email address. Is there a maintained, reasonably accurate plugin or gem for this?
P.S.: Please don't tell me to send an email with a link to see if the email works. I'm developing a "send to a friend" feature, so this isn't practical.
Ruby on-Rails Solutions
Solution 1 - Ruby on-Rails
Don't make this harder than it needs to be. Your feature is non-critical; validation's just a basic sanity step to catch typos. I would do it with a simple regex, and not waste the CPU cycles on anything too complicated:
/\A[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]+\z/
That was adapted from <http://www.regular-expressions.info/email.html> -- which you should read if you really want to know all the tradeoffs. If you want a more correct and much more complicated fully RFC822-compliant regex, that's on that page too. But the thing is this: you don't have to get it totally right.
If the address passes validation, you're going to send an email. If the email fails, you're going to get an error message. At which point you can tell the user "Sorry, your friend didn't receive that, would you like to try again?" or flag it for manual review, or just ignore it, or whatever.
These are the same options you'd have to deal with if the address did pass validation. Because even if your validation is perfect and you acquire absolute proof that the address exists, sending could still fail.
The cost of a false positive on validation is low. The benefit of better validation is also low. Validate generously, and worry about errors when they happen.
Solution 2 - Ruby on-Rails
With Rails 3.0 you can use a email validation without regexp using the Mail gem.
Here is my implementation (packaged as a gem).
Solution 3 - Ruby on-Rails
I created a gem for email validation in Rails 3. I'm kinda surprised that Rails doesn't include something like this by default.
Solution 4 - Ruby on-Rails
This project seems to have the most watchers on github at the moment (for email validation in rails):
Solution 5 - Ruby on-Rails
From the Rails 4 docs:
class EmailValidator < ActiveModel::EachValidator
def validate_each(record, attribute, value)
unless value =~ /\A([^@\s]+)@((?:[-a-z0-9]+\.)+[a-z]{2,})\z/i
record.errors[attribute] << (options[:message] || "is not an email")
end
end
end
class Person < ActiveRecord::Base
validates :email, presence: true, email: true
end
Solution 6 - Ruby on-Rails
In Rails 4 simply add validates :email, email:true
(assuming your field is called email
) to your model and then write a simple (or complex†) EmailValidator
to suit your needs.
eg: - your model:
class TestUser
include Mongoid::Document
field :email, type: String
validates :email, email: true
end
Your validator (goes in app/validators/email_validator.rb
)
class EmailValidator < ActiveModel::EachValidator
EMAIL_ADDRESS_QTEXT = Regexp.new '[^\\x0d\\x22\\x5c\\x80-\\xff]', nil, 'n'
EMAIL_ADDRESS_DTEXT = Regexp.new '[^\\x0d\\x5b-\\x5d\\x80-\\xff]', nil, 'n'
EMAIL_ADDRESS_ATOM = Regexp.new '[^\\x00-\\x20\\x22\\x28\\x29\\x2c\\x2e\\x3a-\\x3c\\x3e\\x40\\x5b-\\x5d\\x7f-\\xff]+', nil, 'n'
EMAIL_ADDRESS_QUOTED_PAIR = Regexp.new '\\x5c[\\x00-\\x7f]', nil, 'n'
EMAIL_ADDRESS_DOMAIN_LITERAL = Regexp.new "\\x5b(?:#{EMAIL_ADDRESS_DTEXT}|#{EMAIL_ADDRESS_QUOTED_PAIR})*\\x5d", nil, 'n'
EMAIL_ADDRESS_QUOTED_STRING = Regexp.new "\\x22(?:#{EMAIL_ADDRESS_QTEXT}|#{EMAIL_ADDRESS_QUOTED_PAIR})*\\x22", nil, 'n'
EMAIL_ADDRESS_DOMAIN_REF = EMAIL_ADDRESS_ATOM
EMAIL_ADDRESS_SUB_DOMAIN = "(?:#{EMAIL_ADDRESS_DOMAIN_REF}|#{EMAIL_ADDRESS_DOMAIN_LITERAL})"
EMAIL_ADDRESS_WORD = "(?:#{EMAIL_ADDRESS_ATOM}|#{EMAIL_ADDRESS_QUOTED_STRING})"
EMAIL_ADDRESS_DOMAIN = "#{EMAIL_ADDRESS_SUB_DOMAIN}(?:\\x2e#{EMAIL_ADDRESS_SUB_DOMAIN})*"
EMAIL_ADDRESS_LOCAL_PART = "#{EMAIL_ADDRESS_WORD}(?:\\x2e#{EMAIL_ADDRESS_WORD})*"
EMAIL_ADDRESS_SPEC = "#{EMAIL_ADDRESS_LOCAL_PART}\\x40#{EMAIL_ADDRESS_DOMAIN}"
EMAIL_ADDRESS_PATTERN = Regexp.new "#{EMAIL_ADDRESS_SPEC}", nil, 'n'
EMAIL_ADDRESS_EXACT_PATTERN = Regexp.new "\\A#{EMAIL_ADDRESS_SPEC}\\z", nil, 'n'
def validate_each(record, attribute, value)
unless value =~ EMAIL_ADDRESS_EXACT_PATTERN
record.errors[attribute] << (options[:message] || 'is not a valid email')
end
end
end
This will allow all sorts of valid emails, including tagged emails like "[email protected]" and so on.
To test this with rspec
in your spec/validators/email_validator_spec.rb
require 'spec_helper'
describe "EmailValidator" do
let(:validator) { EmailValidator.new({attributes: [:email]}) }
let(:model) { double('model') }
before :each do
model.stub("errors").and_return([])
model.errors.stub('[]').and_return({})
model.errors[].stub('<<')
end
context "given an invalid email address" do
let(:invalid_email) { 'test test tes' }
it "is rejected as invalid" do
model.errors[].should_receive('<<')
validator.validate_each(model, "email", invalid_email)
end
end
context "given a simple valid address" do
let(:valid_simple_email) { '[email protected]' }
it "is accepted as valid" do
model.errors[].should_not_receive('<<')
validator.validate_each(model, "email", valid_simple_email)
end
end
context "given a valid tagged address" do
let(:valid_tagged_email) { '[email protected]' }
it "is accepted as valid" do
model.errors[].should_not_receive('<<')
validator.validate_each(model, "email", valid_tagged_email)
end
end
end
This is how I've done it anyway. YMMV
†Regular expressions are like violence; if they don't work you are not using enough of them.
Solution 7 - Ruby on-Rails
In Rails 3 it's possible to write a reusable validator, as this great post explains:
class EmailValidator < ActiveRecord::Validator
def validate()
record.errors[:email] << "is not valid" unless
record.email =~ /^([^@\s]+)@((?:[-a-z0-9]+\.)+[a-z]{2,})$/i
end
end
and use it with validates_with
:
class User < ActiveRecord::Base
validates_with EmailValidator
end
Solution 8 - Ruby on-Rails
As Hallelujah suggests I think using the Mail gem is a good approach. However, I dislike some of the hoops there.
I use:
def self.is_valid?(email)
parser = Mail::RFC2822Parser.new
parser.root = :addr_spec
result = parser.parse(email)
# Don't allow for a TLD by itself list (sam@localhost)
# The Grammar is: (local_part "@" domain) / local_part ... discard latter
result &&
result.respond_to?(:domain) &&
result.domain.dot_atom_text.elements.size > 1
end
You could be stricter by demanding that the TLDs (top level domains) are in this list, however you would be forced to update that list as new TLDs pop up (like the 2012 addition .mobi
and .tel
)
The advantage of hooking the parser direct is that the rules in Mail grammar are fairly wide for the portions the Mail gem uses, it is designed to allow it to parse an address like user<[email protected]>
which is common for SMTP. By consuming it from the Mail::Address
you are forced to do a bunch of extra checks.
Another note regarding the Mail gem, even though the class is called RFC2822, the grammar has some elements of RFC5322, for example this test.
Solution 9 - Ruby on-Rails
Noting the other answers, the question still remains - why bother being clever about it?
The actual volume of edge cases that many regex may deny or miss seems problematic.
I think the question is 'what am I trying to acheive?', even if you 'validate' the email address, you're not actually validating that it is a working email address.
If you go for regexp, just check for the presence of @ on the client side.
As for the incorrect email scenario, have a 'message failed to send' branch to your code.
Solution 10 - Ruby on-Rails
There are basically 3 most common options:
- Regexp (there is no works-for-all e-mail address regexp, so roll your own)
- MX query (that is what you use)
- Generating an activation token and mailing it (restful_authentication way)
If you don't want to use both validates_email_veracity_of and token generation, I'd go with old school regexp checking.
Solution 11 - Ruby on-Rails
The Mail gem has a built in address parser.
begin
Mail::Address.new(email)
#valid
rescue Mail::Field::ParseError => e
#invalid
end
Solution 12 - Ruby on-Rails
This solution is based on answers by @SFEley and @Alessandro DS, with a refactor, and usage clarification.
You can use this validator class in your model like so:
class MyModel < ActiveRecord::Base
# ...
validates :colum, :email => { :allow_nil => true, :message => 'O hai Mark!' }
# ...
end
Given you have the following in your app/validators
folder (Rails 3):
class EmailValidator < ActiveModel::EachValidator
def validate_each(record, attribute, value)
return options[:allow_nil] == true if value.nil?
unless matches?(value)
record.errors[attribute] << (options[:message] || 'must be a valid email address')
end
end
def matches?(value)
return false unless value
if /\A[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]+\z/.match(value).nil?
false
else
true
end
end
end
Solution 13 - Ruby on-Rails
For Mailing Lists Validation. (I use Rails 4.1.6)
I got my regexp from here. It seems to be a very complete one, and it's been tested against a great number of combinations. You can see the results on that page.
I slightly changed it to a Ruby regexp, and put it in my lib/validators/email_list_validator.rb
Here's the code:
require 'mail'
class EmailListValidator < ActiveModel::EachValidator
# Regexp source: https://fightingforalostcause.net/content/misc/2006/compare-email-regex.php
EMAIL_VALIDATION_REGEXP = Regexp.new('\A(?!(?:(?:\x22?\x5C[\x00-\x7E]\x22?)|(?:\x22?[^\x5C\x22]\x22?)){255,})(?!(?:(?:\x22?\x5C[\x00-\x7E]\x22?)|(?:\x22?[^\x5C\x22]\x22?)){65,}@)(?:(?:[\x21\x23-\x27\x2A\x2B\x2D\x2F-\x39\x3D\x3F\x5E-\x7E]+)|(?:\x22(?:[\x01-\x08\x0B\x0C\x0E-\x1F\x21\x23-\x5B\x5D-\x7F]|(?:\x5C[\x00-\x7F]))*\x22))(?:\.(?:(?:[\x21\x23-\x27\x2A\x2B\x2D\x2F-\x39\x3D\x3F\x5E-\x7E]+)|(?:\x22(?:[\x01-\x08\x0B\x0C\x0E-\x1F\x21\x23-\x5B\x5D-\x7F]|(?:\x5C[\x00-\x7F]))*\x22)))*@(?:(?:(?!.*[^.]{64,})(?:(?:(?:xn--)?[a-z0-9]+(?:-[a-z0-9]+)*\.){1,126}){1,}(?:(?:[a-z][a-z0-9]*)|(?:(?:xn--)[a-z0-9]+))(?:-[a-z0-9]+)*)|(?:\[(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){7})|(?:(?!(?:.*[a-f0-9][:\]]){7,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?)))|(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){5}:)|(?:(?!(?:.*[a-f0-9]:){5,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3}:)?)))?(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))(?:\.(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))){3}))\]))\z', true)
def validate_each(record, attribute, value)
begin
invalid_emails = Mail::AddressList.new(value).addresses.map do |mail_address|
# check if domain is present and if it passes validation through the regex
(mail_address.domain.present? && mail_address.address =~ EMAIL_VALIDATION_REGEXP) ? nil : mail_address.address
end
invalid_emails.uniq!
invalid_emails.compact!
record.errors.add(attribute, :invalid_emails, :emails => invalid_emails.to_sentence) if invalid_emails.present?
rescue Mail::Field::ParseError => e
# Parse error on email field.
# exception attributes are:
# e.element : Kind of element that was wrong (in case of invalid addres it is Mail::AddressListParser)
# e.value: mail adresses passed to parser (string)
# e.reason: Description of the problem. A message that is not very user friendly
if e.reason.include?('Expected one of')
record.errors.add(attribute, :invalid_email_list_characters)
else
record.errors.add(attribute, :invalid_emails_generic)
end
end
end
end
And I use it like this in the model:
validates :emails, :presence => true, :email_list => true
It will validate mailing lists like this one, with different separators and synthax:
mail_list = 'John Doe <john@doe.com>, chuck@schuld.dea.th; David G. <david@pink.floyd.division.bell>'
Before using this regexp, I used Devise.email_regexp
, but that is a very simple regexp and didn't get all the cases I needed. Some emails bumped.
I tried other regexps from the web, but this one's got the best results till now. Hope it helps in your case.