Difference between \A \z and ^ $ in Ruby regular expressions
RubyRegexRuby Problem Overview
In the documentation I read:
> Use \A and \z to match the start and end of the string, ^ and $ match the start/end of a line.
I am going to apply a regular expression to check username (or e-mail is the same) submitted by user. Which expression should I use with validates_format_of
in model? I can't understand the difference: I've always used ^ and $ ...
Ruby Solutions
Solution 1 - Ruby
If you're depending on the regular expression for validation, you always want to use \A
and \z
. ^
and $
will only match up until a newline character, which means they could use an email like [email protected]\n<script>dangerous_stuff();</script>
and still have it validate, since the regex only sees everything before the \n
.
My recommendation would just be completely stripping new lines from a username or email beforehand, since there's pretty much no legitimate reason for one. Then you can safely use EITHER \A
\z
or ^
$
.
Solution 2 - Ruby
According to Pickaxe:
>
> ^
> Matches the beginning of a line.
>
> $
> Matches the end of a line.
>
> \A
> Matches the beginning of the string.
>
> \z
> Matches the end of the string.
>
> \Z
> Matches the end of the string unless the string ends with a "\n"
, in which case it matches just before the "\n"
.
So, use \A
and lowercase \z
. If you use \Z
someone could sneak in a newline character. This is not dangerous I think, but might screw up algorithms that assume that there's no whitespace in the string. Depending on your regex and string-length constraints someone could use an invisible name with just a newline character.
JavaScript's implementation of Regex treats \A
as a literal 'A'
(ref). So watch yourself out there and test.
Solution 3 - Ruby
The start and end of a string may not necessarily be the same thing as the start and end of a line. Imagine if you used the following as your test string:
> my
name
is
Andrew
Notice that the string has many lines in it - the ^
and $
characters allow you to match the beginning and end of those lines (basically treating the \n
character as a delimeter) while \A
and \Z
allow you to match the beginning and end of the entire string.
Solution 4 - Ruby
Difference By Example
/^foo$/
matches any of the following,/\Afoo\z/
does not:whatever1 foo whatever2
foo
whatever2
whatever1
foo
/^foo$/
and/\Afoo\z/
all match the following:foo