What does preceding a string literal with "r" mean?
PythonStringSyntaxLiteralsRawstringPython Problem Overview
I first saw it used in building regular expressions across multiple lines as a method argument to re.compile()
, so I assumed that r
stands for RegEx.
For example:
regex = re.compile(
r'^[A-Z]'
r'[A-Z0-9-]'
r'[A-Z]$', re.IGNORECASE
)
So what does r
mean in this case? Why do we need it?
Python Solutions
Solution 1 - Python
The r
means that the string is to be treated as a raw string, which means all escape codes will be ignored.
For an example:
'\n'
will be treated as a newline character, while r'\n'
will be treated as the characters \
followed by n
.
> When an 'r'
or 'R'
prefix is present,
> a character following a backslash is
> included in the string without change,
> and all backslashes are left in the
> string. For example, the string
> literal r"\n"
consists of two
> characters: a backslash and a
> lowercase 'n'
. String quotes can be
> escaped with a backslash, but the
> backslash remains in the string; for
> example, r"\""
is a valid string
> literal consisting of two characters:
> a backslash and a double quote; r"\"
> is not a valid string literal (even a
> raw string cannot end in an odd number
> of backslashes). Specifically, a raw
> string cannot end in a single
> backslash (since the backslash would
> escape the following quote character).
> Note also that a single backslash
> followed by a newline is interpreted
> as those two characters as part of the
> string, not as a line continuation.
Source: Python string literals
Solution 2 - Python
It means that escapes won’t be translated. For example:
r'\n'
is a string with a backslash followed by the letter n
. (Without the r
it would be a newline.)
b
does stand for byte-string and is used in Python 3, where strings are Unicode by default. In Python 2.x strings were byte-strings by default and you’d use u
to indicate Unicode.