Why doesn't ignorecase flag (re.I) work in re.sub()

PythonRegex

Python Problem Overview


From pydoc:

> re.sub = sub(pattern, repl, string, count=0, flags=0)
> Return the string obtained by replacing the leftmost > non-overlapping occurrences of the pattern in string by the > replacement repl. repl can be either a string or a callable; > if a string, backslash escapes in it are processed. If it is > a callable, it's passed the match object and must return > a replacement string to be used.

example code:

import re
print re.sub('class', 'function', 'Class object', re.I)

No replacement is made unless I change pattern to 'Class'.

Documentation doesn't mention anything about this limitation, so I assume I may be doing something wrong.

What's the case here?

Python Solutions


Solution 1 - Python

Seems to me that you should be doing:

import re
print(re.sub('class', 'function', 'Class object', flags=re.I))

Without this, the re.I argument is passed to the count argument.

Solution 2 - Python

The flags argument is the fifth one - you're passing the value of re.I as the count argument (an easy mistake to make).

Solution 3 - Python

Note for those who still deal with Python 2.6.x installations or older. Python documentation for 2.6 re says:

re.sub(pattern, repl, string[, count])

re.compile(pattern[, flags])

This means you cannot pass flags directly to sub. They can only be used with compile:

regex = re.compile('class', re.I)
regex.sub("function", "Class object")

Solution 4 - Python

To avoid mistakes of this kind, the following monkey patching can be used:

import re
re.sub = lambda pattern, repl, string, *, count=0, flags=0, _fun=re.sub: \
    _fun(pattern, repl, string, count=count, flags=flags)

(* is to forbid specifying count, flags as positional arguments. _fun=re.sub is to use the declaration-time re.sub.)

Demo:

$ python
Python 3.4.2 (default, Oct  8 2014, 10:45:20) 
[GCC 4.9.1] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> re.sub(r'\b or \b', ',', 'or x', re.X)
'or x'   # ?!
>>> re.sub = lambda pattern, repl, string, *, count=0, flags=0, _fun=re.sub: \
...     _fun(pattern, repl, string, count=count, flags=flags)
>>> re.sub(r'\b or \b', ',', 'or x', re.X)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: <lambda>() takes 3 positional arguments but 4 were given
>>> re.sub(r'\b or \b', ',', 'or x', flags=re.X)
', x'
>>> 

Solution 5 - Python

Just to add to Seppo's answer. According to http://docs.python.org/2.6/library/re.html, there is still a way to pass flags directly to 'sub' in 2.6 which might be useful if you have to make a 2.7 code with a lot of sub's compatible with 2.6. To quote the manual:

> ... if you need to specify regular expression flags, you must use a RE object, or use embedded modifiers in a pattern; for example, sub("(?i)b+", "x", "bbbb BBBB") returns 'x x'

and

>(?iLmsux) (One or more letters from the set 'i', 'L', 'm', 's', 'u', 'x'.) The group matches the empty string; the letters set the corresponding flags: re.I (ignore case), re.L (locale dependent), re.M (multi-line), re.S (dot matches all), re.U (Unicode dependent), and re.X (verbose), for the entire regular expression. (The flags are described in Module Contents.) This is useful if you wish to include the flags as part of the regular expression, instead of passing a flag argument to the re.compile() function.

In practice, this means

print re.sub("class", "function", "Class object", flags=re.I)

can be rewritten using modifiers (?ms) as

print re.sub("(?i)class", "function", "Class object")

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionthetaView Question on Stackoverflow
Solution 1 - PythonAndré CaronView Answer on Stackoverflow
Solution 2 - PythonekhumoroView Answer on Stackoverflow
Solution 3 - PythonSeppo ErviäläView Answer on Stackoverflow
Solution 4 - PythonKirill BulyginView Answer on Stackoverflow
Solution 5 - PythonMaksymView Answer on Stackoverflow