Variable-length lookbehind-assertion alternatives for regular expressions

PhpJavascriptPythonRegexPerl

Php Problem Overview


Is there an implementation of regular expressions in Python/PHP/JavaScript that supports variable-length lookbehind-assertion?

/(?<!foo.*)bar/

How can I write a regular expression that has the same meaning, but uses no lookbehind-assertion?

Is there a chance that this type of assertion will be implemented some day?

Things are much better that I thought.

Update:

(1) There are regular expressions implementation that support variable-length lookbehind-assertion already.

Python module regex (not standard re, but additional regex module) supports such assertions (and has many other cool features).

>>> import regex
>>> m = regex.search('(?<!foo.*)bar', 'f00bar')
>>> print m.group()
bar
>>> m = regex.search('(?<!foo.*)bar', 'foobar')
>>> print m
None

It was a really big surprise for me that there is something in regular expressions that Perl can't do and Python can. Probably, there is "enhanced regular expression" implementation for Perl also?

(Thanks and +1 to MRAB).

(2) There is a cool feature \K in modern regular expressions.

This symbols means that when you make a substitution (and from my point of view the most interesting use case of assertions is the substitution), all characters that were found before \K must not be changed.

s/unchanged-part\Kchanged-part/new-part/x

That is almost like a look-behind assertion, but not so flexible of course.

More about \K:

As far as I understand, you can't use \K twice in the same regular expression. And you can't say till which point you want to "kill" the characters that you've found. That is always till the beginning of the line.

(Thanks and +1 to ikegami).

My additional questions:

  • Is it possible to say what point must be the final point of \K effect?
  • What about enhanced regular expressions implementations for Perl/Ruby/JavaScript/PHP? Something like regex for Python.

Php Solutions


Solution 1 - Php

Most of the time, you can avoid variable length lookbehinds by using \K.

s/(?<=foo.*)bar/moo/s;

would be

s/foo.*\Kbar/moo/s;

Anything up to the last \K encountered is not considered part of the match (e.g. for the purposes of replacement, $&, etc)

Negative lookbehinds are a little trickier.

s/(?<!foo.*)bar/moo/s;

would be

s/^(?:(?!foo).)*\Kbar/moo/s;

because (?:(?!STRING).)* is to STRING as [^CHAR]* is to CHAR.


If you're just matching, you might not even need the \K.

/foo.*bar/s

/^(?:(?!foo).)*bar/s

Solution 2 - Php

For Python there's a regex implementation which supports variable-length lookbehinds:

http://pypi.python.org/pypi/regex

It's designed to be backwards-compatible with the standard re module.

Solution 3 - Php

You can reverse the string AND the pattern and use variable length lookahead

(rab(?!\w*oof)\w*)

matches in bold:

> raboof rab7790oof raboo rabof rab rabo raboooof rabo

Original solution as far as I know by:

> Jeff 'japhy' Pinyan

Solution 4 - Php

The regexp you show will find any instance of bar which is not preceded by foo.

A simple alternative would be to first match foo against the string, and find the index of the first occurrence. Then search for bar, and see if you can find an occurrence which comes before that index.

If you want to find instances of bar which are not directly preceded by foo, I could also provide a regexp for that (without using lookbehind), but it will be very ugly. Basically, invert the sense of /foo/ -- i.e. /[^f]oo|[^o]o|[^o]|$/.

Solution 5 - Php

foo.*|(bar)

If foo is in the string first, then the regex will match, but there will be no groups.

Otherwise, it will find bar and assign it to a group.

So you can use this regex and look for your results in the groups found:

>>> import re
>>> m = re.search('foo.*|(bar)', 'f00bar')
>>> if m: print(m.group(1))
bar
>>> m = re.search('foo.*|(bar)', 'foobar')
>>> if m: print(m.group(1))
None
>>> m = re.search('foo.*|(bar)', 'fobas')
>>> if m: print(m.group(1))
>>> 

Source.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionIgor ChubinView Question on Stackoverflow
Solution 1 - PhpikegamiView Answer on Stackoverflow
Solution 2 - PhpMRABView Answer on Stackoverflow
Solution 3 - PhpBenjamin Udink ten CateView Answer on Stackoverflow
Solution 4 - PhpAlex DView Answer on Stackoverflow
Solution 5 - PhptwasbrilligView Answer on Stackoverflow