Regular expression for a string containing one word but not another
RegexGoogle AnalyticsRegex NegationRegex Problem Overview
I'm setting up some goals in Google Analytics and could use a little regex help.
Lets say I have 4 URLs
http://www.anydotcom.com/test/search.cfm?metric=blah&selector=size&value=1
http://www.anydotcom.com/test/search.cfm?metric=blah2&selector=style&value=1
http://www.anydotcom.com/test/search.cfm?metric=blah3&selector=size&value=1
http://www.anydotcom.com/test/details.cfm?metric=blah&selector=size&value=1
I want to create an expression that will identify any URL that contains the string selector=size but does NOT contain details.cfm
I know that to find a string that does NOT contain another string I can use this expression:
(^((?!details.cfm).)*$)
But, I'm not sure how to add in the selector=size portion.
Any help would be greatly appreciated!
Regex Solutions
Solution 1 - Regex
This should do it:
^(?!.*details\.cfm).*selector=size.*$
^.*selector=size.*$
should be clear enough. The first bit, (?!.*details.cfm)
is a negative look-ahead: before matching the string it checks the string does not contain "details.cfm" (with any number of characters before it).
Solution 2 - Regex
^(?=.*selector=size)(?:(?!details\.cfm).)+$
If your regex engine supported posessive quantifiers (though I suspect Google Analytics does not), then I guess this will perform better for large input sets:
^[^?]*+(?<!details\.cfm).*?selector=size.*$
Solution 3 - Regex
regex could be (perl syntax):
`/^[(^(?!.*details\.cfm).*selector=size.*)|(selector=size.*^(?!.*details\.cfm).*)]$/`
Solution 4 - Regex
There is a problem with the regex in the accepted answer. It also matches abcselector=size
, selector=sizeabc
etc.
A correct regex can be ^(?!.*\bdetails\.cfm\b).*\bselector=size\b.*$
Explanation of the regex at regex101:
Solution 5 - Regex
I was looking for a way to avoid --line-buffered
on a tail in a similar situation as the OP and Kobi's solution works great for me. In my case excluding lines with either "bot" or "spider" while including ' / '
(for my root document).
My original command:
tail -f mylogfile | grep --line-buffered -v 'bot\|spider' | grep ' / '
Now becomes (with -P
perl switch):
tail -f mylogfile | grep -P '^(?!.*(bot|spider)).*\s\/\s.*$'