How do I match any character across multiple lines in a regular expression?

RegexMultiline

Regex Problem Overview


For example, this regex

(.*)<FooBar>

will match:

abcde<FooBar>

But how do I get it to match across multiple lines?

abcde
fghij<FooBar>

Regex Solutions


Solution 1 - Regex

Try this:

((.|\n)*)<FooBar>

It basically says "any character or a newline" repeated zero or more times.

Solution 2 - Regex

It depends on the language, but there should be a modifier that you can add to the regex pattern. In PHP it is:

/(.*)<FooBar>/s

The s at the end causes the dot to match all characters including newlines.

Solution 3 - Regex

The question is, can the . pattern match any character? The answer varies from engine to engine. The main difference is whether the pattern is used by a POSIX or non-POSIX regex library.

A special note about [tag:lua-patterns]: they are not considered regular expressions, but . matches any character there, the same as POSIX-based engines.

Another note on [tag:matlab] and [tag:octave]: the . matches any character by default (demo): str = "abcde\n fghij<Foobar>"; expression = '(.*)<Foobar>*'; [tokens,matches] = regexp(str,expression,'tokens','match'); (tokens contain a abcde\n fghij item).

Also, in all of [tag:Boost]'s regex grammars the dot matches line breaks by default. Boost's ECMAScript grammar allows you to turn this off with regex_constants::no_mod_m (source).

As for [tag:oracle] (it is POSIX based), use the n option (demo): select regexp_substr('abcde' || chr(10) ||' fghij<Foobar>', '(.*)<Foobar>', 1, 1, 'n', 1) as results from dual

POSIX-based engines:

A mere . already matches line breaks, so there isn't a need to use any modifiers, see [tag:bash] (demo).

The [tag:tcl] (demo), [tag:postgresql] (demo), [tag:r] (TRE, base R default engine with no perl=TRUE, for base R with perl=TRUE or for stringr/stringi patterns, use the (?s) inline modifier) (demo) also treat . the same way.

However, most POSIX-based tools process input line by line. Hence, . does not match the line breaks just because they are not in scope. Here are some examples how to override this:

  • [tag:sed] - There are multiple workarounds. The most precise, but not very safe, is sed 'H;1h;$!d;x; s/\(.*\)><Foobar>/\1/' (H;1h;$!d;x; slurps the file into memory). If whole lines must be included, sed '/start_pattern/,/end_pattern/d' file (removing from start will end with matched lines included) or sed '/start_pattern/,/end_pattern/{{//!d;};}' file (with matching lines excluded) can be considered.
  • [tag:perl] - perl -0pe 's/(.*)<FooBar>/$1/gs' <<< "$str" (-0 slurps the whole file into memory, -p prints the file after applying the script given by -e). Note that using -000pe will slurp the file and activate 'paragraph mode' where Perl uses consecutive newlines (\n\n) as the record separator.
  • [tag:gnu-grep] - grep -Poz '(?si)abc\K.*?(?=<Foobar>)' file. Here, z enables file slurping, (?s) enables the DOTALL mode for the . pattern, (?i) enables case insensitive mode, \K omits the text matched so far, *? is a lazy quantifier, (?=<Foobar>) matches the location before <Foobar>.
  • [tag:pcregrep] - pcregrep -Mi "(?si)abc\K.*?(?=<Foobar>)" file (M enables file slurping here). Note pcregrep is a good solution for macOS grep users.

See demos.

Non-POSIX-based engines:

  • [tag:php] - Use the s modifier PCRE_DOTALL modifier: preg_match('~(.*)<Foobar>~s', $s, $m) (demo)

  • [tag:C#] - Use RegexOptions.Singleline flag (demo):
    - var result = Regex.Match(s, @"(.*)<Foobar>", RegexOptions.Singleline).Groups[1].Value;
    - var result = Regex.Match(s, @"(?s)(.*)<Foobar>").Groups[1].Value;

  • [tag:powershell] - Use the (?s) inline option: $s = "abcde`nfghij<FooBar>"; $s -match "(?s)(.*)<Foobar>"; $matches[1]

  • [tag:perl] - Use the s modifier (or (?s) inline version at the start) (demo): /(.*)<FooBar>/s

  • [tag:python] - Use the re.DOTALL (or re.S) flags or (?s) inline modifier (demo): m = re.search(r"(.*)<FooBar>", s, flags=re.S) (and then if m:, print(m.group(1)))

  • [tag:java] - Use Pattern.DOTALL modifier (or inline (?s) flag) (demo): Pattern.compile("(.*)<FooBar>", Pattern.DOTALL)

  • [tag:kotlin] - Use RegexOption.DOT_MATCHES_ALL : "(.*)<FooBar>".toRegex(RegexOption.DOT_MATCHES_ALL)

  • [tag:groovy] - Use (?s) in-pattern modifier (demo): regex = /(?s)(.*)<FooBar>/

  • [tag:scala] - Use (?s) modifier (demo): "(?s)(.*)<Foobar>".r.findAllIn("abcde\n fghij<Foobar>").matchData foreach { m => println(m.group(1)) }

  • [tag:javascript] - Use [^] or workarounds [\d\D] / [\w\W] / [\s\S] (demo): s.match(/([\s\S]*)<FooBar>/)[1]

  • [tag:c++] (std::regex) Use [\s\S] or the JavaScript workarounds (demo): regex rex(R"(([\s\S]*)<FooBar>)");

  • [tag:vba] [tag:vbscript] - Use the same approach as in JavaScript, ([\s\S]*)<Foobar>. (NOTE: The MultiLine property of the RegExp object is sometimes erroneously thought to be the option to allow . match across line breaks, while, in fact, it only changes the ^ and $ behavior to match start/end of lines rather than strings, the same as in JavaScript regex) behavior.)

  • [tag:ruby] - Use the /m MULTILINE modifier (demo): s[/(.*)<Foobar>/m, 1]

  • [tag:r][tag:tre][tag:base-R] - Base R PCRE regexps - use (?s): regmatches(x, regexec("(?s)(.*)<FooBar>",x, perl=TRUE))[[1]][2] (demo)

  • [tag:r][tag:icu][tag:stringr][tag:stringi] - in stringr/stringi regex funtions that are powered with the ICU regex engine. Also use (?s): stringr::str_match(x, "(?s)(.*)<FooBar>")[,2] (demo)

  • [tag:go] - Use the inline modifier (?s) at the start (demo): re: = regexp.MustCompile(`(?s)(.*)<FooBar>`)

  • [tag:swift] - Use dotMatchesLineSeparators or (easier) pass the (?s) inline modifier to the pattern: let rx = "(?s)(.*)<Foobar>"

  • [tag:objective-c] - The same as Swift. (?s) works the easiest, but here is how the option can be used: NSRegularExpression* regex = [NSRegularExpression regularExpressionWithPattern:pattern options:NSRegularExpressionDotMatchesLineSeparators error:&regexError];

  • [tag:re2], [tag:google-apps-script] - Use the (?s) modifier (demo): "(?s)(.*)<Foobar>" (in Google Spreadsheets, =REGEXEXTRACT(A2,"(?s)(.*)<Foobar>"))

NOTES ON (?s):

In most non-POSIX engines, the (?s) inline modifier (or embedded flag option) can be used to enforce . to match line breaks.

If placed at the start of the pattern, (?s) changes the bahavior of all . in the pattern. If the (?s) is placed somewhere after the beginning, only those .s will be affected that are located to the right of it unless this is a pattern passed to Python's re. In Python re, regardless of the (?s) location, the whole pattern . is affected. The (?s) effect is stopped using (?-s). A modified group can be used to only affect a specified range of a regex pattern (e.g., Delim1(?s:.*?)\nDelim2.* will make the first .*? match across newlines and the second .* will only match the rest of the line).

POSIX note:

In non-POSIX regex engines, to match any character, [\s\S] / [\d\D] / [\w\W] constructs can be used.

In POSIX, [\s\S] is not matching any character (as in JavaScript or any non-POSIX engine), because regex escape sequences are not supported inside bracket expressions. [\s\S] is parsed as bracket expressions that match a single character, \ or s or S.

Solution 4 - Regex

If you're using Eclipse search, you can enable the "DOTALL" option to make '.' match any character including line delimiters: just add "(?s)" at the beginning of your search string. Example:

(?s).*<FooBar>

Solution 5 - Regex

In many regex dialects, /[\S\s]*<Foobar>/ will do just what you want. Source

Solution 6 - Regex

([\s\S]*)<FooBar>

The dot matches all except newlines (\r\n). So use \s\S, which will match ALL characters.

Solution 7 - Regex

In Ruby you can use the 'm' option (multiline):

/YOUR_REGEXP/m

See the Regexp documentation on ruby-doc.org for more information.

Solution 8 - Regex

We can also use

(.*?\n)*?

to match everything including newline without being greedy.

This will make the new line optional

(.*?|\n)*?

Solution 9 - Regex

"." normally doesn't match line-breaks. Most regex engines allows you to add the S-flag (also called DOTALL and SINGLELINE) to make "." also match newlines. If that fails, you could do something like [\S\s].

Solution 10 - Regex

For Eclipse, the following expression worked:

> Foo > > >jadajada Bar"

Regular expression:

Foo[\S\s]{1,10}.*Bar*

Solution 11 - Regex

Use:

/(.*)<FooBar>/s

The s causes dot (.) to match carriage returns.

Solution 12 - Regex

Note that (.|\n)* can be less efficient than (for example) [\s\S]* (if your language's regexes support such escapes) and than finding how to specify the modifier that makes . also match newlines. Or you can go with POSIXy alternatives like [[:space:][:^space:]]*.

Solution 13 - Regex

Use RegexOptions.Singleline. It changes the meaning of . to include newlines.

Regex.Replace(content, searchText, replaceText, RegexOptions.Singleline);

Solution 14 - Regex

In a Java-based regular expression, you can use [\s\S].

Solution 15 - Regex

Generally, . doesn't match newlines, so try ((.|\n)*)<foobar>.

Solution 16 - Regex

Solution:

Use pattern modifier sU will get the desired matching in PHP.

Example:

preg_match('/(.*)/sU', $content, $match);

Sources:

Solution 17 - Regex

In notepad++ you can use this

<table (.|\r\n)*</table>

It will match the entire table starting from

rows and columns

You can make it greedy, using the following, that way it will match the first, second and so forth tables and not all at once

<table (.|\r\n)*?</table>

Solution 18 - Regex

In the context of use within languages, regular expressions act on strings, not lines. So you should be able to use the regex normally, assuming that the input string has multiple lines.

In this case, the given regex will match the entire string, since "<FooBar>" is present. Depending on the specifics of the regex implementation, the $1 value (obtained from the "(.*)") will either be "fghij" or "abcde\nfghij". As others have said, some implementations allow you to control whether the "." will match the newline, giving you the choice.

Line-based regular expression use is usually for command line things like egrep.

Solution 19 - Regex

I had the same problem and solved it in probably not the best way but it works. I replaced all line breaks before I did my real match:

mystring = Regex.Replace(mystring, "\r\n", "")

I am manipulating HTML so line breaks don't really matter to me in this case.

I tried all of the suggestions above with no luck. I am using .NET 3.5 FYI.

Solution 20 - Regex

In JavaScript you can use [^]* to search for zero to infinite characters, including line breaks.

$("#find_and_replace").click(function() {
  var text = $("#textarea").val();
  search_term = new RegExp("[^]*<Foobar>", "gi");;
  replace_term = "Replacement term";
  var new_text = text.replace(search_term, replace_term);
  $("#textarea").val(new_text);
});

<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<button id="find_and_replace">Find and replace</button>
<br>
<textarea ID="textarea">abcde
fghij&lt;Foobar&gt;</textarea>

Solution 21 - Regex

Try: .*\n*.*<FooBar> assuming you are also allowing blank newlines. As you are allowing any character including nothing before <FooBar>.

Solution 22 - Regex

I wanted to match a particular if block in Java:

   ...
   ...
   if(isTrue){
       doAction();

   }
...
...
}

If I use the regExp

if \(isTrue(.|\n)*}

it included the closing brace for the method block, so I used

if \(!isTrue([^}.]|\n)*}

to exclude the closing brace from the wildcard match.

Solution 23 - Regex

Often we have to modify a substring with a few keywords spread across lines preceding the substring. Consider an XML element:

<TASK>
  <UID>21</UID>
  <Name>Architectural design</Name>
  <PercentComplete>81</PercentComplete>
</TASK>

Suppose we want to modify the 81, to some other value, say 40. First identify .UID.21..UID., then skip all characters including \n till .PercentCompleted.. The regular expression pattern and the replace specification are:

String hw = new String("<TASK>\n  <UID>21</UID>\n  <Name>Architectural design</Name>\n  <PercentComplete>81</PercentComplete>\n</TASK>");
String pattern = new String ("(<UID>21</UID>)((.|\n)*?)(<PercentComplete>)(\\d+)(</PercentComplete>)");
String replaceSpec = new String ("$1$2$440$6");
// Note that the group (<PercentComplete>) is $4 and the group ((.|\n)*?) is $2.

String iw = hw.replaceFirst(pattern, replaceSpec);
System.out.println(iw);

<TASK>
  <UID>21</UID>
  <Name>Architectural design</Name>
  <PercentComplete>40</PercentComplete>
</TASK>

The subgroup (.|\n) is probably the missing group $3. If we make it non-capturing by (?:.|\n) then the $3 is (<PercentComplete>). So the pattern and replaceSpec can also be:

pattern = new String("(<UID>21</UID>)((?:.|\n)*?)(<PercentComplete>)(\\d+)(</PercentComplete>)");
replaceSpec = new String("$1$2$340$5")

and the replacement works correctly as before.

Solution 24 - Regex

Typically searching for three consecutive lines in PowerShell, it would look like:

$file = Get-Content file.txt -raw

$pattern = 'lineone\r\nlinetwo\r\nlinethree\r\n'     # "Windows" text
$pattern = 'lineone\nlinetwo\nlinethree\n'           # "Unix" text
$pattern = 'lineone\r?\nlinetwo\r?\nlinethree\r?\n'  # Both

$file -match $pattern

# output
True

Bizarrely, this would be Unix text at the prompt, but Windows text in a file:

$pattern = 'lineone
linetwo
linethree
'

Here's a way to print out the line endings:

'lineone
linetwo
linethree
' -replace "`r",'\r' -replace "`n",'\n'

# Output
lineone\nlinetwo\nlinethree\n

Solution 25 - Regex

###Option 1

One way would be to use the s flag (just like the accepted answer):

/(.*)<FooBar>/s

###Demo 1

###Option 2

A second way would be to use the m (multiline) flag and any of the following patterns:

/([\s\S]*)<FooBar>/m

or

/([\d\D]*)<FooBar>/m

or

/([\w\W]*)<FooBar>/m

###Demo 2

RegEx Circuit

jex.im visualizes regular expressions:

enter image description here

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionandyukView Question on Stackoverflow
Solution 1 - RegexlevikView Answer on Stackoverflow
Solution 2 - RegexPaige RutenView Answer on Stackoverflow
Solution 3 - RegexWiktor StribiżewView Answer on Stackoverflow
Solution 4 - RegexPaulo MersonView Answer on Stackoverflow
Solution 5 - RegexAbbas ShahzadehView Answer on Stackoverflow
Solution 6 - RegexsamwizeView Answer on Stackoverflow
Solution 7 - Regexuser621354View Answer on Stackoverflow
Solution 8 - RegexNambi_0915View Answer on Stackoverflow
Solution 9 - RegexMarkus JarderotView Answer on Stackoverflow
Solution 10 - RegexGordonView Answer on Stackoverflow
Solution 11 - RegexBillView Answer on Stackoverflow
Solution 12 - RegextyeView Answer on Stackoverflow
Solution 13 - RegexshmallView Answer on Stackoverflow
Solution 14 - RegexKamahireView Answer on Stackoverflow
Solution 15 - RegextloachView Answer on Stackoverflow
Solution 16 - RegexSian Lerk LauView Answer on Stackoverflow
Solution 17 - RegexHammad KhanView Answer on Stackoverflow
Solution 18 - RegexnsayerView Answer on Stackoverflow
Solution 19 - RegexSleeView Answer on Stackoverflow
Solution 20 - RegexPaul Chris JonesView Answer on Stackoverflow
Solution 21 - Regexhafiz031View Answer on Stackoverflow
Solution 22 - RegexSpangenView Answer on Stackoverflow
Solution 23 - Regexuser1348737View Answer on Stackoverflow
Solution 24 - Regexjs2010View Answer on Stackoverflow
Solution 25 - RegexEmmaView Answer on Stackoverflow