Regex, how to match multiple lines?

RubyRegexRubular

Ruby Problem Overview


I'm trying to match the From line all the way to the end of the Subject line in the following:

....
From: XXXXXX 
Date: Tue, 8 Mar 2011 10:52:42 -0800 
To: XXXXXXX
Subject: XXXXXXX
....

So far I have:

/From:.*Date:.*To:.*Subject/m

But that doesn't match to the end of the subject line. I tried adding $ but that had no effect.

Ruby Solutions


Solution 1 - Ruby

You can use the /m modifier to enable multiline mode (i.e. to allow . to match newlines), and you can use ? to perform non-greedy matching:

message = <<-MSG
Random Line 1
Random Line 2
From: [email protected]
Date: 01-01-2011
To: friend@example.com
Subject: This is the subject line
Random Line 3
Random Line 4
MSG

message.match(/(From:.*Subject.*?)\n/m)[1]
=> "From: [email protected]\nDate: 01-01-2011\nTo: [email protected]\nSubject: This is the subject line"

See http://ruby-doc.org/core/Regexp.html and search for "multiline mode" and "greedy by default".

Solution 2 - Ruby

If you are using ruby, you can try :

Regexp.new("some reg", Regexp::MULTILINE)

If you are not using ruby, I suggest you hack this question:

  1. replace all the "\n" with SOME_SPECIAL_TOKEN
  2. search the regexp, and do other operations...
  3. restore: replace SOME_SPECIAL_TOKEN with "\n"

Solution 3 - Ruby

If you want to match across linebreaks, one possibility is to first replace all newline characters with some other character (or character sequence) that wouldn't otherwise appear in the text. For example, if you have all of the text in one string variable you can do something like aString.split("\n").join("|") to replace all newlines in the string with pipe characters.

Also, look at Alan Moore's answer to your previous question regarding how to match the newline character in a regular expression.

Solution 4 - Ruby


Try:

/...^Subject:[^\n]*/m

Solution 5 - Ruby

Using the following data:

From: XXXXXX
Date: Tue, 8 Mar 2011 10:52:42 -0800
To: XXXXXXX
Subject: XXXXXXX

The following regex will do the magic:

From:([^\r\n]+)[\r\n]+Date:([^\r\n]+)[\r\n]+To:([^\r\n]+)[\r\n]+Subject:([^\r\n]+)[\r\n]+

But I would recommend that you don't try and do this in 1 regex. Push into a regex "^(\w+):(.+)$" line by line, unless you are sure that the sequence of the FROM/DATE/TO/SUBJECT is not going to change ;)

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionAnApprenticeView Question on Stackoverflow
Solution 1 - RubyPan ThomakosView Answer on Stackoverflow
Solution 2 - RubySiweiView Answer on Stackoverflow
Solution 3 - RubybtaView Answer on Stackoverflow
Solution 4 - RubyDigitalRossView Answer on Stackoverflow
Solution 5 - RubychkdskView Answer on Stackoverflow