How to select lines between two marker patterns which may occur multiple times with awk/sed

ShellUnixSedAwkPattern Matching

Shell Problem Overview


Using awk or sed how can I select lines which are occurring between two different marker patterns? There may be multiple sections marked with these patterns.

For example: Suppose the file contains:

abc
def1
ghi1
jkl1
mno
abc
def2
ghi2
jkl2
mno
pqr
stu

And the starting pattern is abc and ending pattern is mno So, I need the output as:

def1
ghi1
jkl1
def2
ghi2
jkl2

I am using sed to match the pattern once:

sed -e '1,/abc/d' -e '/mno/,$d' <FILE>

Is there any way in sed or awk to do it repeatedly until the end of file?

Shell Solutions


Solution 1 - Shell

Use awk with a flag to trigger the print when necessary:

$ awk '/abc/{flag=1;next}/mno/{flag=0}flag' file
def1
ghi1
jkl1
def2
ghi2
jkl2

How does this work?

  • /abc/ matches lines having this text, as well as /mno/ does.
  • /abc/{flag=1;next} sets the flag when the text abc is found. Then, it skips the line.
  • /mno/{flag=0} unsets the flag when the text mno is found.
  • The final flag is a pattern with the default action, which is to print $0: if flag is equal 1 the line is printed.

For a more detailed description and examples, together with cases when the patterns are either shown or not, see How to select lines between two patterns?.

Solution 2 - Shell

Using sed:

sed -n -e '/^abc$/,/^mno$/{ /^abc$/d; /^mno$/d; p; }'

The -n option means do not print by default.

The pattern looks for lines containing just abc to just mno, and then executes the actions in the { ... }. The first action deletes the abc line; the second the mno line; and the p prints the remaining lines. You can relax the regexes as required. Any lines outside the range of abc..mno are simply not printed.

Solution 3 - Shell

This might work for you (GNU sed):

sed '/^abc$/,/^mno$/{//!b};d' file

Delete all lines except for those between lines starting abc and mno

Solution 4 - Shell

sed '/^abc$/,/^mno$/!d;//d' file

golfs two characters better than ppotong's {//!b};d

The empty forward slashes // mean: "reuse the last regular expression used". and the command does the same as the more understandable:

sed '/^abc$/,/^mno$/!d;/^abc$/d;/^mno$/d' file

This seems to be POSIX:

> If an RE is empty (that is, no pattern is specified) sed shall behave as if the last RE used in the last command applied (either as an address or as part of a substitute command) was specified.

Solution 5 - Shell

From the previous response's links, the one that did it for me, running ksh on Solaris, was this:

sed '1,/firstmatch/d;/secondmatch/,$d'
  • 1,/firstmatch/d: from line 1 until the first time you find firstmatch, delete.
  • /secondmatch/,$d: from the first occurrance of secondmatch until the end of file, delete.
  • Semicolon separates the two commands, which are executed in sequence.

Solution 6 - Shell

something like this works for me:

file.awk:

BEGIN {
    record=0
}

/^abc$/ {
    record=1
}

/^mno$/ {
    record=0;
    print "s="s;
    s=""
}

!/^abc|mno$/ {
    if (record==1) {
        s = s"\n"$0
    }   
}

using: awk -f file.awk data...

edit: O_o fedorqui solution is way better/prettier than mine.

Solution 7 - Shell

Don_crissti's answer from Show only text between 2 matching pattern?

firstmatch="abc"
secondmatch="cdf"
sed "/$firstmatch/,/$secondmatch/!d;//d" infile

which is much more efficient than AWK's application, see here.

Solution 8 - Shell

perl -lne 'print if((/abc/../mno/) && !(/abc/||/mno/))' your_file

Solution 9 - Shell

I tried to use awk to print lines between two patterns while pattern2 also match pattern1. And the pattern1 line should also be printed.

e.g. source

package AAA
aaa
bbb
ccc
package BBB
ddd
eee
package CCC
fff
ggg
hhh
iii
package DDD
jjj

should has an ouput of

package BBB
ddd
eee

Where pattern1 is package BBB, pattern2 is package \w*. Note that CCC isn't a known value so can't be literally matched.

In this case, neither @scai 's awk '/abc/{a=1}/mno/{print;a=0}a' file nor @fedorqui 's awk '/abc/{a=1} a; /mno/{a=0}' file works for me.

Finally, I managed to solve it by awk '/package BBB/{flag=1;print;next}/package \w*/{flag=0}flag' file, haha

A little more effort result in awk '/package BBB/{flag=1;print;next}flag;/package \w*/{flag=0}' file, to print pattern2 line also, that is,

package BBB
ddd
eee
package CCC

Solution 10 - Shell

This can also be done with logical operations and increment/decrement operations on a flag:

awk '/mno/&&--f||f||/abc/&&f++' file

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestiondvaiView Question on Stackoverflow
Solution 1 - ShellfedorquiView Answer on Stackoverflow
Solution 2 - ShellJonathan LefflerView Answer on Stackoverflow
Solution 3 - ShellpotongView Answer on Stackoverflow
Solution 4 - ShellCiro Santilli Путлер Капут 六四事View Answer on Stackoverflow
Solution 5 - ShellFanDeLaUView Answer on Stackoverflow
Solution 6 - ShellpatalucView Answer on Stackoverflow
Solution 7 - ShellLéo Léopold Hertz 준영View Answer on Stackoverflow
Solution 8 - ShellVijayView Answer on Stackoverflow
Solution 9 - ShellWeekendView Answer on Stackoverflow
Solution 10 - ShellblhsingView Answer on Stackoverflow