How to grep for contents after pattern?

LinuxGrep

Linux Problem Overview


Given a file, for example:

potato: 1234
apple: 5678
potato: 5432
grape: 4567
banana: 5432
sushi: 56789

I'd like to grep for all lines that start with potato: but only pipe the numbers that follow potato: . So in the above example, the output would be:

1234
5432

How can I do that?

Linux Solutions


Solution 1 - Linux

grep 'potato:' file.txt | sed 's/^.*: //'

grep looks for any line that contains the string potato:, then, for each of these lines, sed replaces (s/// - substitute) any character (.*) from the beginning of the line (^) until the last occurrence of the sequence : (colon followed by space) with the empty string (s/...// - substitute the first part with the second part, which is empty).

or

grep 'potato:' file.txt | cut -d\   -f2

For each line that contains potato:, cut will split the line into multiple fields delimited by space (-d\ - d = delimiter, \ = escaped space character, something like -d" " would have also worked) and print the second field of each such line (-f2).

or

grep 'potato:' file.txt | awk '{print $2}'

For each line that contains potato:, awk will print the second field (print $2) which is delimited by default by spaces.

or

grep 'potato:' file.txt | perl -e 'for(<>){s/^.*: //;print}'

All lines that contain potato: are sent to an inline (-e) Perl script that takes all lines from stdin, then, for each of these lines, does the same substitution as in the first example above, then prints it.

or

awk '{if(/potato:/) print $2}' < file.txt

The file is sent via stdin (< file.txt sends the contents of the file via stdin to the command on the left) to an awk script that, for each line that contains potato: (if(/potato:/) returns true if the regular expression /potato:/ matches the current line), prints the second field, as described above.

or

perl -e 'for(<>){/potato:/ && s/^.*: // && print}' < file.txt

The file is sent via stdin (< file.txt, see above) to a Perl script that works similarly to the one above, but this time it also makes sure each line contains the string potato: (/potato:/ is a regular expression that matches if the current line contains potato:, and, if it does (&&), then proceeds to apply the regular expression described above and prints the result).

Solution 2 - Linux

Or use regex assertions: grep -oP '(?<=potato: ).*' file.txt

Solution 3 - Linux

grep -Po 'potato:\s\K.*' file

-P to use Perl regular expression

-o to output only the match

\s to match the space after potato:

\K to omit the match

.* to match rest of the string(s)

Solution 4 - Linux

sed -n 's/^potato:[[:space:]]*//p' file.txt

One can think of Grep as a restricted Sed, or of Sed as a generalized Grep. In this case, Sed is one good, lightweight tool that does what you want -- though, of course, there exist several other reasonable ways to do it, too.

Solution 5 - Linux

This will print everything after each match, on that same line only:

perl -lne 'print $1 if /^potato:\s*(.*)/' file.txt

This will do the same, except it will also print all subsequent lines:

perl -lne 'if ($found){print} elsif (/^potato:\s*(.*)/){print $1; $found++}' file.txt

These command-line options are used:

  • -n loop around each line of the input file

  • -l removes newlines before processing, and adds them back in afterwards

  • -e execute the perl code

Solution 6 - Linux

Modern BASH has support for regular expressions:

while read -r line; do
  if [[ $line =~ ^potato:\ ([0-9]+) ]]; then
    echo "${BASH_REMATCH[1]}"
  fi
done

Solution 7 - Linux

You can use grep, as the other answers state. But you don't need grep, awk, sed, perl, cut, or any external tool. You can do it with pure bash.

Try this (semicolons are there to allow you to put it all on one line):

$ while read line;
  do
    if [[ "${line%%:\ *}" == "potato" ]];
    then
      echo ${line##*:\ };
    fi;
  done< file.txt

tells bash to delete the longest match of ": " in $line from the front.

$ while read line; do echo ${line##*:\ }; done< file.txt
1234
5678
5432
4567
5432
56789

or if you wanted the key rather than the value, %% tells bash to delete the longest match of ": " in $line from the end.

$ while read line; do echo ${line%%:\ *}; done< file.txt
potato
apple
potato
grape
banana
sushi

The substring to split on is ":\ " because the space character must be escaped with the backslash.

You can find more like these at the linux documentation project.

Solution 8 - Linux

grep potato file | grep -o "[0-9].*"

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionLexiconView Question on Stackoverflow
Solution 1 - LinuxridView Answer on Stackoverflow
Solution 2 - Linuxmohit6upView Answer on Stackoverflow
Solution 3 - LinuxtuxutkuView Answer on Stackoverflow
Solution 4 - LinuxthbView Answer on Stackoverflow
Solution 5 - LinuxChris KoknatView Answer on Stackoverflow
Solution 6 - LinuxcevingView Answer on Stackoverflow
Solution 7 - LinuxmightypileView Answer on Stackoverflow
Solution 8 - LinuxWawrzyniec PruskiView Answer on Stackoverflow