Count all occurrences of a string in lots of files with grep

Grep

Grep Problem Overview


I have a bunch of log files. I need to find out how many times a string occurs in all files.

grep -c string *

returns

...
file1:1
file2:0
file3:0
...

Using a pipe I was able to get only files that have one or more occurrences:

grep -c string * | grep -v :0

...
file4:5
file5:1
file6:2
...

How can I get only the combined count? (If it returns file4:5, file5:1, file6:2, I want to get back 8.)

Grep Solutions


Solution 1 - Grep

This works for multiple occurrences per line:

grep -o string * | wc -l

Solution 2 - Grep

cat * | grep -c string

Solution 3 - Grep

grep -oh string * | wc -w

will count multiple occurrences in a line

Solution 4 - Grep

Instead of using -c, just pipe it to wc -l.

grep string * | wc -l

This will list each occurrence on a single line and then count the number of lines.

This will miss instances where the string occurs 2+ times on one line, though.

Solution 5 - Grep

cat * | grep -c string

One of the rare useful applications of cat.

Solution 6 - Grep

You can add -R to search recursively (and avoid to use cat) and -I to ignore binary files.

grep -RIc string .

Solution 7 - Grep

Something different than all the previous answers:

perl -lne '$count++ for m/<pattern>/g;END{print $count}' *

Solution 8 - Grep

Obligatory AWK solution:

grep -c string * | awk 'BEGIN{FS=":"}{x+=$2}END{print x}'

Take care if your file names include ":" though.

Solution 9 - Grep

If you want number of occurrences per file (example for string "tcp"):

grep -RIci "tcp" . | awk -v FS=":" -v OFS="\t" '$2>0 { print $2, $1 }' | sort -hr

Example output:

53	./HTTPClient/src/HTTPClient.cpp
21	./WiFi/src/WiFiSTA.cpp
19	./WiFi/src/ETH.cpp
13	./WiFi/src/WiFiAP.cpp
4	./WiFi/src/WiFiClient.cpp
4	./HTTPClient/src/HTTPClient.h
3	./WiFi/src/WiFiGeneric.cpp
2	./WiFi/examples/WiFiClientBasic/WiFiClientBasic.ino
2	./WiFiClientSecure/src/ssl_client.cpp
1	./WiFi/src/WiFiServer.cpp

Explanation:

  • grep -RIci NEEDLE . - looks for string NEEDLE recursively from current directory (following symlinks), ignoring binaries, counting number of occurrences, ignoring case
  • awk ... - this command ignores files with zero occurrences and formats lines
  • sort -hr - sorts lines in reverse order by numbers in first column

Of course, it works with other grep commands with option -c (count) as well. For example:

grep -c "tcp" *.txt | awk -v FS=":" -v OFS="\t" '$2>0 { print $2, $1 }' | sort -hr

Solution 10 - Grep

The AWK solution which also handles file names including colons:

grep -c string * | sed -r 's/^.*://' | awk 'BEGIN{}{x+=$1}END{print x}'

Keep in mind that this method still does not find multiple occurrences of string on the same line.

Solution 11 - Grep

You can use a simple grep to capture the number of occurrences effectively. I will use the -i option to make sure STRING/StrING/string get captured properly.

Command line that gives the files' name:

grep -oci string * | grep -v :0

Command line that removes the file names and prints 0 if there is a file without occurrences:

grep -ochi string *

Solution 12 - Grep

short recursive variant:

find . -type f -exec cat {} + | grep -c 'string'

Solution 13 - Grep

Here is a faster-than-grep AWK alternative way of doing this, which handles multiple matches of <url> per line, within a collection of XML files in a directory:

awk '/<url>/{m=gsub("<url>","");total+=m}END{print total}' some_directory/*.xml

This works well in cases where some XML files don't have line breaks.

Solution 14 - Grep

Grep only solution which I tested with grep for windows:

grep -ro "pattern to find in files" "Directory to recursively search" | grep -c "pattern to find in files"

This solution will count all occurrences even if there are multiple on one line. -r recursively searches the directory, -o will "show only the part of a line matching PATTERN" -- this is what splits up multiple occurences on a single line and makes grep print each match on a new line; then pipe those newline-separated-results back into grep with -c to count the number of occurrences using the same pattern.

Solution 15 - Grep

Another oneliner using basic command line functions handling multiple occurences per line.

 cat * |sed s/string/\\\nstring\ /g |grep string |wc -l

Solution 16 - Grep

awk -v RS='' -v FPAT='fast' '{print NF,FILENAME}' <file1..N>

Take a string, make it a line look for instance of fast and then print the number of fields with the filename.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionŽeljko FilipinView Question on Stackoverflow
Solution 1 - GrepJeremy LavineView Answer on Stackoverflow
Solution 2 - GrepBombeView Answer on Stackoverflow
Solution 3 - GrepKaofuView Answer on Stackoverflow
Solution 4 - GrepMichael HarenView Answer on Stackoverflow
Solution 5 - GrepJoachim SauerView Answer on Stackoverflow
Solution 6 - GrepazmeukView Answer on Stackoverflow
Solution 7 - GrepVijayView Answer on Stackoverflow
Solution 8 - GrepmumrahView Answer on Stackoverflow
Solution 9 - GrepAndriy MakukhaView Answer on Stackoverflow
Solution 10 - GrepKreuvfView Answer on Stackoverflow
Solution 11 - GrepMitul PatelView Answer on Stackoverflow
Solution 12 - GrepDmitry TarashkevichView Answer on Stackoverflow
Solution 13 - GrepExcaliburView Answer on Stackoverflow
Solution 14 - GrepQuanticView Answer on Stackoverflow
Solution 15 - GrepNTwoOView Answer on Stackoverflow
Solution 16 - GrepAlan TegelView Answer on Stackoverflow