What is the best way to count "find" results?

BashFind

Bash Problem Overview


My current solution would be find <expr> -exec printf '.' \; | wc -c, but this takes far too long when there are more than 10000 results. Is there no faster/better way to do this?

Bash Solutions


Solution 1 - Bash

Why not

find <expr> | wc -l

as a simple portable solution? Your original solution is spawning a new process printf for every individual file found, and that's very expensive (as you've just found).

Note that this will overcount if you have filenames with newlines embedded, but if you have that then I suspect your problems run a little deeper.

Solution 2 - Bash

Try this instead (require find's -printf support):

find <expr> -type f -printf '.' | wc -c

It will be more reliable and faster than counting the lines.

Note that I use the find's printf, not an external command.


Let's bench a bit :

$ ls -1
a
e
l
ll.sh
r
t
y
z

My snippet benchmark :

$ time find -type f -printf '.' | wc -c
8

real    0m0.004s
user    0m0.000s
sys     0m0.007s

With full lines :

$ time find -type f | wc -l
8

real    0m0.006s
user    0m0.003s
sys     0m0.000s

So my solution is faster =) (the important part is the real line)

Solution 3 - Bash

This solution is certainly slower than some of the other find -> wc solutions here, but if you were inclined to do something else with the file names in addition to counting them, you could read from the find output.

n=0
while read -r -d ''; do
    ((n++)) # count
    # maybe perform another act on file
done < <(find <expr> -print0)
echo $n

It is just a modification of a solution found in BashGuide that properly handles files with nonstandard names by making the find output delimiter a NUL byte using print0, and reading from it using '' (NUL byte) as the loop delimiter.

Solution 4 - Bash

This is my countfiles function in my ~/.bashrc (it's reasonably fast, should work for Linux & FreeBSD find, and does not get fooled by file paths containing newline characters; the final wc just counts NUL bytes):

countfiles () 
{ 
   command find "${1:-.}" -type f -name "${2:-*}" -print0 | 
       command tr -dc '\0' | command wc -c;
return 0
}

countfiles

countfiles ~ '*.txt'

Solution 5 - Bash

POSIX compliant and newline-proof:

find /path -exec printf %c {} + | wc -c

And, from my tests in /, not even two times slower than the other solutions, which are either not newline-proof or not portable.

Note the + instead of \;. That is crucial for performance, as \; spawns one printf command per file name, whereas + gives as much file names as it can to a single printf command. (And in the possible case where there are too many arguments, Find intelligently spawns new Printfs on demand to cope with it, so it would be as if

{ 
  printf %c very long argument list1
  printf %c very long argument list2
  printf %c very long argument list3 
} | wc -c

were called.)

Solution 6 - Bash

I needed something where I wouldn't take all output from find as some other commands run also print stuff.

Without need for temporary files this is only possible with a big caveat: You might get (far) more than one line of output as it will execute the output command once for every 800~1600 files.

find . -print -exec sh -c 'printf %c "$@" | wc -c' '' '{}' + # just print the numbers
find . -print -exec sh -c 'echo "Processed `printf %c "$@" | wc -c` items."' '' '{}' +

Generates this result:

Processed 1622 items.
Processed 1578 items.
Processed 1587 items.

An alternative is to use a temporary file:

find . -print -fprintf tmp.file .
wc -c <tmp.file # using the file as argument instead causes the file name to be printed after the count

echo "Processed `wc -c <tmp.file` items." # sh variant
echo "Processed $(wc -c <tmp.file) items." # bash variant

The -print in every of the find commands will not influence the count at all.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionMechMK1View Question on Stackoverflow
Solution 1 - BashBrian AgnewView Answer on Stackoverflow
Solution 2 - BashGilles QuenotView Answer on Stackoverflow
Solution 3 - BashJohn BView Answer on Stackoverflow
Solution 4 - BashcarloView Answer on Stackoverflow
Solution 5 - BashQuasímodoView Answer on Stackoverflow
Solution 6 - BashseyfahniView Answer on Stackoverflow