Exclude list of files from find

LinuxShellFind

Linux Problem Overview


If I have a list of filenames in a text file that I want to exclude when I run find, how can I do that? For example, I want to do something like:

find /dir -name "*.gz" -exclude_from skip_files

and get all the .gz files in /dir except for the files listed in skip_files. But find has no -exclude_from flag. How can I skip all the files in skip_files?

Linux Solutions


Solution 1 - Linux

I don't think find has an option like this, you could build a command using printf and your exclude list:

find /dir -name "*.gz" $(printf "! -name %s " $(cat skip_files))

Which is the same as doing:

find /dir -name "*.gz" ! -name first_skip ! -name second_skip .... etc

Alternatively you can pipe from find into grep:

find /dir -name "*.gz" | grep -vFf skip_files

Solution 2 - Linux

This is what i usually do to remove some files from the result (In this case i looked for all text files but wasn't interested in a bunch of valgrind memcheck reports we have here and there):

find . -type f -name '*.txt' ! -name '*mem*.txt'

It seems to be working.

Solution 3 - Linux

I think you can try like

find /dir \( -name "*.gz" ! -name skip_file1 ! -name skip_file2 ...so on \)

Solution 4 - Linux

find /var/www/test/ -type f \( -iname "*.*" ! -iname  "*.php" ! -iname "*.jpg" ! -iname "*.png"  \)

The above command gives list of all files excluding files with .php, .jpg ang .png extension. This command works for me in putty.

Solution 5 - Linux

Josh Jolly's grep solution works, but has O(N**2) complexity, making it too slow for long lists. If the lists are sorted first (O(N*log(N)) complexity), you can use comm, which has O(N) complexity:

find /dir -name '*.gz' |sort >everything_sorted
sort skip_files >skip_files_sorted
comm -23 everything_sorted skip_files_sorted | xargs . . . etc

man your computer's comm for details.

Solution 6 - Linux

This solution will go through all files (not exactly excluding from the find command), but will produce an output skipping files from a list of exclusions. I found that useful while running a time-consuming command (file /dir -exec md5sum {} \;).

  1. You can create a shell script to handle the skipping logic and run commands on the files found (make it executable with chmod, replace echo with other commands):
    $ cat skip_file.sh
    #!/bin/bash
    found=$(grep "^$1$" files_to_skip.txt)
    if [ -z "$found" ]; then
        # run your command
        echo $1
    fi
  1. Create a file with the list of files to skip named files_to_skip.txt (on the dir you are running from).

  2. Then use find using it:

    find /dir -name "*.gz" -exec ./skip_file.sh {} \;

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionThomas JohnsonView Question on Stackoverflow
Solution 1 - LinuxJosh JollyView Answer on Stackoverflow
Solution 2 - LinuxMartin GView Answer on Stackoverflow
Solution 3 - LinuxJayesh BhoiView Answer on Stackoverflow
Solution 4 - LinuxSohel PathanView Answer on Stackoverflow
Solution 5 - LinuxEric ToombsView Answer on Stackoverflow
Solution 6 - LinuxjdtogniView Answer on Stackoverflow