Exclude list of files from find
LinuxShellFindLinux Problem Overview
If I have a list of filenames in a text file that I want to exclude when I run find
, how can I do that? For example, I want to do something like:
find /dir -name "*.gz" -exclude_from skip_files
and get all the .gz files in /dir except for the files listed in skip_files. But find has no -exclude_from
flag. How can I skip all the files in skip_files
?
Linux Solutions
Solution 1 - Linux
I don't think find
has an option like this, you could build a command using printf
and your exclude list:
find /dir -name "*.gz" $(printf "! -name %s " $(cat skip_files))
Which is the same as doing:
find /dir -name "*.gz" ! -name first_skip ! -name second_skip .... etc
Alternatively you can pipe from find
into grep
:
find /dir -name "*.gz" | grep -vFf skip_files
Solution 2 - Linux
This is what i usually do to remove some files from the result (In this case i looked for all text files but wasn't interested in a bunch of valgrind memcheck reports we have here and there):
find . -type f -name '*.txt' ! -name '*mem*.txt'
It seems to be working.
Solution 3 - Linux
I think you can try like
find /dir \( -name "*.gz" ! -name skip_file1 ! -name skip_file2 ...so on \)
Solution 4 - Linux
find /var/www/test/ -type f \( -iname "*.*" ! -iname "*.php" ! -iname "*.jpg" ! -iname "*.png" \)
The above command gives list of all files excluding files with .php, .jpg ang .png extension. This command works for me in putty.
Solution 5 - Linux
Josh Jolly's grep solution works, but has O(N**2) complexity, making it too slow for long lists. If the lists are sorted first (O(N*log(N)) complexity), you can use comm
, which has O(N) complexity:
find /dir -name '*.gz' |sort >everything_sorted
sort skip_files >skip_files_sorted
comm -23 everything_sorted skip_files_sorted | xargs . . . etc
man
your computer's comm
for details.
Solution 6 - Linux
This solution will go through all files (not exactly excluding from the find
command), but will produce an output skipping files from a list of exclusions.
I found that useful while running a time-consuming command (file /dir -exec md5sum {} \;
).
- You can create a shell script to handle the skipping logic and run commands on the files found (make it executable with
chmod
, replaceecho
with other commands):
$ cat skip_file.sh
#!/bin/bash
found=$(grep "^$1$" files_to_skip.txt)
if [ -z "$found" ]; then
# run your command
echo $1
fi
-
Create a file with the list of files to skip named
files_to_skip.txt
(on the dir you are running from). -
Then use find using it:
find /dir -name "*.gz" -exec ./skip_file.sh {} \;