Use find command but exclude files in two directories
LinuxShellUnixFindLinux Problem Overview
I want to find files that end with _peaks.bed
, but exclude files in the tmp
and scripts
folders.
My command is like this:
find . -type f \( -name "*_peaks.bed" ! -name "*tmp*" ! -name "*scripts*" \)
But it didn't work. The files in tmp
and script
folder will still be displayed.
Does anyone have ideas about this?
Linux Solutions
Solution 1 - Linux
Here's how you can specify that with find
:
find . -type f -name "*_peaks.bed" ! -path "./tmp/*" ! -path "./scripts/*"
Explanation:
find .
- Start find from current working directory (recursively by default)-type f
- Specify tofind
that you only want files in the results-name "*_peaks.bed"
- Look for files with the name ending in_peaks.bed
! -path "./tmp/*"
- Exclude all results whose path starts with./tmp/
! -path "./scripts/*"
- Also exclude all results whose path starts with./scripts/
Testing the Solution:
$ mkdir a b c d e
$ touch a/1 b/2 c/3 d/4 e/5 e/a e/b
$ find . -type f ! -path "./a/*" ! -path "./b/*"
./d/4
./c/3
./e/a
./e/b
./e/5
You were pretty close, the -name
option only considers the basename, where as -path
considers the entire path =)
Solution 2 - Linux
Here is one way you could do it...
find . -type f -name "*_peaks.bed" | egrep -v "^(./tmp/|./scripts/)"
Solution 3 - Linux
Use
find \( -path "./tmp" -o -path "./scripts" \) -prune -o -name "*_peaks.bed" -print
or
find \( -path "./tmp" -o -path "./scripts" \) -prune -false -o -name "*_peaks.bed"
or
find \( -path "./tmp" -path "./scripts" \) ! -prune -o -name "*_peaks.bed"
The order is important. It evaluates from left to right. Always begin with the path exclusion.
Explanation
Do not use -not
(or !
) to exclude whole directory. Use -prune
.
As explained in the manual:
−prune The primary shall always evaluate as true; it
shall cause find not to descend the current
pathname if it is a directory. If the −depth
primary is specified, the −prune primary shall
have no effect.
and in the GNU find manual:
-path pattern
[...]
To ignore a whole
directory tree, use -prune rather than checking
every file in the tree.
Indeed, if you use -not -path "./pathname"
,
find will evaluate the expression for each node under "./pathname"
.
find expressions are just condition evaluation.
\( \)
- groups operation (you can use-path "./tmp" -prune -o -path "./scripts" -prune -o
, but it is more verbose).-path "./script" -prune
- if-path
returns true and is a directory, return true for that directory and do not descend into it.-path "./script" ! -prune
- it evaluates as(-path "./script") AND (! -prune)
. It revert the "always true" of prune to always false. It avoids printing"./script"
as a match.-path "./script" -prune -false
- since-prune
always returns true, you can follow it with-false
to do the same than!
.-o
- OR operator. If no operator is specified between two expressions, it defaults to AND operator.
Hence, \( -path "./tmp" -o -path "./scripts" \) -prune -o -name "*_peaks.bed" -print
is expanded to:
[ (-path "./tmp" OR -path "./script") AND -prune ] OR ( -name "*_peaks.bed" AND print )
The print is important here because without it is expanded to:
{ [ (-path "./tmp" OR -path "./script" ) AND -prune ] OR (-name "*_peaks.bed" ) } AND print
-print
is added by find - that is why most of the time, you do not need to add it in you expression. And since -prune
returns true, it will print "./script" and "./tmp".
It is not necessary in the others because we switched -prune
to always return false.
Hint: You can use find -D opt expr 2>&1 1>/dev/null
to see how it is optimized and expanded,
find -D search expr 2>&1 1>/dev/null
to see which path is checked.
Solution 4 - Linux
for me, this solution didn't worked on a command exec with find, don't really know why, so my solution is
find . -type f -path "./a/*" -prune -o -path "./b/*" -prune -o -exec gzip -f -v {} \;
Explanation: same as sampson-chen one with the additions of
-prune - ignore the proceding path of ...
-o - Then if no match print the results, (prune the directories and print the remaining results)
18:12 $ mkdir a b c d e
18:13 $ touch a/1 b/2 c/3 d/4 e/5 e/a e/b
18:13 $ find . -type f -path "./a/*" -prune -o -path "./b/*" -prune -o -exec gzip -f -v {} \;
gzip: . is a directory -- ignored
gzip: ./a is a directory -- ignored
gzip: ./b is a directory -- ignored
gzip: ./c is a directory -- ignored
./c/3: 0.0% -- replaced with ./c/3.gz
gzip: ./d is a directory -- ignored
./d/4: 0.0% -- replaced with ./d/4.gz
gzip: ./e is a directory -- ignored
./e/5: 0.0% -- replaced with ./e/5.gz
./e/a: 0.0% -- replaced with ./e/a.gz
./e/b: 0.0% -- replaced with ./e/b.gz
Solution 5 - Linux
You can try below:
find ./ ! \( -path ./tmp -prune \) ! \( -path ./scripts -prune \) -type f -name '*_peaks.bed'
Solution 6 - Linux
Try something like
find . \( -type f -name \*_peaks.bed -print \) -or \( -type d -and \( -name tmp -or -name scripts \) -and -prune \)
and don't be too surprised if I got it a bit wrong. If the goal is an exec (instead of print), just substitute it in place.
Solution 7 - Linux
With these explanations you meet your objective and many others. Just join each part as you want to do.
MODEL
find ./\
-iname "some_arg" -type f\ # File(s) that you want to find at any hierarchical level.
! -iname "some_arg" -type f\ # File(s) NOT to be found on any hirearchic level (exclude).
! -path "./file_name"\ # File(s) NOT to be found at this hirearchic level (exclude).
! -path "./folder_name/*"\ # Folder(s) NOT to be found on this Hirearchic level (exclude).
-exec grep -IiFl 'text_content' -- {} \; # Text search in the content of the found file(s) being case insensitive ("-i") and excluding binaries ("-I").
EXAMPLE
find ./\
-iname "*" -type f\
! -iname "*pyc" -type f\
! -path "./.gitignore"\
! -path "./build/*"\
! -path "./__pycache__/*"\
! -path "./.vscode/*"\
! -path "./.git/*"\
-exec grep -IiFl 'title="Brazil - Country of the Future",' -- {} \;
Thanks! 珞
[Ref(s).: https://unix.stackexchange.com/q/73938/61742 ]
EXTRA:
You can use the commands above together with your favorite editor and analyze the contents of the files found, for example...
vim -p $(find ./\
-iname "*" -type f\
! -iname "*pyc" -type f\
! -path "./.gitignore"\
! -path "./build/*"\
! -path "./__pycache__/*"\
! -path "./.vscode/*"\
! -path "./.git/*"\
-exec grep -IiFl 'title="Brazil - Country of the Future",' -- {} \;)