How do you diff a directory for only files of a specific type?

LinuxBash

Linux Problem Overview


I have a question about the diff command if I want a recursive directory diff but only for a specific file type, how to do that?

I tried using the exclude option but can only use one pattern only:

$ diff /destination/dir/1 /destination/dir/2 -r -x *.xml

with the command I can only exclude xml file type, even though there are files in the folder image type (png, gif, jpg), txt, php, etc

how to diff only certain file types.

Linux Solutions


Solution 1 - Linux

You can specify -x more than once.

diff -x '*.foo' -x '*.bar' -x '*.baz' /destination/dir/1 /destination/dir/2

From the Comparing Directories section of info diff (on my system, I have to do info -f /usr/share/info/diff.info.gz):

> To ignore some files while comparing directories, use the '-x PATTERN' or '--exclude=PATTERN' option. This option ignores any files or subdirectories whose base names match the shell pattern PATTERN. Unlike in the shell, a period at the start of the base of a file name matches a wildcard at the start of a pattern. You should enclose PATTERN in quotes so that the shell does not expand it. For example, the option -x '*.[ao]' ignores any file whose name ends with '.a' or '.o'.

> This option accumulates if you specify it more than once. For example, using the options -x 'RCS' -x '*,v' ignores any file or subdirectory whose base name is 'RCS' or ends with ',v'.

Solution 2 - Linux

Taken from ( a version of) the man page:

-x PAT  --exclude=PAT
  Exclude files that match PAT.

-X FILE    --exclude-from=FILE
  Exclude files that match any pattern in FILE.

So it looks like -x only accepts one pattern as you report but if you put all the patterns you want to exclude in a file (presumably one per line) you could use the second flag like so:

$ diff /destination/dir/1 /destination/dir/2 -r -X exclude.pats

where exclude.pats is:

*.jpg
*.JPG
*.xml
*.XML
*.png
*.gif

Solution 3 - Linux

You can also use find with -exec to call diff:

cd /destination/dir/1
find . -name *.xml -exec diff {} /destination/dir/2/{} \;

Solution 4 - Linux

The lack of a complementary --include ... .

We can do one workaround, a exclude file with all files but what we want include. So we create file1 with a find all files which don't have extensions that we want include, sed catch the filename and is just :

diff --exclude-from=file1  PATH1/ PATH2/

For example:

find  PATH1/ -type f | grep --text -vP "php$|html$" | sed 's/.*\///' | sort -u > file1 
diff PATH1/ PATH2/ -rq -X file1 

Solution 5 - Linux

I used the following command to find the diff of all *.tmpl files between DIR1 and DIR2. In my case this didn't yield any false positives, but it may for you, depending on the contents of your DIRS.

Solution 6 - Linux

In case you find it convenient, you could use the following Makefile. Just run: "make patch"

#Makefile for patches

#Exlude following file endings
SUFFIX += o
SUFFIX += so
SUFFIX += exe
SUFFIX += pdf
SUFFIX += swp

#Exlude following folders
FOLDER += bin
FOLDER += lib
FOLDER += Image
FOLDER += models

OPTIONS = Naur

patch: 
	rm test.patch
	diff -$(OPTIONS) \
	$(foreach element, $(SUFFIX) , -x '*.$(element)') \
	$(foreach element, $(FOLDER) , -x '$(element)*') \
    	org/ new/ > test.patch	

unpatch: 
	rm test.unpatch
	diff -$(OPTIONS) \
	$(foreach element, $(SUFFIX) , -x '*.$(element)') \
	$(foreach element, $(FOLDER) , -x '$(element)*') \
	new/ org/ > test.unpatch

Solution 7 - Linux

The lack of a complementary --include makes it necessary to use such convoluted heuristic patterns as

*.[A-Zb-ik-uw-z]*

to find (mostly) java files!

Solution 8 - Linux

If you want to differ sources and keep it simple:

diff -rqx "*.a" -x "*.o" -x "*.d" ./PATH1 ./PATH2 | grep "\.cpp " | grep "^Files"

Remove the last grep if you want to get the files which exist in only one of the paths.

Solution 9 - Linux

Whilst it does not avoid the actual diff of other files, if your goal is to produce a patch file, or similar then you can use filterdiff from the patchutils package, e.g. to patch only your .py changes:

diff -ruNp /path/1 /path/2 | filterdiff -i "*.py" | tee /path/to/file.patch

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionde_3View Question on Stackoverflow
Solution 1 - LinuxDennis WilliamsonView Answer on Stackoverflow
Solution 2 - LinuxjamesbtateView Answer on Stackoverflow
Solution 3 - LinuxAlex HaruiView Answer on Stackoverflow
Solution 4 - LinuxSérgioView Answer on Stackoverflow
Solution 5 - LinuxMikhail GolubitskyView Answer on Stackoverflow
Solution 6 - LinuxRafizView Answer on Stackoverflow
Solution 7 - LinuxJerry MillerView Answer on Stackoverflow
Solution 8 - LinuxAlexView Answer on Stackoverflow
Solution 9 - LinuxcEzView Answer on Stackoverflow