Recursively look for files with a specific extension

LinuxBashRecursion

Linux Problem Overview


I'm trying to find all files with a specific extension in a directory and its subdirectories with my bash (Latest Ubuntu LTS Release).

This is what's written in a script file:

#!/bin/bash

directory="/home/flip/Desktop"
suffix="in"

browsefolders ()
  for i in "$1"/*; 
  do
	echo "dir :$directory"
	echo "filename: $i"
	#	echo ${i#*.}
	extension=`echo "$i" | cut -d'.' -f2`
	echo "Erweiterung $extension"
	if     [ -f "$i" ]; then 		
	
		if [ $extension == $suffix ]; then
			echo "$i ends with $in"
				
		else
			echo "$i does NOT end with $in"
		fi
	elif [ -d "$i" ]; then  
    browsefolders "$i"
	fi
  done
}
browsefolders  "$directory"

Unfortunately, when I start this script in terminal, it says:

[: 29: in: unexpected operator

(with $extension instead of 'in')

What's going on here, where's the error? But this curly brace

Linux Solutions


Solution 1 - Linux

find $directory -type f -name "*.in"

is a bit shorter than that whole thing (and safer - deals with whitespace in filenames and directory names).

Your script is probably failing for entries that don't have a . in their name, making $extension empty.

Solution 2 - Linux

find {directory} -type f -name '*.extension'

Example: To find all csv files in the current directory and its sub-directories, use:

find . -type f -name '*.csv'

Solution 3 - Linux

The syntax I use is a bit different than what @Matt suggested:

find $directory -type f -name \*.in

(it's one less keystroke).

Solution 4 - Linux

Without using find:

du -a $directory | awk '{print $2}' | grep '\.in$'

Solution 5 - Linux

find "$PWD" -type f -name "*.in"

Solution 6 - Linux

Though using find command can be useful here, the shell itself provides options to achieve this requirement without any third party tools. The bash shell provides an extended glob support option using which you can get the file names under recursive paths that match with the extensions you want.

The extended option is extglob which needs to be set using the shopt option as below. The options are enabled with the -s support and disabled with he -u flag. Additionally you could use couple of options more i.e. nullglob in which an unmatched glob is swept away entirely, replaced with a set of zero words. And globstar that allows to recurse through all the directories

shopt -s extglob nullglob globstar

Now all you need to do is form the glob expression to include the files of a certain extension which you can do as below. We use an array to populate the glob results because when quoted properly and expanded, the filenames with special characters would remain intact and not get broken due to word-splitting by the shell.

For example to list all the *.csv files in the recursive paths

fileList=(**/*.csv)

The option ** is to recurse through the sub-folders and *.csv is glob expansion to include any file of the extensions mentioned. Now for printing the actual files, just do

printf '%s\n' "${fileList[@]}"

Using an array and doing a proper quoted expansion is the right way when used in shell scripts, but for interactive use, you could simply use ls with the glob expression as

ls -1 -- **/*.csv

This could very well be expanded to match multiple files i.e. file ending with multiple extension (i.e. similar to adding multiple flags in find command). For example consider a case of needing to get all recursive image files i.e. of extensions *.gif, *.png and *.jpg, all you need to is

ls -1 -- **/+(*.jpg|*.gif|*.png)

This could very well be expanded to have negate results also. With the same syntax, one could use the results of the glob to exclude files of certain type. Assume you want to exclude file names with the extensions above, you could do

excludeResults=()
excludeResults=(**/!(*.jpg|*.gif|*.png))
printf '%s\n' "${excludeResults[@]}"

The construct !() is a negate operation to not include any of the file extensions listed inside and | is an alternation operator just as used in the Extended Regular Expressions library to do an OR match of the globs.

Note that these extended glob support is not available in the POSIX bourne shell and its purely specific to recent versions of bash. So if your are considering portability of the scripts running across POSIX and bash shells, this option wouldn't be right.

Solution 7 - Linux

  1. There's a { missing after browsefolders ()
  2. All $in should be $suffix
  3. The line with cut gets you only the middle part of front.middle.extension. You should read up your shell manual on ${varname%%pattern} and friends.

I assume you do this as an exercise in shell scripting, otherwise the find solution already proposed is the way to go.

To check for proper shell syntax, without running a script, use sh -n scriptname.

Solution 8 - Linux

To find all the pom.xml files in your current directory and print them, you can use:

find . -name 'pom.xml' -print

Solution 9 - Linux

find $directory -type f -name "*.in"|grep $substring

Solution 10 - Linux

for file in "${LOCATION_VAR}"/*.zip
do
  echo "$file"
done 

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionflipView Question on Stackoverflow
Solution 1 - LinuxMatView Answer on Stackoverflow
Solution 2 - LinuxMohammad AlQannehView Answer on Stackoverflow
Solution 3 - LinuxScott C WilsonView Answer on Stackoverflow
Solution 4 - LinuxrtrnView Answer on Stackoverflow
Solution 5 - Linuxkip2View Answer on Stackoverflow
Solution 6 - LinuxInianView Answer on Stackoverflow
Solution 7 - LinuxJensView Answer on Stackoverflow
Solution 8 - LinuxBharat YadavView Answer on Stackoverflow
Solution 9 - LinuxSergiuView Answer on Stackoverflow
Solution 10 - LinuxAvinash Kumar MishraView Answer on Stackoverflow