How to loop through file names returned by find?

BashFind

Bash Problem Overview


x=$(find . -name "*.txt")
echo $x

if I run the above piece of code in Bash shell, what I get is a string containing several file names separated by blank, not a list.

Of course, I can further separate them by blank to get a list, but I'm sure there is a better way to do it.

So what is the best way to loop through the results of a find command?

Bash Solutions


Solution 1 - Bash

TL;DR: If you're just here for the most correct answer, you probably want my personal preference (see the bottom of this post):

# execute `process` once for each file
find . -name '*.txt' -exec process {} \;

If you have time, read through the rest to see several different ways and the problems with most of them.


The full answer:

The best way depends on what you want to do, but here are a few options. As long as no file or folder in the subtree has whitespace in its name, you can just loop over the files:

for i in $x; do # Not recommended, will break on whitespace
    process "$i"
done

Marginally better, cut out the temporary variable x:

for i in $(find -name \*.txt); do # Not recommended, will break on whitespace
    process "$i"
done

It is much better to glob when you can. White-space safe, for files in the current directory:

for i in *.txt; do # Whitespace-safe but not recursive.
    process "$i"
done

By enabling the globstar option, you can glob all matching files in this directory and all subdirectories:

# Make sure globstar is enabled
shopt -s globstar
for i in **/*.txt; do # Whitespace-safe and recursive
    process "$i"
done

In some cases, e.g. if the file names are already in a file, you may need to use read:

# IFS= makes sure it doesn't trim leading and trailing whitespace
# -r prevents interpretation of \ escapes.
while IFS= read -r line; do # Whitespace-safe EXCEPT newlines
    process "$line"
done < filename

read can be used safely in combination with find by setting the delimiter appropriately:

find . -name '*.txt' -print0 | 
    while IFS= read -r -d '' line; do 
        process "$line"
    done

For more complex searches, you will probably want to use find, either with its -exec option or with -print0 | xargs -0:

# execute `process` once for each file
find . -name \*.txt -exec process {} \;

# execute `process` once with all the files as arguments*:
find . -name \*.txt -exec process {} +

# using xargs*
find . -name \*.txt -print0 | xargs -0 process

# using xargs with arguments after each filename (implies one run per filename)
find . -name \*.txt -print0 | xargs -0 -I{} process {} argument

find can also cd into each file's directory before running a command by using -execdir instead of -exec, and can be made interactive (prompt before running the command for each file) using -ok instead of -exec (or -okdir instead of -execdir).

*: Technically, both find and xargs (by default) will run the command with as many arguments as they can fit on the command line, as many times as it takes to get through all the files. In practice, unless you have a very large number of files it won't matter, and if you exceed the length but need them all on the same command line, you're SOL find a different way.

Solution 2 - Bash

What ever you do, don't use a for loop:

# Don't do this
for file in $(find . -name "*.txt")
do
    …code using "$file"
done

Three reasons:

  • For the for loop to even start, the find must run to completion.
  • If a file name has any whitespace (including space, tab or newline) in it, it will be treated as two separate names.
  • Although now unlikely, you can overrun your command line buffer. Imagine if your command line buffer holds 32KB, and your for loop returns 40KB of text. That last 8KB will be dropped right off your for loop and you'll never know it.

Always use a while read construct:

find . -name "*.txt" -print0 | while read -d $'\0' file
do
    …code using "$file"
done

The loop will execute while the find command is executing. Plus, this command will work even if a file name is returned with whitespace in it. And, you won't overflow your command line buffer.

The -print0 will use the NULL as a file separator instead of a newline and the -d $'\0' will use NULL as the separator while reading.

Solution 3 - Bash

find . -name "*.txt"|while read fname; do
  echo "$fname"
done

Note: this method and the (second) method shown by bmargulies are safe to use with white space in the file/folder names.

In order to also have the - somewhat exotic - case of newlines in the file/folder names covered, you will have to resort to the -exec predicate of find like this:

find . -name '*.txt' -exec echo "{}" \;

The {} is the placeholder for the found item and the \; is used to terminate the -exec predicate.

And for the sake of completeness let me add another variant - you gotta love the *nix ways for their versatility:

find . -name '*.txt' -print0|xargs -0 -n 1 echo

This would separate the printed items with a \0 character that isn't allowed in any of the file systems in file or folder names, to my knowledge, and therefore should cover all bases. xargs picks them up one by one then ...

Solution 4 - Bash

Filenames can include spaces and even control characters. Spaces are (default) delimiters for shell expansion in bash and as a result of that x=$(find . -name "*.txt") from the question is not recommended at all. If find gets a filename with spaces e.g. "the file.txt" you will get 2 separated strings for processing, if you process x in a loop. You can improve this by changing delimiter (bash IFS Variable) e.g. to \r\n, but filenames can include control characters - so this is not a (completely) safe method.

From my point of view, there are 2 recommended (and safe) patterns for processing files:

1. Use for loop & filename expansion:

for file in ./*.txt; do
    [[ ! -e $file ]] && continue  # continue, if file does not exist
    # single filename is in $file
    echo "$file"
    # your code here
done

2. Use find-read-while & process substitution

while IFS= read -r -d '' file; do
    # single filename is in $file
    echo "$file"
    # your code here
done < <(find . -name "*.txt" -print0)

Remarks

on Pattern 1:

  1. bash returns the search pattern ("*.txt") if no matching file is found - so the extra line "continue, if file does not exist" is needed. see [Bash Manual, Filename Expansion][1]
  2. shell option nullglob can be used to avoid this extra line.
  3. "If the failglob shell option is set, and no matches are found, an error message is printed and the command is not executed." (from Bash Manual above)
  4. shell option globstar: "If set, the pattern ‘**’ used in a filename expansion context will match all files and zero or more directories and subdirectories. If the pattern is followed by a ‘/’, only directories and subdirectories match." see [Bash Manual, Shopt Builtin][2]
  5. other options for filename expansion: extglob, nocaseglob, dotglob & shell variable GLOBIGNORE

on Pattern 2:

  1. filenames can contain blanks, tabs, spaces, newlines, ... to process filenames in a safe way, find with -print0 is used: filename is printed with all control characters & terminated with NUL. see also [Gnu Findutils Manpage, Unsafe File Name Handling][3], [safe File Name Handling][4], [unusual characters in filenames][5]. See David A. Wheeler below for detailed discussion of this topic.

  2. There are some possible patterns to process find results in a while loop. Others (kevin, David W.) have shown how to do this using pipes:

    > > files_found=1 > find . -name "*.txt" -print0 | > while IFS= read -r -d '' file; do > # single filename in $file > echo "$file" > files_found=0 # not working example > # your code here > done > [[ $files_found -eq 0 ]] && echo "files found" || echo "no files found" >

    When you try this piece of code, you will see, that it does not work: files_found is always "true" & the code will always echo "no files found". Reason is: each command of a pipeline is executed in a separate subshell, so the changed variable inside the loop (separate subshell) does not change the variable in the main shell script. This is why I recommend using process substitution as the "better", more useful, more general pattern.
    See [I set variables in a loop that's in a pipeline. Why do they disappear...][6] (from Greg's Bash FAQ) for a detailed discussion on this topic.

Additional References & Sources:

  • [Gnu Bash Manual, Pattern Matching][7]

  • [Filenames and Pathnames in Shell: How to do it Correctly, David A. Wheeler][8]

  • [Why you don't read lines with "for", Greg's Wiki][9]

  • [Why you shouldn't parse the output of ls(1), Greg's Wiki][10]

  • [Gnu Bash Manual, Process Substitution][11]

[1]: https://www.gnu.org/software/bash/manual/html_node/Filename-Expansion.html#Filename-Expansion "Bash Manual, Filename Expansion" [2]: https://www.gnu.org/software/bash/manual/html_node/The-Shopt-Builtin.html#The-Shopt-Builtins "Bash Manual, Shopt Builtin" [3]: http://www.gnu.org/software/findutils/manual/html_mono/find.html#Unsafe-File-Name-Handling [4]: http://www.gnu.org/software/findutils/manual/html_mono/find.html#Safe-File-Name-Handling [5]: http://www.gnu.org/software/findutils/manual/html_mono/find.html#Unusual-Characters-in-File-Names [6]: http://mywiki.wooledge.org/BashFAQ/024 "I set variables in a loop that's in a pipeline. Why do they disappear ..." [7]: https://www.gnu.org/software/bash/manual/html_node/Pattern-Matching.html#Pattern-Matching [8]: http://www.dwheeler.com/essays/filenames-in-shell.html [9]: http://mywiki.wooledge.org/DontReadLinesWithFor [10]: http://mywiki.wooledge.org/ParsingLs [11]: https://www.gnu.org/software/bash/manual/html_node/Process-Substitution.html

Solution 5 - Bash

(Updated to include @Socowi's execellent speed improvement)

With any $SHELL that supports it (dash/zsh/bash...):

find . -name "*.txt" -exec $SHELL -c '
    for i in "$@" ; do
        echo "$i"
    done
' {} +

Done.


Original answer (shorter, but slower):

find . -name "*.txt" -exec $SHELL -c '
    echo "$0"
' {} \;

Solution 6 - Bash

If you can assume the file names don't contain newlines, you can read the output of find into a Bash array using the following command:

readarray -t x < <(find . -name '*.txt')

Note:

  • -t causes readarray to strip newlines.
  • It won't work if readarray is in a pipe, hence the process substitution.
  • readarray is available since Bash 4.

Bash 4.4 and up also supports the -d parameter for specifying the delimiter. Using the null character, instead of newline, to delimit the file names works also in the rare case that the file names contain newlines:

readarray -d '' x < <(find . -name '*.txt' -print0)

readarray can also be invoked as mapfile with the same options.

Reference: https://mywiki.wooledge.org/BashFAQ/005#Loading_lines_from_a_file_or_stream

Solution 7 - Bash

# Doesn't handle whitespace
for x in `find . -name "*.txt" -print`; do
  process_one $x
done

or

# Handles whitespace and newlines
find . -name "*.txt" -print0 | xargs -0 -n 1 process_one

Solution 8 - Bash

I like to use find which is first assigned to variable and IFS switched to new line as follow:

FilesFound=$(find . -name "*.txt")

IFSbkp="$IFS"
IFS=$'\n'
counter=1;
for file in $FilesFound; do
	echo "${counter}: ${file}"
	let counter++;
done
IFS="$IFSbkp"

As commented by @Konrad Rudolph this will not work with "new lines" in file name. I still think it is handy as it covers most of the cases when you need to loop over command output.

Solution 9 - Bash

You can put the filenames returned by find into an array like this:

array=()
while IFS=  read -r -d ''; do
    array+=("$REPLY")
done < <(find . -name '*.txt' -print0)

Now you can just loop through the array to access individual items and do whatever you want with them.

Note: It's white space safe.

Solution 10 - Bash

based on other answers and comment of @phk, using fd #3:
(which still allows to use stdin inside the loop)

while IFS= read -r f <&3; do
	echo "$f"

done 3< <(find . -iname "*filename*")

Solution 11 - Bash

You can store your find output in array if you wish to use the output later as:

array=($(find . -name "*.txt"))

Now to print the each element in new line, you can either use for loop iterating to all the elements of array, or you can use printf statement.

for i in ${array[@]};do echo $i; done

or

printf '%s\n' "${array[@]}"

You can also use:

for file in "`find . -name "*.txt"`"; do echo "$file"; done

This will print each filename in newline

To only print the find output in list form, you can use either of the following:

find . -name "*.txt" -print 2>/dev/null

or

find . -name "*.txt" -print | grep -v 'Permission denied'

This will remove error messages and only give the filename as output in new line.

If you wish to do something with the filenames, storing it in array is good, else there is no need to consume that space and you can directly print the output from find.

Solution 12 - Bash

As already posted on the top answer by Kevin, the best solution is to use a for loop with bash glob, but as bash glob is not recursive by default, this can be fixed by a bash recursive function:

#!/bin/bash
set -x
set -eu -o pipefail

all_files=();

function get_all_the_files()
{
    directory="$1";
    for item in "$directory"/* "$directory"/.[^.]*;
    do
        if [[ -d "$item" ]];
        then
            get_all_the_files "$item";
        else
            all_files+=("$item");
        fi;
    done;
}

get_all_the_files "/tmp";

for file_path in "${all_files[@]}"
do
    printf 'My file is "%s"\n' "$file_path";
done;

Related questions:

  1. https://stackoverflow.com/questions/26381807/bash-loop-through-directory-including-hidden-file
  2. https://stackoverflow.com/questions/38945927/recursively-list-files-from-a-given-directory-in-bash
  3. https://stackoverflow.com/questions/1767384/ls-command-how-can-i-get-a-recursive-full-path-listing-one-line-per-file
  4. https://stackoverflow.com/questions/245698/list-files-recursively-in-linux-cli-with-path-relative-to-the-current-directory
  5. https://stackoverflow.com/questions/747465/recursively-list-all-directories-and-files
  6. https://stackoverflow.com/questions/21668471/bash-script-create-array-of-all-files-in-a-directory
  7. https://stackoverflow.com/questions/51191766/how-can-i-creates-array-that-contains-the-names-of-all-the-files-in-a-folder
  8. https://stackoverflow.com/questions/51191766/how-can-i-creates-array-that-contains-the-names-of-all-the-files-in-a-folder
  9. https://stackoverflow.com/questions/2437452/how-to-get-the-list-of-files-in-a-directory-in-a-shell-script

Solution 13 - Bash

function loop_through(){
        length_="$(find . -name '*.txt' | wc -l)"
        length_="${length_#"${length_%%[![:space:]]*}"}"
        length_="${length_%"${length_##*[![:space:]]}"}"   
        for i in {1..$length_}
        do
            x=$(find . -name '*.txt' | sort | head -$i | tail -1)
            echo $x
        done

}

To grab the length of the list of files for loop, I used the first command "wc -l".
That command is set to a variable.
Then, I need to remove the trailing white spaces from the variable so the for loop can read it.

Solution 14 - Bash

find <path> -xdev -type f -name *.txt -exec ls -l {} \;

This will list the files and give details about attributes.

Solution 15 - Bash

Another alternative is to not use bash, but call Python to do the heavy lifting. I recurred to this because bash solutions as my other answer were too slow.

With this solution, we build a bash array of files from inline Python script:

#!/bin/bash
set -eu -o pipefail

dsep=":"  # directory_separator
base_directory=/tmp

all_files=()
all_files_string="$(python3 -c '#!/usr/bin/env python3
import os
import sys

dsep="'"$dsep"'"
base_directory="'"$base_directory"'"

def log(*args, **kwargs):
    print(*args, file=sys.stderr, **kwargs)

def check_invalid_characther(file_path):
    for thing in ("\\", "\n"):
        if thing in file_path:
            raise RuntimeError(f"It is not allowed {thing} on \"{file_path}\"!")
def absolute_path_to_relative(base_directory, file_path):
    relative_path = os.path.commonprefix( [ base_directory, file_path ] )
    relative_path = os.path.normpath( file_path.replace( relative_path, "" ) )

    # if you use Windows Python, it accepts / instead of \\
    # if you have \ on your files names, rename them or comment this
    relative_path = relative_path.replace("\\", "/")
    if relative_path.startswith( "/" ):
        relative_path = relative_path[1:]
    return relative_path

for directory, directories, files in os.walk(base_directory):
    for file in files:
        local_file_path = os.path.join(directory, file)
        local_file_name = absolute_path_to_relative(base_directory, local_file_path)

        log(f"local_file_name {local_file_name}.")
        check_invalid_characther(local_file_name)
        print(f"{base_directory}{dsep}{local_file_name}")
' | dos2unix)";
if [[ -n "$all_files_string" ]];
then
    readarray -t temp <<< "$all_files_string";
    all_files+=("${temp[@]}");
fi;

for item in "${all_files[@]}";
do
    OLD_IFS="$IFS"; IFS="$dsep";
    read -r base_directory local_file_name <<< "$item"; IFS="$OLD_IFS";

    printf 'item "%s", base_directory "%s", local_file_name "%s".\n' \
            "$item" \
            "$base_directory" \
            "$local_file_name";
done;

Related:

  1. https://stackoverflow.com/questions/13454164/os-walk-without-hidden-folders
  2. https://stackoverflow.com/questions/18394147/how-to-do-a-recursive-sub-folder-search-and-return-files-in-a-list
  3. https://stackoverflow.com/questions/10586153/how-to-split-a-string-into-an-array-in-bash

Solution 16 - Bash

How about if you use grep instead of find?

ls | grep .txt$ > out.txt

Now you can read this file and the filenames are in the form of a list.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionHaiyuan ZhangView Question on Stackoverflow
Solution 1 - BashKevinView Answer on Stackoverflow
Solution 2 - BashDavid W.View Answer on Stackoverflow
Solution 3 - Bash0xC0000022LView Answer on Stackoverflow
Solution 4 - BashMichael BruxView Answer on Stackoverflow
Solution 5 - Bashuser569825View Answer on Stackoverflow
Solution 6 - BashSeppo EnarviView Answer on Stackoverflow
Solution 7 - BashbmarguliesView Answer on Stackoverflow
Solution 8 - BashPacoView Answer on Stackoverflow
Solution 9 - BashJahidView Answer on Stackoverflow
Solution 10 - BashFlorianView Answer on Stackoverflow
Solution 11 - BashRakholiya JenishView Answer on Stackoverflow
Solution 12 - BashuserView Answer on Stackoverflow
Solution 13 - BashS.Doe_DudeView Answer on Stackoverflow
Solution 14 - BashchetangbView Answer on Stackoverflow
Solution 15 - BashuserView Answer on Stackoverflow
Solution 16 - BashDhruv Raj Singh RathoreView Answer on Stackoverflow