Bash function to find newest file matching pattern

LinuxBash

Linux Problem Overview


In Bash, I would like to create a function that returns the filename of the newest file that matches a certain pattern. For example, I have a directory of files like:

Directory/
   a1.1_5_1
   a1.2_1_4
   b2.1_0
   b2.2_3_4
   b2.3_2_0

I want the newest file that starts with 'b2'. How do I do this in bash? I need to have this in my ~/.bash_profile script.

Linux Solutions


Solution 1 - Linux

The ls command has a parameter -t to sort by time. You can then grab the first (newest) with head -1.

ls -t b2* | head -1

But beware: Why you shouldn't parse the output of ls

My personal opinion: parsing ls is only dangerous when the filenames can contain funny characters like spaces or newlines. If you can guarantee that the filenames will not contain funny characters then parsing ls is quite safe.

If you are developing a script which is meant to be run by many people on many systems in many different situations then I very much do recommend to not parse ls.

Here is how to do it "right": How can I find the latest (newest, earliest, oldest) file in a directory?

unset -v latest
for file in "$dir"/*; do
  [[ $file -nt $latest ]] && latest=$file
done

Solution 2 - Linux

The combination of find and ls works well for

  • filenames without newlines
  • not very large amount of files
  • not very long filenames

The solution:

find . -name "my-pattern" -print0 |
    xargs -r -0 ls -1 -t |
    head -1

Let's break it down:

With find we can match all interesting files like this:

find . -name "my-pattern" ...

then using -print0 we can pass all filenames safely to the ls like this:

find . -name "my-pattern" -print0 | xargs -r -0 ls -1 -t

additional find search parameters and patterns can be added here

find . -name "my-pattern" ... -print0 | xargs -r -0 ls -1 -t

ls -t will sort files by modification time (newest first) and print it one at a line. You can use -c to sort by creation time. Note: this will break with filenames containing newlines.

Finally head -1 gets us the first file in the sorted list.

Note: xargs use system limits to the size of the argument list. If this size exceeds, xargs will call ls multiple times. This will break the sorting and probably also the final output. Run

xargs  --show-limits

to check the limits on you system.

Note 2: use find . -maxdepth 1 -name "my-pattern" -print0 if you don't want to search files through subfolders.

Note 3: As pointed out by @starfry - -r argument for xargs is preventing the call of ls -1 -t, if no files were matched by the find. Thank you for the suggesion.

Solution 3 - Linux

This is a possible implementation of the required Bash function:

# Print the newest file, if any, matching the given pattern
# Example usage:
#   newest_matching_file 'b2*'
# WARNING: Files whose names begin with a dot will not be checked
function newest_matching_file
{
    # Use ${1-} instead of $1 in case 'nounset' is set
    local -r glob_pattern=${1-}

    if (( $# != 1 )) ; then
        echo 'usage: newest_matching_file GLOB_PATTERN' >&2
        return 1
    fi

    # To avoid printing garbage if no files match the pattern, set
    # 'nullglob' if necessary
    local -i need_to_unset_nullglob=0
    if [[ ":$BASHOPTS:" != *:nullglob:* ]] ; then
        shopt -s nullglob
        need_to_unset_nullglob=1
    fi

    newest_file=
    for file in $glob_pattern ; do
        [[ -z $newest_file || $file -nt $newest_file ]] \
            && newest_file=$file
    done

    # To avoid unexpected behaviour elsewhere, unset nullglob if it was
    # set by this function
    (( need_to_unset_nullglob )) && shopt -u nullglob

    # Use printf instead of echo in case the file name begins with '-'
    [[ -n $newest_file ]] && printf '%s\n' "$newest_file"

    return 0
}

It uses only Bash builtins, and should handle files whose names contain newlines or other unusual characters.

Solution 4 - Linux

A Bash function to find the newest file under a directory matching a pattern

#1.  Make a bash function:
newest_file_matching_pattern(){ 
    find $1 -name "$2" -print0 | xargs -0 ls -1 -t | head -1  
} 
 
#2. Setup a scratch testing directory: 
mkdir /tmp/files_to_move;
cd /tmp/files_to_move;
touch file1.txt;
touch file2.txt; 
touch foobar.txt; 
 
#3. invoke the function: 
result=$(newest_file_matching_pattern /tmp/files_to_move "file*") 
printf "result: $result\n"

Prints:

result: /tmp/files_to_move/file2.txt

Or if brittle bash parlor tricks subcontracting to python interpreter is more your angle, this does the same thing:

#!/bin/bash 
 
function newest_file_matching_pattern { 
python - <<END 
import glob, os, re  
print(sorted(glob.glob("/tmp/files_to_move/file*"), key=os.path.getmtime)[0]); 
END 
} 
 
result=$(newest_file_matching_pattern) 
printf "result: $result\n" 

Prints:

result: /tmp/files_to_move/file2.txt

Solution 5 - Linux

Use the find command.

Assuming you're using Bash 4.2+, use -printf '%T+ %p\n' for file timestamp value.

find $DIR -type f -printf '%T+ %p\n' | sort -r | head -n 1 | cut -d' ' -f2

Example:

find ~/Downloads -type f -printf '%T+ %p\n' | sort -r | head -n 1 | cut -d' ' -f2

For a more useful script, see the find-latest script here: https://github.com/l3x/helpers

Solution 6 - Linux

Unusual filenames (such as a file containing the valid \n character can wreak havoc with this kind of parsing. Here's a way to do it in Perl:

perl -le '@sorted = map {$_->[0]} 
                    sort {$a->[1] <=> $b->[1]} 
                    map {[$_, -M $_]} 
                    @ARGV;
          print $sorted[0]
' b2*

That's a Schwartzian transform used there.

Solution 7 - Linux

You can use stat with a file glob and a decorate-sort-undecorate with the file time added on the front:

$ stat -f "%m%t%N" b2* | sort -rn | head -1 | cut -f2-

As stated in comments, the best cross platform solution may be with a Python, Perl, Ruby script.

For such things, I tend to use Ruby since it is very awk like in the ease of writing small, throw away scripts yet has the power of Python or Perl right from the command line.

Here is a ruby:

ruby -e '
# index [0] for oldest and [-1] for newest
newest=Dir.glob("*").
	reject { |f| File.directory?(f)}.
	sort_by { |f| File.birthtime(f) rescue File.mtime(f) 
	}[-1]
p newest'

That gets the newest file in the current working directory.

You can also make the glob recursive by using **/* in glob or limit to matched files with b2*, etc

Solution 8 - Linux

For googlers:

ls -t | head -1

  • -t sorts by last modification datetime
  • head -1 only returns the first result

(Don't use in production)

Solution 9 - Linux

There is a much more efficient way of achieving this. Consider the following command:

find . -cmin 1 -name "b2*"

This command finds the latest file produced exactly one minute ago with the wildcard search on "b2*". If you want files from the last two days then you'll be better off using the command below:

find . -mtime 2 -name "b2*"

The "." represents the current directory. Hope this helps.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionjlconlinView Question on Stackoverflow
Solution 1 - LinuxLesmanaView Answer on Stackoverflow
Solution 2 - LinuxBoris BrodskiView Answer on Stackoverflow
Solution 3 - LinuxpjhView Answer on Stackoverflow
Solution 4 - LinuxEric LeschinskiView Answer on Stackoverflow
Solution 5 - Linuxl3xView Answer on Stackoverflow
Solution 6 - Linuxglenn jackmanView Answer on Stackoverflow
Solution 7 - LinuxdawgView Answer on Stackoverflow
Solution 8 - LinuxTobias FeilView Answer on Stackoverflow
Solution 9 - LinuxNaufalView Answer on Stackoverflow