How to search a folder and all of its subfolders for files of a certain type

RubyFile IoRecursion

Ruby Problem Overview


I am trying to search for all files of a given type in a given folder and copy them to a new folder.

I need to specify a root folder and search through that folder and all of its subfolders for any files that match the given type.

How do I search the root folder's subfolders and their subfolders? It seems like a recursive method would work, but I cannot implement one correctly.

Ruby Solutions


Solution 1 - Ruby

Try this:

Dir.glob("#{folder}/**/*.pdf")

which is the same as

Dir["#{folder}/**/*.pdf"]

Where the folder variable is the path to the root folder you want to search through.

Solution 2 - Ruby

You want the https://ruby-doc.org/stdlib-2.6.3/libdoc/find/rdoc/Find.html">Find</a> module. Find.find takes a string containing a path, and will pass the parent path along with the path of each file and sub-directory to an accompanying block. Some example code:

require 'find'

pdf_file_paths = []
Find.find('path/to/search') do |path|
  pdf_file_paths << path if path =~ /.*\.pdf$/
end

That will recursively search a path, and store all file names ending in .pdf in an array.

Solution 3 - Ruby

If speed is a concern, prefer Dir.glob over Find.find.

Warming up --------------------------------------
           Find.find   124.000  i/100ms
            Dir.glob   515.000  i/100ms
Calculating -------------------------------------
           Find.find      1.242k4.7%) i/s -      6.200k in   5.001398s
            Dir.glob      5.249k4.5%) i/s -     26.265k in   5.014632s

Comparison:
            Dir.glob:     5248.5 i/s
           Find.find:     1242.4 i/s - 4.22x slower

 

require 'find'
require 'benchmark/ips'

dir = '.'

Benchmark.ips do |x|
  x.report 'Find.find' do
	Find.find(dir).select { |f| f =~ /\*\.pdf/ }
  end

  x.report 'Dir.glob' do
	Dir.glob("#{dir}/**/*\.pdf")
  end

  x.compare!
end

Using ruby 2.2.2p95 (2015-04-13 revision 50295) [x86_64-darwin15]

Solution 4 - Ruby

As a small improvement to Jergason and Matt's answer above, here's how you can condense to a single line:

pdf_file_paths = Find.find('path/to/search').select { |p| /.*\.pdf$/ =~ p }

This uses the Find method as above, but leverages the fact that the result is an enumerable (and as such we can use select) to get an array back with the set of matches

Solution 5 - Ruby

Another fast way of doing this is delegating the task to the shell command "find" and splitting the output:

pdf_file_paths = `find #{dir} -name "*.pdf"`.split("\n")

Does not work on Windows.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionab217View Question on Stackoverflow
Solution 1 - RubyrogerdpackView Answer on Stackoverflow
Solution 2 - RubyjergasonView Answer on Stackoverflow
Solution 3 - RubyDennisView Answer on Stackoverflow
Solution 4 - RubychrisdurheimView Answer on Stackoverflow
Solution 5 - RubyFelipe ZavanView Answer on Stackoverflow