How do I list all files in a subdirectory in scala?

Scala

Scala Problem Overview


Is there a good "scala-esque" (I guess I mean functional) way of recursively listing files in a directory? What about matching a particular pattern?

For example recursively all files matching "a*.foo" in c:\temp.

Scala Solutions


Solution 1 - Scala

Scala code typically uses Java classes for dealing with I/O, including reading directories. So you have to do something like:

import java.io.File
def recursiveListFiles(f: File): Array[File] = {
  val these = f.listFiles
  these ++ these.filter(_.isDirectory).flatMap(recursiveListFiles)
}

You could collect all the files and then filter using a regex:

myBigFileArray.filter(f => """.*\.html$""".r.findFirstIn(f.getName).isDefined)

Or you could incorporate the regex into the recursive search:

import scala.util.matching.Regex
def recursiveListFiles(f: File, r: Regex): Array[File] = {
  val these = f.listFiles
  val good = these.filter(f => r.findFirstIn(f.getName).isDefined)
  good ++ these.filter(_.isDirectory).flatMap(recursiveListFiles(_,r))
}

Solution 2 - Scala

I would prefer solution with Streams because you can iterate over infinite file system(Streams are lazy evaluated collections)

import scala.collection.JavaConversions._

def getFileTree(f: File): Stream[File] =
		f #:: (if (f.isDirectory) f.listFiles().toStream.flatMap(getFileTree) 
               else Stream.empty)

Example for searching

getFileTree(new File("c:\\main_dir")).filter(_.getName.endsWith(".scala")).foreach(println)

Solution 3 - Scala

As of Java 1.7 you all should be using java.nio. It offers close-to-native performance (java.io is very slow) and has some useful helpers

But Java 1.8 introduces exactly what you are looking for:

import java.nio.file.{FileSystems, Files}
import scala.collection.JavaConverters._
val dir = FileSystems.getDefault.getPath("/some/path/here") 

Files.walk(dir).iterator().asScala.filter(Files.isRegularFile(_)).foreach(println)

You also asked for file matching. Try java.nio.file.Files.find and also java.nio.file.Files.newDirectoryStream

See documentation here: http://docs.oracle.com/javase/tutorial/essential/io/walk.html

Solution 4 - Scala

for (file <- new File("c:\\").listFiles) { processFile(file) }

http://langref.org/scala+java/files

Solution 5 - Scala

Scala is a multi-paradigm language. A good "scala-esque" way of iterating a directory would be to reuse an existing code!

I'd consider using commons-io a perfectly scala-esque way of iterating a directory. You can use some implicit conversions to make it easier. Like

import org.apache.commons.io.filefilter.IOFileFilter
implicit def newIOFileFilter (filter: File=>Boolean) = new IOFileFilter {
  def accept (file: File) = filter (file)
  def accept (dir: File, name: String) = filter (new java.io.File (dir, name))
}

Solution 6 - Scala

I like yura's stream solution, but it (and the others) recurses into hidden directories. We can also simplify by making use of the fact that listFiles returns null for a non-directory.

def tree(root: File, skipHidden: Boolean = false): Stream[File] = 
  if (!root.exists || (skipHidden && root.isHidden)) Stream.empty 
  else root #:: (
    root.listFiles match {
      case null => Stream.empty
      case files => files.toStream.flatMap(tree(_, skipHidden))
  })

Now we can list files

tree(new File(".")).filter(f => f.isFile && f.getName.endsWith(".html")).foreach(println)

or realise the whole stream for later processing

tree(new File("dir"), true).toArray

Solution 7 - Scala

No-one has mentioned yet https://github.com/pathikrit/better-files

val dir = "src"/"test"
val matches: Iterator[File] = dir.glob("**/*.{java,scala}")
// above code is equivalent to:
dir.listRecursively.filter(f => f.extension == 
                      Some(".java") || f.extension == Some(".scala")) 

Solution 8 - Scala

Apache Commons Io's FileUtils fits on one line, and is quite readable:

import scala.collection.JavaConversions._ // important for 'foreach'
import org.apache.commons.io.FileUtils

FileUtils.listFiles(new File("c:\temp"), Array("foo"), true).foreach{ f =>
  
}

Solution 9 - Scala

I personally like the elegancy and simplicity of @Rex Kerr's proposed solution. But here is what a tail recursive version might look like:

def listFiles(file: File): List[File] = {
  @tailrec
  def listFiles(files: List[File], result: List[File]): List[File] = files match {
    case Nil => result
    case head :: tail if head.isDirectory =>
      listFiles(Option(head.listFiles).map(_.toList ::: tail).getOrElse(tail), result)
    case head :: tail if head.isFile =>
      listFiles(tail, head :: result)
  }
  listFiles(List(file), Nil)
}

Solution 10 - Scala

Take a look at scala.tools.nsc.io

There are some very useful utilities there including deep listing functionality on the Directory class.

If I remember correctly this was highlighted (possibly contributed) by retronym and were seen as a stopgap before io gets a fresh and more complete implementation in the standard library.

Solution 11 - Scala

And here's a mixture of the stream solution from @DuncanMcGregor with the filter from @Rick-777:

  def tree( root: File, descendCheck: File => Boolean = { _ => true } ): Stream[File] = {
    require(root != null)
    def directoryEntries(f: File) = for {
      direntries <- Option(f.list).toStream
      d <- direntries
    } yield new File(f, d)
    val shouldDescend = root.isDirectory && descendCheck(root)
    ( root.exists, shouldDescend ) match {
      case ( false, _) => Stream.Empty
      case ( true, true ) => root #:: ( directoryEntries(root) flatMap { tree( _, descendCheck ) } )
      case ( true, false) => Stream( root )
    }   
  }

  def treeIgnoringHiddenFilesAndDirectories( root: File ) = tree( root, { !_.isHidden } ) filter { !_.isHidden }

This gives you a Stream[File] instead of a (potentially huge and very slow) List[File] while letting you decide which sorts of directories to recurse into with the descendCheck() function.

Solution 12 - Scala

How about

   def allFiles(path:File):List[File]=
   {    
       val parts=path.listFiles.toList.partition(_.isDirectory)
       parts._2 ::: parts._1.flatMap(allFiles)         
   }

Solution 13 - Scala

Scala has library 'scala.reflect.io' which considered experimental but does the work

import scala.reflect.io.Path
Path(path) walkFilter { p => 
  p.isDirectory || """a*.foo""".r.findFirstIn(p.name).isDefined
}

Solution 14 - Scala

The simplest Scala-only solution (if you don't mind requiring the Scala compiler library):

val path = scala.reflect.io.Path(dir)
scala.tools.nsc.io.Path.onlyFiles(path.walk).foreach(println)

Otherwise, @Renaud's solution is short and sweet (if you don't mind pulling in Apache Commons FileUtils):

import scala.collection.JavaConversions._  // enables foreach
import org.apache.commons.io.FileUtils
FileUtils.listFiles(dir, null, true).foreach(println)

Where dir is a java.io.File:

new File("path/to/dir")

Solution 15 - Scala

os-lib is the easiest way to recursively list files in Scala.

os.walk(os.pwd/"countries").filter(os.isFile(_))

Here's how to recursively list all the files that match the "a*.foo" pattern specified in the question:

os.walk(os.pwd/"countries").filter(_.segments.toList.last matches "a.*\\.foo")

os-lib is way more elegant and powerful than other alternatives. It returns os objects that you can easily move, rename, whatever. You don't need to suffer with the clunky Java libraries anymore.

Here's a code snippet you can run if you'd like to experiment with this library on your local machine:

os.makeDir(os.pwd/"countries")
os.makeDir(os.pwd/"countries"/"colombia")
os.write(os.pwd/"countries"/"colombia"/"medellin.txt", "q mas pues")
os.write(os.pwd/"countries"/"colombia"/"a_something.foo", "soy un rolo")
os.makeDir(os.pwd/"countries"/"brasil")
os.write(os.pwd/"countries"/"brasil"/"a_whatever.foo", "carnaval")
os.write(os.pwd/"countries"/"brasil"/"a_city.txt", "carnaval")

println(os.walk(os.pwd/"countries").filter(os.isFile(_))) will return this:

ArraySeq(
  /.../countries/brasil/a_whatever.foo, 
  /.../countries/brasil/a_city.txt, 
  /.../countries/colombia/a_something.foo, 
  /.../countries/colombia/medellin.txt)

os.walk(os.pwd/"countries").filter(_.segments.toList.last matches "a.*\\.foo") will return this:

ArraySeq(
  /.../countries/brasil/a_whatever.foo, 
  /.../countries/colombia/a_something.foo)

See here for more details on how to use the os-lib.

Solution 16 - Scala

Here's a similar solution to Rex Kerr's, but incorporating a file filter:

import java.io.File
def findFiles(fileFilter: (File) => Boolean = (f) => true)(f: File): List[File] = {
  val ss = f.list()
  val list = if (ss == null) {
    Nil
  } else {
    ss.toList.sorted
  }
  val visible = list.filter(_.charAt(0) != '.')
  val these = visible.map(new File(f, _))
  these.filter(fileFilter) ++ these.filter(_.isDirectory).flatMap(findFiles(fileFilter))
}

The method returns a List[File], which is slightly more convenient than Array[File]. It also ignores all directories that are hidden (ie. beginning with '.').

It's partially applied using a file filter of your choosing, for example:

val srcDir = new File( ... )
val htmlFiles = findFiles( _.getName endsWith ".html" )( srcDir )

Solution 17 - Scala

It seems nobody mentions the scala-io library from scala-incubrator...

import scalax.file.Path

Path.fromString("c:\temp") ** "a*.foo"

Or with implicit

import scalax.file.ImplicitConversions.string2path

"c:\temp" ** "a*.foo"

Or if you want implicit explicitly...

import scalax.file.Path
import scalax.file.ImplicitConversions.string2path

val dir: Path = "c:\temp"
dir ** "a*.foo"

Documentation is available here: http://jesseeichar.github.io/scala-io-doc/0.4.3/index.html#!/file/glob_based_path_sets

Solution 18 - Scala

The deepFiles method of scala.reflect.io.Directory provides a pretty nice way of recursively getting all the files in a directory:

import scala.reflect.io.Directory
new Directory(f).deepFiles.filter(x => x.startsWith("a") && x.endsWith(".foo"))

deepFiles returns an iterator so you can convert it some other collection type if you don't need/want lazy evaluation.

Solution 19 - Scala

This incantation works for me:

  def findFiles(dir: File, criterion: (File) => Boolean): Seq[File] = {
    if (dir.isFile) Seq()
    else {
      val (files, dirs) = dir.listFiles.partition(_.isFile)
      files.filter(criterion) ++ dirs.toSeq.map(findFiles(_, criterion)).foldLeft(Seq[File]())(_ ++ _)
    }
  }

Solution 20 - Scala

You can use tail recursion for it:

object DirectoryTraversal {
  import java.io._

  def main(args: Array[String]) {
    val dir = new File("C:/Windows")
    val files = scan(dir)

    val out = new PrintWriter(new File("out.txt"))

    files foreach { file =>
      out.println(file)
    }

    out.flush()
    out.close()
  }

  def scan(file: File): List[File] = {

    @scala.annotation.tailrec
    def sc(acc: List[File], files: List[File]): List[File] = {
      files match {
        case Nil => acc
        case x :: xs => {
          x.isDirectory match {
            case false => sc(x :: acc, xs)
            case true => sc(acc, xs ::: x.listFiles.toList)
          }
        }
      }
    }

    sc(List(), List(file))
  }
}

Solution 21 - Scala

Minor improvement to the accepted answer.
By partitioning on the _.isDirectory this function returns list of files only.
(Directories are excluded)

import java.io.File
def recursiveListFiles(f: File): Array[File] = {
  val (dir, files)  = f.listFiles.partition(_.isDirectory)
  files ++ dir.flatMap(recursiveListFiles)
}

Solution 22 - Scala

Why are you using Java's File instead of Scala's AbstractFile?

With Scala's AbstractFile, the iterator support allows writing a more concise version of James Moore's solution:

import scala.reflect.io.AbstractFile  
def tree(root: AbstractFile, descendCheck: AbstractFile => Boolean = {_=>true}): Stream[AbstractFile] =
  if (root == null || !root.exists) Stream.empty
  else
    (root.exists, root.isDirectory && descendCheck(root)) match {
      case (false, _) => Stream.empty
      case (true, true) => root #:: root.iterator.flatMap { tree(_, descendCheck) }.toStream
      case (true, false) => Stream(root)
    }

Solution 23 - Scala

获取路径下所有文件,剔除文件夹

import java.io.File
import scala.collection.mutable.{ArrayBuffer, ListBuffer}

object pojo2pojo {

    def main(args: Array[String]): Unit = {
        val file = new File("D:\\tmp\\tmp")
        val files = recursiveListFiles(file)
        println(files.toList)
        // List(D:\tmp\tmp\1.txt, D:\tmp\tmp\a\2.txt)
    }

    def recursiveListFiles(f: File):ArrayBuffer[File] = {
        val all = collection.mutable.ArrayBuffer(f.listFiles:_*)
        val files = all.filter(_.isFile)
        val dirs = all.filter(_.isDirectory)
        files ++ dirs.flatMap(recursiveListFiles)
    }

}


Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionNick FortescueView Question on Stackoverflow
Solution 1 - ScalaRex KerrView Answer on Stackoverflow
Solution 2 - ScalayuraView Answer on Stackoverflow
Solution 3 - ScalamonzonjView Answer on Stackoverflow
Solution 4 - ScalaPhilView Answer on Stackoverflow
Solution 5 - ScalaArtemGrView Answer on Stackoverflow
Solution 6 - ScalaDuncan McGregorView Answer on Stackoverflow
Solution 7 - ScalaPhilView Answer on Stackoverflow
Solution 8 - ScalaRenaudView Answer on Stackoverflow
Solution 9 - ScalapolbotinkaView Answer on Stackoverflow
Solution 10 - ScalaDon MackenzieView Answer on Stackoverflow
Solution 11 - ScalaJames MooreView Answer on Stackoverflow
Solution 12 - ScalaDino FancelluView Answer on Stackoverflow
Solution 13 - ScalaroterlView Answer on Stackoverflow
Solution 14 - ScalaBrent FaustView Answer on Stackoverflow
Solution 15 - ScalaPowersView Answer on Stackoverflow
Solution 16 - ScalaRick-777View Answer on Stackoverflow
Solution 17 - ScaladrawView Answer on Stackoverflow
Solution 18 - ScalaCalvin KesslerView Answer on Stackoverflow
Solution 19 - ScalaConnor DoyleView Answer on Stackoverflow
Solution 20 - ScalaMilindView Answer on Stackoverflow
Solution 21 - ScalaSakthi Priyan HView Answer on Stackoverflow
Solution 22 - ScalaNicolas RouquetteView Answer on Stackoverflow
Solution 23 - ScalachinayangyongyongView Answer on Stackoverflow