When to use ** (double star) in glob syntax within JAVA

JavaPathNioGlobMatcher

Java Problem Overview


Directly from this Java Oracle tutorial:

> Two asterisks, **, works like * but crosses directory boundaries. This > syntax is generally used for matching complete paths.

Could anybody do a real example out of it? What do they mean with "crosses directory boundary"? Crossing the directory boundary, I imagine something like checking the file from root to getNameCount()-1. Again a real example explaining the difference between * and ** in practice would be great.

Java Solutions


Solution 1 - Java

The javadoc for FileSystem#getPathMatcher() has some pretty good examples and explanations

*.java Matches a path that represents a file name ending in .java 
*.*    Matches file names containing a dot 

*.{java,class}  Matches file names ending with .java or .class 
foo.?           Matches file names starting with foo. and a single character extension 
/home/*/*       Matches /home/gus/data on UNIX platforms 
/home/**        Matches /home/gus and /home/gus/data on UNIX platforms 
C:\\*           Matches C:\foo and C:\bar on the Windows platform (note that the backslash is escaped; as a string literal in the Java Language the pattern would be "C:\\\\*")  

So /home/** would match /home/gus/data, but /home/* wouldn't.

/home/* is saying every file directly in the /home directory.

/home/** is saying every file in any directory inside /home.


Example of * vs **. Assuming your current working directory is /Users/username/workspace/myproject, then the following will only match the ./myproject file (directory).

PathMatcher pathMatcher = FileSystems.getDefault().getPathMatcher("glob:/Users/username/workspace/*");
Files.walk(Paths.get(".")).forEach((path) -> {
	path = path.toAbsolutePath().normalize();
	System.out.print("Path: " + path + " ");
	if (pathMatcher.matches(path)) {
		System.out.print("matched");
	}
	System.out.println();
});

If you use **, it will match all folders and files within that directory.

Solution 2 - Java

Double asterisk (**) matches zero or more characters across multiple nested directories. I will explain the double asterisk as well as other wildcards that are useful step by step with examples after explaining the main concept.


Globbing

A glob is a string literal and/or wildcard characters used to match file paths. Locating files on a filesystem using one or more globs is called globbing. The globbing is not just limited to Java. It's also used for matching files in various configuration files, such as listing ignored files and directories in .gitignore in Git, selecting files and folders in Unix operating system, e.g ls **/*.java etc.

Following are some of the most important parts of globbing. Double asterisk(**) is one of them:


Separator and Segments (/)

In Globbing, the forward slash character (/) always acts as the separator, no matter what operating system is being used. A segment is everything that comes between the two separators.

Example: tests/HelloWorld.java

Here, tests and HelloWorld.java are the segments and / is the separator.


Single Asterisk (*)

Single Asterisk (*) matches zero or more characters within one segment. It is used for globbing the files within one directory.

Example: *.java

This glob will match files such as HelloWorld.java but not files like tests/HelloWorld.java or tests/ui/HelloWorld.java.


Double Asterisk (**)

Double Asterisk (**) matches zero or more characters across multiple segments. It is used for globbing files that are in nested directories.

Example: tests/**/*.java

Here, the file selecting will be restricted to the tests directory. The glob will match the files such as tests/HelloWorld.java, tests/ui/HelloWorld.java, tests/ui/feature1/HelloWorld.java.


Question Mark(?)

Question mark(?) matches a single character within one segment. It can be used for matching the files or folders that differ in name by just one character.

Example: tests/?at.java

This will match files such as tests/cat.java, test/Cat.java, test/bat.java etc.


Square Brackets ([abc])

Square Brackets ([...]) matches a single character given in the square brackets.

Example: tests/[CB]at.java

This glob will match files like tests/Cat.java or tests/Bat.java


Square Brackets Range ([a-z])

Square Brackets Range ([a-z]), matches one character given in the range.

Example: tests/feature[1-9]/HelloWorld.java

This glob will match files like tests/feature1/HelloWorld.java, test/feature2/HelloWorld.java and so on... upto 9.


Negation (!)

Negation (!) is used for excluding some files.

Example: tests/[!C]at.java

This will exclude the file tests/Cat.java and will match files like tests/Bat.java, tests/bat.java, tests/cat.java.


That's it! Hope that helps.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionRollerballView Question on Stackoverflow
Solution 1 - JavaSotirios DelimanolisView Answer on Stackoverflow
Solution 2 - JavaYogesh Umesh VaityView Answer on Stackoverflow