Whitelisting and subdirectories in Git

GitGitignore

Git Problem Overview


I have created a white-list for text files only.

*
!*.txt

Now, I have an untracked text file in a sub-directory - sub/dir/file.txt, and this is NOT shown (it is ignored). Text files in the root directory are shown, however.

Why is that, and how do I fix it?

Git Solutions


Solution 1 - Git

If you try it that way, it'll fail, because you'll end up blacklisting the directories in your structure.

To solve, you want to blacklist everything that is not a directory, and is not one of the file-types you want to commit, while not blacklisting directories.

The .gitignore file that will do this:

# First, ignore everything
*
# Now, whitelist anything that's a directory
!*/
# And all the file types you're interested in.
!*.one
!*.two
!*.etc

Tested this in a three-level structure white-listing for .txt files in the presence of *.one, *.two and *.three files using a .gitignore located in the root directory of the repository - works for me. You won't have to add .gitignore files to all directories in your structure.

Information I used to figure out the answer came from, amongst other things, this (stackoverflow.com).

Solution 2 - Git

A simpler way of achieving this is:

# Ignore all files...
*.*

# ...except the ones we want
!*.txt

This works because gitignore applies patterns that do not start with / to every level below the .gitignore file:

> If there is a separator at the beginning or middle (or both) of the pattern, then the pattern is relative to the directory level of the particular .gitignore file itself. Otherwise the pattern may also match at any level below the .gitignore level.

If you wanted to do this to files inside a directory, things get more complex:

# Ignore all files in all directories inside subdir...
/subdir/**/*.*

# ...except the ones we want
!/subdir/**/*.txt

This works because gitignore has special rules for **:

> Two consecutive asterisks ("**") in patterns matched against full pathname may have special meaning: > > * A slash followed by two consecutive asterisks then a slash matches zero or more directories. For example, "a/**/b" matches "a/b", "a/x/b", "a/x/y/b" and so on.

The key piece is to make sure you don't ignore directories, because then every file within that directory is ignored regardless of other rules.

Solution 3 - Git

I searched for a long time:

  1. Assume I have a large folder structure with ~100.000 directories recursively nested. In those folders, there're about 30.000 files of type .txt (in my case: type *.md). Next to these *.md files, there're, lets say, 500GB of (a million+) files that I don't want to track.

  2. I want git to track only .txt (or *.md) files in all folders and subdirs.

The correct answer should be: this is not possible in Git.

What I did instead:

[edit: did also not work - I tried to create a folder with symlinks (or hardlinks) and use git there, but git doesn't follow symlinks and overwrites hardlinks. Doh!]

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionNateView Question on Stackoverflow
Solution 1 - GitsimontView Answer on Stackoverflow
Solution 2 - GitSharadhView Answer on Stackoverflow
Solution 3 - GitAlexView Answer on Stackoverflow