gitignore by file size?

GitConfigurationBlobGitignoreLarge Files

Git Problem Overview


I'm trying to implement Git to manage creative assets (Photoshop, Illustrator, Maya, etc.), and I'd like to exclude files from Git based on file size rather than extension, location, etc.

For example, I don't want to exclude all .avi files, but there are a handful of massive +1GB avi files in random directories that I don't want to commit.

Any suggestions?

Git Solutions


Solution 1 - Git

I'm new to .gitignore, so there may be better ways to do this, but I've been excluding files by file size using:

find . -size +1G | cat >> .gitignore

Obviously you'll have to run this code frequently if you're generating a lot of large files.

Solution 2 - Git

Although the file size is very large and the following should not be an issue at all and provided that @abendine answer is correct, according to: <https://stackoverflow.com/a/22057427/6466510>

find * -size +1G | cat >> .gitignore

it would be far better. Have a look at this too: <https://stackoverflow.com/questions/46578534/difference-between-find-and-find-in-unix/62292764#62292764> it turns out that replacing . with * here above, avoid to find things in .git directory.

Solution 3 - Git

To satisfy github's <100MB file limit, run this:

find . -size +100M | cat >> .gitignore

Solution 4 - Git

I wanted to also offer a Windows version of this as well.

forfiles /s /c "cmd /q /c if @fsize GTR 1073741824 echo @relpath" >> .gitignore

Solution 5 - Git

(Update 2020-05)

Microsoft released time ago Git-LFS as Open-Source. Probably this is what most people really are searching for:

https://git-lfs.github.com/ C&P from the project page: "Git Large File Storage (LFS) replaces large files such as audio samples, videos, datasets, and graphics with text pointers inside Git, while storing the file contents on a remote server like GitHub.com or GitHub Enterprise."

Solution 6 - Git

I want to add to all these answers that you can also just use a git hook to have something more automatic (or less human-error prone) like this:

cat .git/hooks/pre-commit

#!/bin/bash

echo "automatically ignoring large files"
find . -size 5M | sed 's|^\./||g' >> .gitignore
cat .gitignore | sort | uniq > .gitignore

git diff --exit-code .gitignore
exit_status=$?
if [ $exit_status -eq 1 ]
then
    set +e
    for i in `cat .gitignore`
    do
	set +e
        git rm --cached $i
    done

    git add .gitignore
    git commit .gitignore --no-verify -m"ignoring large files"

    echo "ignored new large files"
fi

It is pretty brute force and the downside is that in case there were new large files added by the git hook, the origin commit fails because the state (hash) changed. So you need to execute another commit to actually commit what you have staged. Consider this as a feature telling you that new large files were detected ;-)

Solution 7 - Git

Just adding an answer that summarizes the suggestions about "remove the leading ./" and "useless use of sed" and "useless use of cat"

find . -size +100M -printf '%P\n' >> .gitignore

fwiw i think that cat is fine :D

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionWarren BenedettoView Question on Stackoverflow
Solution 1 - GitabendineView Answer on Stackoverflow
Solution 2 - GitandreagalleView Answer on Stackoverflow
Solution 3 - GitstevecView Answer on Stackoverflow
Solution 4 - GittisaconundrumView Answer on Stackoverflow
Solution 5 - GitearizonView Answer on Stackoverflow
Solution 6 - GitKICView Answer on Stackoverflow
Solution 7 - GitphysincubusView Answer on Stackoverflow