What is the advantage of git lfs?
GitGit LfsGit Problem Overview
Github has a limit on push large file. So if you want to push a large file to your repo, you have to use Git LFS.
I know it's a bad idea to add binary file in git repo. But if I am using gitlab on my server and there is no limit of file size in a repo, and I don't care the repo size to be super large on my server. In this condition, what's the advantage of git lfs?git clone
or git checkout
will be faster?
Git Solutions
Solution 1 - Git
One specificity of Git (and other distributed systems) compared to centralized systems is that each repository contains the whole history of the project. Suppose you create a 100 MB file, modify it 100 times in a way that doesn't compress well. You'll end up with a 10 GB repository. This means that each clone will download 10 GB of data, eat 10 GB of disk space on each machine on which you're making a clone. What's even more frustrating: you'd still have to download these 10 GB of data even if you git rm
the big files.
Putting big files in a separate system like git-lfs allow you to store only pointers to each version of the file in the repository, hence each clone will only download a tiny piece of data for each revision. The checkout will download only the version you are using, i.e. 100 MB in the example above. As a result, you would be using disk space on the server, but saving a lot of bandwidth and disk space on the client.
In addition to this, the algorithm used by git gc
(internally, git repack
) does not always work well with big files. Recent versions of Git made progress in this area and it should work reasonably well, but using a big repository with big files in it may eventually get you in trouble (like not having enough RAM to repack your repository).