Converting git repository to shallow?
GitGit Problem Overview
How can I convert an already cloned git repository to a shallow repository?
The git repository is downloaded through a script outside of my control so I cannot do a shallow clone.
The reason for doing this is to save disk space. (Yes, I'm really short on disk space so even though a shallow repository doesn't save much, it is needed.)
I already tried
git repack -a -d -f -depth=1
But that actually made the repository larger.
Git Solutions
Solution 1 - Git
This worked for me:
git pull --depth 1
git gc --prune=all
This leaves the tags and reflog laying around, each of which reference additional commits that can use up space. Note that I would not erase the reflog unless severely needed: it contains local change history used for recovery from mistakes.
There are commands on how to erase the tags or even the reflog in the comments below, and a link to a similar question with a longer answer.
If you still have a lot of space used you may need to remove the tags, which you should try first before removing the reflog.
Solution 2 - Git
You can convert git repo to a shallow one in place along this lines:
git show-ref -s HEAD > .git/shallow
git reflog expire --expire=0
git prune
git prune-packed
Make sure to make backup since this is destructive operation, also keep in mind that cloning nor fetching from shallow repo is not supported! To really remove all the history you also need to remove all references to previous commits before pruning.
Solution 3 - Git
Create shallow clone of a local repo:
git clone --depth 1 file:///full/path/to/original/dir destination
Note that the first "address" should be a file://
, that's important. Also, git will assume your original local file:// address to be the "remote" ("origin"), so you'll need to update the new repository specifying the correct git remote
.
Solution 4 - Git
Convert to shallow since a specific date:
git pull --shallow-since=YYYY-mm-dd
git gc --prune=all
Also works:
git fetch --shallow-since=YYYY-mm-dd
git gc --prune=all
Solution 5 - Git
Combining the answer from @fuzzyTew with what the comments on that answer:
git pull --depth 1
git tag -d $(git tag -l)
git reflog expire --expire=all --all
git gc --prune=all
Want to save space by running this across your entire disk? - Then run this fd
command:
fd -HIFt d '.git' -x bash -c 'pushd "$0" && ( git pull --depth 1; git tag -d $(git tag -l); git reflog expire --expire=all --all; git gc --prune=all ) && popd' {//}
Or with just regular find
:
find -type d -name '.git' -exec bash -c 'pushd "${0%/*}" && ( git pull --depth 1; git tag -d $(git tag -l); git reflog expire --expire=all --all; git gc --prune=all ) && popd' {} \;
Solution 6 - Git
Note that a shallow repo (like one with git clone --depth 1
as a way to convert an existing repo to a shallow one) can fail on git repack
.
See commit 5dcfbf5, commit 2588f6e, commit 328a435 (24 Oct 2018) by Johannes Schindelin (dscho
).
(Merged by Junio C Hamano -- gitster
-- in commit ea100b6, 06 Nov 2018)
> repack -ad
: prune the list of shallow commits
> git repack
can drop unreachable commits without further warning,
making the corresponding entries in .git/shallow
invalid, which causes
serious problems when deepening the branches.
> One scenario where unreachable commits are dropped by git repack
is
when a git fetch --prune
(or even a git fetch
when a ref was
force-pushed in the meantime) can make a commit unreachable that was
reachable before.
> Therefore it is not safe to assume that a git repack -adlf
will keep unreachable commits alone (under the assumption that they had not been packed in the first place, which is an assumption at least some of Git's code seems to make).
> This is particularly important to keep in mind when looking at the
.git/shallow
file: if any commits listed in that file become
unreachable, it is not a problem, but if they go missing, it is a
problem.
One symptom of this problem is that a deepening fetch may now
fail with:
> fatal: error in object: unshallow
> To avoid this problem, let's prune the shallow list in git repack
when the -d
option is passed, unless -A
is passed, too (which would force the now-unreachable objects to be turned into loose objects instead of being deleted).
Additionally, we also need to take --keep-reachable
and --unpack-unreachable=<date>
into account.
> Note: an alternative solution discussed during the review of this patch
was to teach git fetch
to simply ignore entries in .git/shallow
if the
corresponding commits do not exist locally.
A quick test, however, revealed that the .git/shallow
file is written during a shallow clone, in which case the commits do not exist, either, but the "shallow" line
does need to be sent.
Therefore, this approach would be a lot more finicky than the approach presented by the this patch.