Is it safe to shallow clone with --depth 1, create commits, and pull updates again?

PerformanceGitGit Clone

Performance Problem Overview


The --depth 1 option in git clone:

> Create a shallow clone with a history truncated to the specified number of revisions. A shallow repository has a number of limitations (you cannot clone or fetch from it, nor push from nor into it), but is adequate if you are only interested in the recent history of a large project with a long history, and would want to send in fixes as patches.

But I've successfully done a shallow clone, committed some changes and pushed those changes back to the (bare clone) origin.

It makes sense to me - I mean why not? when the cloned HEAD is identifiable in the origin, and my commit comes on top of this, there seems no reason. But the manual says otherwise.

I like the idea of shallow clone - e.g. of drupal core: there's no way I need to know what went on in drupal 4 when I've started from 7. - but I don't want to shoot myself in the foot.

So is it safe to shallow clone, develop commits in it, pull again to keep up with updates from origin?

Performance Solutions


Solution 1 - Performance

Note that Git 1.9/2.0 (Q1 2014) has removed that limitation.
See commit 82fba2b, from Nguyễn Thái Ngọc Duy (pclouds):

> Now that git supports data transfer from or to a shallow clone, these limitations are not true anymore.

The documentation now reads:

--depth <depth>::

> Create a 'shallow' clone with a history truncated to the specified number of revisions.

That stems from commits like 0d7d285, f2c681c, and c29a7b8 which support clone, send-pack /receive-pack with/from shallow clones.
smart-http now supports shallow fetch/clone too.

All the details are in "shallow.c: the 8 steps to select new commits for .git/shallow".

Update June 2015: Git 2.5 will even allow for fetching a single commit!
(Ultimate shallow case)


Update January 2016: Git 2.8 (Mach 2016) now documents officially the practice of getting a minimal history.
See commit 99487cf, commit 9cfde9e (30 Dec 2015), commit 9cfde9e (30 Dec 2015), commit bac5874 (29 Dec 2015), and commit 1de2e44 (28 Dec 2015) by Stephen P. Smith (``).
(Merged by Junio C Hamano -- gitster -- in commit 7e3e80a, 20 Jan 2016)

This is "Documentation/user-manual.txt"

> A <<def_shallow_clone,shallow clone>> is created by specifying the git-clone --depth switch.
The depth can later be changed with the git-fetch --depth switch, or full history restored with --unshallow.

> Merging inside a <<def_shallow_clone,shallow clone>> will work as long as a merge base is in the recent history.
Otherwise, it will be like merging unrelated histories and may have to result in huge conflicts.
This limitation may make such a repository unsuitable to be used in merge based workflows.

Update 2020:

  • git 2.11.1 introduced option git fetch --shallow-exclude= to prevent fetching all history
  • git 2.11.1 introduced option git fetch --shallow-since= to prevent fetching old commits.

For more on the shallow clone update process, see "How to update a git shallow clone?".


As commented by Richard Michael:

> to backfill history: git pull --unshallow

And Olle Härstedt adds in the comments:

> To backfill part of the history: git fetch --depth=100.

Solution 2 - Performance

See some of the answers to my similar question why-cant-i-push-from-a-shallow-clone and the link to the recent thread on the git list.

Ultimately, the 'depth' measurement isn't consistent between repos, because they measure from their individual HEADs, rather than (a) your Head, or (b) the commit(s) you cloned/fetched, or (c) something else you had in mind.

The hard bit is getting one's Use Case right (i.e. self-consistent), so that distributed, and therefore probably divergent repos will still work happily together.

It does look like the checkout --orphan is the right 'set-up' stage, but still lacks clean (i.e. a simple understandable one line command) guidance on the "clone" step. Rather it looks like you have to init a repo, set up a remote tracking branch (you do want the one branch only?), and then fetch that single branch, which feels long winded with more opportunity for mistakes.

Edit: For the 'clone' step see this answer

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionartfulrobotView Question on Stackoverflow
Solution 1 - PerformanceVonCView Answer on Stackoverflow
Solution 2 - PerformancePhilip OakleyView Answer on Stackoverflow