Where does Git store the SHA1 of the commit for a submodule?

GitGit SubmodulesSha

Git Problem Overview


I know that when you add a submodule to a git repository it tracks a particular commit of that submodule referenced by its sha1.

I'm trying to find where this sha1 value is stored.

The .gitmodules and .git/config files only show the paths for the submodule, but not the sha1 of the commit.

The git-submodule(1) reference only speaks of a gitlink entry and the gitmodules(5) reference doesn't say anything about this either.

Git Solutions


Solution 1 - Git

It is stored in Git's object database directly. The tree object for the directory where the submodule lives will have an entry for the submodule's commit (this is the so-called "gitlink").

Try doing git ls-tree master <path-to-directory-containing-submodule> (or just git ls-tree master if the submodule lives in the top-level directory).

Solution 2 - Git

The object database ($GIT_DIR/objects/) where the submodule tree object is stored, is evolving recently:

With Git 2.34 (Q4 2021), the code to make "git grep"(man) recurse into submodules has been updated to migrate away from the add submodules object store as an alternate object store" mechanism (which is suboptimal).

See commit 18a2f66, commit e3e8bf0, commit 0693806, commit dd45471, commit 78ca584, commit 50d92b5, commit 8d33c3a, commit a35e03d (16 Aug 2021) by Jonathan Tan (jhowtan).
(Merged by Junio C Hamano -- gitster -- in commit 11e5d0a, 20 Sep 2021)

> ## submodule: lazily add submodule ODBs as alternates
> Signed-off-by: Jonathan Tan
> Reviewed-by: Emily Shaffer
> Reviewed-by: Matheus Tavares

> Teach Git to add submodule ODBs as alternates to the object store of the_repository only upon the first access of an object not in the_repository, and not when add_submodule_odb() is called.
> > This provides a means of gradually migrating from accessing a submodule's object through alternates to accessing a submodule's object by explicitly passing its repository object.
> Any Git command can declare that it might access submodule objects by calling add_submodule_odb() (as they do now), but the submodule ODBs themselves will not be added until needed, so individual commands and/or combinations of arguments can be migrated one by one.
> > [The advantage of explicit repository-object passing is code clarity (it is clear which repository an object read is from), performance (there is no need to linearly search through all submodule ODBs whenever an object is accessed from any repository, whether superproject or submodule), and the possibility of future features like partial clone submodules (which right now is not possible because if an object is missing, we do not know which repository to lazy-fetch into).]
> > This commit also introduces an environment variable that a test may set to make the actual registration of alternates fatal, in order to demonstrate that its codepaths do not need this registration.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionAbizernView Question on Stackoverflow
Solution 1 - GitDan MouldingView Answer on Stackoverflow
Solution 2 - GitVonCView Answer on Stackoverflow