How to make an existing directory within a git repository a git submodule

GitVersion ControlGit Submodules

Git Problem Overview


##I'm very confused about git-submodules. Basically my problem is that I can't make git understand that ~/main-project/submodule is a submodule.


I have good experience with git submodules:
in my dotfiles repository I created the .gitmodules file in ~/dotfiles-repo and I added there paths and urls. Since then, If I make changes to the files within the submodules and run git status, I'd get something like like: .vim/bundle/auto-complete (new commits) # in red

I created the .gitmodules file in ~/main-project but:

  • If I make changes to ~/main-project/submodule and even push the changes, I don't get a similar response like <submodule> (new commits) # in red when running git status in ~/main-project. I just get the changes that were made in those directories
  • When I hit the folders' links at github for these directories it's not directing me to the repositories themselves but I stay in the same repository.
  1. Maybe I'm missing the whole point. What are the main features of submodules?
  2. Why does git understands the submodules in the dotfiles repo but not in my other repo?
  3. Is it because I have already told git to add the files inside ~/main-project/submodule to the index?

I've read this question which led me to this answer But I'm not sure I need git-subtree. I don't want to do things that might do changes hard to be revert.

> Edit: This suggested duplicate-solution didn't work either, I recieved an error that Updates were rejected because the remote contains work that you do not have locally. It seems that @GabLeRoux practically told me to push <repo-A> to the url of <repo-B>.

Git Solutions


Solution 1 - Git

Use git submodule absorbgitdirs

This is what the docs state this command does:

> If a git directory of a submodule is inside the submodule, > move the git directory of the submodule into its superprojects > $GIT_DIR/modules path and then connect the git directory and > its working directory by setting the core.worktree and adding > a .git file pointing to the git directory embedded in the > superprojects git directory.

So instead of starting all over as suggested in the previous answers by @DomQ and myself, one can just add run the following:

  1. Without removing from the index the submodule, Add the submodule's url to .gitmodules and to .git/config with
    git submodule add <url> <path>
  2. Move the submodule's $GIT_DIR directory (.git in regular repositories) to .git/modules/<path> with
    git submodule absorbgitdirs <path>
Original answer - pre v2.12.0

git submodule absorbgitdirs was introduced only in v2.12.0-rc0 (see commit).

The Solution is quite simple. It was extracted from here.

  1. git rm submodule-dir
    This will delete all the files that git was tracking after in submodule-dir
  2. rm -rf submoduledir
    This will delete all the other files that might have been left in submodule-dir because git ignored them.
  3. Now, we have to commit in order to remove the files from the index:
    git commit
    After the commit, we cleaned the files that git followed and didn't followed in submodul-dir. Now it's time to do:
  4. git submodule add <remote-path-to-submodule>
    This will re-add the submodule but as a true submodule.
  5. At this point it might be a good idea to check .gitmodules and see if the submodules have been added successfully. In my case I already had an .gitmodules file so I had to modify it.

Solution 2 - Git

None of these solutions seemed to work for me so I figured my own:

  1. Make sure a new git repo already exist that will hold the content of the new submodule, for example, we'll be using "[email protected]:/newemptyrepo"

  2. Navigate to the directory you're modulizing:

cd myproject/submodule-dir
  1. Remove the to-be submodule from the parent's index:
git rm -r --cached .
  1. Init a new git repo inside the to-be submodule:
git init
  1. Set up the origin for the to-be submodule and make your first commit:
git remote add origin git@github.com:/newemptyrepo
git add . && git commit && git push --set-upstream origin master
  1. Now you must navigate to the parent repo's top-level path:
cd .. && cd `git rev-parse --show-toplevel`
  1. Finally, add the submodule as you would normally:
git submodule add git@github.com:/newemptyrepo ./myproject/submodule-dir
  1. Now commit & push the changes the above command makes and you're all set up!

Solution 3 - Git

There is basically no better way than pretending to start over:

  1. ensure that everything is committed everywhere
  2. move your sub-repository out of the way
  3. git submodule add from the sub-repository's remote
  4. cd mysubmodule
  5. git fetch ../wherever/you/stashed/the/sub-repository/in/step-1
  6. git merge FETCH_HEAD

To explain why this is so, it seems to me than a deeper understanding of what submodules are is needed, than what one can glean from the git-submodule(1) manual page (or even the relevant chapter from the Git book). I found some more in-depth explanations on this blog post, but since that post is a bit lengthy I am taking the liberty summarize them here.

At a low level, a git submodule consists of the following elements,

  • A commit object at the top of the submodule tree,
  • (In recent versions of Git) A subdirectory in .git/modules to host the Git objects for the submodule,
  • An entry in the .gitmodules configuration file.

The commit object is contained (or more precisely, referenced by SHA1) in the parent tree object. This is unusual, as things usually happen the other way round, but this explains why you see a directory appear in the main repository's git status after you have performed a commit in the submodule. You can also make some experiments with git ls-tree to observe this commit object in more detail.

The subdirectory in .git/modules stands in for a .git subdirectory in the submodule; and in fact, there is a .git file in the submodule that points to the former using a gitdir: line. This is the default behavior since version 1.7.8 of Git. Not sure why everything wouldn't Just Work if you just kept on having a separate .git directory, except as noted in the release notes you would probably run into trouble when switching between a branch that has the submodule and another that doesn't.

The .gitmodules file provides the URL that git submodule update --remote and friends should pull from; which obviously is distinct from the main repository's set of remotes. Note also that .gitmodules is copied in part into .git/config by the git submodule sync command and other commands that invoke it behind the scenes.

While it is fairly easy to do the necessary changes by hand for .gitmodules + .git/config, and also for .git/modules + mysubmodule/.git (and in fact, there is even git submodule absorbgitdirs for the latter), there isn't really a porcelain to create only the in-tree commit object. Hence the proposed solution by moving + redoing changes presented above.

Solution 4 - Git

To answer your questions in order:

  1. The purpose of submodules according to GitHub. Feature wise, it has been designed to be conceptualized a child repository (which can almost be treated like any other file), that is version controlled by a parent repository, where the parent tracks the current commit ID of the submodule (child repository) rather than it's content.
  2. It's most likely because you've already added the files to the index of the repository. In which case, the solution is to git rm --cached submodule-name/. Then create an intermediate commit, followed by the adding of the folder as a repository: git add submodule-name (notice that there is no trailing slash after submodule-name in the case of submodules).
  3. Yes :)

The answer you mentioned may correct the history of your commits as well:

  1. Advantages:

That folder will be treated as a submodule in all of your commit history, not just all future commits. This avoids any complications if you checkout to a previous version where it was treated like a folder. This is a complication because when you return to the tip of your branch, you may have to enter your submodule also and checkout to the latest commit to restore all the files (which may be deleted from your working directory). This could be avoided by doing some kind of a recursive checkout to your latest commit.

  1. Disadvantages:

If the commit history is modified, all other contributors would also have to re-clone the project since they will get merge conflicts or worse; reintroduce the problem commits back into the project.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionDoron BeharView Question on Stackoverflow
Solution 1 - GitDoron BeharView Answer on Stackoverflow
Solution 2 - GitDellowarView Answer on Stackoverflow
Solution 3 - GitDomQView Answer on Stackoverflow
Solution 4 - GitreubenjohnView Answer on Stackoverflow