What's the difference between the index, cached, and staged in git?

Git

Git Problem Overview


Are these the same thing? If so, why are there so many terms?!

Also, I know there is this thing called git stash, which is a place where you can temporarily store changes to your working copy without committing them to the repo. I find this tool really useful, but again, the name is very similar to a bunch of other concepts in git -> this is very confusing!!

Git Solutions


Solution 1 - Git

The index/stage/cache are the same thing - as for why so many terms, I think that index was the 'original' term, but people found it confusing, so the other terms were introduced. And I agree that it makes things a bit confusing sometimes at first.

The stash facility of git is a way to store 'in-progress' work that you don't want to commit right now in a commit object that gets stored in a particular stash directory/database). The basic stash command will store uncommitted changes made to the working directory (both cached/staged and uncached/unstaged changes) and will then revert the working directory to HEAD.

It's not really related to the index/stage/cache except that it'll store away uncommitted changes that are in the cache.

This lets you quickly save the state of a dirty working directory and index so you can perform different work in a clean environment. Later you can get back the information in the stash object and apply it to your working directory (even if the working directory itself is in a different state).

The official git stash manpage has pretty good detail, while remaining understandable. It also has good examples of scenarios of how stash might be used.

Solution 2 - Git

It's very confusing indeed. The 3 terms are used interchangeably. Here's my take on why it's called each of those things. The git index is:

  • a binary file .git/index that is an index of all the tracked files
  • used as a staging area for commits
  • contains cached SHA1 hashes for the files (speeds up performance)

An important note is that the index/cache/stage contains a list of ALL files under source control, even unchanged ones. Unfortunately, phrases like "add a file to the index" or "file is staged to the index" can misleadingly imply that the index only contains changed files.

Here's a demo that shows that the git index contains list of ALL files, not only the changed files:

# setup
git init

echo 'x' > committed.txt
git add committed.txt
git commit -m 'initial'

echo 'y' > staged.txt
git add staged.txt

echo 'z' > working.txt

# list HEAD
git ls-tree --name-only -r HEAD
# committed.txt

# list index
git ls-files
# committed.txt
# staged.txt

# raw content of .git/index
strings .git/index
# DIRC
# committed.txt
# staged.txt
# TREE

# list working dir
ls -1
# committed.txt
# staged.txt
# working.txt

Additional reading:

https://www.kernel.org/pub/software/scm/git/docs/technical/racy-git.txt

https://stackoverflow.com/questions/4084921/what-does-the-git-index-contain-exactly

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionallyourcodeView Question on Stackoverflow
Solution 1 - GitMichael BurrView Answer on Stackoverflow
Solution 2 - GitwisbuckyView Answer on Stackoverflow