What's the difference between the index, cached, and staged in git?
GitGit Problem Overview
Are these the same thing? If so, why are there so many terms?!
Also, I know there is this thing called git stash, which is a place where you can temporarily store changes to your working copy without committing them to the repo. I find this tool really useful, but again, the name is very similar to a bunch of other concepts in git -> this is very confusing!!
Git Solutions
Solution 1 - Git
The index/stage/cache are the same thing - as for why so many terms, I think that index was the 'original' term, but people found it confusing, so the other terms were introduced. And I agree that it makes things a bit confusing sometimes at first.
The stash
facility of git is a way to store 'in-progress' work that you don't want to commit right now in a commit object that gets stored in a particular stash directory/database). The basic stash
command will store uncommitted changes made to the working directory (both cached/staged and uncached/unstaged changes) and will then revert the working directory to HEAD.
It's not really related to the index/stage/cache except that it'll store away uncommitted changes that are in the cache.
This lets you quickly save the state of a dirty working directory and index so you can perform different work in a clean environment. Later you can get back the information in the stash object and apply it to your working directory (even if the working directory itself is in a different state).
The official git stash
manpage has pretty good detail, while remaining understandable. It also has good examples of scenarios of how stash
might be used.
Solution 2 - Git
It's very confusing indeed. The 3 terms are used interchangeably. Here's my take on why it's called each of those things. The git index is:
- a binary file
.git/index
that is an index of all the tracked files - used as a staging area for commits
- contains cached SHA1 hashes for the files (speeds up performance)
An important note is that the index/cache/stage contains a list of ALL files under source control, even unchanged ones. Unfortunately, phrases like "add a file to the index" or "file is staged to the index" can misleadingly imply that the index only contains changed files.
Here's a demo that shows that the git index contains list of ALL files, not only the changed files:
# setup
git init
echo 'x' > committed.txt
git add committed.txt
git commit -m 'initial'
echo 'y' > staged.txt
git add staged.txt
echo 'z' > working.txt
# list HEAD
git ls-tree --name-only -r HEAD
# committed.txt
# list index
git ls-files
# committed.txt
# staged.txt
# raw content of .git/index
strings .git/index
# DIRC
# committed.txt
# staged.txt
# TREE
# list working dir
ls -1
# committed.txt
# staged.txt
# working.txt
Additional reading:
https://www.kernel.org/pub/software/scm/git/docs/technical/racy-git.txt
https://stackoverflow.com/questions/4084921/what-does-the-git-index-contain-exactly