Terminology used by Git

Git

Git Problem Overview


it seems like I have to learn to use git. Which probably is a good thing (TM). However reading online guides and man-pages, I just cannot get my head around the terminology. Everything is always defined in terms of themselves or other unexplained terms (do a "man git" and you see what I mean).

So, is there a more DAG-alike structure of definitions of terms, including some of the following (all taken from the git man page(s)!). Maybe using a file system as a starting point, and not assuming the reader is well versed in svn (which I am not).

  • repo
  • repository
  • a git
  • "the git"
  • index
  • clone
  • commit
  • branch
  • tree
  • upstream
  • a head
  • HEAD
  • version
  • tag
  • archive
  • patch
  • submission
  • changeset
  • stash
  • archive
  • object
  • module
  • submodule
  • refspec
  • a history

While I can find explanations for some, they usually are in terms of the other. Also some others terms I do know from other contexts (like a UNIX diff). However some other I thought I knew...

I have gathered that there are repositories (similar to gits? and/or trees? upstream?), which you copy (clone? branch?) to get the files physically to your hard drive. Then there are branches (similar to changesets?), tags and commits (similar to patches?), but their distinction is not clear. What files do what modify? What makes my files stay local and what might (heaven forbid) submit my code to teh internets?

What is the recommended way to work, when it comes to branches, tags and commits -- so it is easy to swap between versions, and to import updates from publically available gits.

//T, biting his tongue to control his frustration...

Git Solutions


Solution 1 - Git

Here's an attempt to complete your glossary (from the top of my head, trying to use my own words):

  • repo, repository: This is your object database were your history and configuration is stored. May contain several branches. Often it contains a worktree too.

  • a git, "the git": never heard of, sorry. "the git" probably describes the software itself, but I'm not sure

  • index, staging area: This is a 'cache' between your worktree and your repository. You can add changes to the index and build your next commit step by step. When your index content is to your likes you can create a commit from it. Also used to keep information during failed merges (your side, their side and current state)

  • clone: A clone of a repository ("just another repository") or the act of doing so ("to clone a repository (creates a new clone)")

  • commit: A state of your project at a certain time. Contains a pointer to its parent commit (in case of a merge: multiple parents) and a pointer to the directory structure at this point in time.

  • branch: A different line of development. A branch in git is just a "label" which points to a commit. You can get the full history through the parent pointers. A branch by default is only local to your repository.

  • tree: Basically speaking a directory. It's just a list of files (blobs) and subdirectories (trees). (The list may also contain commits in case you use submodules, but that's an advanced topic)

  • upstream: After cloning a repository you often call that "original" repository "upstream". In git it's aliased to origin

  • a head: The top commit of a branch (commit the label points to)

  • HEAD: A symbolic name to describe the currently checked out commit. Often the topmost commit

  • version: Might be the same as a commit. Could also mean a released version of your project.

  • tag: A descriptive name given to one of your commits (or trees, or blobs). Can also contain a message (eg. changelog). Tags can be cryptographically signed with GPG.

  • archive: An simple archive (.tar, .zip), nothing special wrt git.

  • patch: A commit exported to text format. Can be sent by email and applied by other users. Contains the original auther, commit message and file differences

  • submission: no idea. Submitting a patch to a project maybe?

  • changeset: Synonym for "commit"

  • stash: Git allows you to "stash away" changes. This gives you a clean working tree without any changes. Later they can be "popped" to be brought back. This can be a life saver if you need to temporarily work on an unrelated change (eg. time critical bug fix)

  • object: can be one of commit, tree, blob, tag. An object has associated its SHA1 hash by which it is referenced (the commit with id deadbeaf, the tree decaf). The hash is identical between all repositories that share the same object. It also garuantees the integrity of a repository: you cannot change past commits without changing the hashes of all child commits.

  • (module,) submodule: A repository included in another repository (eg. external library). Advanced stuff.

  • revspec: A revspec (or revparse expression) describes a certain git object or a set of commits through what is called the extended SHA1 syntax (eg. HEAD, master~4^2, origin/master..HEAD, deadbeaf^!, …)

  • refspec: A refspec is pattern describing the mapping to be done between remote and local references during Fetch or Push operations

  • history: Describes all ancestor commits prior to a commit going back to the first commit.


Things you didn't mention, but are probably good to know:

Everything you do is local to your repository (either created by git init or git clone git://url.com/another/repo.git). There are only a few commands in git that interact with other repositories (a.k.a. teh interwebz), including clone, fetch, pull, push.

Push & pull are used to syncronize repositories. Pull fetches objects from another repository and merges them with your current branch. Push is used to take your changes and push them to another repository. You cannot push single commits or changes, you only can push a commit including its complete history.

A single repository can contain multiple branches but does not need to. The default branch in git is called master. You can create as many branches as you want, merging is a piece of cake with git. Branches are local until you run git push origin <branch>.

A commit describes a complete state of the project. Those states can be compared to one another, which produces a "diff" (git diff origin/master master = see differences between origin/master and master)

Git is pretty powerful when it comes to preparing your commits. The key ingredient here is the "index" (or "staging area"). You can add single changes to the index (using git add) until you think the index looks good. git commit fires up your text editor and you need to provide a commit message (why and how did you make that change); after entering your commit message git will create a new commit – containing the contents of the index – on top of the previous commit (the parent pointer is the SHA1 of the previous commit).

Solution 2 - Git

Git comes with documentation for exactly what you are looking for.

$ git help glossary

Solution 3 - Git

I found this (free) book very useful when learning how to use git: http://progit.org/. The book exists in printed form as well.

I think the quickest way to learn git is probably to pick up a book or tutorial which teaches you the basic concepts and terms.

Solution 4 - Git

Another good resource for learning Git is Edgecase's Git Immersion. Trying to learn Git through the man pages is probably very difficult, there is a short, steep learning curve that has to be overcome first. You need to be introduced to the concept of a DCVS (Distributed Version Control System) first.

Progit as recommended by @fulhack is also very good.

I can also strongly recommend Think Like A Git. The explanation of rebase here is worth its weight in gold.

Solution 5 - Git

The best I have found for understanding git is The Git Parable

> Imagine that you have a computer that has nothing on it but a text > editor and a few file system commands. Now imagine that you have > decided to write a large software program on this system. Because > you’re a responsible software developer, you decide that you need to > invent some sort of method for keeping track of versions of your > software so that you can retrieve code that you previously changed or > deleted. What follows is a story about how you might design one such > version control system (VCS) and the reasoning behind those design > choices...

Solution 6 - Git

I think you might like this article: Git for Computer Scientists

And another important aspect to understand when using git is the workflow. Read this wonderful blog post: Git branching model

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionThe ApaView Question on Stackoverflow
Solution 1 - GitknittlView Answer on Stackoverflow
Solution 2 - GitBenjamin BannierView Answer on Stackoverflow
Solution 3 - GitJonatanView Answer on Stackoverflow
Solution 4 - GitDaniel LeeView Answer on Stackoverflow
Solution 5 - GitBenjolView Answer on Stackoverflow
Solution 6 - GityasouserView Answer on Stackoverflow