Why is Git not considered a "block chain"?

GitHashBlockchain

Git Problem Overview


Git's internal data structure is a tree of data objects, wherein each objects only points to its predecessor. Each data block is hashed. Modifying (bit error or attack) an intermediate block will be noticed when the saved hash and the actual hash deviate.

How is this concept different from block chain?
Git is not listed as an example of block chains, but at least in summaries, both data structure descriptions look alike: data block, single direction reverse linking, hashes, ...).

So where is the difference, that Git isn't called a block chain?

Git Solutions


Solution 1 - Git

The reason why Git and blockchains appear similar is because they are both using merkle trees as their underlying data structure. A merkle tree is a tree where each node is labeled with the cryptographic hash value of their contents, which includes the labels of its children.

Git’s directed acyclic graph is exactly that, a merkle tree where each node (tag, commit, tree, or blob object) is labeled with the hash of its content and the label of its “child”. Note that for commits, the “child” term conflicts a bit with Git’s understanding of parents: Parent commits are the children of commits, you just need to look at the graph as a tree that keeps growing by re-rooting it.

Blockchains are very similar to this, since they also keep growing that way, and they are also using its merkle tree property to ensure data integrity. But usually, blockchains are understood as way more than just merkle trees which is where they are separating from the “stupid content tracker” Git. For example, blockchains usually also means having a highly decentralized system on a block level (not all blocks need to be in the same place).

Understanding blockchains is kind of difficult (personally, I’m still far away from understanding everything about it), but I consider understanding Git internals as a good way to understand merkle trees which definitely helps understanding a fundamental part about blockchains.

Solution 2 - Git

The question reads: Why is Git not considered a “block chain”? So this is asserting that there is a wide-spread opinion that Git is not a blockchain (an assertion that is illustrated and corroborated by the answers preceding mine on this page) and asking for the reason of the prevalence of this opinion. This is a good question.

Taking the question literally, the answer could be that the blockchain term and concept gained popularity as part of the digital currency operation called “Bitcoin”, and hence came to be associated with how Bitcoin does things: which is by using a lot of computing power to calculate a specific hash including a nonce to meet certain arbitrary requirements, which is by allegedly having no central authority, which is by being “independent”, maybe even “democratic”, and the rest of the kool aid; and as these things are not seen in Git, well, Git cannot be a blockchain, right? And so the question would be answered literally.

Hidden behind this prima facie question is another question: What is a block chain? Now you could look up a definition somehwere and copy it over here, but I didn't do that as I have made up my mind years ago, when listening to a podcast about Bitcoin that strove to explain the new concept of a blockchain, that a blockchain works like Git and I don't intend to let my precious understanding be misled by random claims on the internet.

So what is a blockchain? What's in the word?

Nothing in the term “blockchain” presupposes the requirement to include a nonce in the content so as to come up with a hash of so and so many leading zeros. (This requirement is only there to be able to control the blockchain by computing power and so, ultimately, by money.)

Nothing in the term “blockchain” presupposes the existence of a network, let alone a decentralized one.

Nothing in the term “blockchain” presupposes any “independence” from “central authority”.

The term “block chain” only presupposes blocks (of data) chained together. Now what is a chain? Is it just a link? No, it is a strong link designed to hold things together by force.

A simple linked list doesn't qualify as a blockchain because the contents of the chunks of data in the list could be altered while the list would continue to link back and forth just fine. This is not how a chain works.

To make a link of blocks of data into a chain of blocks of data, the contents of the blocks need to be checksummed (digested) in one way or another and this checksum (digest) must be part of the link, making it a strong link protecting the content, preventing it from being altered. This is a blockchain.

And this is what Git does, and hence Git is a blockchain, or works as one, if you prefer.

To close the circle, let's ask again: Why is Git not considered a “block chain”? It could be because many people, perhaps even a large majority, do not focus on the essence of a concept but on blinking accidents.

Solution 3 - Git

Blockchain is not just any chain of any blocks.

Blockchain is when there is a way of determining the main chain when two or more are diverted, and when no central authority is needed for that determination.

Solution 4 - Git

Cyber Currencies like Bitcoin, use a distributed consensuses cryptographic chain of blocks (merkle tree). Common usage has shortened this to 'blockchain'

While git uses a chain of blocks (merkle tree), it lacks the distributed consensuses cryptographic components that common usage of the term 'BlockChain' imply.

Solution 5 - Git

Unlike cryptocurrency blockchains; git doesn't have a p2p trustless consensus mechanism.

Solution 6 - Git

There is no reason to not consider Git as a blockchain. Git is focused in a very particular (and important) set of assets: source code. The consensus in this case is manual, and we can consider that a transaction (commit) is accepted when it is merged into the release branch. Actually, considering the number of transactions (commits), Git is by far the most successful blockchain.

Extracted from: https://arxiv.org/pdf/1803.00892.pdf "... ...We define“blockchain” and “blockchain network”, and then discuss two very different, well known classes of blockchain networks: cryptocurrencies and Git repositories..."

See also next paper that explain why Google use a single monorepo as single source of truth (basically, as a blockchain). https://research.google/pubs/pub45424/

Solution 7 - Git

As poke said:

Git and Blockchains appear similar because they are both using Merkle Trees to store ordered timestamped transactions. A merkle tree is a tree data structure where each node is labeled with the cryptographic hash value of their contents, which includes the labels of its children.

The first difference is the Hash function: Blockchain has a very expensive hash function so that each block has to be mined, wheras a Git "block" can be created with a simple commit message.

The purpose of Bitcoin is to add trust to the order of transactions. The focus is on the longest chain, since that is most expensive to compute and thus most likely to be the truth.

Bitcoin accomplishes this by requiring that the hash meets certain parameters (begins with a specific number of 0s), by incrementing a value ("nonce") in the message until a satisfactory hash is found. This takes effort to find, but only 1 calculation to verify for a nonce; and if multiple nonces produce a satisfactory hash, then one will be lower and taken as the truth. Other authentication schemes make the hash trustworthy by centralizing the issuing of the hash to an authority, perhaps voted by network agreement, or some other method.

Blockchain data is limited to transactions, which must must conform to validation. Transaction must be valid to be included in the next block. A Bitcoin transaction corresponds to something important in the real world that justifies using an expensive block to record this transfer, like exchange of money value. We don't actually care about the final ledger, it's a metaphor for something in the real world.

By contrast, Git blocks are arbitrary, as a commit can contain any amount of data. The value lies in the changes of data being organized into the git tree because we care about the final product, it's validated by the existence of the git repository.

The purpose of Git is to allow cheap "ledgers" to track multiple product alternatives. The "ledger" in Git is what we care about, it's our final product; the transactions data just record how the product was built. We want to make it very cheap to make multiple versions of final products, just enough overhead to require the creator to record how they built this product. No explicit validation is done on the data, you maintain the end-product if it looks good, and that existence makes it useful to have the chain of this product's creation. If the end-product is bad or the order of commits is invalid, this "ledger" gets deleted during garbage collection.

The second difference is that Blockchain transactions must come from a prior valid source. In Git, we don't care what data you use to extend the tree. In Blockchain, the transactions must come from a prior valid source. In that sense, Git tracks the extension of our environment, whereas Blockchain tracks the exchange of value within a closed environment.

Solution 8 - Git

To sum it up (for me):

While Git offers you complete full freedom of choice, Blockchains are a highly political system, where you are forced to trust in others:

  • Git is a Merkle Tree without a predefined consensus algorithm.

  • Blockchains are Merkle Trees with a predefined consensus algorithm.

Hence if you are all alone, there is no difference between Git and a Blockchain. As you trust Git and yourself, you already have that predefined consensus.

But things start to become different, when you are in a Network.


Notes:

  • For Blockchains there is absolutely no requirement for the hash to be difficult to calculate or to define something like "Mining" or have some specific software which ensures you take part of a certain Network.
    This all might be a requirement for something like Bitcoin (which usually is referred to as Cryptocurrency, which I cannot fully agree to), but neither is BitCoin defining what a Blockchain is, nor does a Blockchain need to be something like BitCoin.

  • The consensus algorithm does not necessarily be something which is based on some cryptographic protocol. For example it would be enough to publish your TIP in a local newspaper each day to (ab)use Git as some Blockchain.

Git readily offers multiple possible consensus algorithms you can chose from:

  • Publishing the SHA in a Newspaper or similar (something which is distributed and hard to fake)

  • If you are in the rare situation that you are already part of some GnuPG Web Of Trust, you readily can use Signed Commits (or Signed Tags) to agree to the consensus.

  • The "Signed off:" variant does not offer cryptographically secure consensus, but in combination with something like Gerrit and Fast-Forward-Only pushes it is some pretty well defined consensus algorithm.

Hence to make Git a Blockchain, all you need is to add some air.


Some different view:

Git is no Blockchain on itself. In contrast, it is far less than a Blockchain (lacking the predefined consensus algorithm) and much more than a Blockchain (allows a plethora of consensus algorithm to chose from, is meant as an SCM etc.).


Some other observations:

  • Git branches are the same as Blockchain splits. While Blockchain splits happen rarely, most Git repositories have less branches (master+HEAD) than BitCoin had splits.

  • Git always has an explicite consensus done by you, that is, the TIP you push to. However this only applies to you and nobody else.
    Pushing the Git repository to some shared Git Service can also be seen as a consensus. There is no requirement for such a consensus to be based on Democratic principles.


Very personal thoughts:

While Blockchain is some overhyped buzzword, something you can happily live without, Git is an inevitable fundamental tool for getting your work done, one of the basic must-haves you cannot live without, something as important as air and water. This is probably, why people like me do not refer to Git as a Blockchain ..

YMMV

Solution 9 - Git

The Goals are different for blockchain and git although both use merkle trees as data structure.

A blockchain is typically managed by a peer-to-peer network adhering to a protocol for inter-node communication and validating new blocks. Once recorded, the data in any given block cannot be altered retroactively without alteration of all subsequent blocks, which requires consensus of the network majority.

As According to Bitcoin whitepaper :

> A purely peer-to-peer version of electronic cash would allow online > payments to be sent directly from one party to another without going > through a financial institution. Digital signatures provide part of > the solution, but the main benefits are lost if a trusted third party > is still required to prevent double-spending. We propose a solution to > the double-spending problem using a peer-to-peer network. The network > timestamps transactions by hashing them into an ongoing chain of > hash-based proof-of-work, forming a record that cannot be changed > without redoing the proof-of-work. The longest chain not only serves > as proof of the sequence of events witnessed, but proof that it came > from the largest pool of CPU power. As long as a majority of CPU power > is controlled by nodes that are not cooperating to attack the network, > they'll generate the longest chain and outpace attackers. The network > itself requires minimal structure. Messages are broadcast on a best > effort basis, and nodes can leave and rejoin the network at will, > accepting the longest proof-of-work chain as proof of what happened > while they were gone

While Git is a distributed version-control system for tracking changes in source code during software development.It is designed for coordinating work among programmers, but it can be used to track changes in any set of files. Its goals include speed, data integrity, and support for distributed, non-linear workflows.

As according to Linus Torvalds:

> In many ways you can just see git as a filesystem – it's > content-addressable, and it has a notion of versioning, but I really > designed it coming at the problem from the viewpoint of a filesystem > person (hey, kernels is what I do), and I actually have absolutely > zero interest in creating a traditional SCM system.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionPaebbelsView Question on Stackoverflow
Solution 1 - GitpokeView Answer on Stackoverflow
Solution 2 - GitLumiView Answer on Stackoverflow
Solution 3 - GitDaniel VartanovView Answer on Stackoverflow
Solution 4 - GitBrian SullivanView Answer on Stackoverflow
Solution 5 - GitMiguel MotaView Answer on Stackoverflow
Solution 6 - GitearizonView Answer on Stackoverflow
Solution 7 - GitAlex FView Answer on Stackoverflow
Solution 8 - GitTinoView Answer on Stackoverflow
Solution 9 - Gitasing177View Answer on Stackoverflow