Is git good with binary files?

Git

Git Problem Overview


Is git good with binary files?

If I have a lot of uncompressed files being modified, and many compressed files never (or almost never) modified, would git handle it well? For example, if I insert or remove the middle and insert data near the end it will notice it as it does with text?

If git isn't good with binary files, what tool might I consider?

Git Solutions


Solution 1 - Git

Out of the box, git can easily add binary files to its index, and also store them in an efficient way unless you do frequent updates on large uncompressable files.

The problems begin when git needs to generate diffs and merges: git cannot generate meaningful diffs, or merge binary files in any way that could make sense. So all merges, rebases or cherrypicks involving a change to a binary file will involve you making a manual conflict resolution on that binary file.

You need to decide whether the binary file changes are rare enough that you can live with the extra manual work they cause in the normal git workflow involving merges, rebases, cherrypicks.

Solution 2 - Git

In addition to other answers.

  • You can send a diff to binary file using so called binary diff format. It is not human-readable, and it can only be applied if you have exact preimage in your repository, i.e. without any fuzz.
    An example:

      diff --git a/gitweb/git-favicon.png b/gitweb/git-favicon.png
      index de637c0608090162a6ce6b51d5f9bfe512cf8bcf..aae35a70e70351fe6dcb3e905e2e388cf0cb0ac3 100
      GIT binary patch
      delta 85
      zcmZ3&SUf?+pEJNG#Pt9J149GD|NsBH{?u>)*{Yr{jv*Y^lOtGJcy4sCvGS>LGzvuT
      nGSco!%*slUXkjQ0+{(x>@rZKt$^5c~Kn)C@u6{1-oD!M<s|Fj6
     
      delta 135
      zcmXS3!Z<;to+rR3#Pt9J149GDe=s<ftM(tr<t*@sEM{Qf76xHPhFNnYfP!|OE{-7;
      zjI0MY3OYE5upapO?DR{I1pyyR7cx(jY7y^{FfMCvb5IaiQM`NJfeQjFwttKJyJNq@
      hveI=@x=fAo=hV3$-MIWu9%vGSr>mdKI;RB2CICA_GnfDX
    
  • You can use textconv gitattribute to have git diff show human-readable diff for binary files, or parts of binary files. For example for *.jpg files it can be difference in EXIF information, for PDF files it can be difference between their text representation (pdf2text or something like that).

HTH.

Solution 3 - Git

If you've got really large binary files, you can use git-annex to store the data outside of the repository. Check out: http://git-annex.branchable.com/

Solution 4 - Git

Well git is good with binaries. But it won't handle binaries like text files. It's like you want to merge binary files. I mean, a diff on a jpeg will never return you anything. Git works very well with text file and probably as bad as every other solution with binary files!

Solution 5 - Git

if you want a solution for versioning you might wanna consider git-lfs that has a lightweight pointer to your file.

it means when you clone your repo it doesnt download all the versions but only the one that is checked-out.

Here's a nice tutorial of how to use it

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionuser34537View Question on Stackoverflow
Solution 1 - GitndimView Answer on Stackoverflow
Solution 2 - GitJakub NarębskiView Answer on Stackoverflow
Solution 3 - GitJohn GibbView Answer on Stackoverflow
Solution 4 - GitLoïc Faure-LacroixView Answer on Stackoverflow
Solution 5 - GitdanfromisraelView Answer on Stackoverflow