Why does Git treat this text file as a binary file?

GitBinary

Git Problem Overview


I wonder why git tells me this?

$ git diff MyFile.txt
diff --git a/MyFile.txt b/MyFile.txt
index d41a4f3..15dcfa2 100644
Binary files a/MyFile.txt and b/MyFile.txt differ

Aren't they text files?

I have checked the .gitattributes and it is empty. Why I am getting this message ?, I cannot get diffs as I use to anymore

ADDED :

I've noticed there is an @ in the file permissions, what is this ?, Could this be the reason ?

$ls -all
drwxr-xr-x   5 nacho4d  staff    170 28 Jul 17:07 .
drwxr-xr-x  16 nacho4d  staff    544 28 Jul 16:39 ..
-rw-r--r--@  1 nacho4d  staff   6148 28 Jul 16:15 .DS_Store
-rw-r--r--@  1 nacho4d  staff    746 28 Jul 17:07 MyFile.txt
-rw-r--r--   1 nacho4d  staff  22538  5 Apr 16:18 OtherFile.txt

Git Solutions


Solution 1 - Git

It simply means that when git inspects the actual content of the file (it doesn't know that any given extension is not a binary file - you can use the attributes file if you want to tell it explicitly - see the man pages).

Having inspected the file's contents it has seen stuff that isn't in basic ascii characters. Being UTF16 I expect that it will have 'funny' characters so it thinks it's binary.

There are ways of telling git if you have internationalisation (i18n) or extended character formats for the file. I'm not sufficiently up on the exact method for setting that - you may need to RT[Full]M ;-)

Edit: a quick search of SO found can-i-make-git-recognize-a-utf-16-file-as-text which should give you a few clues.

Solution 2 - Git

If you have not set the type of a file, Git tries to determine it automatically and a file with really long lines and maybe some wide characters (e.g. Unicode) is treated as binary. With the .gitattributes file you can define how Git interpretes the file. Setting the diff attribute manually lets Git interprete the file content as text and will do an usual diff.

Just add a .gitattributes to your repository root folder and set the diff attribute to the paths or files. Here's an example:

src/Acme/DemoBundle/Resources/public/js/i18n/* diff
doc/Help/NothingToSay.yml                      diff
*.css                                          diff

If you want to check if there are attributes set on a file, you can do that with the help of git check-attr

git check-attr --all -- src/my_file.txt

Another nice reference about Git attributes could be found here.

Solution 3 - Git

I was having this issue where Git GUI and SourceTree was treating Java/JS files as binary and thus wouldn’t show a diff.

Creating a file named attributes in .git/info with following content solved the problem:

*.java diff
*.js diff
*.pl diff
*.txt diff
*.ts diff
*.html diff
*.sh diff
*.xml diff

If you would like this to apply to all repositories, then you can add the file attributes in $HOME/.config/git/attributes.

Solution 4 - Git

Git will even determine that it is binary if you have one super-long line in your text file. I broke up a long String, turning it into several source code lines, and suddenly the file went from being 'binary' to a text file that I could see (in SmartGit).

So don't keep typing too far to the right without hitting 'Enter' in your editor - otherwise later on Git will think you have created a binary file.

Solution 5 - Git

I had this same problem after editing one of my files in a new editor. Turns out the new editor used a different encoding (Unicode) than my old editor (UTF-8). So I simply told my new editor to save my files with UTF-8 and then git showed my changes properly again and didn't see it as a binary file.

I think the problem was simply that git doesn't know how to compare files of different encoding types. So the encoding type that you use really doesn't matter, as long as it remains consistent.

I didn't test it, but I'm sure if I would have just committed my file with the new Unicode encoding, the next time I made changes to that file it would have shown the changes properly and not detected it as binary, since then it would have been comparing two Unicode encoded files, and not a UTF-8 file to a Unicode file.

You can use an app like Notepad++ to easily see and change the encoding type of a text file; Open the file in Notepad++ and use the Encoding menu in the toolbar.

Solution 6 - Git

This is also caused (on Windows at least) by text files that have UTF-8 with BOM encoding. Changing the encoding to regular UTF-8 immediately made Git see the file as type=text

Solution 7 - Git

I have had same problem. I found the thread when I search solution on google, still I don't find any clue. But I think I found the reason after studying, the below example will explain clearly my clue.

    echo "new text" > new.txt
    git add new.txt
    git commit -m "dummy"

for now, the file new.txt is considered as a text file.

    echo -e "newer text\000" > new.txt
    git diff

you will get this result

diff --git a/new.txt b/new.txt
index fa49b07..410428c 100644
Binary files a/new.txt and b/new.txt differ

and try this

git diff -a

you will get below

    diff --git a/new.txt b/new.txt
    index fa49b07..9664e3f 100644
    --- a/new.txt
    +++ b/new.txt
    @@ -1 +1 @@
    -new file
    +newer text^@

Solution 8 - Git

We had this case where an .html file was seen as binary whenever we tried to make changes in it. Very uncool to not see diffs. To be honest, I didn't checked all the solutions here but what worked for us was the following:

  1. Removed the file (actually moved it to my Desktop) and commited the git deletion. Git says Deleted file with mode 100644 (Regular) Binary file differs
  2. Re-added the file (actually moved it from my Desktop back into the project). Git says New file with mode 100644 (Regular) 1 chunk, 135 insertions, 0 deletions The file is now added as a regular text file

From now on, any changes I made in the file is seen as a regular text diff. You could also squash these commits (1, 2, and 3 being the actual change you make) but I prefer to be able to see in the future what I did. Squashing 1 & 2 will show a binary change.

Solution 9 - Git

Try using file to view the encoding details (reference):

cd directory/of/interest
file *

It produces useful output like this:

$ file *
CR6Series_stats resaved.dat: ASCII text, with very long lines, with CRLF line terminators
CR6Series_stats utf8.dat:    UTF-8 Unicode (with BOM) text, with very long lines, with CRLF line terminators
CR6Series_stats.dat:         ASCII text, with very long lines, with CRLF line terminators
readme.md:                   ASCII text, with CRLF line terminators

Solution 10 - Git

I had an instance where .gitignore contained a double \r (carriage return) sequence by purpose.

That file was identified as binary by git. Adding a .gitattributes file helped.

# .gitattributes file
.gitignore diff

Solution 11 - Git

If git check-attr --all -- src/my_file.txt indicates that your file is flagged as binary, and you haven't set it as binary in .gitattributes, check for it in /.git/info/attributes.

Solution 12 - Git

I just spent several hours going through everything on this list trying to work out why one of the test projects in my solution wasn't adding any tests to the explorer.

It turned out in my case that somehow (probably due to a poor git merge somewhere) that VS had lost a reference the project altogether. It was still building but I noticed that it only built the dependancies.

I then noticed that it wasn't showing up in the dependencies list itself, so I removed and re-added the test project and all my tests showed up finally.

Solution 13 - Git

Change the Aux.js to another name, like Sig.js.

The source tree still shows it as a binary file, but you can stage(add) it and commit.

Solution 14 - Git

I had a similar issue as I pasted some text from a binary Kafka message, which inserted non-visible character and caused git to think the file is binary.

I found the offending characters by searching the file using regex [^ -~\n\r\t]+.

  • [ match characters in this set
  • ^ match characters not in this set
  • -~ matches all characters from ' ' (space) to '~'
  • \n newline
  • \r carriage return
  • \t tab
  • ] close set
  • + match one or more of these characters

Solution 15 - Git

The reason my file was showing as binary (an dI was getting no diff using git diff or SourceTree) was because the file in question was added as a Git LFS file

Git (and SourceTree) do not seem to be able to diff text files added to LFS. However after a bit of hunting and I was able to fix this by running... git config --global diff.lfs.textconv cat

with help from the suggestion here... https://github.com/git-lfs/git-lfs/issues/440#issuecomment-501007460

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionnacho4dView Question on Stackoverflow
Solution 1 - GitPhilip OakleyView Answer on Stackoverflow
Solution 2 - GitnaitsirchView Answer on Stackoverflow
Solution 3 - GitHemantView Answer on Stackoverflow
Solution 4 - GitChris MurphyView Answer on Stackoverflow
Solution 5 - GitdeadlydogView Answer on Stackoverflow
Solution 6 - GitRobbaView Answer on Stackoverflow
Solution 7 - GithowardView Answer on Stackoverflow
Solution 8 - GitStuFF mcView Answer on Stackoverflow
Solution 9 - GitpatricktokeeffeView Answer on Stackoverflow
Solution 10 - GitErik ZivkovicView Answer on Stackoverflow
Solution 11 - GitcoberlinView Answer on Stackoverflow
Solution 12 - GitcirrusView Answer on Stackoverflow
Solution 13 - GitoscarzView Answer on Stackoverflow
Solution 14 - GitMartyn DavisView Answer on Stackoverflow
Solution 15 - GitOliver PearmainView Answer on Stackoverflow