How to remove a too large file in a commit when my branch is ahead of master by 5 commits

Git

Git Problem Overview


I've been stuck all day on this issue, looking for an answer here :( ...

Context

I'm working alone on a project and I used github until now to save my work other than on my computer. Unfortunately, I added a very large file to the local repository : 300mb (which exceed Github's limit).

What I did

I will try to make an history of what I made :

  1. I (dumbly) added everything to the index :

     git add *
     
    
  2. I committed changes :

     git commit -m "Blablabla"
    
  3. I tried to push to origin master

     git push origin master 
    

It took a while, so I just CTRL+C, and repeated step 2 and 3 four times, until I realised that a file was too large to be pushed to github.

  1. I made the terrible mistake to delete my large file (I don't remember if I did a git rm or a simple rm)

  2. I followed the instructions on (https://help.github.com/articles/remove-sensitive-data)

  3. When I try to git filter branch, I get the following error : "Cannot rewrite branches: You have unstaged changes."

Thanks in advance !

Git Solutions


Solution 1 - Git

A simple solution I used:

  1. Do git reset HEAD^ for as many commits you want to undo, it will keep your changes and your actual state of your files, just flushing the commits of them.

  2. Once the commits are undone, you can then think about how to re-commit your files in a better way, e.g.: removing/ignoring the huge files and then adding what you want and then committing again. Or use Git LFS to track those huge files.


Edit: this answer is also acceptable if for instance your commits needed authentication (e.g.: username and email) and that you need to add the proper credentials after having commited. You can undo things the same way.

Question: would someone have a way to just cherrypick the commit that is bad and change it directly? I'm asking especially in the case of someone who would just need to re-authenthify his commits like in here, but in a case where the files needs not to be changed. Only commits to authentify.

Solution 2 - Git

When you deleted your file, that will be a change and that is the unstaged change that git is complaining about. If you do a git status you should see the file listed as removed/deleted. To undo this change you should git checkout -- <filename>. Then the file will be back and your branch should be clean. You can also git reset --hard this will bring your repo back to the status where you made your commit.

I am assuming that it is the last commit that has the very large file that you want to remove. You can do a git reset HEAD~ Then you can redo the commit (not adding the large file). Then you should be able to git push without a problem.

Since the file is not in the last commit then you can do the final steps without a problem. You just need to get your changes either committed or removed.

http://git-scm.com/book/en/Git-Tools-Rewriting-History

Solution 3 - Git

The github solution is pretty neat. I did a few commits before pushing, so it's harder to undo. Githubs solution is : Removing the file added in an older commit

If the large file was added in an earlier commit, you will need to remove it from your repository history. The quickest way to do this is with The BFG (a faster, simpler alternative to git-filter-branch):

bfg --strip-blobs-bigger-than 50M
# Git history will be cleaned - files in your latest commit will *not* be touched

https://help.github.com/articles/working-with-large-files/

https://rtyley.github.io/bfg-repo-cleaner/

Solution 4 - Git

This is in reference to the BFG post above, I would comment directly, but I have no idea how to do so as a low reputation new user.

You may want to do a 'git gc' to repack first.

I had issues getting BFG to work until I did so, this appears to be a common issue if you've only been working in a local repo and are prepping stuff to put up on a remote for the first time.

Relevant google hit which twigged me to it: https://github.com/rtyley/bfg-repo-cleaner/issues/65

Solution 5 - Git

It seems your only problem is having unstaged changes. You didn't give any detail as to what was actually out of sync, so it's a shot in the dark, but assuming you simple-rmd the file in step 4, you'd bring it back from the index with:

git checkout large_file

If not, you're on your own. Your goal is to make sure both your index and your working tree are in the same state. This shows as git status reporting nothing to commit, working directory clean.

The nuclear option to ensure a clean tree would be git reset --hard. If you want to try that, do backup your tree+repo beforehand.

Once your working copy is clean, you can proceed with your steps 5 and 6.

Solution 6 - Git

I continue to run into this problem over and over again, and I don't seem to learn not to do it. The solutions offered here have worked for me before, but for some reason not this time, but here is what did work (from https://medium.com/analytics-vidhya/tutorial-removing-large-files-from-git-78dbf4cf83a):

to remove the large file

git rm --cached <filename>

Then, to edit the commit

git commit --amend -C HEAD

Then you can push your amended commit with

git push

Solution 7 - Git

Here is what worked for me:

  1. Download and install BFG Repo-Cleaner (BFG), which is available here. My download was bfg-1.13.0.jar.
  2. A potentially helpful location to move the downloaded jar file, in my case bfg-1.13.0.jar, to is your ${JAVA_HOME}/lib. That is what I did because I want the Java specific libraries like these in a somewhat sensible location since they are not like ordinary Windows installations. You may wish to rename the jar file simply as bfg.jar to keep things simple - so below, where I use bfg.jar, I actually mean bfg-1.13.0.jar in my case.
  3. Run java -jar ${JAVA_HOME}/lib/bfg.jar --delete-files <file_name> --no-blob-protection .; you should replace the whole of <file_name> with the specific file name that is causing the issue - note that the path to the file is not necessary ONLY the file name by itself.
  4. Run git reflog expire --expire=now --all && git gc --prune=now --aggressive to complete the BFG cleaning job
  5. Finally, run git push origin main --force to complete pushing any outstanding local commits as you desire.
  6. If you have done everything up until this point successfully then your problem has been solved
  7. Going forward, always check that you do not inadvertently add very large files in directories to Git if you wish to avoid this problem reoccurring.

Solution 8 - Git

Copy newest Repo state

cp -r original_repo repo_tmp

Reset Original Repo to state before large file was commited

cd original_repo && git reset --hard {commit_before_large_file}

Remove .git from repo_tmp, so we only get the contents

cd .. && rm -rf repo_tmp/.git

Copy & Replace repo_tmp (newest repo state) to the original_repo folder

cp -r repo_tmp original_repo

Now Add, Commit & Push and you are good to go

git add . && git commit -m "be gone large file" && git push

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionShlag StagView Question on Stackoverflow
Solution 1 - GitGuillaume ChevalierView Answer on Stackoverflow
Solution 2 - GitSchleisView Answer on Stackoverflow
Solution 3 - GitJohnWolfView Answer on Stackoverflow
Solution 4 - GitJoseph WeaverView Answer on Stackoverflow
Solution 5 - GitJB.View Answer on Stackoverflow
Solution 6 - GitDylan_GomesView Answer on Stackoverflow
Solution 7 - GitchewittyView Answer on Stackoverflow
Solution 8 - GitAndrey BulezyukView Answer on Stackoverflow