How to remove file from Git history?

GitGithub

Git Problem Overview


Some time ago I added info(files) that must be private. Removing from the project is not problem, but I also need to remove it from git history.

I use Git and Github (private account).

Note: On this thread something similar is shown, but here is an old file that was added to a feature branch, that branch merged to a development branch and finally merged to master, since this, a lot of changes was done. So it's not the same and what is needed is to change the history, and hide that files for privacy.

Git Solutions


Solution 1 - Git

I have found this answer and it helped:

git filter-branch --index-filter \
    'git rm -rf --cached --ignore-unmatch path_to_file' HEAD

Found it here https://myopswork.com/how-remove-files-completely-from-git-repository-history-47ed3e0c4c35

Solution 2 - Git

If you have recently committed that file, or if that file has changed in one or two commits, then I'd suggest you use rebase and cherrypick to remove that particular commit.

Otherwise, you'd have to rewrite the entire history.

git filter-branch --tree-filter 'rm -f <path_to_file>' HEAD

When you are satisfied with the changes and have duly ensured that everything seems fine, you need to update all remote branches -

git push origin --force --all

Note:- It's a complex operation, and you must be aware of what you are doing. First try doing it on a demo repository to see how it works. You also need to let other developers know about it, such that they don't make any change in the mean time.

Solution 3 - Git

git-filter-repo

git recommends to use the third-party add-on git-filter-repo (when git filter-branch command is executed). There is a long list of why it is better than any other alternatives (https://github.com/newren/git-filter-repo#why-filter-repo-instead-of-other-alternatives), my experience is that it is very simple and very fast.

This command removes the file from all commits in all branches:

git filter-repo --path <path to the file or directory> --invert-paths

Multiple paths can be specified by using multiple --path parameters. You can find detailed documentation here: https://www.mankier.com/1/git-filter-repo

Solution 4 - Git

Remove the file and rewrite history from the commit you done with the removed file(this will create new commit hash from the file you commited):

there are two ways:

  1. Using git-filter-branch:

git filter-branch --force --index-filter 'git rm --cached --ignore-unmatch <path to the file or directory>' --prune-empty --tag-name-filter cat -- --all

  1. Using git-filter-repo:
pip3 install git-filter-repo
git filter-repo --path <path to the file or directory> --invert-paths


now force push the repo: git push origin --force --all and tell your collaborators to rebase.

Solution 5 - Git

I read this GitHub article, which led me to the following command (similar to the accepted answer, but a bit more robust):

git filter-branch --force --index-filter "git rm --cached --ignore-unmatch PATH-TO-YOUR-FILE-WITH-SENSITIVE-DATA" --prune-empty --tag-name-filter cat -- --all

Solution 6 - Git

Using the bfg repo-cleaner package is another viable alternative to git-filter-branch. Apparently, it is also faster...

Solution 7 - Git

  • First of all, add it to your .gitignore file and don't forget to commit the file :-)

  • You can use this site: http://gitignore.io to generate the .gitignore for you and add the required path to your binary files/folder(s)

  • Once you added the file to .gitignore you can remove the "old" binary file with BFG.


#How to remove big files from the repository

You can use git filter-branch or BFG. https://rtyley.github.io/bfg-repo-cleaner/

> ###BFG Repo-Cleaner > an alternative to git-filter-branch.

> The BFG is a simpler, faster alternative to git-filter-branch for cleansing bad data out of your Git repository history:

> *** Removing Crazy Big Files***
> * Removing Passwords, Credentials & other Private data

Examples (from the official site)

> In all these examples bfg is an alias for java -jar bfg.jar.

# Delete all files named 'id_rsa' or 'id_dsa' :
bfg --delete-files id_{dsa,rsa}  my-repo.git

enter image description here

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionMarcos R. GuevaraView Question on Stackoverflow
Solution 1 - GitPetro FrankoView Answer on Stackoverflow
Solution 2 - GithspandherView Answer on Stackoverflow
Solution 3 - GitTibor TakácsView Answer on Stackoverflow
Solution 4 - GitsuhailvsView Answer on Stackoverflow
Solution 5 - Gitvancy-pantsView Answer on Stackoverflow
Solution 6 - Gitc1au61o_HHView Answer on Stackoverflow
Solution 7 - GitCodeWizardView Answer on Stackoverflow