How to remove file from Git history?
GitGithubGit Problem Overview
Some time ago I added info(files) that must be private. Removing from the project is not problem, but I also need to remove it from git
history.
I use Git and Github (private account).
Note: On this thread something similar is shown, but here is an old file that was added to a feature branch, that branch merged to a development branch and finally merged to master, since this, a lot of changes was done. So it's not the same and what is needed is to change the history, and hide that files for privacy.
Git Solutions
Solution 1 - Git
I have found this answer and it helped:
git filter-branch --index-filter \
'git rm -rf --cached --ignore-unmatch path_to_file' HEAD
Found it here https://myopswork.com/how-remove-files-completely-from-git-repository-history-47ed3e0c4c35
Solution 2 - Git
If you have recently committed that file, or if that file has changed in one or two commits, then I'd suggest you use rebase
and cherrypick
to remove that particular commit.
Otherwise, you'd have to rewrite the entire history.
git filter-branch --tree-filter 'rm -f <path_to_file>' HEAD
When you are satisfied with the changes and have duly ensured that everything seems fine, you need to update all remote branches -
git push origin --force --all
Note:- It's a complex operation, and you must be aware of what you are doing. First try doing it on a demo repository to see how it works. You also need to let other developers know about it, such that they don't make any change in the mean time.
Solution 3 - Git
git-filter-repo
git
recommends to use the third-party add-on git-filter-repo (when git filter-branch
command is executed). There is a long list of why it is better than any other alternatives (https://github.com/newren/git-filter-repo#why-filter-repo-instead-of-other-alternatives), my experience is that it is very simple and very fast.
This command removes the file from all commits in all branches:
git filter-repo --path <path to the file or directory> --invert-paths
Multiple paths can be specified by using multiple --path
parameters. You can find detailed documentation here:
https://www.mankier.com/1/git-filter-repo
Solution 4 - Git
Remove the file and rewrite history from the commit you done with the removed file(this will create new commit hash from the file you commited):
there are two ways:
- Using git-filter-branch:
git filter-branch --force --index-filter 'git rm --cached --ignore-unmatch <path to the file or directory>' --prune-empty --tag-name-filter cat -- --all
- Using git-filter-repo:
pip3 install git-filter-repo
git filter-repo --path <path to the file or directory> --invert-paths
now force push the repo: git push origin --force --all
and tell your collaborators to rebase
.
Solution 5 - Git
I read this GitHub article, which led me to the following command (similar to the accepted answer, but a bit more robust):
git filter-branch --force --index-filter "git rm --cached --ignore-unmatch PATH-TO-YOUR-FILE-WITH-SENSITIVE-DATA" --prune-empty --tag-name-filter cat -- --all
Solution 6 - Git
Using the bfg repo-cleaner package is another viable alternative to git-filter-branch
. Apparently, it is also faster...
Solution 7 - Git
-
First of all, add it to your
.gitignore
file and don't forget to commit the file :-) -
You can use this site: http://gitignore.io to generate the
.gitignore
for you and add the required path to your binary files/folder(s) -
Once you added the file to
.gitignore
you can remove the "old" binary file with BFG.
#How to remove big files from the repository
You can use git filter-branch
or BFG.
https://rtyley.github.io/bfg-repo-cleaner/
> ###BFG Repo-Cleaner
> an alternative to git-filter-branch.
> The BFG is a simpler, faster alternative to git-filter-branch for cleansing bad data out of your Git repository history:
> *** Removing Crazy Big Files***
> * Removing Passwords, Credentials & other Private data
Examples (from the official site)
> In all these examples bfg is an alias for java -jar bfg.jar.
# Delete all files named 'id_rsa' or 'id_dsa' :
bfg --delete-files id_{dsa,rsa} my-repo.git