Is there a way to remove the history for a single file in Mercurial?
Version ControlMercurialDvcsVersion Control Problem Overview
I think I already know the answer to this but thought I would ask anyway:
We have a file that got added to a Mercurial repository with sensitive information in it. Is there any way to remove that file along with its change history without removing the whole repo?
Version Control Solutions
Solution 1 - Version Control
It is correct that you cannot easily remove a particular file from Mercurial in the sense that doing so will disrupt all the changeset IDs in your repository. When you change the changeset IDs, everybody has to re-clone the repository. See the Wiki page about editing history for information about the consequences of modifying the history in Mercurial.
If that is okay to you (internal repository in a company), then take a look at the convert extension. It can do hg → hg conversions and has a --filemap argument which can be used to exclude files, among other things.
Solution 2 - Version Control
No, you can't. Read the changes that should have never been section of the mercurial red book about it; and particularly the what about sensitive changes that escape subsection, which contains this paragraph:
> Mercurial also does not provide a way > to make a file or changeset completely > disappear from history, because there > is no way to enforce its > disappearance; someone could easily > modify their copy of Mercurial to > ignore such directives. In addition, > even if Mercurial provided such a > capability, someone who simply hadn't > pulled a “make this file disappear” > changeset wouldn't be affected by it, > nor would web crawlers visiting at the > wrong time, disk backups, or other > mechanisms. Indeed, no distributed > revision control system can make data > reliably vanish. Providing the > illusion of such control could easily > give a false sense of security, and be > worse than not providing it at all.
The usual way to revert committed changes is supported by mercurial through the backout
command (again, mercurial book: dealing with committed changes) but the information does not disappear from the repository: since you never know who exactly cloned your repository, that would give a false sense of security, as explained above.
Solution 3 - Version Control
It is possible locally, but not globally, and it changes the ID of each commit after the point at which the file was added. In order for the change to stick, you'll need access to every single copy of the repository, particularly the ones that get pulled or pushed from.
That said, I have followed the Editing History sequence described on the Mercurial wiki to remove a file from one of my repositories. This sequence assumes that revision 1301:5200a5a10d8b added the file path/to/badfile.cfg
, which was not changed in any subsequent revision:
-
Enable the MQ extension in your
.hgrc
:[extensions] mq =
-
Pull recent changes from upstream.
hg pull
-
Import everything from the file addition onward into MQ:
hg qimport -r 1301:tip hg qpop -a
-
Remove the file from the commit that added it.
hg qpush 1301.diff hg forget path/to/badfile.cfg hg qrefresh
-
Convert the patches into new Mercurial revisions.
hg qpush -a hg qfinish -a
-
Push the new revisions upstream.
hg push -f
-
On the upstream repository and every single other copy, remove the old revisions.
hg strip 5200a5a10d8b
Warning: This step may destroy work, unless you're careful. If anybody has committed anything since the last time you pulled from upstream, then you'll have to rebase that work before stripping. Unfortunately, the rebase
extension is not helpful here; you'll have to use MQ again, converting the new commits into patches that you apply onto the new tip.
Good luck.
Solution 4 - Version Control
It can be done in under 10 min. in a single repository, although there are consequences.
How: use hg convert as described in this excellent guide. Basically, you "convert" an Hg repo into a new Hg repo, but you get to specify a list of files to exclude during conversion. This is an excerpt of the key steps:
Make sure all your teammates have pushed their local changes to the central repo (if any)
Backup your repository
Create a "map.txt" file:
# this filemap is used to exclude specific files
exclude "subdir/filename1.ext"
exclude "subdir/filename2.ext"
exclude "subdir2"
Run this command:
hg convert --filemap map.txt c:/oldrepo c:/newrepo
NOTE: You have to use "forward-slash" in paths, even on windows.
Wait and be patient
Now you have a new repo at c:\newrepo but without the files
As for the consequences...
- all changeset IDs after the files you want to exclude were added will be different
- the new "clean" main repository will have to be manually put in place of the existing one
- all team members will have to make new clones of the main repo
- any other services which integrate with Hg may require attention (e.g. the issue tracker, a code review system etc.)
Solution 5 - Version Control
hg transplant, then hg strip