Is version control (ie. Subversion) applicable in document tracking?

Version ControlTracking

Version Control Problem Overview


I am in charge of about 100+ documents (word document, not source code) that needs revision by different people in my department. Currently all the documents are in a shared folder where they will retrieve, revise and save back into the folder.

What I am doing now is looking up the "date modified" in the shared folder, opened up recent modified documents and use the "Track Change" function in MS Word to apply the changes. I find this a bit tedious.

So will it be better and easier if I commit this in a version control database?

Basically I want to keep different version of a file.


What have I learn from answers:

  • Use Time Machine to save different version (or Shadow copy in Vista)

  • There is a difference between text and binary documents when you use version control app. (I didn't know that)

  • Diff won't work on binary files

  • A notification system (ie email) for revision is great

  • Google Docs revision feature.

Update :

I played around with Google Docs revision feature and feel that it is almost right for me. Just a bit annoyed with the too frequent versioning (autosaving).

But what feels right for me doesn't mean it feels right for my dept. Will they be okay with saving all these documents with Google?

Version Control Solutions


Solution 1 - Version Control

I've worked with Word documents in SVN. With TortoiseSVN, you can easily diff Word documents (between working copy and repository, or between two repository revisions). It's really slick and definitely recommended.

The other thing to do if you're using Word documents in SVN is to add the svn:needs-lock property to the Word documents. This will prevent two people from trying to edit the same document at the same time, since unfortunately there's no good way to merge Word documents.

With the above two things, handling revision controlled Word documents is at least tolerable. It certainly beats the alternative of using a shared folder and track-changes.

Solution 2 - Version Control

What on Earth are you all Word-is-binary-so-no-diff people talking about? TortoiseSVN, for example, integrates right out of the box with Word and enables you to use Word's built-in diff and merge functionality. It works just fine.

I have worked on projects that store documents in version control. It has worked out pretty well, although if people are unfamiliar with version control, they are probably going to have conceptual difficulties with things like "working copy" and "merge" and "conflict". Don't overestimate the users' capabilities when you plan your document management system.

I believe there exist big and powerful commercial solutions for all of this, as well. I'm sure if you have enough kilodollars, you can get something that fits your needs perfectly. Document management systems are a big business for big enterprise.

Solution 3 - Version Control

I guess one thing that nobody seems to have asked is if you have a legal requirement to store history of changes to the doc's?

Whether you do or don't is going to have an impact on what solutions you can consider.

Also a notification mechanism for out of date copies is also a bundle of fun. If engineer A has a copy of a document and engineer B then edits it and commits the changes you want engineer A to be notified that his copy is out of date.

Document control can become a real can of worms quite easily.

Maybe keep the doc's under CVS or SVN and set it up so that emails are generated to whoever has checked out a copy when updates for the same doc. are checked in to the repository?

Edit: I forgot to add don't forget to use the binary switch, e.g. -kb for CVS, when adding the new doc. Otherwise, you will get any sequences of data that happen to match the ascii for keyword strings having the relevant config management data appended thereby corrupting your doc. data.

Solution 4 - Version Control

Thinking out of the box, would migrating to a Wiki be out of the question?

Since you consider it feasible to force your users into Subversion (or something similar), a larger change seem acceptable.

Another migration target could be to use some kind of structured XML document format (DocBook comes to mind). This would enable you to indeed use diffs and source control, while getting all sorts of document formats for free.

Solution 5 - Version Control

Sharepoint also does a good (ok decent) job of versioning MS-specific documents.

Solution 6 - Version Control

How about trying git , It seems git can support word .doc and open document .odf files if you configure it in .gitattributes file.

Here is a reference , Scroll down to diffing binary files .

Solution 7 - Version Control

For what it's worth, there is also Google Docs. I guess it's not a perfect fit, but it's versioning is very convenient.

Solution 8 - Version Control

Clearcase integrates with Word for revision tracking. I believe Telelogic DOORs does as well.

Solution 9 - Version Control

I use Mercurial with the TortoiseHg overlay. I can right-click a changeset, choose "Visual Diff", then choose the "docdiff" tool (comes bundled), which launches the document in Word with the Track Changes.

Solution 10 - Version Control

You can, but you will allways compare the document versions with Word itself.

I haven't heard a version control database which can track changes in Word documents.

However there are some tools which can compare Word documents, so if you set up your version control client to use these tools for comparison, you can have some fun.

Solution 11 - Version Control

Not necessarily. It depends on how often the new files are committed to the repo. If the files are edited several times before a commit, then you're precisely where you are now. The biggest benefit is if the file becomes corrupted.

You can version any file; this is how Time Machine in Mac OS X Leopard works, for example, and there is an interesting article by someone who committed his entire computing environment into CVS and then just maintained working copies on his home and work machines.

But "better" and "easier" are specific to your situation, and I'm not sure I completely understand your problem as things stand.

Solution 12 - Version Control

Subversion, CVS and all other source control systems are not good for Word documents and other office files (such as Excel spread sheets), since the files themselves are stored in a binary format. That means that you can never go back and annotate (or blame, or whatever you want to call it), or do diffs between documents.

There are revision control systems for Word documents out there, unfortunately I do not know any good ones. We use such control systems for Excel at my work, and unfortunately they all cost money.

The good thing is that they make life a lot easier, especially if you ever have to do an audit or due diligence.

Solution 13 - Version Control

If you use WinMerge it has added support for merging Word and Excel binary files.

Solution 14 - Version Control

Have a look at Sharepoint. If cost is an issue, Sharepoint portal sevices can also work for you. Read this for more info

Solution 15 - Version Control

Just wanted to clarify an answer someone gave but I don't have enough points yet.

diff will work on binary files but it is only going to say something not really useful like "toto1 and toto2 binary files differ".

Solution 16 - Version Control

You could use something like the Revisionator, which is like google docs but with built in revision control including diffs, forks, and 3 way merges. http://revisionator.com

UPDATE: It also fixes the problem of too frequent autosaving that you mention with Google Docs. It'll still autosave to prevent data loss, but it will only create a new version in the revision history and share with other users when you explicitly "release" your changes.

Solution 17 - Version Control

You could do that, but if that files are binary you should always put a lock on it before editing. You won't get a conflict (which would be unresolvable).

Solution 18 - Version Control

Many of the new version control projects are better suited to entire directories, and not so much for single files.

Convincing someone that they need to get an entire project, when they only want to update an individual file can be a "fun" way to spend an afternoon.

Solution 19 - Version Control

Another option you have is a piece of software and cloud computing magic called dropbox. Or, you could ditch the word documents and make a locally shared mediawiki instead.

DropBox: getdropbox DOT com

MediaWiki: mediawiki DOT org

Solution 20 - Version Control

YES, it's applicable! I totally agree to say that the combo SVN+TortoiseSVN suits well to track MS Office documents. You can lock a document for edition, write protect all unlocked files to avoid conflicts (i.e. parallel modifications), diff two versions of the same file, see the history of all the modifications and of course rollback to an older revision.
I tried to describe all of those tips in a dedicated blog post. (disclaimer: I'm the blog owner)

All of this could even be accessible from the web with a SVN web client! (might need some software development)

But if you're not accustomed to Version Control Systems in an other context this may not be the obvious choice. The needed work for a good integration with docs give dedicated tools an advantage: "electronic document management" systems are made just for that. A VCS like SVN may stay a good alternative for cost reasons :-)

Did you test the online service Simul? It looks promising, I personally like the GitHub-like orientation. Note that I'm not affiliated to Simul!

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionqwertyuuView Question on Stackoverflow
Solution 1 - Version ControlGreg HewgillView Answer on Stackoverflow
Solution 2 - Version ControlSanderView Answer on Stackoverflow
Solution 3 - Version ControlRob WellsView Answer on Stackoverflow
Solution 4 - Version ControlHenrik PaulView Answer on Stackoverflow
Solution 5 - Version ControlhometoastView Answer on Stackoverflow
Solution 6 - Version ControlGautamView Answer on Stackoverflow
Solution 7 - Version ControlgrapefruktView Answer on Stackoverflow
Solution 8 - Version ControlPaul NathanView Answer on Stackoverflow
Solution 9 - Version ControlJohnZajView Answer on Stackoverflow
Solution 10 - Version ControlBiriView Answer on Stackoverflow
Solution 11 - Version ControlPolsonbyView Answer on Stackoverflow
Solution 12 - Version ControlMats FredrikssonView Answer on Stackoverflow
Solution 13 - Version ControlKeithView Answer on Stackoverflow
Solution 14 - Version ControlRadView Answer on Stackoverflow
Solution 15 - Version ControlRob WellsView Answer on Stackoverflow
Solution 16 - Version ControljpalmucciView Answer on Stackoverflow
Solution 17 - Version ControlrafekView Answer on Stackoverflow
Solution 18 - Version ControlBrad BruceView Answer on Stackoverflow
Solution 19 - Version ControlNathan LawrenceView Answer on Stackoverflow
Solution 20 - Version ControljustinmassiotView Answer on Stackoverflow