Version control of Mathematica notebooks

GitVersion ControlWolfram Mathematica

Git Problem Overview


Mathematica notebooks are, of course, plaintext files -- it seems reasonable to expect that they should play nice with a version-control system (git in my case, although I doubt the specific system matters). But the fact is that any .nb file is full of cache information, timestamps, and other assorted metadata. Scads of it.

Which means that limited version control is possible -- commits and rollbacks work fine. Merging, though, is a disaster. Mathematica won't open a file with merge markers in it, and a text editor is no way to go through a .nb file.

Has anyone had any luck putting a notebook under version control? How?

Git Solutions


Solution 1 - Git

It's recommended to disable the file outline cache, which is the metadata you're referring to when you look at the notebook with a text editor. As you discovered, it can cause merge conflicts if multiple parties are editing the same notebook.

This is easily disabled with the Option Inspector. In the Mathematica menu, go to FormatOption Inspector..., in the top-left set the scope dropdown to Selected Notebook and search for FileOutlineCache in the search field. Set the option to False and save your notebook, and you should be all set.

Note that this can make opening notebooks a little slower, but unless the notebook is rather large, you probably won't notice the difference.

Solution 2 - Git

There is a nice set of recommendations for how to use Git to do version control with Mathematica at Mathematica Stack Exchange. In short, the philosophy is to minimize use of .nb notebooks, and try to do most of the version control with .m packages (similar to what xuhdev and MMA user say above). This seems quite sensible given the way notebooks are managed.

Solution 3 - Git

Not a solution to your merging problem exactly, but this is how we handle notebooks and source control in my team. Basically, we treat Mathematica notebooks the way we'd treat binary files. They're checked-in, but:

  • we always keep a pdf copy alongside the .nb (backup for restoring the information in case we lose, for some reason, the capability of readings .nb files. Still proprietary format, but a bit more widespread, and chances are both Adobe and Wolfram won't simultaneously disappear)
  • we do not allow merges
  • we code-review only the final product (the rendered notebook) instead of the .nb file.

We mostly use Mathematica for small proofs, explorations and sidetracks, so the above procedure works fine for us (our main documentation is in LaTeX, which produces friendlier documentation for non-mathematicians/non-programmers)

Solution 4 - Git

A new possibility is to use mathematica-notebook-filter which parses Mathematica notebooks and strips all output cells and metadata so that these are not committed into the version control system.

In the specific case of git, it is quite easy to integrate mathematica-notebook-filter so that git automatically cleans the output and metadata when calculating diffs through the use of gitattribute filters. You will need to have mathematica-notebook-filter filter installed and added to your path variable (or adapt the configuration below to point to the binary) and add the following line to your ~/.gitattributes file:

*.nb    filter=dropoutput_nb

This instructs git to parse all files matching *.nb with the dropoutput_nb filter which is defined in your ~/.gitconfig as:

[filter "dropoutput_nb"]
    clean = mathematica-notebook-filter
    smudge = cat

If, for some reason, you want to have a specific Mathematica notebook committed with all output and metadata, you can disable the filter in the project's .gitattributes file by adding:

notebook_file.nb    !filter

Disclaimer: I am the author of this tool. It is open source and feedback (both good and bad) is appreciated. Contributions are welcome on Github.

Solution 5 - Git

Along the lines of what Simon and Kena were saying, when I have had Mathematica .nb's under version control, I often create a plain-text version of only the input code and save it with the same name but a .txt extension. While this doesn't directly solve the merging problem, it does make diff-ing work in a reasonable way and makes manual merging more obvious when I go back to edit the .nb's later. There are still some idiosyncrasies in this format, but it is MUCH easier to read than the raw .nb format.

To generate the text file, I just copy the notebook into a new blank notebook (with shortcuts, Ctrl-A,C,N,V), select the menu Cell->Delete All Output, copy the result (Ctrl-A,C), and paste the result into a plain text editor to save it. It takes surprisingly little time once you get the hang of it.

Solution 6 - Git

Well, my solution is not using Notebook for tracking, but using plain text files (not the "Notebook" plain text).

Whenever you have a notebook, you can use the "save as..." menu to save the current file as a plain text file. When you need to load it, simply open it with Mahthematica. Tracking this file would be much nicer than tracking a Notebook file. I'm unsure about what features you may lose by using plain text format rather than the Mathematica Notebook, but I haven't found any defects so far.

Reference: http://www.topbug.net/blog/2013/05/02/track-mathematica-source-files-with-version-control-systems/

Solution 7 - Git

You should only get merge markers if the source control system detects changes to a single line by multiple users.

The source control system adds markers to make if very clear where the conflicts are, and to force you to manually remove them (as you resolve each conflict). There is no way for a source control system to know how to do it automatically for you.

If the file is text, but is designed to be read by a program only, it may have no end of line characters at all (or very long lines). Therefore if multiple people are working on such a file you'll get many merge conflicts.

I'm not familiar with the nb file format, but in general the solution to this problem is to ensure only one person is working on a file at a time (ie use an exclusive check-out mode for nb files).

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionEtaoinView Question on Stackoverflow
Solution 1 - GitMichael PilatView Answer on Stackoverflow
Solution 2 - GitApoView Answer on Stackoverflow
Solution 3 - GitKenaView Answer on Stackoverflow
Solution 4 - GitJP-EllisView Answer on Stackoverflow
Solution 5 - GitMMA userView Answer on Stackoverflow
Solution 6 - GitxuhdevView Answer on Stackoverflow
Solution 7 - GitAshView Answer on Stackoverflow