Git Commit Messages: 50/72 Formatting

Git

Git Problem Overview


Tim Pope argues for a particular Git commit message style in his blog post: <http://www.tpope.net/node/106>;.

Here is a quick summary of what he recommends:

  • First line is 50 characters or less.
  • Then a blank line.
  • Remaining text should be wrapped at 72 characters.

His blog post gives the rationale for these recommendations (which I will call “50/72 formatting” for brevity):

  • In practice, some tools treat the first line as a subject line and the second paragraph as a body (similar to email).
  • git log does not handle wrapping, so it is hard to read if lines are too long.
  • git format-patch --stdout converts commits to email — so to play nice it helps if your commits are already wrapped nicely.

A point I would like to add that I think Tim would agree with:

  • The act of summarizing your commit is a good practice inherently in any version control system. It helps others (or a later you) find relevant commits more quickly.

So, I have a couple of angles to my question:

  • What chunk (roughly) of the “thought leaders” or “experienced users” of Git embrace the 50/72 formatting style? I ask this because sometime newer users don’t know or don’t care about community practices.
  • For those that don’t use this formatting, is there a principled reason for using a different formatting style? (Please note that I’m looking for an argument on the merits, not “I’ve never heard of it” or “I don’t care.”)
  • Empirically speaking, what percentage of Git repositories embrace this style? (In case someone wants to do an analysis on GitHub repositories… hint, hint.)

My point here is not to recommend the 50/72 style or shoot down other styles. (To be open about it, I do prefer it, but I am open to other ideas.) I just want to get the rationale for why people like or oppose various Git commit message styles. (Feel free to bring up points that haven’t been mentioned, too.)

Git Solutions


Solution 1 - Git

Regarding the “summary” line (the 50 in your formula), the Linux kernel documentation has this to say:

For these reasons, the "summary" must be no more than 70-75
characters, and it must describe both what the patch changes, as well
as why the patch might be necessary.  It is challenging to be both
succinct and descriptive, but that is what a well-written summary
should do.

That said, it seems like kernel maintainers do indeed try to keep things around 50. Here’s a histogram of the lengths of the summary lines in the git log for the kernel:

Lengths of Git summary lines (view full-sized)

There is a smattering of commits that have summary lines that are longer (some much longer) than this plot can hold without making the interesting part look like one single line. (There’s probably some fancy statistical technique for incorporating that data here but oh well… :-)

If you want to see the raw lengths:

cd /path/to/repo
git shortlog  | grep -e '^      ' | sed 's/[[:space:]]\+\(.*\)$/\1/' | awk '{print length($0)}'

or a text-based histogram:

cd /path/to/repo
git shortlog  | grep -e '^      ' | sed 's/[[:space:]]\+\(.*\)$/\1/' | awk '{lens[length($0)]++;} END {for (len in lens) print len, lens[len] }' | sort -n

Solution 2 - Git

Regarding “thought leaders”: Linus emphatically advocates line wrapping for the full commit message:

> […] we use 72-character columns for word-wrapping, except for quoted > material that has a specific line format.

The exceptions refers mainly to “non-prose” text, that is, text that was not typed by a human for the commit — for example, compiler error messages.

Solution 3 - Git

Separation of presentation and data drives my commit messages here.

Your commit message should not be hard-wrapped at any character count and instead line breaks should be used to separate thoughts, paragraphs, etc. as part of the data, not the presentation. In this case, the "data" is the message you are trying to get across and the "presentation" is how the user sees that.

I use a single summary line at the top and I try to keep it short but I don't limit myself to an arbitrary number. It would be far better if Git actually provided a way to store summary messages as a separate entity from the message but since it doesn't I have to hack one in and I use the first line break as the delimiter (luckily, many tools support this means of breaking apart the data).

For the message itself newlines indicate something meaningful in the data. A single newline indicates a start/break in a list and a double newline indicates a new thought/idea.

This is a summary line, try to keep it short and end with a line break.
This is a thought, perhaps an explanation of what I have done in human readable format.  It may be complex and long consisting of several sentences that describe my work in essay format.  It is not up to me to decide now (at author time) how the user is going to consume this data.

Two line breaks separate these two thoughts.  The user may be reading this on a phone or a wide screen monitor.  Have you ever tried to read 72 character wrapped text on a device that only displays 60 characters across?  It is a truly painful experience.  Also, the opening sentence of this paragraph (assuming essay style format) should be an intro into the paragraph so if a tool chooses it may want to not auto-wrap and let you just see the start of each paragraph.  Again, it is up to the presentation tool not me (a random author at some point in history) to try to force my particular formatting down everyone else's throat.

Just as an example, here is a list of points:
* Point 1.
* Point 2.
* Point 3.

Here's what it looks like in a viewer that soft wraps the text.

> This is a summary line, try to keep it short and end with a line break. > > This is a thought, perhaps an explanation of what I have done in human readable format. It may be complex and long consisting of several sentences that describe my work in essay format. It is not up to me to decide now (at author time) how the user is going to consume this data. > > Two line breaks separate these two thoughts. The user may be reading this on a phone or a wide screen monitor. Have you ever tried to read 72 character wrapped text on a device that only displays 60 characters across? It is a truly painful experience. Also, the opening sentence of this paragraph (assuming essay style format) should be an intro into the paragraph so if a tool chooses it may want to not auto-wrap and let you just see the start of each paragraph. Again, it is up to the presentation tool not me (a random author at some point in history) to try to force my particular formatting down everyone else's throat. > > Just as an example, here is a list of points:
> * Point 1.
> * Point 2.
> * Point 3.
>

My suspicion is that the author of Git commit message recommendation you linked has never written software that will be consumed by a wide array of end-users on different devices before (i.e., a website) since at this point in the evolution of software/computing it is well known that storing your data with hard-coded presentation information is a bad idea as far as user experience goes.

Solution 4 - Git

Is the maximum recommended title length really 50?

I have believed this for years, but as I just noticed the documentation of "git commit" actually states

$ git help commit | grep -C 1 50
      Though not required, it’s a good idea to begin the commit message with
      a single short (less than 50 character) line summarizing the change,
      followed by a blank line and then a more thorough description. The text

$  git version
git version 2.11.0

One could argue that "less then 50" can only mean "no longer than 49".

Solution 5 - Git

I'd agree it is interesting to propose a particular style of working. However, unless I have the chance to set the style, I usually follow what's been done for consistency.

Taking a look at the Linux Kernel Commits, the project that started git if you like, http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=bca476139d2ded86be146dae09b06e22548b67f3, they don't follow the 50/72 rule. The first line is 54 characters.

I would say consistency matters. Set up proper means of identifying users who've made commits (user.name, user.email - especially on internal networks. User@OFFICE-1-PC-10293982811111 isn't a useful contact address). Depending on the project, make the appropriate detail available in the commit. It's hard to say what that should be; it might be tasks completed in a development process, then details of what's changed.

I don't believe users should use git one way because certain interfaces to git treat the commits in certain ways.

I should also note there are other ways to find commits. For a start, git diff will tell you what's changed. You can also do things like git log --pretty=format:'%T %cN %ce' to format the options of git log.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionDavid J.View Question on Stackoverflow
Solution 1 - GitmgalgsView Answer on Stackoverflow
Solution 2 - GitleonbloyView Answer on Stackoverflow
Solution 3 - GitMicah ZoltuView Answer on Stackoverflow
Solution 4 - GitGuenther BrunthalerView Answer on Stackoverflow
Solution 5 - Gituser257111View Answer on Stackoverflow