What makes a good commit message?
Extreme Programming (XP) embraces five values: Communication, Simplicity, Feedback, Courage and Respect.
Communication is arguably the most important of those. With the rise of distributed teams it is extremely important to maintain good habits of communication in all forms. Given the differences in timezones, written communication becomes even more important.
Commit messages are important means of communication between team members and for the lifecycle of the team: its past and its future.
What matters most in team software development is communication. (Extreme Programming Explained: Embrace Change)
During the research to write this post I’ve stumbled over mostly 2 types of messages:
- Countless uninformative commit messages
- Messages written in a format that virtually blurred what was the important change by pointing outwards of the repository. Let’s call them look-elsewhere messages.
Let’s talk about each one of these problems, how we can improve the message to convey more meaning and information and, finally, give some examples of good commit messages.
Uninformative messages
Uninformative messages are those that do not convey any information when you read them. For example:
add cli new
fixes
fix code review comments
no message
description
wip
hackz
Imagine the following situation: You arrive at the office and there’s a major bug, your company is losing money. You start investigating by using git log --grep
and some keywords. You find nothing. Then you use git log -- file/path
in a file you think it may be related. You find a commit that look suspicious, but again, the commit message does not tell you why the code is the way it is. You look the author’s name with hope the person is still around, with no luck. Wouldn’t be great if the commit messages helped you? None of this is possible with these messages. As we’ve said, they do not convey any information to the reader. We should avoid them.
Look-elsewhere messages
In this category are messages that include Github issue numbers or Jira identifications in their description. They tend to not include any more information. For example:
Fix issue #123
[JIRA-1234] Fix a problem with the route system when...
These messages favour information that is outside of the codebase. The identifications also take valuable space from the commit message. Let’s talk about these 2 points.
It’s important that information live the closest to the codebase as possible since the codebase is the single source of truth. Jira or Github or any separated tool from git are secondary tools. In case they burn on a fire or we switch providers, we still have the git repository. But if all information lives somewhere else that’s not the git repository then we would be in a serious situation. We should then strive to write as much information in commit messages as we can. Those extra minutes of thought to explain to whoever comes next why the changes were made (not the what, since git show
will tell you that) will be worth.
Another thing about secondary tools is that we eventually skip the parts related to issue tracking of the message. Try going to Github project where all messages start with a bug tracking number and see if you read any of them after looking for something for a while. I’ll wait. Now that you’re back let’s continue. Did you see how your eyes simply skipped those parts? Your eyes are telling you that they are not the most important part.
A Jira ticket identification with 10 characters take 20% of the 50 characters of the subject line. We end up with way less space to write about what was really important. When looking at the git history tree one wants to know how the code evolved, why the changes were made the way they were. We’re hardly looking for a identification since we cannot yet know that JIRA-1234
is related to cached role while investigating a cached role problem.
That’s not to say that Jira identifications are not useful but that they shouldn’t take space away from most important information. Simply, they should not come first but should appear elsewhere. A suggestion would be to write them as the last part of the message, after the body of the message was written. One can even use a ---
separator. One can still glimpse them when doing git log
and one can always find them with git log --oneline --grep=JIRA-1234
. This way all the information stays in the git repository and then point outward.
Good commit messages
Following are examples of what I consider good commit messages given the arguments I’ve made and guidelines put forward by other people (check the links section). They were taken from Open Source projects (with some adaptations). I’ve ommited authors and project names to focus in the messages themselves.
Make the cached role coherrent with the actual oneWhen observing the leader running a master role, set the cached role
stored in the state_handler to master as well. Failure to do so
resulted in the manually promoted node to continue running with a
cached 'replica' role. This led to the failure to create replication
slots for the new replicas.We could do it conditionally, but both reading and writing the role
require the same lock, and the unconditional approach makes the unit tests simpler.
Fix rewind behavior when paused* Check if we can rewind when deciding to call Postgresql.follow during pause.Without this the following sequence of events occurs:1. Manual failover from node, demotion sets need_rewind flag. Rewind is not done because can_rewind is False.
2. Cluster is put into paused mode. Rewind is not attempted because recovery.conf matches.
3. PostgreSQL goes down (pg_ctl stop).
4. Because need_rewind is set PostgreSQL is restarted.
5. Sysadmin is unhappy because Patroni is doing stuff behind his back during pause.* Allow superuser user to be missing from config for rewind.In other places a missing superuser authentication section is allowed and we default to libpq's use default OS user with no password. This makes rewind work with a missing superuser configuration.---Issue: [#365](https://link/to/issue/365)
Make Source.indexOf(ByteString) significantly fasterPreviously the algorithm that did this was extremely inefficient, and had worst case runtime of O(N * S * S) for N is size of the bytestring and S is the number of segments.The new code runs in O(N * S). It accomplishes this by not starting each search at the first segment, which could occur many times when called by RealBufferedSource.
Convert specs to RSpec 3.1.7 syntax with TranspecThis conversion is done by Transpec 2.3.8 with the following command:
transpec* 46 conversions
from: it { should ... }
to: it { is_expected.to ... }* 7 conversions
from: it { should_not ... }
to: it { is_expected.not_to ... }* 2 conversions
from: it { should have(n).items }
to: it 'has n dependencies' do expect(subject.size).to eq(n) end* 2 conversions
from: obj.should
to: expect(obj).to* 2 conversions
from: obj.stub(:message)
to: allow(obj).to receive(:message)* 1 conversion
from: == expected
to: eq(expected)* 1 conversion
from: it { should have(n).items }
to: it 'has n feature' do expect(subject.size).to eq(n) end* 1 conversion
from: it { should have(n).items }
to: it 'has n features' do expect(subject.size).to eq(n) end* 1 conversion
from: it { should have(n).items }
to: it 'has n files' do expect(subject.size).to eq(n) endFor more details: https://github.com/yujinakayama/transpec#supported-conversions
Beside that is always nice to read something well writen, that someone took care with proper formatting and spacing. Following are 3 extra wishes:
- Please start your sentences with an uppercase letter
- Please don’t edit the generated git messages
- Please don’t write “Revert” or “Merge” when you’re not using
git revert
norgit merge