To comment or not comment
Un article de Agile-Swiss.
(by Didier)
Back in 1991 I joined the Smalltalk forum that Digitalk had created on Compuserve. I had a little over 6 years of isolated Smalltalk programming behind me and meeting with other Smalltalkers was a great experience. Shortly after I joined that forum, I came across a message from a guy named Kent Beck—whom I did not know at the time—that ticked me off:
Wow! Before using Smalltalk, I had programmed for many years in FORTRAN, C and assembly languages of various sorts. As a result I was, at this time, still peppering my Smalltalk code with comments. After reading the message above, I immediately joined the heated discussion provoked by Kent’s message because I considered such a statement as totally ludicrous, on the verge of blasphemy.
For those of you who do not know Kent, let me just say that he won the argument easily and, as a result, I stopped writing comments in my Smalltalk code ever after and continue to do so with my Java code. Surprisingly enough, the topic of whether or not one should comment code is not dead yet. A friend of mine directed me toward a discussion thread under the title "What is well-commented code?" on slashdot.org started in May 2002. Here is the message that opens the discussion:
"What exactly is well-commented code anyway? Can anyone suggest resources with insight into writing better comments and making code more readable? After about six years in the software development industry I've seen my share of other people's code. I seem to spend a lot of time wishing the code had better (sometimes _any_) comments. The comments can be frustrating to me for different reasons: too vague, too specific, incoherent, pointing out the obvious while leaving the non-obvious to my imagination, or just plain incorrect. Poorly or mysteriously named variables and methods can be just as confusing.?"
Browsing through the contributions, I found the same arguments that Kent and I and others exchanged more than ten years earlier. Indeed, old habits die hard!
Sommaire |
The purpose of comments
Before asking what is well-commented code, it should be wise to ask: what is the purpose of commenting the code? Why do we need to write comments in the first place? Everybody knows the answer: comments help understanding the code.
Or so they think! Let us get start immediately with an example of Java code from a guy who clearly must have seen assembler code in a former life.
Listing 1‑1.1, Over-commented code
aCarQuote.setToday( today); // today aCarQuote.setKanton( kanton); // kanton aCarQuote.setEurotax( eurotax); // eurotax aCarQuote.setVnNation( vnNation); // vnNation aCarQuote.setVnBeruf( vnBeruf); // vnBeruf aCarQuote.setSfKH( sfKH); // sfKH aCarQuote.setSfVK( sfVK); // sfVK
All readers should have guessed what the first statement of Listing 1-1.1 is supposed to do. Those of the readers who can read German will be able to understand what lines 2 to 6 are supposed to do. Although lines 5 and 6 might give you some doubts on whether the meaning the "vn" prefix can be relevant or not. As for the last two lines of this code, readers able to understand them shall win my personal esteem.
Indeed, poorly or mysteriously named variables and methods can be just as confusing? as was stated in the quote earlier. I might add that repeating the same mysterious names in the comment is no help at all. In fact, it does not take much programming experience to see that the comments of Listing 1-1.1 do not bring any information in addition to what the code states. Thus, you are none the wiser with such comments.
Some readers might think that the code of Listing 1-1.1 is a fake, or at least that it has been produced by a greenhorn. Actually, the sad truth is that this code is real, and still in production at the time this book is written. A sadder truth is that it was written by a person teaching Java at a technical college. Hopefully, none of you will be exposed to programmers coming out of that school.
To come back to our main topic, putting comments into the code is no guaranty that these comments are of better quality than the code they are supposed to enlighten. It actually gets even worse with the availability of plug-ins for Java IDEs where comments can be generated at will. With the advent of such add-on tools, useless comments can be generated by the buckets. Here is a typical automatic comment.
Listing 1‑1.2, Automatic comment
/**
* Gets the currency
* @return Returns a String
*/
public String getCurrency() {
return currency;
}
A glimpse at the method’s declaration—or signature if you locate that method in a JAR file without the code—gives you more (That method is public. This is not said in the comment) information than the generated comment. This is no surprise: if a computer program, however smart, is able to generate the comment, the information must be obvious.
To conclude this section, we should agree that there are cases where comments are really not needed. But nevertheless some reader might think along the line of this quote taken from the slashdot discussion quoted earlier.
“Some people say code shouldn't need commenting, and the code itself should be enough. In a perfect world of no bugs and only populated by wizard programmers, this is fine, but not in the world I live in. You write some code and someone else (maybe yourself) will have to debug it at some point - maybe 3-4 years down the line. Even with a "neat" language like Java, working out how things work is much more time consuming without comments. “
Never mind, let us assume we shall comment some parts of our code for the time being.
Thanks for the tip!
At this point, we still agree that comments, when we need them, should tip us as to what the code next or below it should be doing. So let us look at the little gem below.
Listing 1‑1.3, Expressive comment
// Set currency code to Swiss Francs order.setCurrencyCode( 6);
Now the comment makes sense, doesn’t it? All Java programmers would conclude that "6" is the code used to flag the Swiss Franc currency. More experienced Java programmers would comment that it is not a good practice to have "magic" numbers such as this "6". Surely, the currency codes are used at some other places and it would be better to define all these codes at one place. Good point!
However, it turns out that the code for Swiss Francs in that particular application is "1". The code of Listing 1-1.3 had been modified and the new business requirements for that particular bank item had to be carried in Euro whose currency code was "6".
I leave it the reader to imagine how many days a maintenance programmer would spend trying to find out why on earth the currency assignment was not made in Swiss Francs when the order was processed.
On a more complex piece of code, you can spend days of sifting through the code—most of the time looking at the wrong place, because the place where the bug lies contains a comment stating a correct behavior/procedure—, questioning you mental sanity, and even accusing the compiler from making mistakes. In my past programming years, I have seen all the above and more!… and more than once!
So, we see that writing good comments can quickly be undone by a hasty change of code. At this point I can hear some of the readers saying something like this, found on the slashdot discussion quoted earlier.
"Changes have to be done twice? That's right, when they change the code, they must change the comment. I'll repeat that: they MUST change the comment. And it must make complete sense when they're done or they'll be out of a job! Why is this important? When you change the comment, you must think about the comment. You must think about the change you've done and how it fits in with the rest of the code, and what the rest of the code is trying to do. If a comment isn't up to date or doesn't make sense, that's a bug in the code, as bad as any other, and it needs to be fixed. It's not difficult to spot when the comments don't line up, so they're fairly easy to fix. While you're there fixing the comments you need to check the code, 'cos whoever the idiot was that wrote it, they obviously haven't checked it properly. Go and hit them with a Very Big Stick."
Oh boy! Would you like to work with this guy? Actually, I would like to cite a part of the earlier quote to the rescue: "In a perfect world of no bugs and only populated by wizard programmers, this is fine, but not in the world I live in." Indeed, how do you expect people to make changes at two places, when the only place where that change matters is the code and the code only? How do you enforce such policy? Certainly not with a Very Big Stick. And talking about "so they're fairly easy to fix", I just refer you to the example shown at the beginning of this section. This one was certainly not easily spotted.
Let it be read!
So what is the practical solution to the dilemma? As Kent stated it to me in 1991, the solution is to make the code readable. Then, and only then, you shall not have any more problems. I know this option has been highly criticized in the second quote from the slashdot discussion, but let us look nevertheless at the practical side of things. We shall come back to the objections of all readers later on.
Fortunately, a modern computer language like Java no longer limits the developers to 6 or 8 letters variable names. On top of that, most development environments have a facility to automatically expand names after typing a few letters. Therefore there are no excuses for not writing readable code.
For example the last line of the code in Listing 1-1.1 should be written as
Listing 1‑1.4, Code without comment
aCarQuote.setSwissFrancsValueKeep(swissFrancsValueKeep);
or something of the sort since I still do not know what on earth "SfVK" means! Similarly the code of Listing 1-1.3 should be written as:
Listing 1‑1.5,Another code without comment
order.setCurrencyCode( CurrencyCodes.Euro);where currency codes are defined as constants in a class
CurrencyCodesfollowing good practices of any experienced Java programmers.
The beauty of making the code readable is that what your read is what the computer—the compiler actually—does read. As a consequence, if your code compiles right, this is what gets executed in the end. This is all what matters.
In fact, when it comes to readability, I personally think that comments actually are in the way of the code. At best it forces you to read things twice. In practice, however, it lowers your acuteness at spotting bizarre things in the code because you tend to read only the comments and ignore the code.
More than one, I had to get rid of the comments when debugging a foreign piece of program in order to concentrate on the real subject, the code that gets executed by the computer.
Breaking up code
The discussion about code readability was punctuated by an interesting remark on the slahdot discussion. The remark is reproduced below and the code accompanying it is shown in Listing 1-1.6.
"I've seen too much code written in the manner you suggest, that makes the reader bounce around from function to function to function for no reason other than "otherwise that function would be more than 30 lines"
Listing 1‑1.6, Silly method break-up
void foo()
{
foo_part_1();
foo_part_2();
foo_part_3();
}
Seeing that code, one can but agree with the remark.
However, whoever generated the method names of Listing 1-1.6 had a very limited imagination. Personally, I do apply to myself a rule that a method should not exceed a certain number of lines. How many is not the issue here (my limit is much much lower than 30). There are practical limitations like the size of the text window or physiological like the number of lines one can read within a glimpse of the eye. I usually pick up the latest.
The point I want to make here is that one must use meaningful names when breaking up a method. It actually goes a long way toward making your code readable. In the end I find myself breaking a method into several small ones not because I impose on myself an arbitrary maximum number of lines, but because I want to make my code readable. Because I want to have meaningful method names, I shall break the code into meaningful stand-alone entities, thus ensuring the readability of each method.
During the development of a framework for generating WEB pages, the original version of the methodinitializeNewSessionin charge of initialising a newly established session was 24 lines long. In my eyes this was too long. So I went inside and identified 17 lines that were devoted to selecting the type of navigator requesting a session and 5 lines devoted to selecting the default locale. The result of the break-up is shown in Listing 1-1.7.
Listing 1‑1.7, Meaningful method break-up
private void initializeNewSession(AbstractHtmlPage page, HttpSession session, HttpServletRequest request) throws HtmlException
{
assignNavigator( session, request);
assignDefaultLocale( session, request);
page.setSession( session);
page.initializeNewSession( request);
}
The first two lines of the method shown in Listing 1-1.7 expose very explicitly what is being done and do not require additional comments. As one can see from the parameters passed to the methods, the code of these could have well been placed within the calling method. Moreover, these two methods are not called anywhere else. Thus, the break-up appears totally arbitrary and indeed makes the reader bounce around from method to method… if the reader does not trust the method name, that is!
However, the rational behind the choice is clear: by using carefully chosen names, I ensure that the code can be read by anyone coming after me.
Conclusion
Let us now address all the concerns raised in the quotations made in this chapter.
For one, programmers who are forced to produce commented code tend to be very sloppy with their comments. A former colleague of mine once won an honorable mention from the management because all his code was commented. It turned out that he had produced most of the comments using a plug-in generating automatic comments. Enforcing comment writing—with or without a very big stick—will only produce stupid, incomplete and useless comments.
- Rule 1: Do not enforce comment writing. Instead of writing comments, I ask my programmers to spend time thinking about meaningful names for variables and methods. They feel an immediate incentive for it and usually succeed in making the transition between writing comments and writing readable code.
- Rule 2: Enforce code readability.Actually, the decision to follow Kent’s advice of not commenting my code had an immediate effect on me. Since I could no longer rely on the comment to read what my code my doing, I was forced to write readable code: there was no other choice. Thus, I want to go beyond rules 1 and 2, and state that comments should be forbidden within the code. This is a good way to force a programmer to write readable code. This line of thought brings us to our last and final rule.
- Rule 3: Discourage people from writing comments in order to have them concentrate on writing readable code.
Please, notice that I used the word "discourage" and not "impose". Personally, I hate totalitarian regimes. My experience is that making strict rules is the best way of making people more stupid than they are. I bet you that the guy with the Very Big Stick is having a lot of problems with his team.
Mollified remarks
Yes, Virginia, I do not write any comment within my code. While writing the code, I concentrate on making it as readable as can be for someone knowledgeable with the business domain of the application. I never use abbreviations and my colleagues, who must use classes with a name taking up to half a line of this book at times, curse me for that at the beginning. After a while, however, they stop complaining and emulate me.
And yet, I do write a few comments before each method to generate the Javadoc. Javadoc is another great invention of the designers of Java. It allows a natural way of creating documentation for someone who has to use code without the source. And that is the main and only purpose of it. I also must admit that I am too lazy at times to remove the automatically generated comments such as the one in Listing 1-1.1. A weak excuse is that it allows people looking at the code with a text editor to quickly locate method boundaries. Setter and getter methods are so small that they can easily be overlooked. However, I agree that this is a very poor excuse.
The Javadoc program should have a flag allowing automatic generation of the entries corresponding to getter and setter methods when they do not have any comment. It would have more sense than having these silly plug-ins that generate automated comments.
Finally, one should note that Javadoc comments are not written for the person maintaining the code, or to explain how actions performed within the method are carried out. It is not the business of the person using a method to know how the method works. What that person wants to know is what are the input of the method, if any, what are the preconditions for using the method if any, what is the effect of the method on the object it is sent to, and what is the object or value returned by that method if any. There is no word on how the method does its work. After all, this is what hiding the implementation is all about.
