Literate Programming - Source Code Comments

Jeffrey Kotula. Source Code Documentation: An Engineering Deliverable" in Proceedings of the Technology of Object-Oriented Languages and Systems, 2000.

Source code documentation is a fundamental engineering practice critical to efficient software development. Regardless of the intent of its author, all source code is eventually reused, either directly, or just through the basic need to understand it. In either case, the source code documentation acts as a specification of behavior for other engineers. Without documentation, they are forced to get the information they need by making dangerous assumptions, scrutinizing the implementation, or interrogating the author. These alternatives are unacceptable. Although some developers believe that source code "self-documents", there is a great deal of information about code behavior that simply cannot be expressed in source code, but requires the power and flexibility of natural language to state. Consequently, source code documentation is an irreplaceable necessity, as well as an important discipline to increase development efficiency and quality.

Edward Yourdon. "Flashes on Maintenance From Techniques of Program Structure and Design" in Techniques of Program and System Maintenance. QED Information Sciences, 1988, pg. 73.

In my opinion, there is nothing in the programming field more despicable than an uncommented program. A programmer can be forgiven many sins and flights of fancy, including those listed in the sections below; however no programmer, no matter how wise, no matter how experienced, no matter how hard-pressed for time, no matter how well-intentioned, should be forgiven an uncommented and undocumented program.

Of course, it is important to point out that comments are not an end unto themselves. As Kernighan and Plauger point out in their excellent book, The Elements of Program Style, good comments cannot substitute for bad code. However, it is not clear that good code can substitute for comments. That is, I do not agree that it is unnecessary for comments to accompany "good" code. The code obviously tells us what the program is doing, but the comments are often necessary for us to understand why the programmer has used those particular instructions.

Robert Dunn. Software Defect Removal. McGraw-Hill, 1984, pg. 308.

Common sense also leads us to the recognition of the characteristics of programs that makes the programs maintainable. Above all, we look for programs that exhibit logical simplicity -- failing that, at least clarity. The earmarks of simplicity and clarity include modularity (true functional modularity, not arbitrary segmentation) and a hierarchical control structure, restrictions on each module's access to data, structured data forms, the use of structured control forms, and generous and accurate annotation.

Much has been said of the technical members of this set in earlier pages. Of good annotation, there are several features that must be included. First, the header information of each procedure should provide a concise statement of the procedure's external specifications, including a description of input and output data. Each section of the procedure should be introduced by comments identifying the section's relation to the external characteristics. Finally, comments within each section should relate groups of statements to the program's documented description. This last is automatically achieved by using design language statements as source code comments.

David Zokaities. "Writing Understandable Code". Software Development, January 2002, pg. 48-49.

Software must be understandable to two different types of entities for two different purposes. First, compilers or interpreters must be able to translate source code into machine instructions. Second, people need to understand the software so they can further develop, maintain and utilize the application. The average developer overemphasizes capability and function while undervaluing the human understanding that effects improved development and continued utilization. There should be a description in clear view within the programming medium.

As I gradually improved my in-code documentation, I realized that English is a natural language, but computer languages, regardless of how well we use them, are still "code." Communication via natural language is a relatively quick and efficient process. Not so with computer languages: They must be "decoded" for efficient human understanding.

People who read my code? wait a moment - did I say "read my code?" Now that's a remarkable way to approach software - not to debug, analyze, program, or develop, but simply to read. The act of reading allows me to approach my code as a work of software art: I strive to make the overall design, algorithm, structure, documentation and style as simple, elegant, through and effective as practical. Yes, this takes time, but when I'm rushed, I usually dash off the wrong implementation of the wrong design, and the darn project takes twice as long as it would have had I done it right in the first place. A disciplined, focused approach clarifies my thinking and improves my implementation. In keeping with a reasonable attempt for excellence, I proofread my applications.

My goal is to find a balance, describing all salient program features comprehensively but concisely. I explained each software component's purpose, the algorithm used, arguments, inputs, outputs- even the reason for #include-ing a particular header file. I document each section of each function so that the overall program flow is readily understandable.

David Zokaities. "Feedback". Software Development, March 2002, pg. 14.

My article seems to have generated quite a bit of controversy. The article's text received only praise. This implies that the goal of "understandable code" is well-nigh universal. How best to achieve it seems to be a matter of highly polarized opinion. Even for the wealth of comments that I customarily provide, some readers chided me for not having enough! Other readers believe that all comments are superfluous and cause trouble by their very existence. I've seen horrendous maintenance problems incurred with this approach. Some readers believe that Design by Contract, coupled with lengthy function and variable names, provides all the necessary documentation.

My experience is that well-developed modular designs, coupled with good system documentation, descriptive identifier names and a natural-language narrative, result in code that's a pleasure to work with and efficient to maintain.

Christopher Seiwald. "Pillars of Pretty Code". Software Development, January 2005, pg 49-51.

The essence of pretty code? One can infer much about its structure from a glance, without completely reading it. I call this visual parsing: discerning the flow and relative importance of code from its shape.

Blend In: Code changes should blend in with the original style.
Bookish: Keep columns narrow.
Disentangle Code Blocks: Break code into logical blocks within functions, and disentangle the purpose of separate blocks, so that each does a single thing or single kind of thing. A reader can avoid a total reading if a cursory inspection can reveal the whole block's nature.
Comment Code Blocks: Set off code blocks with white space and comments that describe each block. Sometimes large code blocks (with multiline comments) may embed small blocks (with single line comments). Comments should rephrase what happens in the code block, rather than be a literal translation into English. That way, even if your code is inscrutable and your comments gibberish, the reader can at least attempt to triangulate on the actual purpose. Big comments are needed for subtle or problematic code blocks, not necessarily big code blocks.
Declutter: Reduce, reduce, reduce. Remove anything that will distract the reader.
Make Alike Look Alike: Two or more pieces of code that do the same or similar thing should be made to look the same. Nothing speeds the reader along better than seeing a pattern.
Overcome Indentation: The left edge of the code defines its structure, while the right side holds the detail. You must fight indentation to safeguard this property. Code that moves too quickly from left to right (and back again) mixes major control flow with minor detail.

Richard Gunderman. "A Glimpse into Program Maintenance" in Techniques of Program and System Maintenance. QED Information Sciences, 1988, pg. 59.

Program documentation has been propelled into importance by sheer necessity. However, it still suffers from glowing tributes but inept implementations. One of the basic elements of good program documentation is an effective program listing.

Henry Ledgard. Professional Software Volume II: Programming Practice, Addison-Wesley, 1987, pg 65.

A program is in some sense a permanent object in that it can have a long lifetime. For the future reader, comments in a program should be truly substantive. They should say something. They should assist the program reader.

The professional thinks of a comment as a way to proceed from one point (a given state of knowledge) to another (understanding what is written in the program). The comment is a bridge. The professional assumes something about the reader of the program--the reader being, of course, someone else. It is fair to assume that the reader knows the language in which the program is written. The reader's difficulty is to modify the program at hand.

These observations lead to some specific recommendations. First, regarding the idea of comments as a bridge--Extensive introductory program comments are entirely in order. These comments set the stage for reading the program. They may contain an outline of the solution adopted by the programmer, summarize its input and output, give a directory of key variable names, or describe an algorithm that may not be known to the reader. Such comments provide a direct bridge from the problem to the program. They do not intrude on the reading of the program itself because they appear at the beginning of the program and can be read or not as the reader desires.

A second recommendation has to do with procedures and other major units of the program--Introductory module comments are also in order. Comments following a procedure header explaining the general nature of the procedure are not only in order but may be necessary. Keeping in mind the bridge aspect, we need not describe the calling environment. The professional assumes that the reader has read the program to the degree that the procedure calls are understood--but maybe not the procedure itself. As such, the procedure header comments should be short and help the reader understand the next level of detail in the program.

Third, the professional should spend the most energy on the code itself. This means--Avoid embedded (in-line) comments within the body of the module itself. It is my view that such comments can readily intrude upon the meaning of a program. Ideally, the code should speak for itself and require few supporting comments.

Dennis Smith. Designing Maintainable Software. Springer-Verlag, 1999, pg. 103.

Documentation that is structured and contained within the program is able to immediately satisfy the changing information demands of the maintainer. Those needs are determined in part by the subtask on which he is currently working. For solving nontrivial error correction or modification problems, the maintainer must have a detailed understanding of the program. To locate a section of code, knowledge of the program's structure is required. Knowing how an instruction sequence relates to other parts of the program is important for altering and testing software. The documentor can inform the unknowledgeable programmer in each subtask demand by varying the message content and effectively using the visual space.

Information may be conveyed to the maintainer in several ways. One is an abstract summary of the module at the beginning of the routine. Another is through the titles and headings of processing sections positioned in the instruction sequence. The third is in phrases and short sentences to the right of the code. They describe the processing steps and relate them to other parts of the program. The descriptions are organized into an outline that reflects the processing divisions of the routine.

The size and complexity of the module determine whether the information will be used. Small routines may need only comments to the right of the code. A more complete description is required for large programs.

The type of documentation that has just been read has a bearing on the processing of code. Documentation formats act as advance organizers of thought. Each type primes the maintainer for a different response to the instructions encountered. Messages that are consistent with the structure of the program aid recognition and recall. ... Programs are documented to enhance the maintainer's performance.

Penny Grubb and Armstrong Takang. Software Maintenance: Concepts and Practice, 2003, pg 7, 120-121.

Program comments within and between modules and procedures usually convey information about the program, such as the functionality, design decisions, assumptions, declarations, algorithms, nature of input and output data, and reminder notes. Considering that the program source code may be the only way of obtaining information about a program, it is important that the programmers should accurately record useful information about these facets of the program and update them as the system changes. Common types of comments used are prologue comments and in-line comments. Prologue comments precede a program or module and describe goals. In-line comments, within the program code, describe how these goals are achieved.

The comments provide information that the understander can use to build a mental representation of the target program. For example, in Brooks' top-down model, comments - which act as beacons - help the programmer not only form hypothesis, but refine them to closer representations of the program. Thus, theoretically there is a strong case for commenting programs. The importance of comments is further strengthened by evidence that the lack of good comments in programs constitutes one of the main problems that programmers encounter when maintaining programs. It has to be pointed out that comments in programs can be useful only if they provide additional information. In other words, it is the quality of the comment that is important, not its presence or absence.

David Marin. "What Motivates Programmers to Comment?"

Though programmers are often encouraged to comment their source code more thoroughly, there has been very little scientific investigation into what kinds of situations actually cause programmers to do so. I conducted a statistical study of the CVS repositories of nine Open Source projects, and made four major findings. First, the rate at which programmers comment varies widely from project to project and programmer to programmer; even the same programmer will comment at different rates on different projects. Second, programmers tend to comment larger modifications to source code more thoroughly. Third, more programmers modifying the same file does not, in general, mean more commenting. Finally, programmers tend to comment more when they are modifying code that is thoroughly commented to begin with. I then determined through an experiment with programmers that there is a causal link behind my last finding; that is, the more throughly a source code file is commented, the more thoroughly programmers will comment when they make modifications to it.

Randall Hyde. Write Great Code: Understanding the Machine, No Starch Press, 2004, pg. 6.

Here are some attributes of great code:

Uses the CPU efficiently (which means the code is fast)
Uses memory efficiently (which means the code is small)
Uses system resources efficiently
Is easy to read and maintain
Follows a consistent set of style guidelines
Uses an explicit design that follows software engineering conventions
Is easy to enhance
Is well-tested and robust (meaning that it works)
Is well-documented