Literate Programming |
Translate This Page To: Deutsche Francaise Espanol Italiana Português Russian Chinese Korean |
One of the original intentions of the object-oriented programming paradigm was to produce code that was not only well-designed, well-organized, and manageable, but to also encourage the development of code that is easy to read and understand. Somewhere along the way, this point was lost by great hords of developers in the computing industry that favored cryptic code resembling assembly language, offering numerous justifications. But despite common fallacies that cryptic code is faster to develop because it is believed to be quicker to type and throw into place, there is a real cost incurred to projects for code that is not easy to read and understand. The reality is that the time it takes to write the code is relatively insignificant to the total time spent on the code in the life cycle of a project that makes it to deployment and maintenance. This is because the time that it takes to write source code is only a small percentage of the time developers spend reading source code in the overall life cycle that involves code reviews, debugging, unit testing, integration testing, quality assurance, elaboration, and maintenance. In fact the statistics suggest that during the life cycle of a released product that source code is typically read by on the order of ten developers other than the originator for these other tasks. This is what makes the readability, clarity, and understandability of the code so important. If these ten other visitors must expend time and efforts to decode cryptic code, this imposes a real cost on the project. If the code is easy to read and understand then this accelerates the time and effort that other developers must expend to work with the source code.
The coding style adopted by a project typically outline the measures taken to ensure good code readability. More specifically, the standards outlined for naming conventions will shape the readability qualities that the code will have. Some of the measures that can be take to facilitate good code readability include:
Standardized class naming conventions, such as prepending specialization to class names in a class hierarchy
Standardized method naming conventions, such as prepending Get & Set for accessors and mutators, as well as making operations have names that are active verbs rather than nouns
Standardized variable naming conventions, such as mapping object names to class names from the UML model without cryptic abbreviations
Standardized variable prefixing notations, such as Hungarian notation or similar prefixing schemes
Standardized variable scoping prefixing notations that indicate the scope of objects and their intent in that scope
The following example illustrates some source code that is cryptic, difficult to follow, and suffers from generally poor readability:
| // What in the world does this code do? |
| for ( int i = 0; i < a.x; i++ ) { |
| p = a[i]; |
| if ( p->flgs & BTMSK_OPN ) |
| p->x++; |
| p->y->z->clcdtd( p->x ); |
| } |
Had this code been written with other developers in mind, it might appear something as follows:
| // Is it possible that this code is equivalent? |
| Iterator< LineItem >& rLineItemIterator = inPurchaseOrder.GetIterator(); |
| for ( rLineItemIterator.MoveFirst(); rLineItemIterator.HasMore(); rLineItemIterator.MoveNext() ) |
| { |
| LineItem& rLineItem = rLineItemIterator.GetMember(); |
| if ( rLineItem.IsOpen() ) |
| { |
| rLineItem.IncrementDaysLate(); |
| // Accessor that encapsulates the internal structure of the LineItem object... |
| InventoryItem& rInventoryItem = rLineItem.GetInventoryItem(); |
| rInventoryItem.CalculateDaysLeftToDeliver( rLineItem.GetDaysLate() ); |
| } |
| // The original code was so poorly written that it didn't even structure the logic correctly! |
| } |
Some variable prefixing schemes to consider that have been proven to work well together:
| Prefix | Example |
Description |
| m_ | m_ChildElement | Scoping descriptor, class member |
| in | inFileName | Scoping descriptor, input argument |
| out | outResultData | Scoping descriptor, output argument |
| inout | inoutDataSet | Scoping descriptor, input / output argument |
| lcl, stk, tmp | stkDataBuffer | Scoping descriptor, local stack object |
| b | bFileOpenedFlag | Boolean value |
| p | pHeapObject | Pointer value |
| r | rSharedObject | Reference value |
| cr | crBorrowedObject | Constant reference value |
| s | sFileName | String object |
| I | ILanguageService | Middleware interface, i.e. COM, CORBA |
| A | AThread, ACriticalSection | Native interface, abstract class |
It is not so important to establish the best variable prefixing conventions as it is to establish a convention at all. There are many different conventions that might work, some better than others. But, the absence of a convention altogether results in inconsistent code that is difficult to understand. Whereas, the presence and adherence to such a convention facilitates consistent code that is easier to understand.
In addition to variable prefixing standards, it is equally important to give variables meaningful names. Variable names that are cryptic abbreviations, such as "i", "it", "n", or similarly non-descriptive names, are not helpful to the readability of the source code. In fact, when the source code becomes more involved and complex, the poor readability of each of these poor variable names begins to snowball when more and more of them are mixed together in complicated algorithms.
Conversely, variable names that help to map the source code to the analysis and design models are very helpful in making the source code readable and understandable. Variable names that utilize the same names as the classes that the variables refer to instances of greatly help the readability of source code by making the objects' classes easily distinguishable when a number of them are used in the same segment of source code.
The variable prefix naming conventions and the variable naming based upon objects' classes work together because they enable easy boiler-plating of source code because there is only a single substitution required to change the objects' class type in the source code. Naming conventions, such as the Java naming standard, the utilize inconsistent use of the class name make the substitution of the class type more difficult in the boiler-plating and code generation of source code. For this reason, this mixed notation is not the best practice because the best practice makes it easier to make substitutions and the developer or code generator is not penalized for making making such substitutions.
| // Mixed notation naming convention, requires multiple substitutions |
| Transaction transaction; |
| transaction.Begin(); |
| // Consistent notation naming convention, requires a single substitution |
| Transaction lclTransaction; |
| lclTransaction.Begin(); |
It is easy to see that the substitution of a single, case-sensitive value is going to be less problematic than the case where there are more than one value. In practice, the single substitution proves to be less problematic and susceptible to problems when substituting variable names. Consider the following example:
| Order order; |
| LineItem lineItem; |
| Border border; |
| BuildLineItem( inLineItemId, lineItem ); |
| order.AddLineItem( lineItem ); |
| ConfigureBorder( border ); |
| order.SetBorder( border ); |
Consider what happens when the above source code has a substitution made for "order". The problems that occur in the substitution is completely avoidable with better naming practices. Clearly this naming practice is not compatible with a boiler-plating templatizing based approach. But, either way, the articulate names are easier to understand than the cryptic naming style. It is just a matter of determining which style really constitutes the best practice based on an examination on the different consequences that are encountered within the different contexts that the naming practices are examined.
|