An interesting paper on untangling skill and luck in sports and business made the interesting observation that if, for some activity, it is easy to lose on purpose then skill dominates over luck, while if it is difficult to lose on purpose luck dominates over skill.
This got me thinking about skill in software development and how it might be used to ‘loose on purpose’ (I will leave the question of luck in software development to another post).
What does winning and loosing mean in software development? Winning might mean producing a program quickly, cheaply, that is the most efficient or is the most maintainable. I’m not sure that there is much skill involved in being the slowest or most expensive developer to write code, or to create a very inefficient program.
Everybody has ideas about how to write maintainable code (there is generally little or no experimental evidence to back up these ideas) and from time to time joke articles about writing unmaintainable code are written.
There are practical reasons for wanting to create non-maintainable or difficult to comprehend code, for instance to make it difficult for a third party (who might be a customer or black hat) to figure out how a program operates. So called obfuscating transformations are an active area of research and tools are available to transform source and binary; the effectiveness of some techniques for making source difficult to maintain is debatable and obfuscating source may have little impact on the generated binary.
These tools primarily operate by removing information, e.g., rewriting identifiers to meaningless sequences of characters, mapping structured loops to sequences of
goto and folding constant expressions. Removing information does not require any skill relating to software maintainability.
I would claim that being able to add negative information to a program is evidence of skill for software maintenance.
Some example of the kinds of negative information I would add include:
- Invented ‘magic’ numbers, i.e., numeric literals that look as if they are derived from an application domain requirement. For instance, a reader of the expression
x*12is likely to assume that
12is a constant specific to some aspect of the application; this is an instance where novice developers are trained to invent a symbolic name (e.g.,
max_eggs_in_box) to denote the given value. So if the original code contained the expression
x*6I would modify it to
x*12and then figure out how to later divide the result by two. Code modified to contain many
12s, or other such ‘invented’ constants, is likely to cause readers to waste lots of time head scratching.
- Replace identifiers with names that have no semantic connection with the information they contain. Well chosen identifier names can significantly reduce the effort needed to comprehend source and I’m sure that suitable chosen names would help sow lots of confusion. An experiment I ran a few years ago found that developers use identifier names to make decisions about binary operator precedence; lots of scope for creating confusion here!
- I don’t have enough experience to know whether restructuring class hierarchies will have a worthwhile impact on maintainability, developers have learned to handle poorly designed class hierarchies and I am not sure I can make things much worse than they have already encountered.
Adding redundant code is a commonly used technique. To be effective this code would have to modify variables used in the original program, which means that it would have to occur in a condition that is never executed; a constant conditional expression would fool nobody and the condition would have to rely on a variable being tested against a value that could never occur. This requires application knowledge not code maintenance skills.
Compression and encryption (of strings in source and as much code as possible in executables) are other existing techniques that do not involve maintenance related skills.