Home > Uncategorized > Is unreachable code primarily created by a fault?

Is unreachable code primarily created by a fault?

August 2, 2010

Many coding guidelines recommend against the presence of unreachable code in source; the argument being that while the code itself is harmless its presence suggests something has gone wrong somewhere else. As a member of the MISRA C++ coding guidelines committee I unsuccessfully argued against this recommendation being included on the grounds of lack of evidence that unreachable code was a significantly strong indicator of a fault and the complications caused by wanting allow certain kinds of what appeared to be unreachable code (e.g., that used by defensive programming practices).

I was recently reading the PhD thesis of Yichen Xie and found some very interesting analysis of correlation between various kinds of dead code (redundant code was one of the kinds) and faults.

Yichen Xie took a course grained approach to finding a correlation between redundant code and faults. He simply counted the numbers of files that did/did not contain redundant code and the number of corresponding files known to contain what he called a hard bug. The counts were as follows (statistical analysis using the contingency table method gives a probability of well below 0.01 for these values being generated by the null hypothesis):

               Hard Bugs
Dead Code     Yes     No   Totals
  Yes         133     135    268
   No         418    1369   1787
Totals        551    1504   2055

These counts show that if one of the 2,055 files was picked at random the expectation of it containing a Hard Bug is 27%, for one of the files containing unreachable code the expectation is 50% and for files not containing unreachable code it is 23%. So picking a file containing unreachable code almost doubles the expectation of a Hard Bug being found in that file.

These measurements do not attempt to make a connection between the unreachable code contained in a file and any fault contained within that file. How would the counts change if they were based on their being some form of logical connection between the redundant code and the fault? The counts suggest at least a 50% false positive rate, but then faults that generate unreachable code may be the result of high level semantic issues that are less likely to be detected by a casual reading of the source or by static analysis tools.

The hoped for consequence of a coding guideline requiring the removal of unreachable code is that developers analyze the code to understand why the code is unreachable and that in many cases this will result in a fault being uncovered and fixed; the worst case scenario is that they simply delete the unreachable code.

This research has caused me to upgrade the significance I give to unreachable code but I remain unconvinced that the false positive rate is sufficiently low for it to be a worthwhile coding guideline.

Categories: Uncategorized Tags:
  1. TemporalBeing
    August 3, 2010 16:43 | #1

    I probably put a lot of “unreachable” code in projects for two reasons: (i) I follow typical defensive programming practices, and (ii) debug-ability.

    I find that having logic that checks parameters catches a lot of other errors early in the process, making those errors apparent while debugging. Further, this code does not get removed for released products so the user has the same reliability for the software that I do under the debugger – one of the things I really hate about the use of assert().

    This does, however, quickly remove all the easy bugs, and the not-so-easy-but-not-hard bugs too. Hard bugs become a bit more apparent, a little easier to debug, but mostly the only ones that remain.

    Just 2 cents.

  2. asterix
    December 3, 2010 11:39 | #2

    I would recommend you to investigate DO-178B standard that defines sw design, developing and verification processes. It is a well known standard in avionics industry, safety critical software area. Dead code and deactivated code issues are also addressed in this standard.

    It basically requires that all source code has to be derived from system level requirements. no dead code or debug related code should be present in the release code (because it can effect the real-time behavior).

  3. December 3, 2010 16:23 | #3

    @asterix
    I am familiar with DO-178B (perhaps we should say 178C now). As I understand it the rational for recommending against the use of dead code in this standard mirrors the rational used by other coding standards groups (e.g., suspicious usage, untested code that may be executed if control flow disrupted by some hardware event or executed if a variable becomes corrupted to hold an out of design range value, etc).

  1. October 3rd, 2010 at 01:59 | #1
Comments are closed.