Home > Uncategorized > C code is 90% unspecified behavior: more uninformed scare mongering

C code is 90% unspecified behavior: more uninformed scare mongering

Another C coding guidelines document, another clueless blanket ban on use of code containing unspecified behavior (no link so its visibility is not increased; the 90% is a back of the envelope calculation, knock yourself out here).

The C Standard defines unspecified behavior as “… provides two or more possibilities and imposes no further requirements on which is chosen in any instance.” Given this one item of information a ban on using constructs that contain unspecified behavior appears to be a good idea (writing code where the compiler gets to choose among several possible choices of behavior does not sound like recipe for consistent program behavior).

What most people lack when thinking about unspecified behavior is an understanding of the design aims for the production of the C Standard; the aim was to be concise. An example of this conciseness is the wording for the order of evaluation of subexpressions “… the order in which side effects take place are both unspecified.”

Consider the subexpression x+y; should the compiler evaluate x first (putting its value in a register) and then y (putting its value in another register), or should it evaluate y followed by x? It most situations the final result does not depend on the choice of evaluation order and the Standard gives the compiler the freedom to choose the order that produces the best quality code.

A coding guideline that bans the use of code containing unspecified behavior bans the use of any binary operator (assignment is a binary operator in C, ruling out use of the statement z=0;). The only executable statements that could be written, following this guideline, would be calls of functions containing zero or one argument (order of evaluation is unspecified, which rules out calls containing two arguments) or global variables appearing on their own in an expression statement.

One case where operand evaluation order matters is printf("Hello")+printf("World"), which can result in either HelloWorld or WorldHello being printed (printf returns the number of characters written). This is an example of the kind of usage that the authors of coding guideline want to ban.

Coming up with guideline wording that delineates the undesirable unspecified behaviors from the harmless ones is hard. Requiring that the external behavior of code does not depend on the compiler’s choice of unspecified behavior is one possibility (now that power consumption can be an external behavior of note, this framing could be too narrow). The wording used by MISRA C is “No reliance shall be placed on … unspecified behavior”; this raises the flag that it is possible to rely on unspecified behavior and leaves it up to others to fill in the details.

  1. April 8, 2015 00:06 | #1

    I agree, a blanket ban on use of code containing unspecified behavior is silly. I am not aware of a coding standard that blanketly bans “unspecified behavior” but then it again it must not be popular ;-). However I heard these silly things said before, just last week I saw someone say (in a discussion on the difference between unspecified and undefined behavior) : “I find this interesting. I have only been using C for 35 years, and never worry about the distinction. Why worry about it? Neither is something you want to do. Avoid both, and why worry about it?”. Yeah, right.

    On the other hand unawareness of code that depends on unspecified behavior is silly too. I am not a fan of the “reliance” word as used in MISRA-C:2004 rule you referenced. Then again, I am not a fan of how MISRA-C:2004 can be (has been) easily mis-interpreted either. The latest MISRA (MISRA-C:2012) has been cleaned up a bit, includes improved rationale and clarity on when/where/how exceptions to the guidelines can be made and the process activities required.

    Other than 10 “mandatory” guidelines (those no one can imagine any rational exceptions for) MISRA does not “ban” anything. If fact it is considered non-compliant development to blindly follow rules (which is really ends up being an effort of trying to pacify a tool) just as it is to blindly ignore violations. In the end, evidence of awareness though manual review and/or documentation is all that is required.

    Some important unspecified behaviours (and undefined) are dealt with by specific rules, many are advisory – where it is recommended developers document rationale/reasons for breaking the rule, but it is not required for MISRA compliancy.
    The rest are “required” rules and if it is necessary to not follow them, a documentation/review is necessary. That could simply mean referencing the compiler’s documentation, or commenting in code what your rationale/intent is, as long as you are following your own procedures/process and can show evidence of it, that’s compliant development.

    You bring up a good point, there is a difference between “critical vs non critical” unspecified behavior. To that end of clairty the rule you refer to in your last paragraph has now be reworded in Rule 1.3, MISRA-C:2012 “There shall be no occurrence of undefined or critical unspecified behavior”. There is also an appendix listing (for C90 and C99) these items specifically.

  1. March 21st, 2015 at 23:16 | #1