Home > Uncategorized > Undefined behavior can travel back in time

Undefined behavior can travel back in time

The committee that produced the C Standard tried to keep things simple and sometimes made very short general statements that relied on compiler writers interpreting them in a ‘reasonable’ way. One example of this reliance on ‘reasonable’ behavior is the definition of undefined behavior; “… erroneous program construct or of erroneous data, for which this International Standard imposes no requirements”. The wording in the Standard permits a compiler to process the following program:

int main(int argc, char **argv)
// lots of code that prints out useful information
1 / 0;  // divide by zero, undefined behavior

to produce an executable that prints out “yah boo sucks”. Such behavior would probably be surprising to the developer who expected the code printing the useful information to be executed before the divide by zero was encountered. The phrase quality of implementation is heard a lot in committee discussions of this kind of topic, but this phrase does not appear in any official document.

A modern compiler is essentially a sophisticated domain specific data miner that happens to produce machine code as output and compiler writers are constantly looking for ways to use the information extracted to minimise the code they generate (minimal number of instructions or minimal amount of runtime). The following code is from the Linux kernel and its authors were surprised to find that the “division by zero” messages did not appear when arg2 was 0, in fact the entire if-statement did not appear in the generated code; based on my earlier example you can probably guess what the compiler has done:

if (arg2 == 0)
   ereport(ERROR, (errcode(ERRCODE_DIVISION_BY_ZERO),
                                             errmsg("division by zero")));
/* No overflow is possible */
PG_RETURN_INT32((int32)arg1 / arg2);

Yes, it figured out that when arg2 == 0 the divide in the call to PG_RETURN_INT32 results in undefined behavior and took the decision that the actual undefined behavior in this instance would not include making the call to ereport which in turn made the if-statement redundant (smaller+faster code, way to go!)

There is/was a bug in Linux because of this compiler behavior. The finger of blame could be pointed at:

  • the developers for not specifying that the function ereport does not return (this would enable the compiler to deduce that there is no undefined behavior because the divide is never execute when arg2 == 0),
  • the C Standard committee for not specifying a timeline for undefined behavior, e.g., program behavior does not become undefined until the statement containing the offending construct is encountered during program execution,
  • the compiler writers for not being ‘reasonable’.

In the coming years more and more developers are likely to encounter this kind of unexpected behavior in their programs as compilers do more and more data mining and are pushed to improve performance. Other examples of this kind of behavior are given in the paper Undefined Behavior: Who Moved My Code?

What might be done to reduce the economic cost of the fallout from this developer ignorance/standard wording/compiler behavior interaction? Possibilities include:

  • developer education: few developers are aware that a statement containing undefined behavior can have an impact on the execution of code that occurs before that statement is executed,
  • change the wording in the Standard: for many cases there is no reason why the undefined behavior be allowed to reach back in time to before when the statement executing it is executed; this does not mean that any program output is guaranteed to occur, e.g., the host OS might delete any pending output when a divide by zero exception occurs.
  • paying gcc/llvm developers to do front end stuff: nearly all gcc funding is to do code generation work (I don’t know anything about llvm funding) and if the US Department of Homeland security are interested in software security they should fund related front end work in gcc and llvm (e.g., providing developers with information about suspicious usage in the code being compiled; the existing -Wall is a start).
  1. July 12th, 2012 at 16:23 | #1

    Neat find! But the error is entirely on the programmers in this case, that PG_RETURN_INT32 call should’ve been wrapped in an else block.

    Then again, I’d think by now a compiler could figure out when most error functions are non-returning. Unless erereport is overloaded… notice its first argument is ERROR, maybe if its first argument is something else (WARNING?) the function doesn’t return, if that’s the case then it’s entirely reasonable for the compiler not to even try very hard to detect whether the function is non-returning (Halting Problem etc) and again it’s the programmers’ fault for poor coding.

  2. Magnus
    July 16th, 2012 at 08:20 | #2

    I’m very much in favour of the current behaviour in C and C++. The philosophy after all is that it is up to the programmer to know about undefined behaviour, and to be be able to reason about her programme (in a non-Turing way) to conclude that undefined behaviour “will never happen” based on all possible inputs and behaviour of the programme.

    Having said that, it’s unreasonable to expect every programmer to know every undefined behaviour. Never mind to be able to write bug-free code that never triggers ub (or worse, a ub timebomb). I think there should be a distinction between ‘release’ (optimised) and ‘debug’ code in the standard itself, where undefined behaviour is caught and reported in debug builds (as far as it can be — e.g. uninitialised memory allocated on the heap would be hard to check for).

    Short version: first point in the first three options, third point in the second three options.

  3. Robert
    October 18th, 2014 at 11:44 | #3

    Except, the intention of the ‘undefined behaviour’ parts of the C standard is to allow the compiler writer to use the behaviour DEFINED by the host machine. Two’s complement on a machine that does two’s complement. Trap on divide by zero or set the result to INT_MAX, INT_MIN or ZERO. Or something new that the standards committee haven’t thought of.

    It is not supposed to be a license to delete the entire program; this is what the GCC/clang etc writers are trying to make you believe.

  4. October 18th, 2014 at 22:00 | #4

    Your point of view was certainly the intention of some members of WG14, but not all (mostly the implementers, of which I was one). While I have not been a regular attendee of committee meetings for the last 10 years, I doubt that things have changed.

    Compiler writers have customers and these invariably pay money because they won’t faster code. Users (i.e., non money paying) are road kill.

  1. April 6th, 2015 at 20:40 | #1

A question to answer *