June 18, 2017 Derek Jones 3 comments

One of the unwritten design aims of the C Standard is that it should be possible to fully implement the C Standard library in conforming C. It turned out that this was not possible in C90; the problem was implementing the memcpy function when the object being copied was an object having a struct type containing one or more padding bytes. The memcpy library function copies the bytes in one object to another object. The padding bytes might be uninitialized (they have an indeterminate value), which means accessing them is undefined behavior (in C90), i.e., use of memcpy for copying structs containing padding results in a non-conforming program.

struct {
        char c; // Occupies 1 byte
        // Possible padding bytes here
        int i;  // A 2/4-byte int sometimes has to be aligned on a 2/4-byte storage boundary
       };

Padding bytes could be set to a known value by, for instance, using memcpy to zero the storage; requiring this usage was thought to be excessive, and a hefty chunk of new words was added in C99 (some of the issues raised by this problem also cropped up elsewhere, which contributed to the will to do this).

One consequence of the new wording is that objects having type unsigned char are special in that while their uninitialized value is still indeterminate, the possible set of values excludes a trap representation, they have an unspecified value making accesses unspecified behavior (which conforming programs can contain). The uninitialized value of objects having other types can be a trap representation; it’s the possibility of a value being a trap representation that makes accessing such uninitialized objects undefined behavior.

All well and good, memcpy can now be implemented in conforming C(99) by copying unsigned chars.

Having made it possible for a conforming program to access an uninitialized object (having type unsigned char), questions about it actual value can be asked. Its value is indeterminate you say, the clue is in the term indeterminate value. Ok, what does the following value function return?

unsigned char random(void)
{
unsigned char x;
 
return x ^ x;
}

Exclusiving-oring a value with itself always produces zero. An unsigned char taking, say, values 0 to 255, pick one and you always get zero; case closed. But where does it say that an indeterminate value is always the same value? There is no wording preventing an indeterminate value being different every time it is accessed. The sound of people not breathing could be heard when this was pointed out to WG14 (the C Standard’s committee), followed by furious argument on one side or the other.

The following illustrates one situation where the value of padding bytes could change with every access. That volatile qualifier specifies that the value of c could change between two accesses (e.g., it represents the storage layout of some memory mapped I/O device). Perhaps any padding bytes following it are also effectively volatile-qualified.

struct {
        volatile char c; // A changeable 1 byte
        // Possible padding bytes may be volatile
        int i;  // No volatility here
       };

The local object x, above, is not associated with a volatile-qualified object. But, so what? Another unwritten design aim of the C Standard is to keep the wording simple, so edge cases are not called out and the behavior intended to handle padding bytes gets applied to local unsigned chars.

A compiler could decide that calls to random always return zero, based on the assumption that while indeterminate values may not be known, they are not time varying.

Categories: Uncategorized Tags: C Standard, padding, uninitialized, unsigned char, unspecified

The Shape of Code

Archive

How indeterminate is an indeterminate value?

Recent Posts

Recent Comments

Archives

Meta