Archive

Posts Tagged ‘embedded systems’

Type compatibility the hard way

January 14th, 2012 No comments

When writing in assembly language it is possible to operate on a sequence of bits as if it were an unsigned integer one moment and a floating-point number the next; it is the developer’s responsibility to ensure that a given sequence of bits is operated on in a consistent manner. The concept of type was initially introduced into computer languages to provide information to compilers, enabling them to generate the appropriate instructions for values having the specified type and where necessary to convert values from the representation used by one type to the representation used by a different type. At this early stage language designers tended to keep things simple and to think in terms of what made sense at the machine representation level when deciding which type conversions to permit (PL/1 was a notable exception and the convolutions that occurred to perform some type conversions are legendary).

It took around 10 years for high level languages to evolve to the point where developers had the ability to create their own named types; Pascal being an early, very well known and stand out example. Once developers could create their own types it became necessary to come up with general rules specifying when a compiler must treat two different types as compatible (i.e., be required to generate code to support some set of operations between variables having these two different types).

Most language designers chose the simple option; a type is compatible with another type if it has the same name (scoping/namespace/lookup rules effectively meant that “same name” was effectively the same as “same definition”). This simple option generally included various exceptions for the arithmetic types; developers did not like having to insert explicit casts for what they considered to be obvious conversions (languages such as Ada/CHILL provided a mechanism for developers to specify that a newly defined arithmetic type really was a completely new type that was not compatible with any other arithmetic type, an explicit cast could change this).

One of the few languages which took a non-simple approach to type compatibility was CHILL, a language for which I once spent over a year writing the semantics phase of a compiler. CHILL uses what is known as structural compatibility, i.e., essentially two types are compatible if they have the same layout in memory (the language definition actually uses the terms similar and equivalent rather than compatible and uses mode rather than type, here I will follow modern general terminology). This has obvious advantages when there is a need to overlay types used in different parts of a program onto the same location in storage (note, no requirements on the fields being the same). CHILL definitions look like a mixture of C and Pascal, unless you know PL/1 they can look odd to the uninitiated (I think I’ve got them right, my CHILL is very rusty), T_1 and T_2 are compatible:

T_1 = struct (               T_2 = struct (
      f1 :int;                     f3 :int;
      f2 :int;                     f4 :int;
      );                           );

Structural compatibility enables the creation some rather unusual compatible types, such as the following three types all being pair-wise compatible (the keyword ref is use to specify pointer types):

T_3 = struct (               T_4 = struct (             T_5 = struct (
      f1 :int;                     f4 :int;                   f7 :int;
      f2 :ref T_3;                 f5 :ref T_4;               f8 :ref T_5;
      f3 :ref T_4;                 f6 :ref T_3;               f9 :ref T_5;
      );                           );                         );

Because types can be recursive it is possible for the compatibility checking code in the compiler to end up having to type check the type it is currently checking. The solution adopted by many CHILL compilers (not that there were ever many) was to associate an is_currently_being_checked flag with every type’s symbol table entry, if during compatibility checking this flag has value TRUE for both types then they are both compatible otherwise the flag is set to TRUE for both types and checking continues (all flags are set to FALSE at the end of compatibility checking).

To check T_3 and T_4 In the above code set the is_currently_being_checked flag to TRUE and iterate over the fields in each record. The first field pair have the same type, the second field pair are pointers to types we are already checking and therefore compatible, as are the third field pair, so the types are compatible. Checking T_3 and T_5 requires a second iteration through T_5 because of the pointer to T_4 which does not yet have its is_currently_being_checked flag set.

Yours truely discovered that one flag was not sufficient to do fully correct compatibility checking. It is necessary to maintain a stack of locations (e.g., the structure field or procedure parameter where compatibility checking has to recurse to check a user defined type) in the two types being compared in order to detect that some types were not compatible. In the following example (involving pointer to procedure types; which is longer than I remember the actual instance I first discovered being, but I had to create it again from vague memories and my CHILL expertise has faded; suggestions welcome) types A and B would be considered compatible using the is_currently_being_checked flag approach because by the time the last parameter is checked both symbol table flags have been set. You can see by inspection that types X and Y are not compatible (they have a different number of parameters to start with). Looking at the stack of previous compatibility checks for A/B would show that no X/Y compatibility check had yet been made and one would be needed for the third parameter (which would fail):

A = proc(X, Y,            X);
B = proc(C, proc(A, int), Y);
 
C = proc(E);
D = proc(A);
E = proc(proc(X, proc(A, int), X));
 
X = proc(D);
Y = proc(A, int);

The potential for complexity created by the use of structural compatibility is one reason why its use is rare. While it is possible to rationalize that CHILL was targeted at embedded telecommunication systems containing lots of code where memory costs can be significant, I suspect that those involved had a hardware mentality and a poor grasp of practical software engineering issues.

Incidentally, the design of the llvm type checking system relies on using an equality test to check for type equality. While this decision will increase the difficulty of integrating languages that use structural type compatibility into llvm, these languages are probably sufficiently rare that it is much more cost effective to make it simple to implement the more common languages.

Where did type compatibility go next? Well, over the last 20 years the juggernaut of object oriented design has pretty much excluded sophisticated non-OO type systems from mainstream languages (e.g., C++ and Java), but that is a topic for another article.

Compiler writing in the next decade

December 22nd, 2009 No comments

What will be the big issues in compiler writing in the next decade? Compilers sit between languages and hardware, with the hardware side usually providing the economic incentive.

Before we set off to follow the money, what about the side that developers prefer to talk about. The last decade has not seen any convergence to a very small number of commonly used languages, if anything there seems to have been a divergence with more languages in widespread use. I will not attempt to predict whether there will be a new (in the sense of previously limited to a few research projects) concept that is widely integrated and used in many languages (i.e., the integrating of object oriented features into languages in the 90s).

Where is hardware going?

  • Moore’s law stops being followed. Moore’s law is an economic one that has a number of technical consequences (e.g., less power consumed and until recently increasing clock rates). Will the x86 architecture evolution dramatically slow down once processor manufacturers are no longer able to cram more transistors onto the same amount of chip real estate? Perhaps processor designers will start looking to compiler writers to suggest functionality that could be made use of by compilers to generate more optimal code. To date my experience of processor designers is that they look to Moore’s law to get a ‘free’ performance boost.

    There are a number of things a compiler code tell the processor, such as when a read or write to a cache line is the last one that will occur for a long time (enabling that line to be moved to the top of the reuse list).

  • Not plugged into the mains. When I made a living writing optimizers the only two optimizations choices were code size and performance. There are a surprising number of functional areas in which a compiler, given processor support, can potentially generate code that consumes less power. More on this issue in a later post.
  • More than one processor. Figuring out how to distribute a program across multiple, loosely coupled, processors remains a difficult research topic. If anybody ever comes up with a solution to this problem it might make more commercial sense for them to keep it secret, selling a compiling service rather than selling compilers.
  • Application Specific Instruction-set Processors. Most processors in embedded systems only ever run a single program. The idea of each program being executed on a processor optimized to its requirements sounds seductive. At the moment the economics are such that it is cheaper to take an existing, very low cost, processor and shoe-horn the application onto it. If the economics change the compiler used for each processor is likely to be automatically generated.

Enough of the hardware, this site is supposed to be about code:

  • New implementation techniques. These include GLR parsing and genetic algorithms to improve the generated code quality. The general availability of development machines containing more than 4G of memory will make it worthwhile for compiler writers to implement more whole program optimizations (which are currently being hemmed in by storage limits)
  • gcc will continue its rise to world domination. The main force at work here is not the quality of gcc but the disappearance of the competition. Compiler writing is not a big bucks business and compiler companies are regularly bought up by much larger hardware outfits looking to gain some edge. A few years go by, plans change, the compiler group are not making enough profit to warrant the time being spent on them by upper management and they are closed down. One less compiler vendor and a bunch of developers are forced to migrate to another compiler, which may or may not be gcc.
  • Figuring out what the developer meant to write based on what they actually wrote, and some mental model of software developers, is my own research interest. This is somewhat leading edge stuff, in other words nothing major has been achieved so far. Knowledge of developer intent looks like it will open the door to whole classes of new optimization techniques.

volatile handling sometimes overly volatile

March 2nd, 2009 1 comment

The contents of some storage locations, used by a program, might be modified outside of the control of that program, e.g., a real-time clock or the input port of a communications device. In some cases writing to particular storage locations has an external effect, e.g., a sequence of bits is sent down a communications channel. This kind of behavior commonly occurs in embedded systems

C and C++ support the existence of variables that have been mapped to such storage locations through the use of the volatile type qualifier.

volatile long flag;
volatile time_t timer;
 
struct {
                    int f1 : 2;
          volatile int f2 : 2;
                    int f3 : 3;
         } x;

When a variable is declared using volatile compilers must assume that its value can change in ways unknown to the compiler and that storing values into such a variable can have external effects. Consequently almost all optimizations involving such variables are off limits. Idioms such as timer = timer; are used to reset or refresh timers and are not dead code.

Volatile is a bit of an inconvenience for writers of code optimizers, requiring them to add checks to make sure the expression or statement they want to attempt to optimize does not contain a volatile qualified variable. In many environments the semantics of volatile are not applicable, which means that bugs in optimizers have a much smaller chance of being detected than faults in other language constructs.

There have been a number of research projects investigating the use of the const qualifier, but as far as I know only one that has investigated volatile. The ‘volatile’ project found that all of the compilers tested generated incorrect code for some of the tests. License agreements prevented the researchers giving details for some compilers. One interesting observation for gcc was that the number of volatile related faults increased with successive compiler releases; it looks like additional optimizations were being implemented and these were not checking for variables being volatile qualified.

One practical output from the project was a compiler stress tester targeting volatile qualified variables. The code generated by stress testers often causes compilers to crash and hang and the researchers reported the same experience with this tool.

There is one sentence in the C Standard whose overly broad wording is sometimes a source of uncertainty: What constitutes an access to an object that has volatile-qualified type is implementation-defined. This sentence applies in two cases: datatypes containing lots of bytes and bit-fields.

To save space/money/time/power some processors access storage a byte, or half-word, at a time. This means that, for instance, an access to a 32-bit storage location may occur as two separate 16-bit operations; from the volatile perspective does that constitute two accesses?

Most processors do not contain instructions capable of loading/storing a specified sequence of bits from a byte. This means that accesses to bit-fields involve one or more bytes being loaded and the appropriate bits extracted. In the definition of x above an access to field f1 is likely to result in field f2 (and f3) being accessed; from the volatile perspective does that constitute two accesses?

Unfortunately it is very difficult to obtain large quantities of source code for programs aimed at the embedded systems market. This means that it is not possible to obtain reliable information on common usage patterns of volatile variables. If anybody knows of any such code base please let me know.