January 29, 2012 Derek Jones No comments

What is the connection between a software developer’s risk attitude and the faults they insert in code they write or fail to detect in code they review? This is a very complicated question and in an experiment performed at the 2011 ACCU conference I investigated one particular instance; the connection between risk attitude and recall of previously seen information.

The experiment consisted of a series of problems having the same format (the identifiers used varied between problems). Each problem involved remembering information on four assignment statements of the form:

p = 6 ;
b = 4 ;
r = 9 ;
k = 8 ;

performing some other unrelated task for a short time (hopefully long enough for them to forget some of the information they had previously seen) and then having to recognize the variables they had previously seen within a list containing five identifiers and recall the numeric value assigned to each variable.

When reading code developers have the option of referring back to previously read code and this option was provided to subject. Next to each identifier listed in the recall part of the problem was space to write the numeric value previously seen and a “would refer back” box. Subjects were told to tick the “would refer back” box if, in real life” they would refer back to the previously seen assignment statements rather than rely on their memory.

As originally conceived this experimental format is investigating the impact of human short term memory on recall of previously seen code. Every time I ran this kind of experiment there was a small number of subjects who gave a much higher percentage of “would refer back” answers than the other subjects. One explanation was that these subjects had a smaller short term memory capacity than other subjects (STM capacity does vary between people), another explanation is that these subjects are much more risk averse than the other subjects.

The 2011 ACCU experiment was designed to test the hypothesis that there was a correlation between a subject’s risk attitude and the percentage of “would refer back” answers they gave. The Domain-Specific Risk-Taking (DOSPERT) questionnaire was used to measure subject’s risk attitude. This questionnaire and the experimental findings behind it have been published and are freely available for others to use. DOSPERT measures risk attitude in six domains: social, recreation, gambling, investing health and ethical.

The following scatter plot shows each (of 30) subject’s risk attitude in the six domains (x-axis) plotted against percentage of “would refer back” answers (y-axis).
Risk attitude plotted against percentage
A Spearman rank correlation test confirms what is visibly apparent, there is no correlation between the two quantities. Scatter plots using percentage of correct answers and total number of questions answers show a similar lack of correlation.

The results suggest that risk attitude (at least as measured by DOSPERT) is not a measurable factor in subject recall performance. Perhaps the subjects that originally caught my attention (there were three in 2011) really do have a smaller STM capacity compared to other subjects. The organization of the experiment (one hour during a one lunchtime of the conference) does not allow for a more extensive testing of subject cognitive characteristics.

Categories: Uncategorized Tags: cognitive psychology, correlation, experiment, faults, risk, short term memory

Relative spacing of operands affects perception of operator precedence

January 22, 2012 Derek Jones 1 comment

What I found most intriguing about Google Code Search (shutdown Nov 2011) was how quickly searches involving regular expressions returned matches. A few days ago Russ Cox, the implementor of Code Search not only explained how it worked but also released the source and some precompiled binaries. Google’s database of source code did not include the source of R, so I decided to install CodeSearch on my local machine and run some of my previous searches against the latest (v2.14.1) R source.

In 2007 I ran an experiment that showed developers made use of variable names when making binary operator precedence decisions. At about the same time two cognitive psychologists, David Landy and Robert Goldstone, were investigating the impact of spacing on operator precedence decisions (they found that readers showed a tendency to pair together the operands that were visibly closer to each other, e.g., a with b in a+b * c rather than b with c).

As somebody very interested in finding faults in code the psychologists research findings on spacing immediately suggested to me the possibility that ‘incorrectly’ spaced expressions were a sign of failure to write code that had the intended behavior. Feeding some rather complicated regular expressions into Google’s CodeSearch threw up a number of ‘incorrectly’ spaced expressions. However, this finding went no further than an interesting email exchange with Landy and Goldstone.

Time to find out whether there are any ‘incorrectly’ spaced expressions in the R source. cindex (the tool that builds the database used by csearch) took 3 seconds on a not very fast machine to process all of the R source (56M byte) and build the search database (10M byte; the Linux database is a factor of 5.5 smaller than the sources).

The search:

csearch "w(+|-)w +(*|/) +w"

returned a few interesting matches:

...
modules/internet/nanohttp.c:       used += tv_save.tv_sec + 1e-6 * tv_save.tv_usec;
modules/lapack/dlapack0.f:     $          ( T*( ONE+SQRT( ONE+S / T ) ) ) )
modules/lapack/dlapack2.f:               S = Z( 3 )*( Z( 2 ) / ( T*( ONE+SQRT( ONE+S / T ) ) ) )
modules/lapack/dlapack4.f:     $          ( T*( ONE+SQRT( ONE+S / T ) ) ) )

There were around 15 matches of code like 1e-6 * var (because the pattern w is for alphanumeric sequences and that is not a superset of the syntax of floating-point literals).

The subexpression ONE+S / T is just the sort of thing I was looking for. The three instances all involved code that processed tridiagonal matrices in various special cases. Google search combined with my knowledge of numerical analysis was not up to the task of figuring out whether the intended usage was (ONE+S)/T or ONE+(S/T).

Searches based on various other combination of operator pairs failed to match anything that looked suspicious.

There was an order of magnitude performance difference for csearch vs. grep -R -e (real 0m0.167s vs. real 0m2.208s). A very worthwhile improvement when searching much larger code bases with more complicated patterns.

Categories: Uncategorized Tags: binary operator, Google, precedence, psychology, R whitespace, regular expression, searching code

Type compatibility the hard way

January 14, 2012 Derek Jones No comments

When writing in assembly language it is possible to operate on a sequence of bits as if it were an unsigned integer one moment and a floating-point number the next; it is the developer’s responsibility to ensure that a given sequence of bits is operated on in a consistent manner. The concept of type was initially introduced into computer languages to provide information to compilers, enabling them to generate the appropriate instructions for values having the specified type and where necessary to convert values from the representation used by one type to the representation used by a different type. At this early stage language designers tended to keep things simple and to think in terms of what made sense at the machine representation level when deciding which type conversions to permit (PL/1 was a notable exception and the convolutions that occurred to perform some type conversions are legendary).

It took around 10 years for high level languages to evolve to the point where developers had the ability to create their own named types; Pascal being an early, very well known and stand out example. Once developers could create their own types it became necessary to come up with general rules specifying when a compiler must treat two different types as compatible (i.e., be required to generate code to support some set of operations between variables having these two different types).

Most language designers chose the simple option; a type is compatible with another type if it has the same name (scoping/namespace/lookup rules effectively meant that “same name” was effectively the same as “same definition”). This simple option generally included various exceptions for the arithmetic types; developers did not like having to insert explicit casts for what they considered to be obvious conversions (languages such as Ada/CHILL provided a mechanism for developers to specify that a newly defined arithmetic type really was a completely new type that was not compatible with any other arithmetic type, an explicit cast could change this).

One of the few languages which took a non-simple approach to type compatibility was CHILL, a language for which I once spent over a year writing the semantics phase of a compiler. CHILL uses what is known as structural compatibility, i.e., essentially two types are compatible if they have the same layout in memory (the language definition actually uses the terms similar and equivalent rather than compatible and uses mode rather than type, here I will follow modern general terminology). This has obvious advantages when there is a need to overlay types used in different parts of a program onto the same location in storage (note, no requirements on the fields being the same). CHILL definitions look like a mixture of C and Pascal, unless you know PL/1 they can look odd to the uninitiated (I think I’ve got them right, my CHILL is very rusty), T_1 and T_2 are compatible:

T_1 = struct (               T_2 = struct (
      f1 :int;                     f3 :int;
      f2 :int;                     f4 :int;
      );                           );

Structural compatibility enables the creation some rather unusual compatible types, such as the following three types all being pair-wise compatible (the keyword ref is use to specify pointer types):

T_3 = struct (               T_4 = struct (             T_5 = struct (
      f1 :int;                     f4 :int;                   f7 :int;
      f2 :ref T_3;                 f5 :ref T_4;               f8 :ref T_5;
      f3 :ref T_4;                 f6 :ref T_3;               f9 :ref T_5;
      );                           );                         );

Because types can be recursive it is possible for the compatibility checking code in the compiler to end up having to type check the type it is currently checking. The solution adopted by many CHILL compilers (not that there were ever many) was to associate an is_currently_being_checked flag with every type’s symbol table entry, if during compatibility checking this flag has value TRUE for both types then they are both compatible otherwise the flag is set to TRUE for both types and checking continues (all flags are set to FALSE at the end of compatibility checking).

To check T_3 and T_4 In the above code set the is_currently_being_checked flag to TRUE and iterate over the fields in each record. The first field pair have the same type, the second field pair are pointers to types we are already checking and therefore compatible, as are the third field pair, so the types are compatible. Checking T_3 and T_5 requires a second iteration through T_5 because of the pointer to T_4 which does not yet have its is_currently_being_checked flag set.

Yours truely discovered that one flag was not sufficient to do fully correct compatibility checking. It is necessary to maintain a stack of locations (e.g., the structure field or procedure parameter where compatibility checking has to recurse to check a user defined type) in the two types being compared in order to detect that some types were not compatible. In the following example (involving pointer to procedure types; which is longer than I remember the actual instance I first discovered being, but I had to create it again from vague memories and my CHILL expertise has faded; suggestions welcome) types A and B would be considered compatible using the is_currently_being_checked flag approach because by the time the last parameter is checked both symbol table flags have been set. You can see by inspection that types X and Y are not compatible (they have a different number of parameters to start with). Looking at the stack of previous compatibility checks for A/B would show that no X/Y compatibility check had yet been made and one would be needed for the third parameter (which would fail):

A = proc(X, Y,            X);
B = proc(C, proc(A, int), Y);
 
C = proc(E);
D = proc(A);
E = proc(proc(X, proc(A, int), X));
 
X = proc(D);
Y = proc(A, int);

The potential for complexity created by the use of structural compatibility is one reason why its use is rare. While it is possible to rationalize that CHILL was targeted at embedded telecommunication systems containing lots of code where memory costs can be significant, I suspect that those involved had a hardware mentality and a poor grasp of practical software engineering issues.

Incidentally, the design of the llvm type checking system relies on using an equality test to check for type equality. While this decision will increase the difficulty of integrating languages that use structural type compatibility into llvm, these languages are probably sufficiently rare that it is much more cost effective to make it simple to implement the more common languages.

Where did type compatibility go next? Well, over the last 20 years the juggernaut of object oriented design has pretty much excluded sophisticated non-OO type systems from mainstream languages (e.g., C++ and Java), but that is a topic for another article.

Categories: Uncategorized Tags: CHILL, compatible type, compiler, embedded systems, llvm, mode checking, non-OO, type checking, type system

The Shape of Code

Archive

Correlation between risk attitude and willingness to refer back

Relative spacing of operands affects perception of operator precedence

Type compatibility the hard way

Recent Posts

Recent Comments

Archives

Meta