Archive

Archive for September, 2015

Forces driving patterns found in call graphs of human written code

September 30th, 2015 No comments

It has been a while since I posted anything about generating source that mimics the characteristics of human written code.

Generating function definitions is the easy bit, although variable selection is fiddly and naming needs to be handled.

The hardest part of mimicking human written code is linking functions together, via calls, in apparently meaningful ways. Within one source file it is probably possible to get away with individual calls to other local functions (most function definitions are called once) and a couple of calls to third party libraries. At the complete program level the code needs to tell a story, and doing this is very hard.

I think function calls can be divided into three categories, all based on the relationship between the developer who writes the code than does the call and the developer(s) who wrote the called function:

  • Caller developer is the same person as the callee developer. One person gets to decide how things are done,
  • Caller developer works on the same team as the callee developer. Here an interface needs to be negotiated with one or more other people,
  • Caller developer is making use of a third-party library. The interface is pre-decided, take it or leave it (the same principle applies to library updates, the caller has to decide whether to stay with the existing version or make the changes needed to upgrade to the new version; Android is the (in)famous example of a frequently changed library that developers are under strong pressure to continually adapt to).

Calls to functions in third-party libraries tend to follow stylized sequences, e.g., open_* occurs before read_*/write_* and close_* appears last. Most of the time a generator could get away not calling functions from third-party libraries, because the number of such calls is often small, but when they occur they had better look correct.

Needing to call a function written by another developer on the team opens up all sorts of possibilities: there is always the option of them modifying the function to make life easier for the new call, but the cost of modifying all the existing call sites (plus any associated call chains) may be too high, perhaps a new global variable can be used to communicate the desired information, or perhaps the proposed usage is a small cog in a large wheel and has to make do (or perhaps they don’t like the person asking for the change and reasons are invented for staying as-is).

Then there is correlation in time and space (this has a big impact on patterns of evolution, at least I think so; models of forest fires, i.e., growth, death, fires of various sizes creating space for new growth, is the obvious parallel):

  • The longer a function exists the more likely it is to accumulate callers. A function definition can remain unchanged and yet over time become more and more difficult to change.
  • It can be very expensive to make changes to an existing function definition when there are lots of calls to it.

What do measurements of code have to say? Almost nothing; existing studies mostly do weird things like treating a system’s call graph as-if it were a social network (using a currently trendy metaphor is good for getting papers published, even if the mapping is all wrong), plotting power law-like graphs and sprouting portentous nonsense.

What is really needed is measurements of forked systems; comparing systems derived from a common past will tell us how much natural variation exists due to individual choice.

FreeBSD, NetBSD and OpenBSD are the obvious poster children of forked, common heritage, systems written in C; I cannot think of any such systems written in Java or C++.

Workshop on analyzing software engineering data

September 25th, 2015 No comments

I am teaching a workshop, analyzing software engineering data, on 16 January 2016. If you meet the assumed level of know-how (basic understanding of maths to GCSE level, fluent in at least one programming language {i.e., written 10k+ lines of code} and will turn up with a laptop that has R installed), then you are welcome to sign up, its free. The event is being organized by ACCU London.

The focus is on extracting information that is useful to practicing software developers for creating software systems; statistics is used as a tool to find patterns in the data (R is used for this and the programs have the form: read_data(); format_data(); appropriate_statistical_function(); plot_results() and are usually contained in 10-30 lines).

The maths/programming requirements are there because the focus is on the software engineering ideas implied by the data; people need to implicitly understand how an equation fits together (not because there will be lots of algebra, there isn’t) and to be able to pick up and use a new language quickly.

The material is based on a book I am working on.

Its a hands-on workshop, with me talking for an hour or so and then everybody analyzing data for an hour, repeating until end-of-day. I have plenty of data for you to work on, but if you do have some software engineering data that you are willing to share with everybody, please bring it along.

The workshop is something of an experiment; as far as I know there are no books or courses aimed at software developers interested in analyzing software engineering data (there are a few books containing an assortment of academic papers). If the material is too easy I can speed up, if it is too hard then I will slow down; if the material is of no practical use we can all leave early.

The plan is to start at the beginning and cover all the important topics in software engineering. Obviously this requires more than a one day workshop. If there is enough interest there will be more workshops covering different topics (assuming I have time to organize the material and an available venue permitting).

Tags: , ,

The Ockham Sound Analysis Criteria

September 16th, 2015 No comments

Yesterday a headline on a tool vendor blog caught my eye “… meets NIST high assurance standards”. What is this high assurance standard that I had not heard about before?

It turns out to be the Ockham Sound Analysis Criteria (which is not a standard, but is sort of connected to assurance at some level). I have been following the NIST group (now known as SAMATE) behind this work since their first meeting (the only one I have attended); their Static Analysis Tool Exposition (SATE) work is a great idea, but I imagine it has been an uphill battle convincing tool vendors to publicly expose the strength and weaknesses of their tools.

Passing the Ockham Sound Analysis Criteria requires that a static analysis tool detect “…a minimum of 60% of buggy sites OR of non-buggy sites.”, with no false-positives (which I take to mean that no incorrect warnings generated, i.e., the tool cannot incorrectly say “there is a fault here” or “there are no faults here”).

The obvious low cost tool implementation is to pattern detect the known problems in the known test suite (called the Juliet test suite) and output warnings about them. The only way for SAMATE to stop companies doing this (or at least tuning their tools to pass the suite) is to regularly change the test cases used.

I think I understand the rationale being the no false-positive requirement (SAMATE are using the marketing term “sound”). NIST want static analysis tools to be usable by people who don’t know anything about software; a strange idea I know, but the Nuclear Regulatory Commission have wanted to do this in the past.

Not being willing to accept false-positives kills innovation. New analysis techniques invariably start out being unreliable and improve over time.

I suspect that few vendors will have any interest in claiming to meet the Ockham Sound Analysis Criteria (apart from the ones that pattern match on the tests to satisfy some contract requirement). There is too much downside (new tests could put a vendor in the position of having to make a big investment just to continue to meet the criteria) for almost no upside (does anybody make purchase decisions based on this criteria?)

I think that the tool vendor (TrustinSoft) found they could make the claim and being relatively new in the tools market thought it might mean something to customers (I doubt it will, as their sales people are probably finding out). Of course what customers really want tool vendors to tell them is that their code does not contain any problems.

Recent formal methods and C papers (Sep 2015)

September 14th, 2015 2 comments

I have been catching up on my reading of papers from this year’s Programming Language Design and Implementation conference (whose organizers have not yet figured out that linking to pdfs of the papers might be useful).

Needless to say there are a few papers on formal methods and C:

  • “A Formal C Memory Model Supporting Integer-Pointer Casts” is a truly awful paper. It starts out: “The ISO C standard famously does not give semantics to a significant subset of syntactically valid C programs.” and goes down hill from there. As far as I know only one language, Algol 68, defines semantic requirements using syntax, all other languages specify a syntax which is a very large superset of the set of semantically valid programs. The paper goes on to define a C-like language that is also Java-like, C#-like and *-like most languages created in the last 20 years. I have no idea why this paper got accepted, is PLDI now a third tier conference?
  • Defining the undefinedness of C from the C-semantics guys. I could only find a version from 2012 online. Come on guys, you’re letting down one of your cheer-leaders. Update: pdf now available.
  • A Formal C Memory Model for Separation Logic (not at PLDI, but popped up on arXiv today). This is one of those annoying papers that could have been great, but shoots itself in the foot. The first 20 pages shows that the author is aware of some of the complications involved in modeling C’s behavior. This is followed by pages and pages of definitions, a scattering of lemmas and Facts; at page 51 The Theorems start, blah, blah, blah. Then we are almost done, there is a discussion of related work.

    Where is it shown that any of this stuff is connected to the requirements contained in the C Standard? The source of the implementation is provided, lets look at that; hmmm, no cross references to the C Standard here (in fact it is almost comment free). What about testing, processing source code to see what happens. The only mention of testing appears while discussing what the competition do (C-semantics; those pesky Americans again, not only not using Coq but testing their formal tools, don’t they know that anything written using mathematics must be correct).

    The author’s draft PhD thesis says something about testing; but I get the feeling that he only says something about it because the competition does, even mostly using their+others tests rather than coming up with lots of his own.

    While this work (part of the CH2O project) has clearly created a system that handles a chunk of real C, I don’t think it is anywhere close to being a very accurate model of C semantics. The author appears to be so much more interested in doing interesting mathematical stuff and finds it rather tiresome that the realities of C semantics disrupt the idealism.

Showing that they have clearly not learned how things are done in the formal semantics community, those pesky Americans have gone and produced a formal semantics for Javascript and tested it against the ECMAScript 5.1 conformance test suite (passing all 2,782 core language tests, Chrome V8 is the only other implementations that does this).

The compiler/interpreter distinction

September 8th, 2015 No comments

What is the difference between compiled and interpreted programs?

In the good-old-days the distinction was easy to make: compiled code is executed by hardware while interpreted code is executed by software.

These days it can be very difficult to decide whether a program will be executed by hardware or software, and in some cases both may occur. More complicated cpus implement some of their instructions in micro-code (software control of very low level hardware resources) and the virtual machines specified for software execution can be implemented in hardware (an interesting project for a group of talented students in their summer holidays wanting to learn about ASICs).

Some people make a distinction based on the abstraction level of the cpu specification, e.g., very high level abstraction means the code must be interpreted. In practice the implementation of a cpu specification in hardware or software is an economic decision (software may be slow, but its a lot cheaper to implement).

I think there is a compiler/interpreter distinction, but the difference is not about how code is executed (the hardware/software distinction is a convenient difference that is easy to explain).

The compiler/interpreter distinction is a difference of responsibility. Compilers treat programs like the Spartans treated their children, they are bundled into a file of the appropriate format and left for the Operating system to load into memory and point the cpu at the first instruction (a cpu’s one interest is executing the sequences of instructions pointed to by the program counter). Interpreters are more like dotting nannies, organizing the provision of memory and on call to provide access to the desired resources.

Sometimes a language is classified as an interpreted language. There is no such thing as an interpreted language, only languages which are much more easily implemented using an interpreter than a compiler.

The performance of the Spartan approach may be very desirable, but the cost of achieving it can be very high.

Actively maintained production compilers for middle-age languages

September 1st, 2015 10 comments

The owners of the Borland C++ compiler have stopped maintaining it. So we are now down to, by my counting, three four different production quality C++ compilers still being actively maintained (Visual C++ {the command line c1.exe, not the interactive IDE compiler}, GCC, LLVM and EDG); lots of companies repackage EDG and don’t talk about it.

How many production compilers for other middle-age languages are still being actively maintained?

Ada I think is now down to one (GNAT; I’m not sure of the status of what was the Intermetrics compiler).

Cobol has two+ (I’m not sure ow many internal compilers IBM has, some of which are really Microfocus) that I know of (Microfocus and Fujitsu {was ACUCobol}).

Fortran probably needs more than one hand to count its compilers. Nothing like having large engineering applications using the languages features supported by your compiler to keep the maintenance fees rolling in.

C still has lots of compilers (a C validation suite vendor told me many years ago that they had over 150 customers). Embedded processors can be a very tough target for the general purpose algorithms used in GCC and LLVM, so vendors with hand crafted compilers can still eek out a living.

Perl has one (which I find surprising).

R has one, but like Cobol it is not a fashionable language in compiler writing circles. Over the last couple of years there have been a few ‘play’ implementations and rumors of people creating a new production quality implementation.

Lisp has one or millions, depending on how you view dialects or there could be a million people with a different view on the identity of the 1.

Snobol-4 still has one (yes, I am a fan of this language).

There are lots of languages which have not yet reached middle-age, so its too soon to start counting how many actively supported compilers they still have in production use.