Explaining the decline of German comments in LibreOffice

April 28th, 2016 3 comments

The LibreOffice suite of office programs traces it origins back to StarWriter, first launched in 1985 by a German company. Given its German parentage it’s no surprise that the source code contains comments written in German.

There has been a push by the LibreOffice folks to convert the German comments to English comments. The plot below shows the number of German comments in the source of LibreOffice over time (release version numbers at the top and red line is the least squares fit of an exponential; code and data).

Number of German comments in LibreOffice over time

I am not only surprised to see such a regular decline in the number of German comments, but also that the decline is exponential.

The pattern of behavior may be driven by those doing the work:

  • people may be motivated by the number of remaining German comments; as the number decreases, people may be less likely to think it is worthwhile converting what is left,
  • perhaps those doing the conversion say to themselves: “I will do x% of the comments and then stop”. Having decided on this approach, there would have to be some form of signaling to other involved parties, otherwise the rate of decline would not be so smooth.

Perhaps the issue is the skill required to convert the comments:

  • perhaps many comments are easy to convert, with the conversion process getting progressively harder, e.g., exponentially so with those doing the conversion have roughly the same conversion skill level,
  • alternatively the skill required to convert the comments is roughly the same, but the number of people, of those doing the work, with a given skill level is an exponential.

I find it hard to believe any of these mechanisms. Suggestions for easier to believe mechanisms welcome.

Tags:

The dance of the dead cat

April 17th, 2016 2 comments

The flagging of unspecified behavior in C source by static analysis tools (i.e., where the standard specifies two or more possible behaviors for translators to choose from when generating code) is starting to rather get sophisticated, some tools are working out and reporting how many different program outputs are possible given the unspecified behavior in the source (traditionally tools simply point at code containing an instance of unspecified behavior).

A while ago I created a function where the number of possible different values returned, driven by one unspecified behavior in the code, was a Fibonacci number (the value of the function parameter selected the n’th Fibonacci number).

Weird ways of generating Fibonacci numbers are great for entertaining fellow developers, but developers are unlikely to be amused by a tool that flags their code as generating Fibonacci possible values (rather than the one they are expecting).

The possibility of Fibonacci different values relies on the generated code cycling through all possible behaviors, allowed by the standard, at runtime. This can only happen if program execution uses a Just-in-Time compiler; the compile once before execution implementations used by most developers get to generate far fewer possible permitted values (i.e, two for this particular example). Who would be crazy enough to write a JIT compiler for C you ask? Those crazy people who translate C to JavaScript for one.

To be useful for developers who are not using a JIT compiler to execute their C code, static analysis tools that work out the number of possible values that could be produced are going to have to support a code-compiled-before-execution option. Tools that don’t support this option will have what they report dismissed as weird possibilities that cannot happen in the developer’s world.

In the code below, does setting the code-compiled-before-execution option guarantee that the function Schrödinger always returns 1?

int x;
 
int set_x(void)
{
x=1;
return 1;
}
 
int two_values(void)
{
x=0;
return x + set_x(); // different evaluation order -> different value
}
 
bool Schrödinger(void)
{
return two_values() == two_values();
}

A compiler is free to generate code that returns a value that depends on whether two_values appears as the left or right operand of a binary operator. Not the kind of behavior an implementor would choose on purpose, but inlining both calls could result in different evaluation orders for the two instances of the code contained in what was the function body.

Our developer friendly sophisticated static analysis tool vendor is going to have to support a different_calls_to_same_function_have_same_behavior option.

How can a developer find out whether the compiler they were using always generated code having the same external behavior for different instances of calls, in the source, to the same function? Possibilities include reading the compiler source and asking the compiler vendor (perhaps future releases of gcc and llvm will support a different_calls_to_same_function_have_same_behavior option).

As far as I know (not having spent a lot of time looking) none of the tools doing the sophisticated unspecified behavior stuff support either of the developer friendly options I am suggesting need to be supported. Tools I know about include: c-semantics (including its go faster commercial incarnation) and ch2o (which the authors tell me executes the Fibonacci code a lot faster than c-semantics; last time I looked ch2o needed more installation time for the support tools it uses than I was willing to spend, so I have not tried it); the Cerberus project has not released the source of their system yet (I’m assuming they will).

C Standard meeting, April 2016

April 15th, 2016 No comments

I was at the ISO C Standard’s meeting in London this week; it has been five years since I last attended a WG14 meeting, when it was last in London (my jet setting standard’s meeting days are long gone). Around 20 people attended, of which slightly more than half I knew from previous meetings. Given how unchanging the membership was for so long, this is a large change and its great to see so many new people being interested in C (including and open source vendor, RedHat). There is also a change of convener since my last meeting; David Keaton is a long standing member and as meeting chair he kept things motoring along.

The format of the each day, after the first morning, was to spend an hour at the start of each morning and afternoon working on Defect Reports, break and then work through documents in the pre-meeting mailing.

The topic of note on Monday afternoon was a proposal to add support for the type short float in C2X. There is a lot of hardware support for 16 bit floating-point operations (e.g., SSE instructions) and C is behind the curve on this. There was consensus to move forward on this proposal.

Tuesday was taken up by discussing proposals under the general heading of clarifying the C memory object model; various papers by a formal methods group at Cambridge University that I have written about before. I had misunderstood the intent behind the papers; the Prof running the project wanted to fix the programming world by changing the C Standard (I thought he just wanted clarification of what the standard said). While fixing the programming world is a commendable goal, messy reality and very strong interests for not changing existing behavior are likely to maintain the status quo. Talking to the post grad working on the project, they seem to be doing all the right things, so we could be seeing some very interesting results (a major threat to success is the sheer volume of material that has to be covered).

Wednesday covered the charter for revising C, various proposals for new features in C2X (mostly lots of thread based stuff), conversion of the document to LaTeX (currently in nroff/groff; there was no sentiment to follow C++ and put the draft on a public Github repo). When C89 became an ANSI standard, before C90 became an ISO standard, Rex Jaeschke handed out a floppy of the C89 nroff sources to those attending one of the meetings (I forget which). Unless you happen to have an AT&T 3b2 and know which options to give nroff, you are very unlikely to be able to generate something that looks like C89.

Thursday covered another C2X proposal, closures using syntax and semantics supported by C on Apple (Borland got there first by supporting the __closure qualifier on pointers). In the afternoon we had a presentation of the latest C binding to the guidance on avoiding vulnerabilities in programming languages work going on in WG23. WG23 wanted WG14 to endorse this document and take ownership of it; lots of push back on this and all they got was a request to WG14 members to send any suggested improvements to WG23.

The next WG14 meeting is during October in Pittsburgh and I have no idea when the next meeting will be held in the UK (unlikely to be within three years).

Tags: ,

Tools that help handle floating-point dragons

April 7th, 2016 No comments

There be dragons is a common refrain in any discussion involving code containing floating-point. The dragons are not likely to disappear anytime soon, but there has been a lot of progress since my 2011 post and practical tools for handling them are starting to become available to developers who are not numerical analysts.

All the techniques contain an element of brute force, very many possibilities are examined (cloud computing is starting to have a big impact on how problems are attacked). All the cloud computing on the planet would not make a noticeable dent in any problem unless some clever stuff was done to drastically prune the search space.

My current favorite tool is Herbie, if only because of the blog posts describing some of the techniques used (it’s currently limited to code without loops; if you need loop support check out Rosa).

It’s all very well having the performance of your floating-point code optimized, but who is to say nasty problems are not lurking in unexplored ranges of the underlying formula. Without an Oracle capable of generating the correct answer (whatever that might be; Precimonious has to be provided with training inputs), the analysis can only flag what is considered to be suspicious behavior. Craft attempts to detect cancellation errors, S3FP searches for input values that produce results containing large relative error and Rangelab simply provides bounds on the output values calculated from whatever input it is fed.

Being interested in getting very accurate results is a niche market. Surprisingly inaccurate results are good enough for many people and perhaps we should be using a language designed for this market.

Perhaps the problem of efficiently and accurately printing floating-point numbers might finally have just been solved.

Ada updated to support undefined behavior

April 1st, 2016 No comments

The Ada language has now almost completely disappeared from developer consciousness and the Ada Standards’ committee has decided this desperate situation calls for desperate actions.

Undefined behavior in programming languages has been getting a lot of publicity over the last few years. It might not be good publicity, but the Ada committee has taken to heart the old adage ‘The only thing worse than being talked about is not being talked about.’

While the Ada standard does not use the phrase undefined behavior, it does contain a very close equivalent. The Ada community has bitten the bullet and decreed that constructs previously denoted by the term bounded error shall henceforth be referred to as undefined behavior. Technically, constructs generating a bounded error have effectively always been undefined behavior, but the original positioning of Ada as a sophisticated language suitable for critical applications required something less in-your-face unsafe sounding.

Will this minor change of wording (ISO rules are very strict on what changes can be made to published standards without going through long winded voting procedures) make any difference? I guess it will enable Ada users to jump on the ‘undefined behavior’ bandwagon (their claims that Ada was better because it did not contain undefined behavior always has the effect of annoying people, rather than generating any interest in changing languages).

I think there is a bigger public perception problem than the terminology used to denote undefined behavior. Ada Lovelace was born 200 years ago and gets lots of publicity as the first programmer; somehow this association with Lovelace has transferred to the language named after her, generating a perception of an old, fuddy-duddy language (being strongly typed has not helped, but this feature does appear to be coming back into fashion).

The following are some numbers from a while ago (both standards have been revised since these measurements were made).

In the Ada 2005 Standard there are:

36 occurrences of bounded error.
53 occurrences of the subsection header Erroneous execution and 116 occurrences of erroneous.
343 occurrences of implementation-defined.
373 occurrences of may, some of which describe optional behavior.
22 occurrences of must some of which that read as-if shall was intended.
38 occurrences of optional.
zero occurrences of processor dependent and processor-dependent.
1,018 occurrences of shall of which 131 have the form shall not.
452 occurrences of should.
8 occurrences of undefined, one referencing to an undefined range, three having the form undefined
range and the rest occurring in annexes (also see bounded error).
• 89 occurrences of unspecified.

In the C99 Standard there are:

zero occurrences of bounded error.
3 occurrences of erroneous.
163 occurrences of implementation-defined.
862 occurrences of may.
1 occurrence of must.
34 occurrences of optional.
zero occurrences of processor dependent and processor-dependent.
596 occurrences of shall of which 71 have the form shall not.
63 occurrences of should.
183 occurrences of undefined.
98 occurrences of unspecified.

In the C++ 2003 Standard (which now contains many more pages) there are:

zero occurrences of bounded error.
5 occurrences of erroneous.
236 occurrences of implementation-defined.
371 occurrences of may.
111 occurrences of must.
30 occurrences of optional.
zero occurrences of processor dependent and processor-dependent.
779 occurrences of shall of which 211 have the form shall not.
38 occurrences of should.
195 occurrences of undefined.
113 occurrences of unspecified.

Habits are the peripheral vision of the mind

March 24th, 2016 3 comments

Achieving a basic proficiency in a new skill requires an investment of conscious cognitive effort, i.e., thinking a lot. Students are constantly in the process of achieving basic proficiency in new skills and conclude that thinking is required for all intellectual activities (an incorrect assumption also held by many teachers).

To get past the conscious thinking stage lots of time has to be spent performing the skill. Repetition provides the opportunity for performance via conscious thought to migrate to subconscious performance (driving being a common example).

Real-time performance requires fluency, that is, being able to handle technical details without having to think about them. Thinking (i.e., conscious thought) is slow and requires lots of effort. It is best held in reserve for the important stuff.

To paraphrase Alfred Whitehead: “Software development advances by extending the number of important operations which we can perform without thinking about them.”

Somebody who has spent 100 hours or so (an hour or two a week for a year) learning to code has the same level of fluency as I have in communicating in a foreign language using a phrase book, or Google translate.

After a 1,000 hours of programming a person should be a very fluent coder.

It is said that becoming an expert requires 10,000 hours of practice. The kind of practice involved is deliberate practice, not unconscious use of what is already known. Becoming an expert requires learning lots of new things, not constantly applying what is already known. Old habits have to be broken and new ones acquired.

Programming is not Zen, although it contains elements that are. Why would a developer want to create a program without conscious thought (that is what scripts are for)?

I used to run ‘advanced’ programming courses for professional developers with 2+ years in industry. In many ways the material was a rerun of what they had learned at the start of their programming career. The difference was that this time around they could ignore the mechanics of writing code, now an ingrained habit, and concentrate on the higher level stuff. The course had to have advanced in its title because experienced developers would never sign up for an introductory course. Most of my one-on-one tutoring effort went on talking people out of bad habits they had picked up over time.

Perhaps live coding can be done with a Zen mind, probably why I don’t regard it as real programming (which I think requires some conscious thought).

Talking about details and high level material in the same breath is what beginners do because they have not yet learned to tell the two apart and be able to ignore one of them.

Like life, programs are mostly built from sequences of commonly occurring patterns. Our minds have evolved to subconsciously detect and take advantage of patterns. Programmers don’t know what the common source code patterns are any more than a native speaker can specify the syntax rules of the language they speak.

cpu+FPGA: applications can soon have bespoke instructions

March 21st, 2016 2 comments

Compiler writers are always frustrated that the cpu they are currently targeting does not contain the one instruction that would enable them to generate really efficient code. If only it were possible to add new instructions to the cpu. Well, it looks like this will soon be possible; Intel have added an on chip FPGA to their Broadwell processor (available circa 2017).

Having custom instructions on a FPGA (they would be loaded at program startup) is not the same as having the instructions on the cpu itself, there will be communication overhead when the data operated on by the custom instruction get transferred back and forth between cpu/FPGA (being on-chip means this will be low). To make the exercise worthwhile the custom instruction has to do something that takes very many cycles on the cpu and either speeds it up or reduces the power consumed (the Catapult project at Microsoft has a rack of FPGA enhanced machines speeding up/reducing the power of matching search engine queries to documents).

A CPU+FPGA is like CPU+GPU, except that FPGAs are programmed at a much lower level, i.e., there is little in the way of abstraction between what the hardware does and what the coder sees.

Does the world need a FPGA attached to their cpu? Most don’t but there are probably a few customers who do, e.g., data centers with systems performing dedicated tasks and anybody into serious bit twiddling. Other considerations include Intel needing to add new bells and whistles to its product so that customers who have been trained over the years to buy the very latest product (which has the largest margins) stay on the buying treadmill. The FPGA is also a differentiator, not that Intel would ever think of AMD as a serious competitor.

Initially the obvious use case is libraries performing commonly occurring functionality. No, not matrix multiple and inverse, FPGA are predominantly integer operation units (there are approaches using non-standard floating-point formats that can be used if your FPGA unit does not have floating-point support).

From the compiler perspective the use case is spotting cpu intensive loops, where all the data can be held on the FPGA until processing is complete. Will there be enough of these loops to make it a worthwhile implementation target? I suspect not. But then I can see many PhDs being written on this topic and one of them could produce a viable implementation that bootstraps itself into one of the popular open source compilers.

Interpreters have to do a lot of housekeeping work. Perhaps programs written in Java or R could be executed on the FPGA that uses the cpu as a slave processor. It is claimed that most R programs spend their time in library functions that have been implemented in C and Fortran, but I’m seeing more and more code that appears to be all R. For some programs an R-machine implemented in hardware could produce orders of magnitude speed improvements.

The next generation of cryptocurrency proof-of-work algorithms are being designed to be memory intensive, so they cannot be efficiently implemented using ASIC-proof (this prevents mining being concentrated in a few groups who have built bespoke mining operations). The analysis I have seen is based on ‘conventional’ cpu and ASIC designs. A cpu+FPGA is a very different kind of beast and one that might require another round of cryptocurrency design.

These cpu+FPGA processors have the potential to dramatically upend existing approaches to structuring programs. Very interesting times ahead!

Tags: , , ,

A survey of opinions on the behavior of various C constructs

March 10th, 2016 No comments

The Cerberus project, researching C semantics, has written up the results of their survey of ‘expert’ C users (short version and long detailed version). I took part in the original survey and at times found myself having to second guess what the questioner was asking; the people involved were/are still learning how C works. Anyway, many of the replies provide interesting insights into current developer interpretation of the behavior of various C constructs (while many of the respondents were compiler writers, it looks like some of them were not C compiler writers).

Some of those working on the Cerberus project are proposing changes to the C standard based on issues they encountered while writing a formal specification for parts of C and are bolstering their argument, in part, using the results of their survey. In many ways the content of the C Standard was derived from a survey of those attending WG14 meetings (or rather x3j11 meetings back in the day).

I think there is zero probability that any of these proposed changes will make it into a revised C standard; none of the reasons are technical and include:

  • If it isn’t broken, don’t fix it. Lots of people have successfully implemented compilers based on the text of the standard, which is the purpose of the document. Where is the cost/benefit of changing the wording to enable a formal specification using one particular mathematical notation?
  • WG14 receives lots of requests for changes to the C Standard and has an implicit filtering process. If the person making the request thinks the change is important, they will:
    • put the effort into wording the proposal in the stylized form used for language change proposals (i.e., not intersperse changes in a long document discussing another matter),
    • be regular attendees of WG14 meetings, working with committee members on committee business and helping to navigate their proposals through the process (turning up to part of a meeting will see your proposal disappear as soon as you leave the building; the next WG14 meeting is in London during April).

It could be argued that having to attend many meetings around the world favors those working for large companies. In practice only a few large companies see any benefit in sending an employee to a standard’s meeting for a week to work on something that may be of long term benefit them (sometimes a hardware company who wants to make sure that C can be compiled efficiently to their processors).

The standard’s creation process is about stability (don’t break existing code; many years ago a company voted against a revision to the Cobol standard because they had lost the source code to one of their products and could not check whether the proposed updates would break this code) and broad appeal (not narrow interests).

Update: Herb Sutter’s C++ trip report gives an interesting overview of the process adopted by WG21.

Tags: , ,

p-values for programmers

February 24th, 2016 No comments

Data analysis always contains a degree of uncertainty, in statistics this uncertainty is often expressed through the concept of p-value and possible interpretations of the data tested using the ideas behind null hypothesis testing. What is the best way of explaining the concept of p-value and null hypothesis testing to software developers?

For the empirical software engineering workshops I have been running, p-values get both a broad brush description and a use case in the form of null hypothesis testing (expressed as code). The broad brush approach to p-values worked and the null hypothesis as-code approach flopped miserably.

The broad brush approach defines p-values as a measure of uncertainty and leaves it at that, no mention of what exactly is uncertain. I quickly move on to discuss the more important point, from the real-world point of view, of selecting a p-value. Choice of p-value is one side of a cost/benefit business decision; would you sit in a lecture theater that had a 1 in 20 chance of collapsing? It all depends on what benefits accrued from sitting there.

Developers understand code and expressing the operations around the use of null hypothesis testing, as code, will obviously make everything clear. Perhaps my attempt to fit everything on one slide concentrated things a bit too much, or perhaps being firehosed with stuff to learn had left people yearning for the lunch that was fast approaching. The following function, accompanied by me babbling away was met with many blank stares.

The following function is my stab at explaining the operation of the null hypothesis and the role played by p-values. I hope that things are a bit clearer for readers who have not had me teaching them R in 15 minutes, followed by having a general what you need to know about probability and statistics in 30 slides.

void null_hypothesis_test(void *result_data, float p_value)
{
// H set by reality, its value is only accessible by running the appropriate experiments
if ((probability_of_seeing_data_when_H_true(result_data) < p_value) ||
     !H)
   printf("Willing to assume that H is false\n");
else
   printf("H might be true\n");
}
 
null_hypothesis_test(run_experiment(), 0.05);

Workshop on survival and time series analysis in empirical SE

February 12th, 2016 No comments

In January the material in my book on Empirical software engineering using R had its first exposure to professional software developers at a one day workshop (there was a rerun last week; slides here). The sessions were both fully booked, but as often happens on half turned up, around 15 at each workshop. A couple of people turned up expecting to be taught R and found themselves in a software engineering workshop that assumed the attendees could learn R in 10 minutes (because professional developers are experienced enough to learn the basics of programming in any language in that amount of time).

The main feedback was that people wanted to see more code, something to give them a starting point for the hands-on sessions (I hate seeing reams and reams of code on slides and had gone too far in the minimalist direction).

The approach of minimizing what developers have to learn/remember, even if it means using more computer resources (e.g., always using glm), went down very well.

The probability & statistics material, in the session before lunch, had a mixed reception; this is partly because I have not come up with a clear message and I’m trying to get over several disparate ideas, i.e., sampling issues, life is complicated and p-values. People were a bit unhappy about the life is complicated, data is messy and you have to think about what you are doing message.

I think the regression model building material was a hit. People saw the potential in fitting curves to data and found it easy to do once the parameters to plot, lines, glm and predict were sorted out. There was lots of interesting discussion around interpreting the fitted models, with me continually hamming on the point that p-value selection was a business risk factor decision and what constituted a good model depended on what it was going to be used for.

Those attending wanted to learn more, which is great since the main aim was to show people what useful things could be done to motivate them to go off and learn more (relatively little may be known about empirical software engineering, but there is more than can be fitted into a one day workshop)

There is a part-2 workshop in March and the plan is to cover survival analysis, time series analysis and if there is time something else. It will be assumed that people have the skill level of those who attended the first workshop, e.g., can write basic R and fit a simple regression model.

Thanks to The Rise for sponsoring the venue for all three workshops.

Tags: , ,