June 30, 2009 Derek Jones No comments

I have been doing some research on the variety of ways that different developers write code to implement the same specification and have been lucky enough to obtain the source code of approximately 6,000 implementations of a problem based on the 3n+1 algorithm. At some point this algorithm requires multiplying a value by three and adding one, e.g., n=3*n+1;.

While I expected some variation in the coding of many parts of the algorithm I did not expect to see much variation in the 3n+1 part, perhaps somebody might write n=n*3+1;. I was in for a surprise, the following are some of the different implementations I have seen so far:

n = n + n + n + 1 ;
n += n + n + 1;
n = (n << 1) + n + 1; n += (n << 1) + 1; n *= 3; n++; t = (n << 1) ; n = t + n + 1; n = (n << 2) - n + 1; I was already manually annotating the source and it was easy for me to locate the line implementing 3n+1 to annotate it. But what if I wanted to automate the search for the line of code containing this calculation, what tool could I use? Would I have to write down every possible ways in which 3n+1 could be implemented, with/without parenthesis and all possible orderings of operands? I am not aware of any automatic tool that could be told to locate expressions that calculated 3n+1. What is needed is abstract interpretation over short sequences of statements.

I mentioned this search problem over drinks after a talk I gave at the Oxford branch of the ACCU last week and somebody (Huw ???) suggested that perhaps the code generated by gcc would be the same no matter how 3n+1 was implemented. I could see lots of reasons why this would not be the case, but the idea was interesting and worth investigation.

At the default optimization level the generated x86 code is different for different implemenetations, but optimizing at the “-O 3” level results in all but one of the above expressions generating the same evaluation code:

   leal 1(%rax,%rax,2), %eax

The exception is (n << 2) - n + 1 which results in shift/subtract/add. Perhaps I should report this as a bug in gcc :-)

I was surprised that gcc exhibited this characteristic and I plan to carry out more tests to trace out the envelope of this apparent "same generated code for equivalent expressions" behavior of gcc.

Categories: Uncategorized Tags: 3n+1, gcc, optimize, variation

The Shape of Code

Archive

Searching for the source line implementing 3n+1

Recent Posts

Recent Comments

Archives

Meta