I have been doing some research on the variety of ways that different developers write code to implement the same specification and have been lucky enough to obtain the source code of approximately 6,000 implementations of a problem based on the 3n+1 algorithm. At some point this algorithm requires multiplying a value by three and adding one, e.g.,
While I expected some variation in the coding of many parts of the algorithm I did not expect to see much variation in the
n=n*3+1;. I was in for a surprise, the following are some of the different implementations I have seen so far:
n = n + n + n + 1 ;
n += n + n + 1;
n = (n << 1) + n + 1; n += (n << 1) + 1; n *= 3; n++; t = (n << 1) ; n = t + n + 1; n = (n << 2) - n + 1; I was already manually annotating the source and it was easy for me to locate the line implementing
I mentioned this search problem over drinks after a talk I gave at the Oxford branch of the ACCU last week and somebody (Huw ???) suggested that perhaps the code generated by gcc would be the same no matter how 3n+1 was implemented. I could see lots of reasons why this would not be the case, but the idea was interesting and worth investigation.
At the default optimization level the generated x86 code is different for different implemenetations, but optimizing at the “-O 3″ level results in all but one of the above expressions generating the same evaluation code:
leal 1(%rax,%rax,2), %eax
The exception is
(n << 2) - n + 1 which results in shift/subtract/add. Perhaps I should report this as a bug in gcc :-)
I was surprised that gcc exhibited this characteristic and I plan to carry out more tests to trace out the envelope of this apparent "same generated code for equivalent expressions" behavior of gcc.