<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>The Shape of Code &#187; 3n+1</title>
	<atom:link href="http://shape-of-code.coding-guidelines.com/tag/3n1/feed/" rel="self" type="application/rss+xml" />
	<link>http://shape-of-code.coding-guidelines.com</link>
	<description></description>
	<lastBuildDate>Sun, 29 Jan 2012 23:49:36 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Compiler benchmarking for the 21st century</title>
		<link>http://shape-of-code.coding-guidelines.com/2011/07/24/compiler-benchmarking-for-the-21st-century/</link>
		<comments>http://shape-of-code.coding-guidelines.com/2011/07/24/compiler-benchmarking-for-the-21st-century/#comments</comments>
		<pubDate>Sun, 24 Jul 2011 22:07:27 +0000</pubDate>
		<dc:creator>Derek-Jones</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[3n+1]]></category>
		<category><![CDATA[benchmark]]></category>
		<category><![CDATA[code generation]]></category>
		<category><![CDATA[compiler]]></category>
		<category><![CDATA[gcc]]></category>
		<category><![CDATA[llvm]]></category>
		<category><![CDATA[optimizing]]></category>
		<category><![CDATA[Pi]]></category>
		<category><![CDATA[quality]]></category>
		<category><![CDATA[testing]]></category>

		<guid isPermaLink="false">http://shape-of-code.coding-guidelines.com/?p=498</guid>
		<description><![CDATA[I would like to propose a new way of measuring the quality of a compiler&#8217;s code generator: The highest quality compiler is one that generates identical code for all programs that produce the same output, e.g., a compiler might spot programs that calculate pi and always generate code that uses the most rapidly converging method [...]]]></description>
			<content:encoded><![CDATA[<p>I would like to propose a new way of measuring the quality of a compiler&#8217;s code generator: The highest quality compiler is one that generates identical code for all programs that produce the same output, e.g., a compiler might spot programs that <a href="http://mathforum.org/library/drmath/view/65244.html">calculate pi</a> and always generate code that uses the <a href="http://www.johndcook.com/blog/2011/03/14/algorithm-record-pi-calculation/">most rapidly converging method known</a>.  This is a very different approach to the traditional methods based on using (mostly) execution time or size (usually code but sometimes data) as a measure of quality.</p>
<p>Why is a new measurement method needed and why choose this one?  It is relatively easy for compiler vendors to tune their products to the <a href="http://en.wikipedia.org/wiki/Standard_Performance_Evaluation_Corporation">commonly used benchmark</a> and they seem to have lost their role as drivers for new optimization techniques.  Different developers have different writing habits and companies should not have to waste time and money changing developer habits just to get the best quality code out of a compiler; compilers should handle differences in developer coding habits and not let it affect the quality of generated code.  There are major savings to be had by optimizing the effect that developers are trying to achieve rather than what they have actually written (these days new optimizations targeting at what developers have written show very low percentage improvements).</p>
<p>Deducing that a function calculates pi requires a level of sophistication in whole program analysis that is unlikely to be available in production compilers for some years to come (ok, detecting <code>4*atan(1.0)</code> is possible today).  What is needed is a collection of compilable files containing source code that aims to achieve an outcome in lots of different ways.  To get the ball rolling the &#8220;3n times 2&#8243; problem is presented as the first of this new breed of benchmarks.</p>
<p>The &#8220;3n times 2&#8243; problem is a variant on the <a href="http://shape-of-code.coding-guidelines.com/2009/06/30/searching-for-the-source-line-implementing-3n1/">3n+1 problem</a> that has been tweaked to create more optimization opportunities.  One implementation of the &#8220;3n times 2&#8243; problem is:</p>

<div class="wp_syntax"><div class="code"><pre class="c" style="font-family:monospace;"><span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span>is_odd<span style="color: #009900;">&#40;</span>n<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span>
   n <span style="color: #339933;">=</span> <span style="color: #0000dd;">3</span><span style="color: #339933;">*</span>n<span style="color: #339933;">+</span><span style="color: #0000dd;">1</span><span style="color: #339933;">;</span>
<span style="color: #b1b100;">else</span>
   n <span style="color: #339933;">=</span> <span style="color: #0000dd;">2</span><span style="color: #339933;">*</span>n<span style="color: #339933;">;</span> <span style="color: #666666; font-style: italic;">// this is n = n / 2; in the 3n+1 problem</span></pre></div></div>

<p>There are lots of ways of writing code that has the same effect, some of the statements I have seen for calculating <code>n=3*n+1</code> include:  <code>n = n + n + n + 1</code>, <code>n = (n << 1) + n + 1</code> and <code>n *= 3; n++</code>, while some of the ways of checking if <code>n</code> is odd include: <code>n &#038; 1</code>, <code>(n / 2)*2 != n</code> and <code>n % 2</code>.</p>
<p>I have created a list of different ways in which <code>3*n+1</code> might be calculated and <code>is_odd(n)</code> might be tested and written a script to generate a function containing all possible permutations (to reduce the number of combinations no variants were created for the least interesting case of <code>n=2*n</code>, which was always generated in this form).  The following is a snippet of the <a href="http://www.coding-guidelines.com/images/3n_times2.c" type="text/plain">generated code</a> (<a href="http://www.coding-guidelines.com/images/3n_times2.tgz">download everything</a>):</p>

<div class="wp_syntax"><div class="code"><pre class="c" style="font-family:monospace;"><span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span>n <span style="color: #339933;">&amp;</span> <span style="color: #0000dd;">1</span><span style="color: #009900;">&#41;</span> n<span style="color: #339933;">=</span><span style="color: #009900;">&#40;</span>n <span style="color: #339933;">&lt;&lt;</span> <span style="color: #0000dd;">2</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">-</span> n <span style="color: #339933;">+</span><span style="color: #0000dd;">1</span><span style="color: #339933;">;</span> <span style="color: #b1b100;">else</span> n<span style="color: #339933;">*=</span><span style="color: #0000dd;">2</span><span style="color: #339933;">;</span>
<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span>n <span style="color: #339933;">&amp;</span> <span style="color: #0000dd;">1</span><span style="color: #009900;">&#41;</span> n<span style="color: #339933;">=</span><span style="color: #0000dd;">3</span><span style="color: #339933;">*</span>n<span style="color: #339933;">+</span><span style="color: #0000dd;">1</span><span style="color: #339933;">;</span> <span style="color: #b1b100;">else</span> n<span style="color: #339933;">*=</span><span style="color: #0000dd;">2</span><span style="color: #339933;">;</span>
<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span>n <span style="color: #339933;">&amp;</span> <span style="color: #0000dd;">1</span><span style="color: #009900;">&#41;</span> n <span style="color: #339933;">+=</span> <span style="color: #0000dd;">2</span><span style="color: #339933;">*</span>n <span style="color: #339933;">+</span><span style="color: #0000dd;">1</span><span style="color: #339933;">;</span> <span style="color: #b1b100;">else</span> n<span style="color: #339933;">*=</span><span style="color: #0000dd;">2</span><span style="color: #339933;">;</span>
<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #009900;">&#40;</span>n <span style="color: #339933;">/</span> <span style="color: #0000dd;">2</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">*</span><span style="color: #0000dd;">2</span> <span style="color: #339933;">!=</span> n<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span> t<span style="color: #339933;">=</span><span style="color: #009900;">&#40;</span>n <span style="color: #339933;">&lt;&lt;</span> <span style="color: #0000dd;">1</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span> n<span style="color: #339933;">=</span>t<span style="color: #339933;">+</span>n<span style="color: #339933;">+</span><span style="color: #0000dd;">1</span><span style="color: #339933;">;</span> <span style="color: #009900;">&#125;</span> <span style="color: #b1b100;">else</span> n<span style="color: #339933;">*=</span><span style="color: #0000dd;">2</span><span style="color: #339933;">;</span>
<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #009900;">&#40;</span>n <span style="color: #339933;">/</span> <span style="color: #0000dd;">2</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">*</span><span style="color: #0000dd;">2</span> <span style="color: #339933;">!=</span> n<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span> n<span style="color: #339933;">*=</span><span style="color: #0000dd;">3</span><span style="color: #339933;">;</span> n<span style="color: #339933;">++;</span> <span style="color: #009900;">&#125;</span> <span style="color: #b1b100;">else</span> n<span style="color: #339933;">*=</span><span style="color: #0000dd;">2</span><span style="color: #339933;">;</span></pre></div></div>

<p>Benchmarks need a means of summarizing the results and here I make a stab at doing that for gcc 4.6.1 and llvm 2.9, when executed using the <em>-O3</em> option (output <a href="http://www.coding-guidelines.com/images/3n-gcc-4.6.1.s" type="text/plain">here</a> and <a href="http://www.coding-guidelines.com/images/3n-llvm-2.9.s" type="text/plain">here</a>).  Both compilers generated a total of four different sequences for the 27 'different' statements (I'm not sure what to do about the<code> inline</code> function tests and have ignored them here) with none of the sequences being shared between compilers.  The following lists the number of occurrences of each sequence, e.g., gcc generated one sequence 16 times, another 8 times and so on:</p>
<pre>
gcc    16   8   2   1
llvm   12   6   6   3
</pre>
<p>How might we turn these counts into a single number that enables compiler performance to be compared?  One possibility is to award 1 point for each of the most common sequence, 1/2 point for each of the second most common, 1/4 for the third and so on.  Using this scheme gcc gets 20.625, and llvm gets 16.875.  So gcc has greater consistency (I am loathed to use the much over used phrase higher quality).</p>
<p>Now for a closer look at the code generated.</p>
<p>Both compilers always generated code to test the least significant bit for the conditional expressions <code>n &#038; 1</code> and <code>n % 2</code>.  For the test <code>(n / 2)*2 != n</code> gcc generated the not very clever right-shift/left-shift/compare while llvm and'ed out the bottom bit and then compared; so both compilers failed to handle what is a surprisingly common check for a number being odd.</p>
<p>The optimal code for n=3*n+1 on a modern x86 processor is (lots of register combinations are possible, lets assume <code>rdx</code> contains <code>n</code>) <a href="http://stackoverflow.com/questions/1658294/x86-asm-whats-the-purpose-of-the-lea-instruction">leal	1(%rdx,%rdx,2), %edx</a> and this is what both compilers generated a lot of the time.  This locally optimal code is not always generated because:</p>
<ul>
<li>gcc fails to detect that <code>(n << 2)-n+1</code> is equivalent to <code>(n << 1)+n+1</code> and generates the sequence <code>leal	0(,%rax,4), %edx ; subl %eax, %edx ; addl $1, %edx</code> (I pointed this out to a gcc maintainer sometime ago and he suggested reporting it as a bug).  This 'bug' occurs occurs three times in total.</li>
<li>For some forms of the calculation llvm generates globally better code by taking the else arm into consideration.  For instance, when the calculation is written as <code>n += (n << 1) +1</code> llvm deduces that <code>(n << 1)</code> and the <code>2*n</code> in the <code>else</code> are equivalent, evaluates this value into a register before performing the conditional test thus removing the need for an unconditional jump around the 'else' code:

<div class="wp_syntax"><div class="code"><pre class="asm" style="font-family:monospace;">  leal	<span style="color: #009900; font-weight: bold;">&#40;</span><span style="color: #339933;">%</span>rax<span style="color: #339933;">,%</span>rax<span style="color: #009900; font-weight: bold;">&#41;</span><span style="color: #339933;">,</span> <span style="color: #339933;">%</span><span style="color: #00007f;">ecx</span>
  testb	$<span style="color: #0000ff;">1</span><span style="color: #339933;">,</span> <span style="color: #339933;">%</span><span style="color: #00007f;">al</span>
  <span style="color: #00007f; font-weight: bold;">je</span>	<span style="color: #339933;">.</span>LBB0_8
# BB#<span style="color: #0000ff;">7</span><span style="color: #339933;">:</span>
  orl	$<span style="color: #0000ff;">1</span><span style="color: #339933;">,</span> <span style="color: #339933;">%</span><span style="color: #00007f;">ecx</span>     # deduced <span style="color: #00007f;">ecx</span> is <span style="color: #000000; font-weight: bold;">even</span><span style="color: #339933;">,</span> arithmetic unit <span style="color: #00007f; font-weight: bold;">not</span> needed!
  addl	<span style="color: #339933;">%</span><span style="color: #00007f;">eax</span><span style="color: #339933;">,</span> <span style="color: #339933;">%</span><span style="color: #00007f;">ecx</span>
<span style="color: #339933;">.</span>LBB0_8<span style="color: #339933;">:</span></pre></div></div>

<p>This more efficient sequence occurs nine times in total.</li>
</ul>
<p>The most optimal sequence was generated by gcc:</p>

<div class="wp_syntax"><div class="code"><pre class="asm" style="font-family:monospace;">	testb	$<span style="color: #0000ff;">1</span><span style="color: #339933;">,</span> <span style="color: #339933;">%</span><span style="color: #00007f;">dl</span>
	leal	<span style="color: #009900; font-weight: bold;">&#40;</span><span style="color: #339933;">%</span>rdx<span style="color: #339933;">,%</span>rdx<span style="color: #009900; font-weight: bold;">&#41;</span><span style="color: #339933;">,</span> <span style="color: #339933;">%</span><span style="color: #00007f;">eax</span>
	<span style="color: #00007f; font-weight: bold;">je</span>	<span style="color: #339933;">.</span>L6
	leal	<span style="color: #0000ff;">1</span><span style="color: #009900; font-weight: bold;">&#40;</span><span style="color: #339933;">%</span>rdx<span style="color: #339933;">,%</span>rdx<span style="color: #339933;">,</span><span style="color: #0000ff;">2</span><span style="color: #009900; font-weight: bold;">&#41;</span><span style="color: #339933;">,</span> <span style="color: #339933;">%</span><span style="color: #00007f;">eax</span>
<span style="color: #339933;">.</span>L6<span style="color: #339933;">:</span></pre></div></div>

<p>with llvm and pre 4.6 versions of gcc generating the more traditional form (above, gcc 4.6.1 assumes that the 'then' arm is the most likely to be executed and trades off a <code>leal</code> against a very slow <code>jmp</code>):</p>

<div class="wp_syntax"><div class="code"><pre class="asm" style="font-family:monospace;">	testb	$<span style="color: #0000ff;">1</span><span style="color: #339933;">,</span> <span style="color: #339933;">%</span><span style="color: #00007f;">al</span>
	<span style="color: #00007f; font-weight: bold;">je</span>	<span style="color: #339933;">.</span>LBB0_5
# BB#<span style="color: #0000ff;">4</span><span style="color: #339933;">:</span>
	leal	<span style="color: #0000ff;">1</span><span style="color: #009900; font-weight: bold;">&#40;</span><span style="color: #339933;">%</span>rax<span style="color: #339933;">,%</span>rax<span style="color: #339933;">,</span><span style="color: #0000ff;">2</span><span style="color: #009900; font-weight: bold;">&#41;</span><span style="color: #339933;">,</span> <span style="color: #339933;">%</span><span style="color: #00007f;">eax</span>
	<span style="color: #00007f; font-weight: bold;">jmp</span>	<span style="color: #339933;">.</span>LBB0_6
<span style="color: #339933;">.</span>LBB0_5<span style="color: #339933;">:</span>
	addl	<span style="color: #339933;">%</span><span style="color: #00007f;">eax</span><span style="color: #339933;">,</span> <span style="color: #339933;">%</span><span style="color: #00007f;">eax</span>
<span style="color: #339933;">.</span>LBB0_6<span style="color: #339933;">:</span></pre></div></div>

<p>There is still room for improvement, perhaps by using the <a href="http://www.rcollins.org/p6/opcodes/CMOV.html">conditional move</a> instruction (which gcc actually generates within the not-very-clever code sequence for <code>(n / 2)*2 != n</code>) or by using the fact that <code>eax</code> already holds <code>2*n</code> (the potential saving would come through a reduction in complexity of the internal resources needed to execute the instruction).</p>
<p>llvm insists on storing the calculated value back into <code>n</code> at the end of every statement.  I'm not sure if this is a bug or a feature designed to make runtime debugging easier (if so it ought to be switched off by default).</p>
<p>Missed optimization opportunities (not intended to be part of this benchmark and if encountered would require a restructuring of the test source) include noticing that if <img src="http://shape-of-code.coding-guidelines.com/wp-content/plugins/wpmathpub/phpmathpublisher/img/math_993.5_a45192846640be85b0edaca33c2a3d3b.png" style="vertical-align:-6.5px; display: inline-block ;" alt="n" title="n"/> is odd then <img src="http://shape-of-code.coding-guidelines.com/wp-content/plugins/wpmathpub/phpmathpublisher/img/math_993.5_bab0fea78028b0c90d5567e828204642.png" style="vertical-align:-6.5px; display: inline-block ;" alt="3n+1" title="3n+1"/> is always even, creating the opportunity to perform the following multiply by 2 without an if test.</p>
<p>Perhaps one days compilers will figure out when a program is calculating pi and generate code that uses the best known algorithm.  In the meantime I am interested in hearing suggestions for additional different-algorithm-same-code benchmarks.</p>
<p><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fshape-of-code.coding-guidelines.com%2F2011%2F07%2F24%2Fcompiler-benchmarking-for-the-21st-century%2F&amp;title=Compiler%20benchmarking%20for%20the%2021st%20century" id="wpa2a_2"><img src="http://shape-of-code.coding-guidelines.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://shape-of-code.coding-guidelines.com/2011/07/24/compiler-benchmarking-for-the-21st-century/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Searching for the source line implementing 3n+1</title>
		<link>http://shape-of-code.coding-guidelines.com/2009/06/30/searching-for-the-source-line-implementing-3n1/</link>
		<comments>http://shape-of-code.coding-guidelines.com/2009/06/30/searching-for-the-source-line-implementing-3n1/#comments</comments>
		<pubDate>Tue, 30 Jun 2009 00:44:03 +0000</pubDate>
		<dc:creator>Derek-Jones</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[3n+1]]></category>
		<category><![CDATA[gcc]]></category>
		<category><![CDATA[optimize]]></category>
		<category><![CDATA[variation]]></category>

		<guid isPermaLink="false">http://shape-of-code.coding-guidelines.com/?p=104</guid>
		<description><![CDATA[I have been doing some research on the variety of ways that different developers write code to implement the same specification and have been lucky enough to obtain the source code of approximately 6,000 implementations of a problem based on the 3n+1 algorithm. At some point this algorithm requires multiplying a value by three and [...]]]></description>
			<content:encoded><![CDATA[<p>I have been doing some research on the variety of ways that different developers write code to implement the same specification and have been lucky enough to obtain the source code of approximately 6,000 implementations of a problem based on the <a href="http://en.wikipedia.org/wiki/Collatz_conjecture">3n+1 algorithm</a>.  At some point this algorithm requires multiplying a value by three and adding one, e.g., <code>n=3*n+1;</code>.</p>
<p>While I expected some variation in the coding of many parts of the algorithm I did not expect to see much variation in the <equ>3n+1</equ> part, perhaps somebody might write <code>n=n*3+1;</code>.  I was in for a surprise, the following are some of the different implementations I have seen so far:</p>
<p>   n = n + n + n + 1 ;<br />
   n += n + n + 1;<br />
   n = (n << 1) + n + 1;<br />
   n += (n << 1) + 1;<br />
   n *= 3; n++;<br />
   t = (n << 1) ; n = t + n + 1;<br />
   n = (n << 2) - n + 1;</p>
<p>I was already manually annotating the source and it was easy for me to locate the line implementing <equ>3n+1</equ> to annotate it.  But what if I wanted to automate the search for the line of code containing this calculation, what tool could I use?  Would I have to write down every possible ways in which <equ>3n+1</equ> could be implemented, with/without parenthesis and all possible orderings of operands?  I am not aware of any automatic tool that could be told to locate expressions that calculated <equ>3n+1</equ>.  What is needed is abstract interpretation over short sequences of statements.</p>
<p>I mentioned this search problem over drinks after a talk I gave at the <a href="http://www.lunch.org.uk/wiki/accuoxford">Oxford branch</a> of the <a href="http://www.accu.org">ACCU</a> last week and somebody (Huw ???) suggested that perhaps the code generated by gcc would be the same no matter how 3n+1 was implemented.  I could see lots of reasons why this would not be the case, but the idea was interesting and worth investigation.</p>
<p>At the default optimization level the generated x86 code is different for different implemenetations, but optimizing at the &#8220;-O 3&#8243; level results in all but one of the above expressions generating the same evaluation code:</p>

<div class="wp_syntax"><div class="code"><pre class="asm" style="font-family:monospace;">   leal <span style="color: #0000ff;">1</span><span style="color: #009900; font-weight: bold;">&#40;</span><span style="color: #339933;">%</span>rax<span style="color: #339933;">,%</span>rax<span style="color: #339933;">,</span><span style="color: #0000ff;">2</span><span style="color: #009900; font-weight: bold;">&#41;</span><span style="color: #339933;">,</span> <span style="color: #339933;">%</span><span style="color: #00007f;">eax</span></pre></div></div>

<p>The exception is <code>(n << 2) - n + 1</code> which results in shift/subtract/add.  Perhaps I should report this as a bug in gcc <img src='http://shape-of-code.coding-guidelines.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>I was surprised that gcc exhibited this characteristic and I plan to carry out more tests to trace out the envelope of this apparent "same generated code for equivalent expressions" behavior of gcc.</p>
<p><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fshape-of-code.coding-guidelines.com%2F2009%2F06%2F30%2Fsearching-for-the-source-line-implementing-3n1%2F&amp;title=Searching%20for%20the%20source%20line%20implementing%203n%2B1" id="wpa2a_4"><img src="http://shape-of-code.coding-guidelines.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://shape-of-code.coding-guidelines.com/2009/06/30/searching-for-the-source-line-implementing-3n1/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

