<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>The Shape of Code &#187; precedence</title>
	<atom:link href="http://shape-of-code.coding-guidelines.com/tag/precedence/feed/" rel="self" type="application/rss+xml" />
	<link>http://shape-of-code.coding-guidelines.com</link>
	<description></description>
	<lastBuildDate>Sun, 12 Feb 2012 20:42:27 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Relative spacing of operands affects perception of operator precedence</title>
		<link>http://shape-of-code.coding-guidelines.com/2012/01/22/relative-spacing-of-operands-affects-perception-of-operator-precedence/</link>
		<comments>http://shape-of-code.coding-guidelines.com/2012/01/22/relative-spacing-of-operands-affects-perception-of-operator-precedence/#comments</comments>
		<pubDate>Sun, 22 Jan 2012 22:56:20 +0000</pubDate>
		<dc:creator>Derek-Jones</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[binary operator]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[precedence]]></category>
		<category><![CDATA[psychology]]></category>
		<category><![CDATA[R whitespace]]></category>
		<category><![CDATA[regular expression]]></category>
		<category><![CDATA[searching code]]></category>

		<guid isPermaLink="false">http://shape-of-code.coding-guidelines.com/?p=650</guid>
		<description><![CDATA[What I found most intriguing about Google Code Search (shutdown Nov 2011) was how quickly searches involving regular expressions returned matches. A few days ago Russ Cox, the implementor of Code Search not only explained how it worked but also released the source and some precompiled binaries. Google&#8217;s database of source code did not include [...]]]></description>
			<content:encoded><![CDATA[<p>What I found most intriguing about Google Code Search (<a href="http://googleblog.blogspot.com/2011/10/fall-sweep.html">shutdown</a> Nov 2011) was how quickly searches involving regular expressions returned matches.  A few days ago <a href="http://swtch.com/~rsc/">Russ Cox</a>, the implementor of Code Search not only <a href="http://swtch.com/~rsc/regexp/regexp4.html">explained how it worked</a> but also released the <a href="https://code.google.com/p/codesearch/">source and some precompiled binaries</a>.  Google&#8217;s database of source code did not include the source of R, so I decided to install CodeSearch on my local machine and run some of my previous searches against the latest (v2.14.1) R source.</p>
<p>In <a href="http://www.knosof.co.uk/dev-experiment/accu07.html">2007 I ran an experiment</a> that showed developers made use of variable names when making  binary operator precedence decisions.  At about the same time two cognitive psychologists, <a href="https://facultystaff.richmond.edu/~dlandy/">David Landy</a> and <a href="http://cognitrn.psych.indiana.edu/people.html">Robert Goldstone</a>, were investigating the <a href="https://facultystaff.richmond.edu/~dlandy/Publications/Landy%20Goldstone%202007E.pdf">impact of spacing on operator precedence</a> decisions (they found that readers showed a tendency to pair together the operands that were visibly closer to each other, e.g., <code>a</code> with <code>b</code> in <code>a+b * c</code> rather than <code>b</code> with <code>c</code>).</p>
<p>As somebody very interested in finding faults in code the psychologists research findings on spacing immediately suggested to me the possibility that &#8216;incorrectly&#8217; spaced expressions were a sign of failure to write code that had the intended behavior.  Feeding some rather complicated regular expressions into Google&#8217;s CodeSearch threw up a number of &#8216;incorrectly&#8217; spaced expressions.  However, this finding went no further than an interesting email exchange with Landy and Goldstone.</p>
<p>Time to find out whether there are any &#8216;incorrectly&#8217; spaced expressions in the R source.  <code>cindex</code> (the tool that builds the database used by <code>csearch</code>) took 3 seconds on a not very fast machine to process all of the R source (56M byte) and build the search database (10M byte; the Linux database is a factor of 5.5 smaller than the sources).</p>
<p>The search:</p>

<div class="wp_syntax"><div class="code"><pre class="perl" style="font-family:monospace;">csearch <span style="color: #ff0000;">&quot;<span style="color: #000099; font-weight: bold;">\w</span>(<span style="color: #000099; font-weight: bold;">\+</span>|<span style="color: #000099; font-weight: bold;">\-</span>)<span style="color: #000099; font-weight: bold;">\w</span> +(<span style="color: #000099; font-weight: bold;">\*</span>|<span style="color: #000099; font-weight: bold;">\/</span>) +<span style="color: #000099; font-weight: bold;">\w</span>&quot;</span></pre></div></div>

<p>returned a few interesting matches:</p>

<div class="wp_syntax"><div class="code"><pre class="fortran" style="font-family:monospace;">...
<span style="color: #202020;">modules</span><span style="color: #339933;">/</span>internet<span style="color: #339933;">/</span>nanohttp.<span style="color: #202020;">c</span><span style="color: #339933;">:</span>       used <span style="color: #339933;">+=</span> tv_save.<span style="color: #202020;">tv_sec</span> <span style="color: #339933;">+</span> 1e<span style="color: #339933;">-</span>6 <span style="color: #339933;">*</span> tv_save.<span style="color: #202020;">tv_usec</span>;
modules<span style="color: #339933;">/</span>lapack<span style="color: #339933;">/</span>dlapack0.<span style="color: #202020;">f</span><span style="color: #339933;">:</span>     $          <span style="color: #009900;">&#40;</span> T<span style="color: #339933;">*</span><span style="color: #009900;">&#40;</span> ONE<span style="color: #339933;">+</span><span style="color: #993333;">SQRT</span><span style="color: #009900;">&#40;</span> ONE<span style="color: #339933;">+</span>S <span style="color: #339933;">/</span> T <span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#41;</span>
modules<span style="color: #339933;">/</span>lapack<span style="color: #339933;">/</span>dlapack2.<span style="color: #202020;">f</span><span style="color: #339933;">:</span>               S <span style="color: #339933;">=</span> Z<span style="color: #009900;">&#40;</span> <span style="color: #cc66cc;">3</span> <span style="color: #009900;">&#41;</span><span style="color: #339933;">*</span><span style="color: #009900;">&#40;</span> Z<span style="color: #009900;">&#40;</span> <span style="color: #cc66cc;">2</span> <span style="color: #009900;">&#41;</span> <span style="color: #339933;">/</span> <span style="color: #009900;">&#40;</span> T<span style="color: #339933;">*</span><span style="color: #009900;">&#40;</span> ONE<span style="color: #339933;">+</span><span style="color: #993333;">SQRT</span><span style="color: #009900;">&#40;</span> ONE<span style="color: #339933;">+</span>S <span style="color: #339933;">/</span> T <span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#41;</span>
modules<span style="color: #339933;">/</span>lapack<span style="color: #339933;">/</span>dlapack4.<span style="color: #202020;">f</span><span style="color: #339933;">:</span>     $          <span style="color: #009900;">&#40;</span> T<span style="color: #339933;">*</span><span style="color: #009900;">&#40;</span> ONE<span style="color: #339933;">+</span><span style="color: #993333;">SQRT</span><span style="color: #009900;">&#40;</span> ONE<span style="color: #339933;">+</span>S <span style="color: #339933;">/</span> T <span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#41;</span></pre></div></div>

<p>There were around 15 matches of code like <code>1e-6 * var</code> (because the pattern \w is for alphanumeric sequences and that is not a superset of the syntax of floating-point literals).</p>
<p>The subexpression <code>ONE+S / T</code> is just the sort of thing I was looking for.  The three instances all involved code that processed <a href="http://en.wikipedia.org/wiki/Tridiagonal_matrix">tridiagonal matrices</a> in various special cases.  Google search combined with my knowledge of numerical analysis was not up to the task of figuring out whether the intended usage was <code>(ONE+S)/T</code> or <code>ONE+(S/T)</code>.</p>
<p>Searches based on various other combination of operator pairs failed to match anything that looked suspicious.</p>
<p>There was an order of magnitude performance difference for <code>csearch</code> vs. <code>grep -R -e</code> (real 0m0.167s vs. real 0m2.208s).  A very worthwhile improvement when searching much larger code bases with more complicated patterns.</p>
<p><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fshape-of-code.coding-guidelines.com%2F2012%2F01%2F22%2Frelative-spacing-of-operands-affects-perception-of-operator-precedence%2F&amp;title=Relative%20spacing%20of%20operands%20affects%20perception%20of%20operator%20precedence" id="wpa2a_2"><img src="http://shape-of-code.coding-guidelines.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://shape-of-code.coding-guidelines.com/2012/01/22/relative-spacing-of-operands-affects-perception-of-operator-precedence/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Information content of expressions</title>
		<link>http://shape-of-code.coding-guidelines.com/2009/12/11/information-content-of-expressions/</link>
		<comments>http://shape-of-code.coding-guidelines.com/2009/12/11/information-content-of-expressions/#comments</comments>
		<pubDate>Fri, 11 Dec 2009 02:32:56 +0000</pubDate>
		<dc:creator>Derek-Jones</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[expressions]]></category>
		<category><![CDATA[faults]]></category>
		<category><![CDATA[information content]]></category>
		<category><![CDATA[measuring]]></category>
		<category><![CDATA[precedence]]></category>
		<category><![CDATA[readability]]></category>
		<category><![CDATA[source code]]></category>
		<category><![CDATA[whitespace]]></category>

		<guid isPermaLink="false">http://shape-of-code.coding-guidelines.com/?p=138</guid>
		<description><![CDATA[Software developers read source code to obtain information. How might the information content of source code be quantified? Both of the following functions assign the same value to x and if that is the only information a reader of that code is interested in, then the information content of both assignment statements could be said [...]]]></description>
			<content:encoded><![CDATA[<p>Software developers read source code to obtain information.  How might the information content of source code be quantified?</p>
<p>Both of the following functions assign the same value to <code>x</code> and if that is the only information a reader of that code is interested in, then the information content of both assignment statements could be said to be the same.</p>

<div class="wp_syntax"><div class="code"><pre class="c" style="font-family:monospace;"><span style="color: #993333;">int</span> foo<span style="color: #009900;">&#40;</span><span style="color: #993333;">void</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
x <span style="color: #339933;">=</span> <span style="color: #0000dd;">5</span><span style="color: #339933;">;</span>
...
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #993333;">int</span> bar<span style="color: #009900;">&#40;</span><span style="color: #993333;">void</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
x <span style="color: #339933;">=</span> <span style="color: #0000dd;">2</span> <span style="color: #339933;">+</span> <span style="color: #0000dd;">3</span><span style="color: #339933;">;</span>
...</pre></div></div>

<p>A reader seeking deeper understanding of the above code would ask why the value <code>5</code> is built from two values in <code>bar</code>.  One reason might be that the author of the function wanted to explicitly call out background information about how the value <code>5</code> was derived (this is often done using symbolic names, but the use of literals themselves is sometimes encountered).  Perhaps the author of <code>foo</code> did not see the need to expose this information or perhaps the shared value is purely coincidental.</p>
<p>If the two representations denote the same quantity doesn&#8217;t the second have a greater information content for a reader seeking deeper understanding?</p>
<p>In the following example:</p>

<div class="wp_syntax"><div class="code"><pre class="c" style="font-family:monospace;">... <span style="color: #202020;">x</span> <span style="color: #339933;">+</span> y <span style="color: #339933;">&amp;</span> z ...
&nbsp;
...
&nbsp;
... <span style="color: #202020;">num_red</span> <span style="color: #339933;">+</span> num_white <span style="color: #339933;">&amp;</span> lower_bits ...</pre></div></div>

<p>an experienced developer with a knowledge of English is likely to interpret the expression as adding the number of occurrences of two quantities and using bit-wise AND to extract the lower bits.  For some readers the second expression has a higher information content.  Would use of the names <code>number_of_red</code> further increase the information content?</p>
<p>In the following example the first expression has not added any information that was not already present in the first expression above (except perhaps that the author was not certain of the precedence or perhaps did not expect subsequent readers to be certain).</p>

<div class="wp_syntax"><div class="code"><pre class="c" style="font-family:monospace;">... <span style="color: #009900;">&#40;</span> x <span style="color: #339933;">+</span> y <span style="color: #009900;">&#41;</span> <span style="color: #339933;">&amp;</span> z ...
&nbsp;
...
&nbsp;
... <span style="color: #202020;">x</span> <span style="color: #339933;">+</span> <span style="color: #009900;">&#40;</span> y <span style="color: #339933;">&amp;</span> z <span style="color: #009900;">&#41;</span> ...</pre></div></div>

<p>The second expression uses parenthesis to achieve an operand/operator binding that is different from the default.  Has this changed the information content of the expression?</p>
<p>There is <a href="http://www.knosof.co.uk/cbook/accu07.html">experimental evidence</a> that developers extract information from the names of variables to help them make decisions about operator precedence.  To me the name <code>all_32_bits_one</code> suggests a sequence of bits and I would expect such a representation to be associated with the bit-wise AND operator, not binary plus.  With no knowledge of the relative precedence of the two operators in the following expression the name of the middle operand would cause me to misinterpret the code.  Does this change the information content of the expression?  Does knowledge of the experimental evidence and the correct operator precedence change the information content (i.e., there is a potential fault in the code because the author may have assumed the incorrect precedence)?</p>

<div class="wp_syntax"><div class="code"><pre class="c" style="font-family:monospace;">... <span style="color: #202020;">num_red</span> <span style="color: #339933;">+</span> all_32_bits_one <span style="color: #339933;">&amp;</span> sign_bit ...</pre></div></div>

<p>There is <a href="http://cognitrn.psych.indiana.edu/rgoldsto/pdfs/landy08.pdf">experimental evidence</a> that people use the amount of whitespace appearing between operands and their operators to visually highlight operator precedence</p>
<p>The relative quantities of whitespace used in the following two expressions appear to tell very different stories.  Do the two expressions have a different information content?</p>

<div class="wp_syntax"><div class="code"><pre class="c" style="font-family:monospace;">... <span style="color: #202020;">x</span>  <span style="color: #339933;">+</span>  y <span style="color: #339933;">&amp;</span> z ...
&nbsp;
...
&nbsp;
... <span style="color: #202020;">x</span> <span style="color: #339933;">+</span> y  <span style="color: #339933;">&amp;</span>  z ...</pre></div></div>

<p>The idea of measuring the information content of source code is very enticing.  However, an accurate measure requires knowledge of the kind of information a reader is trying to obtain and of information that already exists in their brain.</p>
<p>Another question is the easy with which information can be extracted from code.  Something that might be labeled as readability, except that readability has connotations of there being an abundant supply of information to extract.</p>
<p><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fshape-of-code.coding-guidelines.com%2F2009%2F12%2F11%2Finformation-content-of-expressions%2F&amp;title=Information%20content%20of%20expressions" id="wpa2a_4"><img src="http://shape-of-code.coding-guidelines.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://shape-of-code.coding-guidelines.com/2009/12/11/information-content-of-expressions/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

