<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>The Shape of Code &#187; readability</title>
	<atom:link href="http://shape-of-code.coding-guidelines.com/tag/readability/feed/" rel="self" type="application/rss+xml" />
	<link>http://shape-of-code.coding-guidelines.com</link>
	<description></description>
	<lastBuildDate>Sun, 12 Feb 2012 20:42:27 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Readability: we know nothing</title>
		<link>http://shape-of-code.coding-guidelines.com/2011/06/30/readability-we-know-nothing/</link>
		<comments>http://shape-of-code.coding-guidelines.com/2011/06/30/readability-we-know-nothing/#comments</comments>
		<pubDate>Thu, 30 Jun 2011 18:17:27 +0000</pubDate>
		<dc:creator>Derek-Jones</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[cognitive psychology]]></category>
		<category><![CDATA[cost/benefit]]></category>
		<category><![CDATA[eye tracking]]></category>
		<category><![CDATA[readability]]></category>
		<category><![CDATA[visualization]]></category>

		<guid isPermaLink="false">http://shape-of-code.coding-guidelines.com/?p=483</guid>
		<description><![CDATA[Readability is one of those terms that developers use and expect other developers to understand while at the same time being unable to define what it is or how it might be measured. I think all developers would agree that their own code is very readable; if only different developers stopped writing code in different [...]]]></description>
			<content:encoded><![CDATA[<p>Readability is one of those terms that developers use and expect other developers to understand while at the same time being unable to define what it is or how it might be measured.  I think all developers would agree that their own code is very readable; if only different developers stopped writing code in different ways the issue would go away <img src='http://shape-of-code.coding-guidelines.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>Having written <a href="http://www.knosof.co.uk/cbook">a book</a> containing lots of material on cognitive psychology and how it might apply to programming, developers who have advanced beyond &#8220;Write code like me and it will be readable&#8221; sometimes ask for my perceived expert view on the subject.  Unfortunately my expertise has only advanced to the stage of: 1) having a good idea of what research questions need to be addressed, 2) being able to point at experimental results showing that most claimed good readability tips are at best worthless or may even increase <a href="http://en.wikipedia.org/wiki/Cognitive_load">cognitive load</a> during reading.</p>
<p>To a good approximation we know nothing about code readability.  What questions need to answered to change this situation?</p>
<p>The first and most important readability question is: what is the purpose of looking at the code?  Is the code being read to gain understanding (likely to involve &#8216;slow&#8217; and deliberate behavior) or is the reader searching for some construct (likely to involve skimming; yes, slow and deliberate is more accurate but people make cost/benefit decisions when deciding which strategies to use.  The factors involved in reader strategy selection is another important question)?</p>
<p>Next we need to ask what characteristics of developer performance are expected to change with different code organization/layouts.  Are we interested in minimizing error, minimizing the time taken to achieve the readers purpose or something else?</p>
<p>What source code attributes play a significant role in readability?  Possibilities include the order in which various constructs appear (e.g., should variable definitions appear at the start of a function or close to where they are first used), <a href="http://www.knosof.co.uk/cbook/sent792.pdf">variable names</a> and the position of tokens relative to each other when viewed by the reader.</p>
<p>Questions involving the relative position of tokens probably generates the greatest volume of discussion among developers.  To what extent does visual organization of source code affect reader performance?  <a href="http://shape-of-code.coding-guidelines.com/2010/12/19/christmas-book-for-2010/">Fluent reading</a> requires a significant amount of practice, perhaps readable code is whatever developers have spent lots of time reading.</p>
<p>If there is some characteristic of the human visual system that generates a worthwhile benefit to splitting long lines so that a binary operator appears at the {end of the last}/{start of the next} line, will it apply the same way to all developers?  We could end up developers having to configure their editor so it displays code in a form that matches the characteristics of their visual system.</p>
<p>How might these &#8216;visual&#8217; questions be answered?  I think that <a href="http://en.wikipedia.org/wiki/Eye_tracking">eye tracking</a> will play a large role (&#8220;Eyetracking Web Usability&#8221; by Jakob Nielsen and Kara Pernice is a good read).  At the moment there are technical/usability issues that make this kind of research very difficult.  Eye trackers capable of continuously supporting enough resolution to know which character on the screen a developer is looking at (e.g., <a href="http://www.sr-research.com/EL_1000.html">EyeLink 1000</a>) require that the head be held in a fixed position, while those allowing completely free head movement (e.g., <a href="http://mirametrix.com/products/eye-tracker/">S2 Eye Tracker</a>) don&#8217;t yet continuously support the required resolution.</p>
<p>Of course any theory derived from eye tracking experiments will still have to be validated by measuring <a href="http://shape-of-code.coding-guidelines.com/2009/01/20/readability-an-experimental-view/">developer performance on various code snippets</a>.</p>
<p><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fshape-of-code.coding-guidelines.com%2F2011%2F06%2F30%2Freadability-we-know-nothing%2F&amp;title=Readability%3A%20we%20know%20nothing" id="wpa2a_2"><img src="http://shape-of-code.coding-guidelines.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://shape-of-code.coding-guidelines.com/2011/06/30/readability-we-know-nothing/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Christmas book for 2010</title>
		<link>http://shape-of-code.coding-guidelines.com/2010/12/19/christmas-book-for-2010/</link>
		<comments>http://shape-of-code.coding-guidelines.com/2010/12/19/christmas-book-for-2010/#comments</comments>
		<pubDate>Sun, 19 Dec 2010 00:38:16 +0000</pubDate>
		<dc:creator>Derek-Jones</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[brain]]></category>
		<category><![CDATA[readability]]></category>
		<category><![CDATA[source code]]></category>
		<category><![CDATA[visualization]]></category>

		<guid isPermaLink="false">http://shape-of-code.coding-guidelines.com/?p=334</guid>
		<description><![CDATA[I&#8217;m rather late with my list of Christmas books for 2010. While I do have a huge stack of books waiting to be read I don&#8217;t seem to have read many books this year (I have been reading lots of rather technical blogs this year, i.e., time/thought consuming ones) and there is only one book [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m rather late with my list of Christmas books for 2010.  While I do have a huge stack of books waiting to be read I don&#8217;t seem to have read many books this year (I have been reading lots of rather technical blogs this year, i.e., time/thought consuming ones) and there is only one book I would strongly recommend.</p>
<p>Anybody with even the slightest of interest in code readability needs to read <a href="http://www.readinginthebrain.com/"><br />
Reading in the Brain</a> by <a href="http://www.unicog.org/main/pages.php?page=Stanislas_Dehaene">Stanislaw Dehaene</a> (the guy who wrote The Number Sense, another highly recommended book).  The style of the book is half way between being populist and being an undergraduate text.</p>
<p>Most of the discussion centers around the hardware/software processing that takes place in what Dehaene refers to as the <em>letterbox area</em> of the brain (in the <a href="http://readinginthebrain.pagesperso-orange.fr/img/small/Diapositive8.jpg">left occipito-temporal cortex</a>).  The hardware being neurons in the human brain and software being the connections between them (part genetically hardwired and part selectively learned as the brain&#8217;s owner goes through childhood; Dehaene is not a software developer and does not use this hardware/software metaphor).</p>
<p>As any engineer knows, knowledge of the functional characteristics of a system are essential when designing other systems to work with it.  Reading this book will help people separate out the plausible from the functionally implausible in discussions about code readability.</p>
<p>Time and again the reading process has co-opted brain functionality that appears to have been <a href="http://readinginthebrain.pagesperso-orange.fr/img/small/Diapositive35.jpg">designed to perform other activities</a>.  During the evolution of writing there also seems to have been some <a href="http://readinginthebrain.pagesperso-orange.fr/img/small/Diapositive39.jpg">adaptation to existing processes in the brain</a>; a lesson here for people designing code visualizations tools.</p>
<p>In my C book I tried to provide an <a href="http://www.coding-guidelines.com/cbook/sent770.pdf">overview of the reading process</a> but skipped discussing what went on in the brain, partly through ignorance on my part and also a belief that we were a long way from having an accurate model.  Dehaene&#8217;s book clearly shows that a good model of what goes on in the brain during reading is now available.</p>
<p><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fshape-of-code.coding-guidelines.com%2F2010%2F12%2F19%2Fchristmas-book-for-2010%2F&amp;title=Christmas%20book%20for%202010" id="wpa2a_4"><img src="http://shape-of-code.coding-guidelines.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://shape-of-code.coding-guidelines.com/2010/12/19/christmas-book-for-2010/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Information content of expressions</title>
		<link>http://shape-of-code.coding-guidelines.com/2009/12/11/information-content-of-expressions/</link>
		<comments>http://shape-of-code.coding-guidelines.com/2009/12/11/information-content-of-expressions/#comments</comments>
		<pubDate>Fri, 11 Dec 2009 02:32:56 +0000</pubDate>
		<dc:creator>Derek-Jones</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[expressions]]></category>
		<category><![CDATA[faults]]></category>
		<category><![CDATA[information content]]></category>
		<category><![CDATA[measuring]]></category>
		<category><![CDATA[precedence]]></category>
		<category><![CDATA[readability]]></category>
		<category><![CDATA[source code]]></category>
		<category><![CDATA[whitespace]]></category>

		<guid isPermaLink="false">http://shape-of-code.coding-guidelines.com/?p=138</guid>
		<description><![CDATA[Software developers read source code to obtain information. How might the information content of source code be quantified? Both of the following functions assign the same value to x and if that is the only information a reader of that code is interested in, then the information content of both assignment statements could be said [...]]]></description>
			<content:encoded><![CDATA[<p>Software developers read source code to obtain information.  How might the information content of source code be quantified?</p>
<p>Both of the following functions assign the same value to <code>x</code> and if that is the only information a reader of that code is interested in, then the information content of both assignment statements could be said to be the same.</p>

<div class="wp_syntax"><div class="code"><pre class="c" style="font-family:monospace;"><span style="color: #993333;">int</span> foo<span style="color: #009900;">&#40;</span><span style="color: #993333;">void</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
x <span style="color: #339933;">=</span> <span style="color: #0000dd;">5</span><span style="color: #339933;">;</span>
...
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #993333;">int</span> bar<span style="color: #009900;">&#40;</span><span style="color: #993333;">void</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
x <span style="color: #339933;">=</span> <span style="color: #0000dd;">2</span> <span style="color: #339933;">+</span> <span style="color: #0000dd;">3</span><span style="color: #339933;">;</span>
...</pre></div></div>

<p>A reader seeking deeper understanding of the above code would ask why the value <code>5</code> is built from two values in <code>bar</code>.  One reason might be that the author of the function wanted to explicitly call out background information about how the value <code>5</code> was derived (this is often done using symbolic names, but the use of literals themselves is sometimes encountered).  Perhaps the author of <code>foo</code> did not see the need to expose this information or perhaps the shared value is purely coincidental.</p>
<p>If the two representations denote the same quantity doesn&#8217;t the second have a greater information content for a reader seeking deeper understanding?</p>
<p>In the following example:</p>

<div class="wp_syntax"><div class="code"><pre class="c" style="font-family:monospace;">... <span style="color: #202020;">x</span> <span style="color: #339933;">+</span> y <span style="color: #339933;">&amp;</span> z ...
&nbsp;
...
&nbsp;
... <span style="color: #202020;">num_red</span> <span style="color: #339933;">+</span> num_white <span style="color: #339933;">&amp;</span> lower_bits ...</pre></div></div>

<p>an experienced developer with a knowledge of English is likely to interpret the expression as adding the number of occurrences of two quantities and using bit-wise AND to extract the lower bits.  For some readers the second expression has a higher information content.  Would use of the names <code>number_of_red</code> further increase the information content?</p>
<p>In the following example the first expression has not added any information that was not already present in the first expression above (except perhaps that the author was not certain of the precedence or perhaps did not expect subsequent readers to be certain).</p>

<div class="wp_syntax"><div class="code"><pre class="c" style="font-family:monospace;">... <span style="color: #009900;">&#40;</span> x <span style="color: #339933;">+</span> y <span style="color: #009900;">&#41;</span> <span style="color: #339933;">&amp;</span> z ...
&nbsp;
...
&nbsp;
... <span style="color: #202020;">x</span> <span style="color: #339933;">+</span> <span style="color: #009900;">&#40;</span> y <span style="color: #339933;">&amp;</span> z <span style="color: #009900;">&#41;</span> ...</pre></div></div>

<p>The second expression uses parenthesis to achieve an operand/operator binding that is different from the default.  Has this changed the information content of the expression?</p>
<p>There is <a href="http://www.knosof.co.uk/cbook/accu07.html">experimental evidence</a> that developers extract information from the names of variables to help them make decisions about operator precedence.  To me the name <code>all_32_bits_one</code> suggests a sequence of bits and I would expect such a representation to be associated with the bit-wise AND operator, not binary plus.  With no knowledge of the relative precedence of the two operators in the following expression the name of the middle operand would cause me to misinterpret the code.  Does this change the information content of the expression?  Does knowledge of the experimental evidence and the correct operator precedence change the information content (i.e., there is a potential fault in the code because the author may have assumed the incorrect precedence)?</p>

<div class="wp_syntax"><div class="code"><pre class="c" style="font-family:monospace;">... <span style="color: #202020;">num_red</span> <span style="color: #339933;">+</span> all_32_bits_one <span style="color: #339933;">&amp;</span> sign_bit ...</pre></div></div>

<p>There is <a href="http://cognitrn.psych.indiana.edu/rgoldsto/pdfs/landy08.pdf">experimental evidence</a> that people use the amount of whitespace appearing between operands and their operators to visually highlight operator precedence</p>
<p>The relative quantities of whitespace used in the following two expressions appear to tell very different stories.  Do the two expressions have a different information content?</p>

<div class="wp_syntax"><div class="code"><pre class="c" style="font-family:monospace;">... <span style="color: #202020;">x</span>  <span style="color: #339933;">+</span>  y <span style="color: #339933;">&amp;</span> z ...
&nbsp;
...
&nbsp;
... <span style="color: #202020;">x</span> <span style="color: #339933;">+</span> y  <span style="color: #339933;">&amp;</span>  z ...</pre></div></div>

<p>The idea of measuring the information content of source code is very enticing.  However, an accurate measure requires knowledge of the kind of information a reader is trying to obtain and of information that already exists in their brain.</p>
<p>Another question is the easy with which information can be extracted from code.  Something that might be labeled as readability, except that readability has connotations of there being an abundant supply of information to extract.</p>
<p><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fshape-of-code.coding-guidelines.com%2F2009%2F12%2F11%2Finformation-content-of-expressions%2F&amp;title=Information%20content%20of%20expressions" id="wpa2a_6"><img src="http://shape-of-code.coding-guidelines.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://shape-of-code.coding-guidelines.com/2009/12/11/information-content-of-expressions/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Readability, an experimental view</title>
		<link>http://shape-of-code.coding-guidelines.com/2009/01/20/readability-an-experimental-view/</link>
		<comments>http://shape-of-code.coding-guidelines.com/2009/01/20/readability-an-experimental-view/#comments</comments>
		<pubDate>Tue, 20 Jan 2009 01:54:50 +0000</pubDate>
		<dc:creator>Derek-Jones</dc:creator>
				<category><![CDATA[empirical]]></category>
		<category><![CDATA[cluster analysis]]></category>
		<category><![CDATA[experimental]]></category>
		<category><![CDATA[random]]></category>
		<category><![CDATA[readability]]></category>
		<category><![CDATA[students]]></category>

		<guid isPermaLink="false">http://shape-of-code.coding-guidelines.com/?p=61</guid>
		<description><![CDATA[Readability is an attribute that source code is often claimed to have, but what is it? While people are happy to use the term they have great difficulty in defining exactly what it is (I will eventually get around discussing my own own views in post). Ray Buse took a very simply approach to answering [...]]]></description>
			<content:encoded><![CDATA[<p>Readability is an attribute that source code is often claimed to have, but what is it?  While people are happy to use the term they have great difficulty in defining exactly what it is (I will eventually get around discussing my own own views in post).  <a href="http://www.arrestedcomputing.com">Ray Buse</a> took a very simply approach to answering this question, he asked lots of people (to be exact 120 students) to rate short snippets of code and analysed the results.  Like all good researchers he made his <a href="http://www.arrestedcomputing.com/readability/">data available to others</a>.  This posting discusses my thoughts on the expected results and some analysis of the results.</p>
<p>The subjects were first, second, third year undergraduates and postgraduates.  I would not expect first year students to know anything and for their results to be essentially random.  Over the years, as they gain more experience, I would expect individual views on what constitutes readability to stabilize.  The input from friends, teachers, books and web pages might be expected to create some degree of agreement between different students&#8217; view of what constitutes readability.  I&#8217;m not saying that this common view is correct or bears any relationship to views held by other groups of people, only that there might be some degree of convergence within a group of people studying together.</p>
<p>Readability is not something that students can be expected to have explicitly studied (I&#8217;m assuming that it plays an insignificant part in any course marks), so their knowledge of it is implicit.  Some students will enjoy writing code and spends lots of time doing it while (many) others will not.</p>
<p>Separating out the data by year the results for first year students look like a normal distribution with a slight bulge on one side (created using <code>plot(density(1_to_5_rating_data))</code> in <a href="http://www.r-project.org">R</a>).</p>
<p><img src="http://www.coding-guidelines.com/images/cs101density.jpg" alt="First year" /></p>
<p>year by year this bulge turns (second year): </p>
<p><img src="http://www.coding-guidelines.com/images/cs216density.jpg" alt="Second year" /></p>
<p>into a hillock (final year):</p>
<p><img src="http://www.coding-guidelines.com/images/cs415density.jpg" alt="Final year" /></p>
<p>It is tempting to interpret these results as the majority of students assigning an essentially random rating, with a slight positive bias, for the readability of each snippet, with a growing number of more experienced students assigning less than average rating to some snippets.</p>
<p>Do the student&#8217;s view converge to a common opinion on readability?  The answers appears to be No.  An analysis of the final year data using <a href="http://en.wikipedia.org/wiki/Fleiss'_kappa.">Fleiss&#8217;s Kappa</a> shows that there is virtually no agreement between students ratings.  In fact every <a href="http://cran.r-project.org/web/packages/irr/index.html">Interrater Reliability and Agreement</a> function I tried said the same thing.  Some <a href="http://en.wikipedia.org/wiki/Data_clustering">cluster analysis</a> might enable me to locate students holding similar views.</p>
<p>In an email exchange with Ray Buse I learned that the postgraduate students had a relatively wide range of computing expertise, so I did not analyse their results.</p>
<p>I wish I had thought of this approach to measuring readability.  Its simplicity makes it amenable for use in a wide range of experimental situations.  The one change I would make is that I would explicitly create the snippets to have certain properties, rather than randomly extracting them from existing source.</p>
<p><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fshape-of-code.coding-guidelines.com%2F2009%2F01%2F20%2Freadability-an-experimental-view%2F&amp;title=Readability%2C%20an%20experimental%20view" id="wpa2a_8"><img src="http://shape-of-code.coding-guidelines.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://shape-of-code.coding-guidelines.com/2009/01/20/readability-an-experimental-view/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

