NWIP for Monochrome inkjet yield

February 23rd, 2017 (8 hours ago) No comments

As a member of IST/5, the British Standards’ programming language committee, I receive a daily notification of relevant documents that have arrived at BSI. The email arrives just before midnight and contains a generous helping of acronyms, such as: N13344 SC 28 ISO-IECJTC1-SC28 N2051 NWIP for Monochrome inkjet yield.

The line break on the above line resulted in “Monochrome inkjet yield” appearing at the start of a line and it caught my attention, so I downloaded the document.

SC28 is the ISO committee for office equipment and this NWIP (New Work Item Proposal) is for WG2 (the Working Group responsible for consumables) to create a new ISO Standard with the title: “Method for the Determination of Ink Cartridge Yield for Monochrome Inkjet Printers and Multifunction Devices that Contain Printer Components”. Voting, on whether or not work should start on this proposal, closes on July 12.

Why was information about inkjet yield sent to a programming language list? Are SC28/WG2 having a membership drive and have been tipped off that our workload is declining? More importantly, are they following the C++ model of having regular meetings in Hawaii; the paperwork does not say. The standard for color injet printers appeared in 2009; was the production of this document such a traumatic event that it decimated committee membership and it has taken eight years to put together a skeleton group.

Attached to the proposal is a 20-page draft document; somebody has been busy.

So how is it proposed that monochrome inkjet yield be calculated? You need at least nine inkjet cartridges, three printers and a room at a temperature of 23 degrees (plus/minus 2 degrees, with readings taken every 15 minutes and an hourly running average calculated; “… temperature can have a profound effect on test results.”). Load “… a common medium weight paper and must conform to the printer’s list of approved papers.” into the three printers that have been “… temperature acclimated to the test room environment.” and count the number of pages printed by each printer (using at least three cartridges in each printer) before “…an end of life judgement.” Divide total number of pages printed by total number of cartridges used and there you go.

End of life? “The cartridge yield is determined by an end of life judgement, or signalled with either of two phenomena: fade, caused by depletion of ink in the cartridge or automatic printing stop caused by an Ink Out detection function.”

What is fade?
“3.1 Fade
A phenomenon where a significant reduction in uniformity occurs due to ink depletion.
NOTE In this test, fade is defined as a noticeably lighter, 3 mm or greater, gap located in the text, in the bar chart, or in the boxes around the periphery of the test page. The determination of the change in lightness is to be made referenced to the 25th page printed for each cartridge in testing. For examples of fade, please consult Annex A.”

And Annex A?
“Examples of Fade <future edit: add picture>”

Formula for calculating the standard deviation and a 90% confidence interval are given (the 90% confidence interval formula assumes a Normal distribution; I would have thought that the distribution of pages printed by a cartridge might be skewed and a bootstrap procedure would be more reliable).

It is daylight now and my interest in inkjet yield is satiated. But if you, dear reader, have a longing for more, then Ms. Michelle Pangborn (Hewlett-Packard), USA or Mr. Nobuaki Hamada (Epson), Japan are the people to contact.

Some printer test pages to add to your link collection.

DACS: Software Life Cycle Empirical/Experience Database

February 19th, 2017 No comments

Economic data relating to software development is very very hard to find. Companies just don’t want to reveal how much they spent/charged to writing a software system. This kind of data is invariably confidential.

I’m currently working on the Economics chapter of my book on Empirical Software Engineering and the data is somewhat thin.

I’m hoping one of my readers can help out with a copy of the “DACS data”.

DACS (The Data & Analysis Center for Software), a US DOC information analysis center, used to sell copies of their Software Life Cycle Empirical/Experience Database for $50. The most interesting data set was the DACS Productivity Dataset containing effort and schedule data on over 500 software projects.

DACS was merged into CSIAC (Cyber Security & information systems Information Analysis Center; not sure if I capitalized the appropriate information) and the data availability is no more.

If you have a copy of this data, or know somebody who does, please send me a copy.

The person who put the data together, Richard Nelson, no longer works for the government, has a consulting firm registered in Orlando, and is an officer of the NASA Alumni League Florida Chapter. All the obvious searches for an email address fail, and I suspect that a retirement is being enjoyed.

Of course I am always happy to hear about any software engineering data that you think I don’t have.

Fault density: so costly to calculate that few values are reliable

February 10th, 2017 No comments

Fault density (i.e., number of faults per thousand lines of code) often appears in claims relating to software quality.

Fault density sounds like a very useful value to know; unfortunately most quoted values are meaningless and because obtaining reliable data is very costly.

The starting point for calculating fault density is the number of reported faults (I will leave the complexity of what constitutes a line of code for a future post). Most faults don’t get reported.

If there are no reported faults, fault density is zero. The more often software is executed the more likely a fault will be experienced (i.e., the large the range of input values thrown at a program the more likely it will go down a path containing a fault). Comparing like-with-like requires knowing how many different kinds of input a program processed to experience a given number of faults; we don’t want to fall into the trap of claiming heavily used code is less fault prone than lightly used code.

What counts as a fault? One study found that 46% of reported faults in Open Source bug tracking systems were misclassified (e.g., a fault report was actually a request for enhancement). Again, comparing like-with-like requires agreement on what constitutes a fault.

How should faults in code that is no longer shipped be counted? If the current version of a program contains 100K lines and previous versions contained 50K lines that have been deleted, should the faults in those 50K lines contribute to the fault density of the current program? I would say not, which means somebody has to figure out which reported faults apply to code in the current version of the program.

I am aware of less than half a dozen fault density values that I would consider reliable (most calculated during the Rome period). Everything else is little better than reading tea-leafs.

I have been reading your interesting paper

February 2nd, 2017 No comments

In the last six years or so I have sent around 420 emails whose first line started: “I have been reading your interesting paper”, followed a few lines later by: “Would it be possible to obtain a copy of the data?”, and then some background and links to blog posts and my previous book.

The response break down is roughly as follows:

Received data                       136  32%
No reply                            132  32%
Pending (received a positive reply)  49  12%
Confidential                         42  10%
No longer have the data              20   5%
Best known address bounces           11   3%

Thanks to those 136 researchers who took the time to collect together their data and send me a copy.

The “No reply” response get a second email 6-9 months after the first. I’m hoping that the availability of a draft of the book will generate some positive publicity that reminds researchers they have had an email from me and are missing out.

The “Confidential” case is relatively low because it is often obvious that the data is confidential and I don’t bother asking for a copy (I only use data that can be made public).

A common reason behind “No longer have the data” is a change of laptop and sometimes a change of jobs. If the paper is more than five years old, I tend not to ask unless the data looks very interesting. Mine and others’ experiences show that research data has a relatively short half-life.

I try quite hard to find a workable address, sometimes emailing supervisors and going via LinkedIn.

Tags: , ,

Empirical Software Engineering using R: first draft available for download

January 29th, 2017 2 comments

A draft of my book Empirical Software Engineering using R is now available for download.

The book essentially comes in two parts:

  • statistical techniques that are useful for analyzing software engineering data. This draft release contains most of the techniques I plan to cover. I am interested in hearing about any techniques you think ought to be covered, but I only cover techniques when real data is available to use in an example,
  • six chapters covering what I consider to be the primary aspects of software engineering. This draft release includes the Human Cognitive Characteristics chapter and I am hoping to release one each of the remaining chapters every few months (Economics is next).

There is a page for making suggestions and problem reports.

All the code+data is available and I am claiming to have a copy of all the important, publicly available, software engineering data. If you know of any I don’t have, please let me know.

I am looking for a publisher. The only publisher I have had serious discussions with decided not to go ahead because of my insistence of releasing a free copy of the pdf. Self-publishing is a last resort.

Tags: ,

Full Fact checking of number words

January 22nd, 2017 2 comments

I was at the Full Fact hackathon last Friday (yes, a weekday hackathon; it looked interesting and interesting hackathons have been very thin on the ground in the last six months). Full Fact is an independent fact checking charity; the event was hosted by Facebook.

Full Fact are aiming to check facts in real-time, for instance tweeting information about inaccurate statements made during live political debates on TV. Real-time responses sounds ambitious, but they are willing to go with what is available, e.g., previously checked facts built up from intensive checking activities after programs have been aired.

The existing infrastructure is very basic, it is still early days.

Being a numbers person I volunteered to help out analyzing numbers. Transcriptions of what people say often contains numbers written as words rather than numeric literals, e.g., eleven rather than 11. Converting number words to numeric literals would enable searches to made over a range of values. There is an existing database of checked facts and Solr is the search engine used in-house, this supports numeric range searches over numeric literals.

Converting number words to numeric literals sounds like a common problem and I expected to be able to choose from a range of fancy Python packages (the in-house development language).

Much to my surprise, the best existing code I could find was rudimentary (e.g., no support for fractions or ranking words such as first, second).

spaCy was used to tokenize sentences and decide whether a token was numeric and text2num converted the token to a numeric literal (nltk has not kept up with advances in nlp).

I quickly encountered a bug in spaCy, which failed to categorize eighteen as a number word; an update was available on github a few hours after I reported the problem+fix :-). The fact that such an obvious problem had not been reported before suggests that few people are using this functionality.

Jenna, the other team member writing code, used beautifulsoup to extract sentences from the test data (formatted in XML).

Number words do not always have clear cut values, e.g., several thousand, thousands, high percentage and character sequences that could be dates. Then there are fraction words (e.g., half, quarter) and ranking words (e.g., first, second), all everyday uses that will need to be handled. It is also important to be able to distinguishing between dates, percentages and ‘raw’ numbers.

The UK is not the only country with independent fact checking organizations. A member of the Chequeado, in Argentina, was at the hack. Obviously number words will have to handle the conventions of other languages.

Full Fact are looking to run more hackathons in the UK. Keep your eyes open for Hackathon announcements. In the meantime, if you know of a good python library for handling word to number conversion, please let me know.

Tags: ,

The future evolutionary cycle of application software?

January 13th, 2017 No comments

At some time in the future (or perhaps it has already happened) all the features needed (by users) in a widely used application will have been implemented in that application. Once this point is reached, do the software developers involved go off and do something else (leaving a few behind to fix lingering faults)? This is not good news for software developers, perhaps they should continue adding features and hope that users don’t notice.

When the application is a commercial product there is every incentive for new releases to be driven by income from upgrades rather than user needs. When users stop paying for upgrades it is time to shift to renting the application in the cloud rather than selling licenses.

With an Open Source application most of the development may be funded commercially or may be funded by the enjoyment that the primary developers obtain from what they do. For renting to be a viable option, a major service components needs to be included, e.g., Github offers hosting along with the use of Git.

Halting development on commercial products is easy, it happens automatically when paying customers drop below the cost of development. Work on Open Source is not so easily halted. The enjoyment from writing software is does not rely on external funding, it is internally generated (having other people use your software is always a buzz and is a kind of external funding).

If the original core developers of an Open Source project move onto something else and nobody takes over, the code stops changing. However, this might only be the death of one branch, not the end of the road for development of what the application does. Eventually another developer decides it would be fun to reimplement the application in their favorite language. An example of this in Asciidoc (a document formatter), where the core developer decided to terminate personal involvement at the end of 2013 (a few people are making local updates to their own copies of the source, at least according to the Github fork timeline). Another developer appeared on the scene and decided to reimplement the functionality in Ruby, Asciidoctor.

Reimplementation of a tool in another language is a surprisingly common activity. There is a breed of developers who thinks that programs written in the language currently occupying their thoughts are magically better than the same program written in another language. At the moment Rust is an easy entanglement for those needing a language to love.

Over time, it will become harder and harder to install and run Asciidoc, because the ecosystem of libraries it depends on have evolved away from the behavior that is relied on. AsciiDoctor will become the default choice because it works on the available platforms. Eventually the core developer of AsciiDoctor will terminate his personal involvement; and then? Perhaps somebody will step forward to maintain the Ruby version or perhaps somebody will decide to reimplement in another language and around we go again.

The evolutionary cycle of software in the future is starting to look like it well be:

  1. developer(s) with enthusiasm and time on their hands, reimplement an application, (which is itself version n-1 of that application), in the language they love
  2. time passes and users accumulate, while the developer(s) actively supports application_n,
  3. those involved terminate involvement in supporting application_n,
  4. more time passes, during which the software ecosystem that applications+n depends on changes,
  5. successfully installing and running application_n is now so difficult that most users have migrated to application_(n+1).

Of course users will complain, but they don’t count in the world of Open Source (the role of users in Open Source is to provide adulation from which the core developers can extract sustenance).

Understanding where one academic paper fits in the plot line

January 1st, 2017 No comments

Reading an academic paper is rather like watching an episode of a soap opera, unless you have been watching for a while you will have little idea of the roles played by the actors and the background to what is happening. A book is like a film in that it has a beginning-middle-end and what you need to know is explained.

Sitting next to somebody who has been watching for a while is a good way of quickly getting up to speed, but what do you do if no such person is available or ignores your questions?

How do you find out whether you are watching a humdrum episode or a major pivotal moment? Typing the paper’s title into Google, in quotes, can provide a useful clue; the third line of the first result returned will contain something like ‎’Cited by 219′ (probably a much lower number, with no ‘Cite by’ meaning ‘Cite by 0′). The number is a count of the other papers that cite the one searched on. Over 50% of papers are never cited and very recently published papers are too new to have any citations; a very few old papers accumulate thousands of citations.

Clicking on the ‘Cited by’ link will take you to Google Scholar and the list of later episodes involving the one you are interested in. Who are the authors of these later episodes (the names appear in the search results)? Have they all been written by the author of the original paper, i.e., somebody wandering down the street mumbling to himself? What are the citation counts of these papers? Perhaps the mumbler did something important in a later episode that attracted lots of attention, but for some reason you are looking at an earlier episode leading up to pivotal moment.

Don’t be put off by a low citation count. Useful work is not always fashionable and authors tend to cite what everybody else cites.

How do you find out about the back story? Papers are supposed to contain a summary of the back story of the work leading up to the current work, along with a summary of all related work. Page length restrictions (conferences invariably place a limit on the maximum length of a paper, e.g., 8 or 10 pages) mean that these summaries tend to be somewhat brief. The back story+related work summaries will cite earlier episodes, which you will then have to watch to find out a bit more about what is going on; yes, you guessed it, there is a rinse repeat cycle tracing episodes further and further back. If you are lucky you will find a survey article, which summarizes what is known based on everything that has been published up to a given point in time (in active fields surveys are published around every 10 years, with longer gaps in less active fields), or you will find the authors PhD thesis (this is likely to happen for papers published a few years after the PhD; a thesis is supposed to have a film-like quality to it and some do get published as books).

A couple of points about those citations you are tracing. Some contain typos (Google failing to return any matches for a quoted title is a big clue), some cite the wrong paper (invariable a cut-and-paste error by the author), some citations are only there to keep a referee happy (the anonymous people chosen to review a paper to decide whether it is worth publishing have been known to suggest their own work, or that of a friend be cited), some citations are only listed because everybody else cites them, and the cited work says the opposite of what everybody claims it says (don’t assume that just because somebody cites a paper that they have actually read it; the waterfall paper is the classic example of this).

After a week or two you should be up to speed on what is happening on the soap you are following.

Tags: ,

Failed projects + the Cloud = Software reuse

December 26th, 2016 No comments

Code reuse is one of those things that sounds like a winning idea to those outside of software development; those who write software for a living are happy to reuse other peoples’ code but don’t want the hassle involved with others reusing their own code. From the management point of view, where is the benefit in having your developers help others get their product out the door when they should be working towards getting your product out the door?

Lots of projects get canceled after significant chunks of software have been produced, some of it working. It would be great to get some return on this investment, but the likely income from selling software components is rarely large enough to make it worthwhile investing the necessary resources. The attractions of the next project are soon appear more enticing than hanging around baby-sitting software from a cancelled project.

Cloud services, e.g., AWS and Azure to name two, look like they will have a big impact on code reuse. The components of a failed project, i.e., those bits that work tolerably well, can be packaged up as a service and sold/licensed to other companies using the same cloud provider. Companies are already offering a wide variety of third-party cloud services, presumably the new software got written because no equivalent services was currently available on the provider’s cloud; well perhaps others are looking for just this service.

The upfront cost of sales is minimal, the services your failed re-purposed software provides get listed in various service directories. The software can just sit there waiting for customers to come along, or you could put some effort into drumming up customers. If sales pick up, it may become worthwhile offering support and even making enhancements.

What about the software built for non-failed projects? Software is a force multiplier and anybody working on a non-failed project wants to use this multiplier for their own benefit, not spend time making it available for others (I’m not talking about creating third-party APIs).

Is sorting a list of names racial discrimination?

December 21st, 2016 No comments

Governments are starting to notice the large, and growing, role that algorithms have in the everyday life of millions of people. There is now an EU regulation, EU 2016/679, covering “… the protection of natural persons with regard to the processing of personal data…”

The wording in Article 22 has generated some waves: “The data subject shall have the right not to be subject to a decision based solely on automated processing, including profiling, which produces legal effects concerning him or her or similarly significantly affects him or her”

But I think something much bigger is tucked away in a subsection of Article 14 paragraph 2 “…the controller shall provide the data subject with the following information…”, subsection (g) “…meaningful information about the logic involved…” Explaining the program logic involved to managers who are supposed to have some basic ability for rational thought is hard enough, but the general public?

It is not necessary for the general public acquire a basic understanding of the logic behind some of the decisions made by computers, rabble-rousing by sections of the press and social media can have a big impact.

A few years ago I was very happy to see a noticeable reduction in my car insurance. This reduction was not the result of anything I had done, but because insurance companies were no longer permitted to discriminate on the basic of gender; men had previously paid higher car insurance premiums because the data showed they were a higher risk than women (who used to pay lower premiums). At last, some of the crazy stuff done in the name of gender equality benefited men.

Sorting would appear to be discrimination free, but ask any taxi driver about appearing first in a list of taxi phone numbers. Taxi companies are not called A1, AA, AAA because the owners are illiterate, they know all too well the power of appearing at the front of a list.

If you are in the market for a compiler writer whose surname starts with J (I have seen people make choices with less rationale than this), the following is obviously the most desirable expert listing (I don’t know any compiler writers called Kurt or Adalene):

Jones, Derek
Jönes, Kurt
Jônes, Adalene

Now Kurt might object, pointing out that in German the letter ö is sorted as if it had been written oe, which means that Jönes gets to be sorted before Jones (in Estonian, Hungarian and Swedish, Jones appears first).

What about Adalene? French does not contain the letter ö, so who is to say she should be sorted after Kurt? Unicode specifies a collation algorithm, but we are in the realm of public opinion here, not having a techy debate.

This issue could be resolved in the UK by creating a brexit locale specifying that good old English letters always sort before Jonny foreigner letters.

Would use of such a brexit locale be permitted under EU 2016/679 (assuming the UK keeps this regulation), or would it be treated as racial discrimination?

I certainly would not want to be the person having to explain to the public the logic behind collation sequences and sort locales.

Tags: , ,