Archive

Posts Tagged ‘cost’

Time-to-fix when mistake discovered in a later project phase

August 23, 2020 No comments

Traditionally the management of software development projects divides them into phases, e.g., requirements, design, coding and testing. A mistake introduced in one phase may not be detected until a later phase. There is long-standing folklore that earlier mistakes detected in later phases are much much more costly to fix persists, despite the original source of this folklore being resoundingly debunked. Fixing a mistake later is likely to a bit more costly, but how much more costly? A lack of data prevents reliable analysis; this question also suffers from different projects having different cost-to-fix profiles.

This post addresses the time-to-fix question (cost involves all the resources needed to perform the fix). Does it take longer to correct mistakes when they are detected in phases that come after the one in which they were made?

The data comes from the paper: Composing Effective Software Security Assurance Workflows. The 35,367 (yes, thirty-five thousand) logged fixes, from 39 projects drawn from three organizations, contains information on: phases in which the mistake was made and fixed, time taken, person ID, project ID, date/time, plus other stuff 🙂

Every project has its own characteristics that affect time-to-fix. Project 615, avionics software developed by organization A, has the most fixes (7,503) and is analysed here.

Avionics software is safety critical, and each major phase included its own review and inspection. The major phases include: requirements gathering, requirements analysis, high level design, design, coding, and testing. When counting the number of phases between introduction/fix, should review and inspection each count as a phase?

The primary reason for doing a review and inspection is to check the correctness (i.e., lack of mistakes) in the corresponding phase. If there is a time-to-fix penalty for mistakes found in these symbiotic-phases, I suspect it will be different from the time-to-fix penalty between major phases (which for simplicity, I’m assuming is major-phase independent).

The time-to-fix has a resolution of 1-minute, and some fix times are listed as taking a minute; 72% of fixes are recorded as taking less than 10-minutes. What kind of mistakes require less than 10-minutes to fix? Typos and other minutiae.

The plot below shows time-to-fix for mistakes having a given ‘distance’ between introduction/fix phase, for fixes taking at least 1, 5 and 10-minutes (code+data):

Time-to-fix for mistakes having a given number of phases between introduction and fix.

There is a huge variation in time-to-fix, and the regression lines (which have the form: fixTime approx e^{sqrt{phaseSep}}) explains just 6% of the variance in the data, i.e., there is a small increase with phase separation, but it is almost down in the noise.

All but one of the 38 people who worked on the project made multiple fixes (30 made more than 20 fixes), and may have got faster with practice. Adding the number of previous fixes by people making more than 20 fixes to the model gives: fixTime approx e^{sqrt{phaseSep}}/fixNum^{0.03}, and improves the model by less than 1-percent.

Fixing mistakes is a human activity, and individual performance often has a big impact on fitted models. Adding person ID to the model as a multiplication factor: i.e., fixTime approx personID*{e^{sqrt{phaseSep}}/fixNum^{0.03}}, improves the variance explained to 14% (better than a poke in the eye, just). The fitted value of personID varies between 0.66 and 1.4 (factor of two, human variation).

The answer to the time-to-fix question posed earlier (for project 615), is that it does take slightly longer to fix a mistake detected in phases occurring after the one in which the mistake was introduced. The phase difference is tiny, with differences in human performance having a bigger impact.

Average maintenance/development cost ratio is less than one

April 20, 2017 3 comments

Part of the (incorrect) folklore of software engineering is that more money is spent on maintaining an application than was spent on the original development.

Bossavit’s The Leprechauns of Software Engineering does an excellent job of showing that the probably source of this folklore did not base their analysis on any cost data (I’m not going to add to an already unwarranted number of citations by listing the source).

I have some data, actually two data sets, each measuring a different part of the problem, i.e., 1) system lifetime and 2) maintenance/development costs. Both sets of measurements apply to IBM mainframe software, so a degree of relatedness can be claimed.

Analyzing this data suggests that the average maintenance/development cost ratio, for a IBM applications, is around 0.81 (code+data). The data also provides a possible explanation for the existing folklore in terms of survivorship bias, i.e., most applications do not survive very long (and have a maintenance/development cost ratio much less than one), while a few survive a long time (and have a maintenance/development cost ratio much greater than one).

At any moment around 79% of applications currently being maintained will have a maintenance/development cost ratio greater than one, 68% a ratio greater than two and 51% a ratio greater than five.

Another possible cause of incorrect analysis is the fact we are dealing with ratios; the harmonic mean has to be used, not the arithmetic mean.

Existing industry practice of not investing in creating maintainable software probably has a better cost/benefit than the alternative because most software is not maintained for very long.