My observation that academics treat Entropy as the go-to topic, when they have no idea what else to talk about, has ruffled a few feathers. David Clark, one of the organizers of a workshop on Information Theory and Software Testing has invited me to give a talk on Entropy (the title is currently *Entropy for the uncertain*, but this state might change :-).

Complaining about the many ways entropy is currently misused in software engineering would be like shooting fish in a barrel, and equally pointless. I want to encourage people to use entropy in a meaningful way, and to stop using Shannon entropy just because it is the premium brand of entropy.

Shannon’s derivation of the iconic formula depends on various assumptions being true. While these conditions look like they might hold for some software engineering problems, they clearly don’t hold for others. It may be possible to use other forms of entropy for some of these other problems; Shannon became the premium brand of entropy because it was first to market, the other entropy products have not had anyone championing their use, and academics follow each other like sheep (it’s much easier to get a paper published by using the well-known brands).

Shannon’s entropy has been generalized, with the two most well-known being (in the limit , both converge to Shannon entropy):

Rényi entropy in 1961:

Tsallis entropy in 1988:

All of these formula reduce a list of probabilities to a single value. A weighting is applied to each probability, and this weighted value is summed to produce a single value that is further manipulated. The probability weighting functions are plotted below:

Under what conditions might one these two forms of entropy be used (there other forms)? I have been rummaging around looking for example uses, and could not find many.

There are some interesting papers about possible interpretations of the parameter in Tsallis entropy: the most interesting paper I have found shows a connection with the correlation between states, e.g., preferential attachment in networks. This implies that Tsallis entropy is the natural first candidate to consider for systems exhibiting power law characteristics. Another paper suggests derives from variation in the parameter of an exponential equation.

Some computer applications: a discussion of Tsallis entropy and the concept of non-extensive entropy, along with an analysis of statistical properties of hard disc workloads, the same idea applied to computer memory.

Some PhD thesis: Rényi entropy, with , for error propagation in software architectures, comparing various measures of entropy as a metric for the similarity of program execution traces, plus using Rényi entropy in cryptography

As you can see, I don’t have much to talk about. I’m hoping my knowledgeable readers can point me at some uses of entropy in software engineering where the author has put some thought into which entropy to use (which may have resulted in Shannon entropy being chosen; I’m only against this choice when it is made for brand name reasons).

Registration for the workshop is open, so turn up and cheer me on.

Roll your own weighting plot:

p_vals=seq(0.001, 1.001, by=0.01)
plot(p_vals, -p_vals*log(p_vals), type="l", col="red",
ylim=c(0, 1),
xaxs="i", yaxs="i",
xlab="Probability", ylab="Weight")
q=0.5
lines(p_vals, p_vals^q, type="l", col="blue")
q=2
lines(p_vals, p_vals^q, type="l", col="green") |

p_vals=seq(0.001, 1.001, by=0.01)
plot(p_vals, -p_vals*log(p_vals), type="l", col="red",
ylim=c(0, 1),
xaxs="i", yaxs="i",
xlab="Probability", ylab="Weight")
q=0.5
lines(p_vals, p_vals^q, type="l", col="blue")
q=2
lines(p_vals, p_vals^q, type="l", col="green")

## Recent Comments