Archive

Posts Tagged ‘probability’

Christmas books for 2009

December 7th, 2009 Derek-Jones No comments

I thought it would be useful to list the books that gripped me one way or another this year (and may be last year since I don’t usually track such things closely); perhaps they will give you some ideas to add to your Christmas present wish list (please make your own suggestions in the Comments). Most of the books were published a few years ago, I maintain piles of books ordered by when I plan to read them and books migrate between piles until eventually read. Looking at the list I don’t seem to have read many good books this year, perhaps I am spending too much time reading blogs.

These books contain plenty of facts backed up by numbers and an analytic approach and are ordered by physical size.

The New Science of Strong Materials by J. E. Gordon. Ideal for train journeys since it is a small book that can be read in small chunks and is not too taxing. Offers lots of insight into those properties of various materials that are needed to build things (’new’ here means postwar).

Europe at War 1939-1945 by Norman Davies. A fascinating analysis of the war from a numbers perspective. It is hard to escape the conclusion that in the grand scheme of things us plucky Brits made a rather small contribution, although subsequent Hollywood output has suggested otherwise. Also a contender for a train book.

Japanese English language and culture contact by James Stanlaw. If you are into Japanese culture you will love this, otherwise avoid.

Evolutionary Dynamics by Martin A. Nowak. For the more mathematical folk and plenty of thought power needed. Some very powerful general results from simple processes.

Analytic Combinatorics by Philippe Flajolet and Robert Sedgewick. Probably the toughest mathematical book I have kept at (yet to get close to the end) in a few years. If number sequences fascinate you then give it a try (a pdf is available).

Probability and Computing by Michael Mitzenmacher and Eli Upfal. For the more mathematical folk and plenty of thought power needed. Don’t let the density of Theorems put you off, the approach is broad brush. Plenty of interesting results with applications to solving problems using algorithms containing a randomizing component.

Network Algorithmics by George Varghese. A real hackers book. Not so much a book about algorithms used to solve networking problems but a book about making engineering trade-offs and using every ounce of computing functionality to solve problems having severe resource and real-time constraints.

Virtual Machines by James E. Smith and Ravi Nair. Everything you every wanted to know about virtual machines and more.

Biological Psychology by James W. Kalat. This might be a coffee table book for scientists. Great illustrations, concise explanations, the nuts and bolts of how our bodies runs at the protein/DNA level.

Estimating variance when measuring source

October 8th, 2009 Derek-Jones No comments

Yesterday I finally delivered a paper on if/switch usage measurements to the ACCU magazine editor and today I read about a switch statement usage that if common, would invalidate a chunk of my results. Does anything jump out at you in the following snippet?

switch (x)
   {
   case 1:
             {
             z++;
             ...
             break;
             }
...

Yes, those { } delimiting the case-labeled statement sequence. A quick check of my C source benchmarks showed this usage occurring in around 1% of case-labels. Panic over.

What is the statistical significance, i.e., variance, of that 1%? Have I simply measured an unrepresentative sample, what would be a representative sample and what would be the expected variance within a representative sample?

I am interested in commercial software development and so I have selected half a dozen or so largish code bases as my source benchmark, preferably written in a commercial environment even if currently available as Open source. I would prefer this benchmark to be an order of magnitude larger and perhaps I will get around to adding more programs soon.

My if/switch measurements were aimed at finding usage characteristics that varied between the two kinds of selection statements. One characteristic measured was the number of equality tests in the associated controlling expression. For instance, in:

if (x == 1 || x == 2)
   z--;
else if (x == 3)
   z++;

the first controlling expression contains two equality tests and the second one equality test.

Plotting the percentage of equality tests that occur in the controlling expressions of if-if/if-else-if sequences and switch statements we get the following:

Number of quality tests in controlling expression

Do these results indicate that if-if/if-else-if sequences and switch statements differ in the number of equality tests contained in their controlling expressions? If I measured a completely different set of source code, would the results be very different?

To answer this question a probability model is needed. Take as an example the controlling expressions present in an if-if sequence. If each controlling expression is independent of the others, then the probability of two equality tests, for instance, occurring in any of these expressions is constant and thus given a large sample the distribution of two equality tests in the source has a binomial distribution. The same argument can be applied to other numbers of equality tests and other kinds of sequence.

Number of quality tests in controlling expression, with error bars

For each measurement point in the above plot the associated error bars span the square-root of the variance of that point (assuming a binomial distribution, for a normal distribution the length of this span is known as the standard deviation). The error bars overlap suggesting that the apparent difference in percentage of equality tests in each kind of sequence is not statistically significant.

The existence of some dependency between controlling expression equality tests would invalidate this simply analysis, or at least reduce its reliability. I did notice that in a sequence that containing two equality tests, the controlling expression that contained it tended to appear later in the sequence (the reverse of the example given above). Did I notice this because I tend to write this way? A question for another day.

What I changed my mind about in 2008

January 4th, 2009 Derek-Jones No comments

A few years ago The Edge asked people to write about what important issue(s) they had recently changed their mind about. This is an interesting question and something people ought to ask themselves every now and again. So what did I change my mind about in 2008?

1. Formal verification of nontrivial C programs is a very long way off. A whole host of interesting projects (e.g., Caduceus, Comcert and Frame-C) going on in France has finally convinced me that things are a lot closer than I once thought. This does not mean that I think developers/managers will be willing to use them, only that they exist.

2. Automatically extracting useful information from source code identifier names is still a long way off. Yes, I am a great believer in the significance of information contained in identifier names. Perhaps because I have studied the issues in detail I know too much about the problems and have been put off attacking them. A number of researchers (e.g., Emily Hill, David Shepherd, Adrian Marcus, Lin Tan and a previously blogged about project) have simply gone ahead and managed to extract (with varying amount of human intervention) surprising amounts of useful from identifier names.

3. Theoretical analysis of non-trivial floating-point oriented programs is still a long way off. Daumas and Lester used the Doobs-Kolmogorov Inequality (I had to look it up) to deduce the probability that the rounding error in some number of floating-point operations, within a program, will exceed some bound. They also integrated the ideas into NASA’s PVS system.

You can probably spot the pattern here, I thought something would not happen for years and somebody went off and did it (or at least made an impressive first step along the road). Perhaps 2008 was not a good year for really major changes of mind, or perhaps an earlier in the year change of mind has so ingrained itself in my mind that I can no longer recall thinking otherwise.

Distribution of numeric values (additive)

December 16th, 2008 Derek-Jones No comments

Developers and testers rarely put any thought into working out the likely distribution of numeric values (final or intermediate) computed during the execution of the code they write.

The likely value of a variable is useful to know in a number of situations, including optimizing code (should it prove to be necessary) for the common case and testing (what distribution of input values are needed to be confident that all paths through a program are exercised?)

The answer for the ’simple’ distributions is actually more complicated to work with than the more ‘complicated’ distributions. For instance, the sum of two independent values having a normal distributions is a normal distribution and the sum of two Poisson distributions is also a Poisson distribution.

What if the values are uniformly distributed? If two independent, randomly chosen, uniformly distributed, variables, are added what is the distribution of the result? For instance, if the values of X and Y are independent of each other and take on any value between 0 and 9, with equal likelihood, what is the most (and least) likely value of X+Y?

Warning: Information spoilers follow.

You are probably thinking that the result will also be uniformly distributed and indeed it would be if the range of values taken by X and Y did not overlap. When the possible range of values overlap exactly the answer is the triangular distribution, with the mostly likely result being 9 and the least likely results being 0 and 18.

The variance of the actual result distribution is approximately six times smaller than the original distribution, meaning that the common cases occupy a much narrower value range. This value range ‘narrowing’ goes someway towards helping to explain the surprising discovery that during program execution a small set of (integer and floating) values often occur with such regularity that it might be worth cpu arithmetic units remembering previous operands and their results (i.e., to save time by returning the result rather than recalculating it).

FireStats icon Powered by FireStatsbuy now online propecia

getting cialis from canada

cheap cialis

canadian generic viagra online

cheap viagra from uk

canada viagra

lowest price levitra generic online

order cheapest propecia online

canada viagra

buy levitra now

cialis overnight

levitra pill

buy cialis in usa

generic cialis next day shipping

lowest price for propecia

canadian drug viagra soft

buy pfizer viagra in canada

best price for generic cialis

levitra sex pill

buy now propecia

generic levitra overnight delivery

buy cheap generic propecia

canadian women viagra

cheapest propecia in uk

levitra canadian pharmacy

how to get cialis no prescription

order usa viagra online

buy propecia in canada

levitra online order

best price generic propecia

brand viagra over the net

cialis generic

cialis daily cost

cost of viagra in germany

buy online propecia

buy now viagra

canadian healthcare pharmacy

internet pharmacy propecia

online propecia uk

levitra lowest price

buy viagra mexico

generic levitra india

generic form of propecia

levitra online

50 mg cialis

buy viagra in canada

cheap canadian viagra

buy cialis canada

cialis fast delivery

buy generic viagra india rx

best price cialis

cialis c 50

cialis medication

canadian pharmacy

cialis daily availability

canadian online pharmacy cialis

levitra prescription

buy propecia online prescription

buying viagra with no prescription

effect of cialis on women

best way to use cialis

how much to buy viagra in pounds

cheap propecia canada

generic cialis sale

chip cialis

cheapest propecia uk

buying generic propecia

levitra tablets

cialis en mexico

low cost viagra

generic viagra canada

best doses for propecia

buy propecia without prescription

cialis generic

cialis in mexico

cialis refractory

buy viagra pills

generic cialis

canadian propecia rx

buying propecia

levitra cheapest

lowest price on non generic levitra

mexico viagra

cialis canada illegal buy

canadian healthcare pharmacy

ordering cialis gel

cialis cheap

buy levitra with no prescription

buy dosages levitra

cialis daily dosing cost

canadian healthcare

canadian viagra and healthcare

generic levitra vardenafil

ordering viagra

cheapest priced propecia

cialis dose

cialis at real low prices

buy mg propecia

cialis from india

cialis soft pills

cialis discounts

buy cheap online propecia

buy vardenafil levitra

fda levitra

buy propecia online prescription

buy cheap levitra

cheap cialis from india

cialis india

buy cialis next day delivery

online viagra

generic viagra made in india

bestellen levitra online

cialis discount prices

cheap propecia order online

can i get viagra in mexico

buy branded viagra

can i get viagra in mexico

online propecia sales

levitra prices

china viagra

buy propecia canada

cheapest prices for viagra

buy daily cialis

cialis online

mail order propecia

cheap viagra on line

cialis daily

brand name cialis overnight

generic cialis soft tabs

cialis daily

cheap propecia order online

cialis pfizer

buy cheap propecia online

levitra cheap

best price for generic cialis

brand viagra professional

online pharmacy propecia renova

cheap price propecia

cialis cheap us pharmacy

cheap propecia no prescription

buy cialis without prescription

cialis blood thinner

cialis and women

cheap fast levitra

online pharmacy cost levitra

levitra vs cialis

cost of cialis

buy viagra in canada no prescription

generic viagra india

order viagra or levitra

canada online pharmacy propecia

cialis alternitives

best price levitra

combine cialis and levitra

next day viagra

generic cialis in india

cialis philippines

buying viagra

cialis canadian pharmacy

discount brand name cialis

generic levitra cheap

5 mg daily cialis

canadain viagra india

cialis angioplasty

buy now online propecia

generic cialis canada

get cialis

buy propecia no prescription

levitra tadalafil

buy levitra vardenafil

daily cialis

cialis cost

online propecia cheap

buy cheap propecia online

cheap order prescription propecia

best price generic propecia

cheapest propecia sale uk

healthcare of canada pharmacy

buy propecia online

best price levitra online

cialis testimonial

buy cialis without a prescription

buy propecia where

cialis generic 100 mg

buy cialis in usa

buy cialis

online cheap viagra

generic propecia uk

buying cialis

cheap viagra no prescription

cialis canada

generic propecia effective

buy propecia without prescription

canada cheap propecia

cheapest online propecia

levitra info

best price for propecia online

levitra mail order

lowest cost levitra

cialis cheap us pharmacy

best prices for propecia

buy propecia no prescription

buy canada in propecia

buy cialis next day delivery

mexico pharmacy cialis

best price propecia

cialis dosage

cost of daily cialis

buy propecia where

buy fast propecia

buy drug propecia

mail order levitra

canada cheap propecia

online pharmacy propecia

cialis soft canada

cialis 100 mg generic

cialis and ketoconazole

5 mg original brand cialis

buying cialis soft tabs 100 mg

discount levitra online

bruising on cialis

buy cialis once daily

cialis india pharmacy

discount levitra rx

discount cialis india

buy discount viagra

cialis daily dosage pharmacy

brand viagra

best online generic levitra

best deal for propecia

cialis in mexico

cialis daily dosage pharmacy

cheap viagra pills

how does viagra work

levitra online pharmacy

dose cialis

best price levitra

cheap canadian viagra

buy propecia in the uk

cheapest prices on viagra

gele viagra

50 mg cialis dose

cialis for woman

buy generic propecia online

buy cheap online propecia

cialis germany

canadian viagra

generic propecia for sale

discount real viagra

cheap propecia 5mg