Archive

Posts Tagged ‘Fortran’

Variations in the literal representation of Pi

March 12th, 2010 Derek-Jones No comments

The numbers system I am developing attempts to match numeric literals contained in a file against a database of interesting numbers. One of the things I did to quickly build a reasonably sized database of reliable values was to extract numeric literals from a few well known programs that I thought I could trust.

R is a widely used statistical package and Maxima is a computer algebra system with a long history. Both contain a great deal of functionality and are actively maintained.

To my surprise the source code of both packages contain a large variety of different literal values for pi, or to be exact the number of digits contained in the literals varied by more than I expected. In the following table the value to the left of the pi representation is the number of occurrences; values listed in increasing literal order:

     Maxima                              R
   2 3.14159
                                      14 3.141592
   1 3.1415926
   1 3.14159265                        2 3.14159265
   3 3.1415926535
   4 3.14159265358979
  14 3.141592653589793
   3 3.1415926535897932385             3 3.1415926535897932385
   9 3.14159265358979324
                                       1 3.14159265359
                                       1 3.1415927
                                       1 3.141593

The comments in the Maxima source led me to believe that some thought had gone into ensuring that the numerical routines were robust. Over 3/4 of the literal representations of pi have a precision comparable to at least that of 64-bit floating-point (I’m assuming an IEEE 754 representation in this post).

In the R source approximately 2/3 of the literal representations of pi have a precision comparable to that of 32-bit floating-point.

Closer examination of the source suggests one reason for this difference. Both packages make heavy use of existing code (translated from Fortran to Lisp for Maxima and from Fortran to C for R); using existing code makes good sense and because of its use in scientific and engineering applications many numerical libraries have been written in Fortran. Maxima has adapted the slatec library, whereas the R developers have used a variety of different libraries (e.g., specfun).

How important is variation in the representation of Pi?

  • A calculation based on a literal that is only accurate to 32-bits is likely to be limited to that level of accuracy (unless errors cancel out somewhere).
  • Inconsistencies in the value used to represent Pi are a source of error. These inconsistencies may be implicit, for instance literals used to denote a value derived from pi such as pi^0.5 often seem to be be based on more precise values of Pi than appear in the code.

The obvious solution to this representation issue of creating a file containing definitions of all of the frequently used literal values has possible drawbacks. For instance, numerical accuracy is a strange beast and increasing the precision of one literal without doing the same for other literals appearing in a calculation can sometimes reduce the accuracy of the final result.

Pulling together existing libraries to build a package is often very cost effective, but numerical accuracy is a slippery beast and this inconsistent usage of literals suggests that developers from these two communities have not addressed the system level consequences of software reuse.

Parsing Fortran 95

December 20th, 2009 Derek-Jones No comments

I have been looking at doing some dimensional analysis of the Climategate code and so needed a Fortran parser.

The last time I used Fortran in anger the modern compilers were claiming conformance to the 1977 standard and since then we have had Fortran 90 (with a minor revision in 95) and Fortran 03. I decided to take the opportunity to learn something about the new features by writing a Fortran parser that did not require a symbol table.

The Eli project had a Fortran 90 grammar that was close to having a form acceptable to bison and a few hours editing and debugging got me a grammar containing 6 shift/reduce conflicts and 1 reduce/reduce conflict. These conflicts looked like they could all be handled using glr parsing. The grammar contained 922 productions, somewhat large but I was only interested in actively making use of parts of it.

For my lexer I planned to cut and paste an existing C/C++/Java lexer I have used for many projects. Now this sounds like a fundamental mistake, these languages treat whitespace as being significant while Fortran does not. This important difference is illustrated by the well known situation where a Fortran lexer needs to lookahead in the character stream to decide whether the next token is the keyword do or the identifier do5i (if 1 is followed by a comma it must be a keyword):

      do 5 i = 1 , 10
      do 5 i = 1 . 10        ! assign 1.10 to do5i
5     continue

In my experience developers don’t break up literals or identifier names with whitespace and so I planned to mostly ignore the whitespace issue (it would simplify things if some adjacent keywords were merged to create a single keyword).

In Fortran the I/O is specified in the language syntax while in C like languages it is a runtime library call involving a string whose contents are interpreted at runtime. I decided to to ignore I/O statements by skipping to the end of line (Fortran is line oriented).

Then the number of keywords hit me, around 190. Even with the simplifications I had made writing a Fortran lexer looked like it would be a lot of work; some of the keywords only had this status when followed by a = and I kept uncovering new issues. Cutting and pasting somebody else’s lexer would probably also involve a lot of work.

I went back and looked at some of the Fortran front ends I had found on the Internet. The GNU Fortran front-end was a huge beast and would need serious cutting back to be of use. moware was written in Fortran and used the traditional six character abbreviated names seen in ‘old-style’ Fortran source and not a lot of commenting. The Eli project seemed a lot more interested in the formalism side of things and Fortran was just one of the languages they claimed to support.

The Open Fortran Parser looked very interesting. It was designed to be used as a parsing skeleton that could be used to produce tools that processed source and already contained hooks that output diagnostic output when each language production was reduced during a parse. Tests showed that it did a good job of parsing the source I had, although there was one vendor extension used quiet often (an not documented in their manual). The tool source, in Java, looked straightforward to follow and it was obvious where my code needed to be added. This tool was exactly what I needed :-)

Does the Climategate code produce reliable output?

November 30th, 2009 Derek-Jones 2 comments

The source of several rather important commercial programs have been made public recently, or to be more exact programs whose output is important (i.e., the Sequoia voting system and code and data from the Climate Research Unit at University of East Anglia the so called ‘Climategate’ leak). While many technical commentators have expressed amazement at how amateurish the programming appears to be, apparently written with little knowledge of good software engineering practices or knowledge of the programming language being used, those who work on commercial projects know that low levels of software engineering/programming competence is the norm.

The emails included in the Climategate leak provide another vivid example, if one were needed, of why scientific data should be made publicly available; scientists are human and are sometimes willing to hide data that does not fit their pet theory or even fails to validate their theory at all.

The Climategate source has only only recently become available and existing technical commentary has been derived from embarassing comments and the usual complaint about not using the right programming language (Fortran is actually a good choice of language for this problem, it is widely used by climatology researchers and a non-professional programmer is probably makes best of their time by using the one language they know tolerably well rather than attempting to use a new language that nobody else in the research group knows).

An important quality indicator of the leaked software was what was not there, test cases (at least I could not find any). How do we know that a program’s output is correct? One way to gain some confidence in a program’s correctness is to process data for which the correct output is known. This blindness to the importance of program level correctness testing is something that I often encounter in people who are subject area experts rather than professional programmers; they believe that if the output has the form they are expecting it must be correct and will sometimes add ‘faults’ to ‘fix’ output that deviates from what they are expecting.

A quick visual scan through the source showed a tale of two worlds, one of single letter identifier names and liberal use of goto, and the other of what looks like meaningful names, structured code and a non-trivial number of comments. The individuals who have contributed to the code base obviously have very different levels of coding ability. Not having written any Fortran in anger for over 15 years my ability to estimate the impact of more subtle coding practices has atrophied.

What kind of faults might a code review look for in these programs? Common coding errors such as using uninitialized variables and incorrect argument passing are obvious choices and their are tools available to check for these kinds of error. A much more insidious kind of error, which requires people with the mathematical expertise to spot, is created by the approximate nature of floating-point arithmetic.

The source is not huge, but not small either, consisting of around 64,000 lines of Fortran and 16,000 lines of IDL (a language designed for interactive data analysis which to my untrained eye looks a lot like MATLAB). There was no obvious support for building the source included within the leaked files (e.g., no makefiles) and my attempt to manually compile using the GNU Fortran compiler failed miserably. So I cannot say anything reliable about the compiler output warnings.

To me the complete lack of test cases implies that the Climategate code does not produce reliable output. Comments in the code such as ***** APPLIES A VERY ARTIFICIAL CORRECTION FOR DECLINE********* suggests that the authors were willing to patch the code to produce output that matched their expectations; this is the mentality of somebody for whom code correctness is not an important issue and if they don’t believe their code is correct then I don’t either.

Source code in itself is rarely that important, although it might have been expensive to create. The real important information in the leaked files is the climate data. Now that this is available others can apply their analysis skills to provide an interpretation to what, if anything statistically reliable, it is telling us.

Dimensional analysis of source code

May 28th, 2009 Derek-Jones 1 comment

The idea of restricting the operations that can be performed on a variable based on attributes appearing in its declaration is actually hundreds of years old and is more widely known as dimensional analysis. Readers are probably familiar with the concept of type checking where, for instance, a value having a floating-point type is not allowed to be added to a value having a pointer type. Unfortunately many of those computer languages that support the functionality I am talking about (e.g., Ada) also refer to it as type checking and differentiate it from the more common usage by calling it strong typing. The concept would be much easier for people to understand if a different term were used, e.g., unit checking or even dimension checking.

Dimensional analysis, as used in engineering and the physical sciences, relies on the fact that quantities are often expressed in terms of a small number of basic attributes, e.g., mass, length and time; velocity is calculated by dividing a length by a time, LT^{-1} and area is calculated by multiplying two lengths, L^{2}. Adding a length quantity to a velocity has no physical meaning and suggests that something is wrong with the calculation, while dividing velocity by time, LT^{-2}, can be interpreted as acceleration. Dividing two quantities that have the same units results in what is known as a dimensionless number.

Dimensional analysis can be used to check a calculation involving physical quantities for internal consistency and as a method for trying to deduce the combinations of quantities that an unknown equation might contain based on the physical units the result is known to be represented in.

The frink language has units of measure checking built into it.

How might dimensional analysis be used to check source code for internal consistency? Consider the following code:

x = a / b;
c = a;
y = c / b;
if (x + y ...
...
z = x + b;

c is assigned a’s value and is therefore assumed to have the same units of measurement. The value assigned to y is calculated by dividing c by b and the train of reasoning leading to the assumption that it has the same units of measurement as x is easy to follow. Based on this analysis there is nothing suspicious about adding x and y, but adding x and b looks wrong (it would be perfectly ok if all of the variables in this code were dimensionless).

A number of tools have been written to check source code expressions for internal consistency e.g., Fortran (Automated computation and consistency checking of physical dimensions and units in scientific programs), C++ (Applied Template Metaprogramming in SI units) and C (Annotation-less Unit Type Inference for C), but so far only one PhD.

Providing a mechanism for developers to add unit information to variable declarations would enable compilers to perform consistency checks and reduce the likelihood of false positives being reported (because dimensionless values can generally be combined together in any way). It is too late in the day for such a major feature to be added to the next revision of the C++ standard; the C standard is also being revised but the committee is currently being very conservative and insists that any proposed new constructs already be implemented in at least one compiler.

Why is code so fault tolerant?

December 22nd, 2008 Derek-Jones No comments

All professional developers eventually encounter a program containing a fault that appears to be so devastating that the program could not possibly perform its intended task, yet the program has been and continues to function more or less as expected.  In my case the program was a cpu instruction set emulator (for a Z80 written in Fortran) that I had written and the fault was a copy-and-past editing mistake that resulted in one of the subtract instructions behaving like the equivalent addition instruction.  The emulator was used to  execute CP/M and various applications (on a minicomputer that did not have any desktop office applications).  I was astounded that CP/M booted and appeared to work correctly, along with various applications (apart from the one exhibiting behavior differences that resulted in me tracking down this fault).

My own continuing experience with apparently fatal faults, in mine and other peoples code, lead me to the conclusion that researchers should be putting most of their effort into trying to figure out why so much software does such a good job of behaving in an acceptable manner while containing so many faults (of various apparent seriousness).  Proving software correctness is an expensive and time consuming dead-end for all but a few specialist applications.

One way for developers to vividly see how robust most software is to random faults is to use a mutation tool on the source.  Such tools introduce faults into code with the aim of checking the thoroughness of a set of test cases.  It is a sobering experience to see how many mutations fail to have any noticeable effect on a programs external behavior.

One group of researchers took this mutation idea to an extreme by changing all less-than operators in for-loops into less-than-or-equals operators. They found that only a handful of the changes prevented the recompiled programs being at all useful to users. While some of the changes produced output that was obviously incorrect, it was still possible to use much of the original functionality.

What is it about the shape of most code that allows it to continue to function in the presence of faults? It is time faults were acknowledged as a fact of life in all actively developed systems and that we should concentrate on developing techniques to help ensure that software containing them continues to behave as intended, rather than the unsophisticated zero-tolerance approach that has held sway for so long.

FireStats icon Powered by FireStatslevitra online

cialis 5 mg buy

buy viagra online

levitra online us

canadian drugs propecia

buy viagra china

cheapest propecia uk

cost levitra

brand viagra without prescription buy

buy propecia international pharmacy

cialis canadian pharmacy

best price levitra online

cialis levitra sales

brand cialis

buy prescription propecia

female viagra

buying online propecia

best price on propecia

cialis alternitives

canadian generic cialis

cheapest prices on viagra

discount levitra online viagra

canadian levitra

cheepest cialis

cheap prescription propecia

generic propecia fda approved

lowest priced propecia

buy levitra online viagra

buy cialis online without prescription

cialis on sale

buy propecia uk

discount us propecia

canada viagra pharmacies scam

generic form of propecia

china viagra

buy cialis once daily

online pharmacy cost levitra

buy cialis

canadian pharmacy viagra legal

online viagra

low cost canadian viagra

gele viagra

genuine cialis pills

brand name cialis overnight

low price cialis

cheap viagra generic

bestellen levitra online

generic levitra cheap

buy cialis next day delivery

cialis for woman

canadian women viagra

cialis 50 mg

cialis germany

cialis daily canada

buy viagra on line

cheap propecia order online

online cialis

cialis canadian cost

cialis and women

cheap fast levitra

cheapest viagra online

cheap levitra online

generic levitra pill

buy propecia where

buying cialis soft tabs 100 mg

bruising on cialis

cialis samples in canada

how much is viagra

cialis next day delivery

cheap viagra on line

effect of cialis on women

cheepest cialis

buy pfizer viagra in canada

cialis brand name

canadian viagra 50mg

best propecia prices

how much does cialis cost

generic cialis canadian

overnight delivery viagra

levitra sex pill

cheap cialis

buy viagra online

cialis price in canada

about cialis

cialis daily availability

cialis refractory

indian cialis canada

canadian pharmacy

how you get pfizer viagra

cialis okay for women

buy cialis 5 mg

buy propecia international pharmacy

cialis soft canada

buy cialis in canada

buying levitra online

cialis tablets foreign

canada viagra

buy levitra low price

buy propecia now

cialis daily

generic levitra mexico

levitra low price

cialis soft

online ordering propecia

cialis samples

cialis daily dosage pharmacy

buy levitra low price

bio viagra herbal

cialis from canada

cialis in canada

cialis daily canada

online pharmacy propecia renova

buy generic propecia online

discount real viagra

levitra in mexico

cialis from mexico

buy fast propecia

cheap order prescription propecia

canadian healthcare cialis

canadian pharmacy discount code viagra

best way to use cialis

order viagra or levitra

buy levitra now

cialis daily in canada

50 mg cialis

levitra without prescriptions

healthcare canadian pharmacy

buy viagra china

name brand cialis

buying levitra online

cialis fast

index

cialis daily price

best way to take cialis

generic cialis from india

canadian viagra generic

cialis women

canadian healthcare cialis

female viagra pills

brand viagra over the net

cialis without prescription

cialis alternatives

about cialis

buy mg propecia

lowest price propecia costs us

buy levitra online viagra

cheapest propecia sale uk

low price propecia

cialis 100 mg generic

buy generic levitra online

cialis 30 mg

cialis angioplasty

buy propecia in the uk

bying viagra online cheap us

cialis price

buy drug propecia

how much to buy viagra in pounds

cialis soft tabs

get viagra fast

cialis next day

levitra online prescription

cialis dosage mg

get propecia online pharmacy

buy generic levitra online

best price for generic cialis

info levitra

levitra 20mg

female viagra

canadian propecia rx

cialis buy

brand viagra professional

cialis alternatives

buy cheap generic levitra

discount generic propecia

levitra cheap fast

buy cialis without prescription

cialis medication

cialis 20 mg

buy propecia canada

online viagra

cialis tablets vs viagra

buy cialis cannada

how much is viagra

natural viagra pills

get cialis

buying propecia

cheap viagra no prescription

canadian healthcare pharmacy

get propecia cheap

levitra viagra cialis

cialis philippines

canadian viagra 50mg

buy now propecia

cost of cialis

buy cialis online without prescription

canadian pharmacies cialis

get cialis online

buy cialis canada

buy propecia no prescription

canadian cialis

buy 5 mg cialis

cialis by women

cialis angioplasty

buy generic propecia online

canadian online pharmacy cialis

levitra canada prescription

buying generic propecia

cheapest propecia pharmacy online

buy prescription propecia without

buying cialis online

brand cialis for sale

online propecia sales

cialis kanada

lowest price levitra generic online

cialis testimonial

canada cheap propecia

how strong is 5 mg of cialis

cialis india pharmacy

levitra 10 mg

buy daily cialis

cheap cialis soft

cheap discount levitra

online pharmacy propecia

cialis philippines

mail online order propecia

overnight delivery cialis

buy levitra online

ordering viagra overnight delivery

buying viagra with no prescription

cheapest price propecia cheap

canadian women viagra

no prescription viagra

buy propecia in canada

levitra or viagra

cialis by mail

best online levitra

buy viagra online canada

canadian healthcare viagra

indian generic levitra

buy levitra american pharmacy

canadian pharmacy viagra legal

how to get cialis in canada

canadian pharmacy cialis

next day delivery cialis

online pharmacy propecia sale

generic propecia canada

brand viagra professional

get viagra

cialis vs levitra

cialis buy