Archive

Posts Tagged ‘ambiguity’

Parsing Fortran 95

December 20th, 2009 Derek-Jones No comments

I have been looking at doing some dimensional analysis of the Climategate code and so needed a Fortran parser.

The last time I used Fortran in anger the modern compilers were claiming conformance to the 1977 standard and since then we have had Fortran 90 (with a minor revision in 95) and Fortran 03. I decided to take the opportunity to learn something about the new features by writing a Fortran parser that did not require a symbol table.

The Eli project had a Fortran 90 grammar that was close to having a form acceptable to bison and a few hours editing and debugging got me a grammar containing 6 shift/reduce conflicts and 1 reduce/reduce conflict. These conflicts looked like they could all be handled using glr parsing. The grammar contained 922 productions, somewhat large but I was only interested in actively making use of parts of it.

For my lexer I planned to cut and paste an existing C/C++/Java lexer I have used for many projects. Now this sounds like a fundamental mistake, these languages treat whitespace as being significant while Fortran does not. This important difference is illustrated by the well known situation where a Fortran lexer needs to lookahead in the character stream to decide whether the next token is the keyword do or the identifier do5i (if 1 is followed by a comma it must be a keyword):

      do 5 i = 1 , 10
      do 5 i = 1 . 10        ! assign 1.10 to do5i
5     continue

In my experience developers don’t break up literals or identifier names with whitespace and so I planned to mostly ignore the whitespace issue (it would simplify things if some adjacent keywords were merged to create a single keyword).

In Fortran the I/O is specified in the language syntax while in C like languages it is a runtime library call involving a string whose contents are interpreted at runtime. I decided to to ignore I/O statements by skipping to the end of line (Fortran is line oriented).

Then the number of keywords hit me, around 190. Even with the simplifications I had made writing a Fortran lexer looked like it would be a lot of work; some of the keywords only had this status when followed by a = and I kept uncovering new issues. Cutting and pasting somebody else’s lexer would probably also involve a lot of work.

I went back and looked at some of the Fortran front ends I had found on the Internet. The GNU Fortran front-end was a huge beast and would need serious cutting back to be of use. moware was written in Fortran and used the traditional six character abbreviated names seen in ‘old-style’ Fortran source and not a lot of commenting. The Eli project seemed a lot more interested in the formalism side of things and Fortran was just one of the languages they claimed to support.

The Open Fortran Parser looked very interesting. It was designed to be used as a parsing skeleton that could be used to produce tools that processed source and already contained hooks that output diagnostic output when each language production was reduced during a parse. Tests showed that it did a good job of parsing the source I had, although there was one vendor extension used quiet often (an not documented in their manual). The tool source, in Java, looked straightforward to follow and it was obvious where my code needed to be added. This tool was exactly what I needed :-)

GLR parsing is the future

August 27th, 2009 Derek-Jones No comments

Traditionally parser generators have required that their input grammar be LALR(1) or some close variant (I would include LL(1) in this set). Back when 64k was an unimaginably large amount of memory being able to squeeze parser tables in a few kilobytes was very important; people received PhDs on parser table compression.

There is still a market for compact, fast parsers. Formal language grammars abound in communication protocols and vendors of communications hardware are very interested in keeping down costs by using minimizing the storage needed by their devices.

The trouble with LALR(1) is that value 1. It means that the parser only looks ahead one token in the input stream. This often means that a grammar is flagged as being ambiguous (i.e., it contains shift/reduce or reduce/reduce conflicts) when it is actually just locally ambiguous, i.e., reading tokens further head on the input stream would provide sufficient context to unambiguously specify the appropriate grammar production.

Restructuring a grammar to make it LALR(1) requires a lot of thought and skill and inexperienced users often give up. I once spent a month trying to remove the conflicts in the SQL/2 grammar specified by the SQL ISO standard; I managed to get the number down from over 1,000 to a small number that I decided I could live with.

It has taken a long time for parser generators to break out of the 64k mentality, but over the last few years it has started to happen. There have been two main approaches: 1) LR(n) provides a mechanism to look further ahead than one token, ie, n tokens, and 2) GLR parsing.

I think that GLR parsing is the future for two reasons:

  • It is supported by the most widely used parser generator, bison.
  • It enables working parsers to be created with much less thought and effort than a LALR(1) parser. (I don’t know how it compares against LR(n)).

GLR parsers resolve any language ambiguities by effectively delaying decisions until runtime in the hope that reading enough tokens will resolve local ambiguities. If an ambiguity in the token stream cannot be resolved a runtime error occurs (this is the one big downside of a GLR parser, the parser generated by a LALR(1) parser generator may produce lots of build time warnings but never produces errors when the parser is executed).

One example of a truly ambiguous construct (discussed here a while ago) is:

x * y;

which in C/C++ could be a declaration of y to be a pointer to x, or an expression that multiplies x and y.

Tools that can detect these global ambiguities in a grammar are starting to appear, e.g., DTWA is a bison extension.

I reviewed an early draft of the new O’Reilly book “flex & bison” and tried to get the author to be more upbeat on GLR support in bison; I think I got him to be a bit less cautious.

Parsing without a symbol table

December 19th, 2008 Derek-Jones No comments

When processing C/C++ source for the first time through a compiler or static analysis tool there are invariably errors caused by missing header files (often because the search path has not been set) or incorrectly defined, or not defined, macro names. One solution to this configuration problem is to be able to process source without handling preprocessing directives (e.g., skipping them, such as not reading the contents of header files or working out which arm of a conditional directive is applicable). Developers can do it, why not machines?

A few years ago GLR support was added to Bison, enabling it to process ambiguous grammars, and I decided to create a C parser that simply skipped all preprocessing directives. I knew that at least one reasonably common usage would generate a syntax error:

func_call(a,
#if SOME_FLAG
b_1);
#else
b_2);
#endif

c);
and wanted to minimize its consequences (i.e., cascading syntax errors to the end of the file). The solution chosen was to parse the source a single statement or declaration at a time, so any syntax error would be localized to a single statement or declaration.

Systems for parsing ambiguous grammars work on the basis that while the input may be locally ambiguous, once enough tokens have been seen the number of possible parses will be reduced to one. In C (and even more so in C++) there are some situations where it is impossible to resolve which of several possible parses apply without declaration information on one or more of the identifiers involved (a traditional parser would maintain a symbol table where this information could be obtained when needed). For instance, x * y; could be a declaration of the identifier y to have type x or an expression statement that multiplies x and y. My parser did not have a symbol table and even if it did the lack of header file processing meant that its contents would only contain a partial set of the declared identifiers. The ambiguity resolution strategy I adopted was to pick the most likely case, which in the example is the declaration parse.

Other constructs where the common case (chosen by me and I have yet to get around to actually verifying via measurement) was used to resolve an ambiguity deadlock included:

f(p);      // Very common, 
            // confidently picked function call as the common case
(m)*p;   // Not rare,
            // confidently picked multiplication as the common case
(s) - t;      // Quiet rare,
               // picked binary operator as the common case
(r) + (s) - t; // Very rare,
                  //an iteration on the case above

At the moment I am using the parser to measure language usage, so less than 100% correctness can be tolerated. Some of the constructs that cause a syntax error to be generated every few hundred statement/declarations include:

offsetof(struct tag, field_name)  // Declarators cannot be 
                                            //function arguments
int f(p, q)
int p;     // Tries to reduce this as a declaration without handling
char q;   // it as part of an old style function definition
{
 
MACRO(+); // Preprocessing expands to something meaningful

Some of these can be handled by extensions to the grammar, while others could be handled by an error recovery mechanism that recognized likely macro usage and inserted something appropriate (e.g., a dummy expression in the MACRO(x) case).

www.wenn.com
FireStats icon Powered by FireStatswww.tinynibbles.com generic propecia online pharmacy

canadian viagra and healthcare

cheap propecia uk

best viagra

online generic cialis 100 mg

buy cheap levitra online

cheap levitra tablets

cialis 50 mg

buy canada levitra

cialis price

discount propecia rx

levitra 10mg

levitra in canada

online propecia uk

levitra discount

generic propecia 5mg

cialis daily

levitra cost

female viagra pills

cialis professional no prescription

get propecia online pharmacy

brand viagra professional

cheap viagra canada or india

5 mg original brand cialis

cialis and canada custom

order prescription propecia

cialis online

cialis daily dosage pharmacy

discount propecia online

herbal propecia

next day viagra

cheap prescription propecia

cheapest viagra usa

buy cheap generic levitra

cialis overnight delivery

cialis next day

buy viagra online cheap us

buy can from i propecia who

cialis pharmacy

how much does cialis cost

canadian healthcare viagra

cheap cialis from india

buy levitra online from canada

generic propecia for sale

cialis price 100 mg

cialis on women

buy viagra china

cialis 5 mg italia

levitra online prescription

lowest propecia 1 mg

buy viagra germany canadian meds

discount drug propecia

levitra prescription

canada cheap propecia

cheapest propecia sale uk

buy levitra online no prescription

cialis discount

cialis in mexico

levitra online no prescription

does generic cialis work

cialis dosage mg

generic viagra canadian

canada generic propecia

brand name cialis

ganeric cialis

cialis strenght mg

buy propecia online from usa pharmacy

online viagra gel to buy

buying online propecia

mail online order propecia

levitra tabs

generic propecia effective

cheapest viagra

cost of propecia

buy cialis usa

canadian pharmacy viagra

buy online prescription propecia

order levitra online

canadian pharmacies cialis

buy viagra mexico

how much is viagra

how strong is 5 mg of cialis

cheap propecia online prescription

lowest price propecia

buy propecia online prescription

cialis 100 mg generic

cialis professional 100 mg

low price levitra

order cheap levitra

lowest price for propecia

cialis vs levitra

cheap cialis

cialis next day delivery

indian cialis generic

buy cheapest propecia

canada meds viagra

buy levitra uk

cialis from mexico

cheap viagra online

generic viagra online

cialis headaches

best price for generic cialis

best price levitra

levitra mail order

how to buy cialis in canada

brand name cialis overnight

levitra sales uk

indian generic levitra

lowest price propecia best

next day delivery cialis

buy generic propecia

levitra online sales

discount levitra rx

low cost levitra

cheapest price propecia cheap

canadian viagra 50mg

buy cialis online canada

generic levitra vardenafil

lowest cost levitra

canadian viagra

best price cialis

internet pharmacy propecia

buy cialis online uk

cheap levitra without prescription

buy propecia on line

levitra buy online

cialis professional 20 mg

buy propecia where

buy prescription propecia without

for sale levitra

buying generic cialis mexico rx

lowest priced propecia

cheap fast levitra

lowest propecia prices

levitra online overnight delivery

levitra online us

online ordering propecia

buying cialis soft tabs 100 mg

how much to buy viagra in pounds

getting cialis from canada

discount cialis india

generic levitra cheap

cheapest propecia prescription

levitra now online

cialis en mexico

canada viagra pharmacies scam

buy generic cialis

cialis cheap us pharmacy

generic propecia finasteride

indian cialis

canadian pharmacy

buy cialis in usa

generic viagra 100 mg

cialis woman

low cost canadian viagra

canadian pharmacy discount code viagra

healthcare canadian pharmacy

buying propecia online

lowest price levitra

i need to buy propecia

levitra from canadian pharmacy

cheapest overnight cialis

online levitra

buy now propecia

get cialis

cheap propecia no prescription

levitra order prescription

online propecia prescription

mail order levitra

5 mg daily cialis

buy propecia canada

discount propecia propecia

levitra next day delivery

generic viagra india

cialis buy overnight

cialis no prescription

can i get viagra in mexico

levitra low price

cialis from canada

natural viagra

discount levitra online

mexico pharmacy cialis

cialis uk

generic levitra online

obtain viagra without prescription

levitra mg

cheap levitra

buy generic levitra

buy viagra without prescription

cialis 5 mg

buy cheap generic propecia

buy cialis for daily use

cialis fast delivery

50 mg cialis

canada viagra generic

discount levitra purchase

china viagra

buy discount viagra

levitra viagra online

canadian viagra india

buy viagra

cialis 100 mg

cheapest propecia uk

buy fast propecia

how much cialis

online propecia prescriptions

canadian drugs propecia

canada levitra

buying propecia

how to get cialis in canada

cialis soft pills

generic cialis from india

cialis for woman

brand viagra over the net

mexico levitra

cheap viagra from uk

buy propecia now

buy 5 mg cialis

canada propecia prescription

cost levitra low

cheap propecia online

cheapest viagra online

buy propecia without prescription

cialis prescription

online pharmacy propecia renova

cialis daily in canada

cialis refractory

cheap canadian viagra

levitra canadian

cialis fast delivery usa

overnight delivery viagra

generic levitra purchase

once a day viagra

genuine cialis pills

buy generic viagra india rx

order cheapest propecia online

cialis transdermal

get levitra

canadian healthcare

buy viagra on line

cost of viagra

canada online pharmacy propecia

levitra in india

lowest propecia prices in canada

buy propecia online pharmacy

lowest price on non generic levitra

drug generic propecia

combine cialis and levitra

order viagra or levitra

levitra pill

mail order propecia

buy cialis fedex shipping

canadian healthcare pharmacy

generic levitra canada

cialis 5 mg buy

buying generic propecia

cheap levitra uk

buy cialis without prescription

cheap discount levitra

cialis one a day

buy branded viagra

generic viagra made in usa

generic cialis next day shipping

name brand cialis

hydrochlorothiazide cialis

info levitra

buy propecia prescriptions online

levitra online

best price for propecia

cheap cialis soft

levitra where to buy

overnight delivery cialis

brand cialis for sale

buy cialis canada

generic viagra canada

levitra for sale

cialis cheap

cheap propecia 5mg

buy real viagra online

canada viagra

low cost propecia

levitra cheap fast

buying viagra in canada

buy propecia in the uk

cialis overnight

once daily cialis

ordering propecia online

cialis and ketoconazole

order cheap propecia

best way to use cialis

canadian pharmacy cialis

cialis and diarrhea

cialis price in canada

cialis purchase

generic propecia alternative

generic viagra made in india

cialis to buy

buy levitra overnight

buying levitra online

how to get viagra