February 7, 2021 Derek Jones 6 comments

Programming languages are like pop groups in that they have followers, fans and supporters; new ones are constantly being created and some eventually become widely popular, while those that were once popular slowly fade away or mutate into something else.

Creating a language is a relatively popular activity. Science fiction and fantasy authors have been doing it since before computers existed, e.g., the Elf language Quenya devised by Tolkien, and in the computer age Star Trek’s Klingon. Some very good how-to books have been written on the subject.

As soon as computers became available, people started inventing programming languages.

What have been the major factors influencing the growth to widespread use of a new programming languages (I’m ignoring languages that become widespread within application niches)?

Cobol and Fortran became widely used because there was widespread implementation support for them across computer manufacturers, and they did not have to compete with any existing widely used languages. Various niches had one or more languages that were widely used in that niche, e.g., Algol 60 in academia.

To become widely used during the mainframe/minicomputer age, a new language first had to be ported to the major computers of the day, whose products sometimes supported multiple, incompatible operating systems. No new languages became widely used, in the sense of across computer vendors. Some new languages were widely used by developers, because they were available on IBM computers; for several decades a large percentage of developers used IBM computers. Based on job adverts, RPG was widely used, but PL/1 not so. The use of RPG declined with the decline of IBM.

The introduction of microcomputers (originally 8-bit, then 16, then 32, and finally 64-bit) opened up an opportunity for new languages to become widely used in that niche (which would eventually grow to be the primary computing platform of its day). This opportunity occurred because compiler vendors for the major languages of the day did not want to cannibalize their existing market (i.e., selling compilers for a lot more than the price of a microcomputer) by selling a much lower priced product on microcomputers.

BASIC became available on practically all microcomputers, or rather some dialect of BASIC that was incompatible with all the other dialects. The availability of BASIC on a vendor’s computer promoted sales of the hardware, and it was not worthwhile for the major vendors to create a version of BASIC that reduced portability costs; the profit was in games.

The dominance of the Microsoft/Intel partnership removed the high cost of porting to lots of platforms (by driving them out of business), but created a major new obstacle to the wide adoption of new languages: Developer choice. There had always been lots of new languages floating around, but people only got to see the subset that were available on the particular hardware they targeted. Once the cpu/OS (essentially) became a monoculture most new languages had to compete for developer attention in one ecosystem.

Pascal was in widespread use for a few years on micros (in the form of Turbo Pascal) and university computers (the source of Wirth’s ETH compiler was freely available for porting), but eventually C won developer mindshare and became the most widely used language. In the early 1990s C++ compiler sales took off, but many developers were writing C with a few C++ constructs scattered about the code (e.g., use of new, rather than malloc/free).

Next, the Internet took off, and opened up an opportunity for new languages to become dominant. This opportunity occurred because Internet related software was being made freely available, and established compiler vendors were not interested in making their products freely available.

There were people willing to invest in creating a good-enough implementation of the language they had invented, and giving it away for free. Luck, plus being in the right place at the right time resulted in PHP and Javascript becoming widely used. Network effects prevent any other language becoming widely used. Compatible dialects of PHP and Javascript may migrate widespread usage to quite different languages over time, e.g., Facebook’s Hack.

Java rode to popularity on the coat-tails of the Internet, and when it looked like security issues would reduce it to niche status, it became the vendor supported language for one of the major smart-phone OSs.

Next, smart-phones took off, but the availability of Open Source compilers closed the opportunity window for new languages to become dominant through lack of interest from existing compiler vendors. Smart-phone vendors wanted to quickly attract developers, which meant throwing their weight behind a language that many developers were already familiar with; Apple went with Objective-C (which evolved to Swift), Google with Java (which evolved to Kotlin, because of the Oracle lawsuit).

Where does Python fit in this grand scheme? I don’t yet have an answer, or is my world-view wrong to treat Python usage as being as widespread as C/C++/Java?

New programming languages continue to be implemented; I don’t see this ever stopping. Most don’t attract more users than their implementer, but a few become fashionable amongst the young, who are always looking to attach themselves to something new and shiny.

Will a new programming language ever again become widely used?

Like human languages, programming languages experience strong networking effects. Widely used languages continue to be widely used because many companies depend on code written in it, and many developers who can use it can obtain jobs; what company wants to risk using a new language only to find they cannot hire staff who know it, and there are not many people willing to invest in becoming fluent in a language with no immediate job prospects.

Today’s widely used programmings languages succeeded in a niche that eventually grew larger than all the other computing ecosystems. The Internet and smart-phones are used by everybody on the planet, there are no bigger ecosystems to provide new languages with a possible route to widespread use. To be widely used a language first has to become fashionable, but from now on, new programming languages that don’t evolve from (i.e., be compatible with) current widely used languages are very unlikely to migrate from fashionable to widely used.

It has always been possible for a proficient developer to dedicate a year+ of effort to create a new language implementation. Adding the polish need to make it production ready used to take much longer, but these days tool chains such as LLVM supply a lot of the heavy lifting. The problem for almost all language creators/implementers is community building; they are terrible at dealing with other developers.

It’s no surprise that nearly all the new languages that become fashionable originate with language creators who work for a company that happens to feel a need for a new language. Examples include:

Go created by Google for internal use, and attracted an outside fan base. Company languages are not new, with IBM’s PL/1 being the poster child (or is there a more modern poster child). At the moment Go is a trendy language, and this feeds a supply of young developers willing to invest in learning it. Once the trendiness wears off, Google will start to have problems recruiting developers, the reason: Being labelled as a Go developer limits job prospects when few other companies use the language. Talk to a manager who has tried to recruit developers to work on applications written in Fortran, Pascal and other once-widely used languages (and even wannabe widely used languages, such as Ada),
Rust a vanity project from Mozilla, which they have now ~~abandoned~~ cast adrift. Did Rust become fashionable because it arrived at the right time to become the not-Google language? I await a PhD thesis on the topic of the rise and fall of Rust,
Microsoft’s C# ceased being trendy some years ago. These days I don’t have much contact with developers working in the Microsoft ecosystem, so I don’t know anything about the state of the C# job market.

Every now and again a language creator has the social skills needed to start an active community. Zig caught my attention when I read that its creator, Andrew Kelley, had quit his job to work full-time on Zig. Two and a-half years later Zig has its own track at FOSEM’21.

Will Zig become the next fashionable language, as Rust/Go popularity fades? I’m rooting for Zig because of its name, there are relatively few languages whose name starts with Z; the start of the alphabet is over-represented with language names. It would be foolish to root for a language because of a belief that it has magical properties (e.g., powerful, readable, maintainable), but the young are foolish.

Categories: Uncategorized Tags: C, Cobol, evolution, fashion, Fortran, Go, history, Java, JavaScript, mainframe, PHP, PL/1, programming language, Rust, Zig

Does the UK need the PL/1 Standard?

September 25, 2011 Derek Jones No comments

Like everything else language standards are born and eventually die. IST/5, the UK programming language committee, is considering whether the British Standard for PL/1 should be withdrawn (there are two standards, ISO 6160:1979 which has been reconfirmed multiple times since 1979, most recently in 2008, and a standardized subset ISO 6522:1992, also last confirmed in 2008).

A language standard is born through the efforts of a group of enthusiastic people. A language standard dies because there is no enthusiast (a group of one is often sufficient) to sing its praises (or at least be willing to be a name on a list that is willing to say, every five years, that the existing document should be reconfirmed).

It is 20 years since IST/5 last had a member responsible for PL/1, but who is to say that nobody in the UK is interested in maintaining the PL/1 standard? Unlike many other programming language ISO Standards there was never an ISO SC22 committee responsible for PL/1. All of the work was done by members of the US committee responsible for programming language PL 22 (up until a few years ago this was ANSI committee X3). A UK person could have paid his dues and been involved in the US based work; I don’t have access to a list of committee meeting attendees and so cannot say for sure that there was no UK involvement.

A member of IST/5, David Muxworthy, has been trying to find somebody in the UK with an interest in maintaining the PL/1 standard. A post to the newsgroup comp.lang.pl1 eventually drew a response from a PL/1 developer who said he would not be affected if the British Standard was withdrawn.

GNU compiler development is often a useful source of information. In this case the PL/1 web page is dated 2007.

In 2008 John Klensin, the ISO PL/1 project editor, wrote: “No activities or requests for additions or clarifications during the last year or, indeed, the last decade. Both ISO 6160 and the underlying US national document, ANS X3.53-1976 (now ANSI/INCITS-53/1976), have been reaffirmed multiple times. The US Standard has been stabilized and the corresponding technical activity was eliminated earlier this year”.

It looks like the British Standard for PL/1 is not going to live past the date of its next formal review in 2013. Thirty four years would then be the time span, from publication of last standard containing new material to formal withdrawal of all standards, to outlive. I wonder if any current member of either of the C or C++ committees will live to see this happen to their work?

Categories: Uncategorized Tags: INCITS, ISO Standard, IST/5, PL/1, SC22

Parsing ambiguous grammars (part 1)

March 4, 2009 Derek Jones No comments

Parsing a language is often much harder than people think, perhaps because they have only seen examples that use a simple language that has been designed to make explanation easy. Most languages in everyday use contain a variety of constructs that make the life of a parser writer difficult. Yes, there are parser generators, tools like bison, that automate the process of turning a grammar into a parser and a language’s grammar is often found in the back of its reference manual. However, these grammars are often written to make the life of the programmer easier, not the life of the parse writer.

People may have spotted technical term like LL(1), LR(1) and LALR(1); what they all have in common is a 1 in brackets, because they all operate by looking one token ahead in the input stream. There is a big advantage to limiting the lookahead to one token, the generated tables are much smaller (back in the days when these tools were first created 64K was considered to be an awful lot of memory and today simple programs in embedded processors, with limited memory, often use simple grammars to parse communication’s traffic). Most existing parser generators operate within this limit and rely on compiler writers to sweat over, and contort, grammars to make them fit.

A simple example is provided by PL/1 (most real life examples tend to be more complicated) which did not have keywords, or to be exact did not restrict the spelling of identifiers that could be used to denote a variable, label or procedure. This meant that in the following code:

IF x THEN y = z; ELSE = w;

when the ELSE was encountered the compiler did not know whether it was the start of the alternative arm of the previously seen if-statement or an assignment statement. The token appearing after the ELSE needed to be examined to settle the question.

In days gone-by the person responsible for parsing PL/1 would have gotten up to some jiggery-pokery, such as having the lexer spot that an ELSE had been encountered and process the next token before reporting back what it had found to the syntax analysis.

A few years ago bison was upgraded to support GLR parsing. Rather than lookahead at more tokens a GLR parser detects that there is more than one way to parse the current input and promptly starts parsing each possibility (it is usually implemented by making copies of the appropriate data structures and updating each copy according to the particular parse being followed). The hope is that eventually all but one of these multiple parsers will reach a point where they cannot successfully parse the input tokens and can be killed off, leaving the one true parse (the case where multiple parses continue to exist was discussed a while ago; actually in another context).

Categories: Uncategorized Tags: ambiguous grammar, embedded processor, language grammar, lexer, limited memory, parsing, PL/1, syntax, tools

The Shape of Code

Archive

Widely used programming languages: past, present, and future

Does the UK need the PL/1 Standard?

Parsing ambiguous grammars (part 1)

Recent Posts

Recent Comments

Archives

Meta