The Shape of Code

About

Home > Uncategorized > C compilers on PR1ME computers

C compilers on PR1ME computers

September 29, 2019 Derek Jones Leave a comment Go to comments

At the start of this week I knew almost nothing about the C compilers running on PR1ME computers (every now and again I check bitsavers for newly scanned C compiler manuals).

On Tuesday I spotted the source of the Georgia Tech C Compiler for Prime Computers (which was believed lost until two 35-years old mag tapes were found); the implementation language is Ratfor.

I posted the information to the comp.compilers newsgroup, and in the ensuing posts I learned about Dennis Boone’s collection of scanned PR1ME manuals, including a C compiler user guide. This C user’s guide is for the Conboy/Pacer software compiler, not the one whose sources I linked to above.

The PR1ME C compiler has the usually assortment of vendor extensions, and missing features, of the kind that might be found in any C compiler from the 1980s; however, there is a larger than usual collection of infrequently seen characteristics. I spotted the following during a quick read through the manual:

addresses point to 16-bit words (not 8-bit bytes). The implementers could have chosen to define a char as 16-bit (i.e., defining the standard macro CHAR_BIT as 16), but they went with 8-bit chars. Handling 8-bit chars on a word addressed processor means that pointers to char have to include a bit specifying which half of the 16-bits they are pointing to. Some Cray computers had the same issue, except their words contained 64 bits, so more offset bits had to be stored in pointers.
The definition of an object having type char occupies 8 bits of a word, no other object is allocated in the other 8 bits; arrays of char are packed, two chars to a word.
ASCII characters have the top-bit set (the manual phrases this as “The basic character set … ASCII 7-bit set (called ASCII-7), with the eighth bit turned on.”). The C standard requires that the ten digit characters have continuous values, and that the basic character set be representable in a char.
a pointer type may contain more bits than any supported integer type, i.e., 48-bit pointer and 32-bit integer. PR1ME cpus supported a dizzying array of different modes and instruction sets; some later instruction sets/modes support 48-bit pointers.
“On some other machines, you may write code in which a function is called with more or fewer parameters than the function actually expects. Such code may work correctly on the 50 Series, but only if the missing or extra parameter is never referenced.”
“On some other machines, programs run correctly if function return value data types are left undeclared. … If this function is not explicitly defined as returning a pointer, the default return value is type int. Such a program may run correctly on some machines, but not on a 50 Series machine.”

These characteristics combine to make the PR1ME a very unfriendly environment for C programs.

C programmers have a culture, the C way of doing things (these cultures exist for all languages), and the characteristics of this (and perhaps other) PR1ME C compilers run counter to this culture in many ways.

I’m not saying that C culture is good or bad, just that it exists and PR1ME is a very poor fit.

What elements of C culture clash with the PR1ME implementation?

There is an expectation that when two objects having char type are defined sequentially (e.g., char a, b;), it will be possible to access b by adding 1 to a pointer to a (as if the definition had been written as: char a[2];). Yes, this practice is now frowned upon, but it was once considered ok (at least in some circles). On PR1ME the two definitions are not equivalent, from the pointer arithmetic point of view.

Very many developers assume that C characters use ASCII. Most of the time they do, but they are not required to; EBCDIC being perhaps the most well known alternative. At least in the PR1ME encoding the alphabetic characters have contiguous values, which they don’t in EBCDIC. But setting the top bit, hmmm….

The assumption that sizeof(int) == sizeof(pointer_type) is endemic in 1980s code, and much code in later decades; many (not so young) C programmers will tell you the story of the first time they had to switch mind-sets to: sizeof(long) == sizeof(pointer_type). Not having a 48-bit integer type is a bit of a killer for C on PR1ME, as we will find out.

PR1MOS, the vendor operating system, uses a function call stack layout that assumes a function definition specifies exactly how the function will be called, e.g., the number and type of parameters, and the return type.

In the early decades of C, programmers were very lax about specifying exactly what arguments a function expected, and that no return type implicitly specified an int return type (and since everybody knew that sizeof(int) == sizeof(pointer_type), this meant that it was not necessary to specify pointer return types).

During development, having the program raise an exception, or whatever the behavior was, when a function call did not match the defined type of the function, is useful; it improves program reliability by catching those cases that might work as expected a lot of the time, but not all the time.

A lot of existing code was created on systems that were forgiving of function call/definition mismatches. Such C source is unlikely to just compile and run under PR1MOS, i.e., porting other peoples programs is likely to be a time-consuming process.

Categories: Uncategorized Tags: C compiler, PR1ME, word addressing

Comments (3) Trackbacks (0) Leave a comment Trackback

John Kelly

October 15, 2019 16:45 | #1

Reply | Quote

Derek,
just saw this thru a crosspost on hacker news. As a former Prime employee (sysadmin) and Prime programmer (writing custom code for Prime clients) I used their C compiler and many of the other Primos compilers and system utilities. I ported Perl to Primos and experienced first hand the issues you mention. Although I’m rusty after so long away from Primos, please reach out if you need more insights into Prime computer. As you say its not the best environment to run C, but there were other languages, FTN, F77, Modula, Pl/p (subset of PL/1) which could be used to offset C’s issues.

cheers,
Derek Jones

October 15, 2019 17:08 | #2

Reply | Quote

@John Kelly
I used Prime (mostly 550s) for two years, mostly in Fortran and sometimes Pascal.

Thanks for your insight offer. What would you say was the most common problem, or the one that took the most time to sort out?

From time to time the C Standard committee goes through a phase of discussing weird things that can happen with pointers. Obviously, if you attempt to dereference a pointer to non-existent storage stuff happens, and fiddling with the bits of a pointer value is not a good idea. Did you experience any (what you thought) surprising pointer behavior?
John Kelly

October 17, 2019 06:24 | #3

Reply | Quote

The largest issue was the segmented architecture. The original prime 300 series hardware in fact supported the Honeywell series 16 instruction set(called R-mode.) I saw only bits and pieces of that when I first started. As Prime matured they kept fiddling with the addressing and it was only with V-mode, which came in with the 500 series I think, that programs could address multiple memory segments. If you were on a 550-II then you were pretty much out of luck running C. You needed IX mode which wasn’t supported until the 9900 series. IX-mode provided general register relative addressing and even added extra instructions to make C run more efficiently. Somewhere around Rev 17 or 18 Primos started to use a dynamic versus a static linker/loader. This produced something called an EPF. The Executable Program File was a demand paged binary that was mapped into segments for execution. While it was much more efficient, there were little gotcha’s with EPFs, such as, if you didn’t take the time to fully zero out your variables when the EPF crashed it would leave crap stuck in the segments until you forcefully cleared out the memory. And yeah, playing pointer games in C running an EPF was a sure way to get a big headache.

It’s true, writing and running C code on a Prime 50 series was challenging and certainly nowhere near today’s coding standards but it could be done. I routinely pulled unix snippets (from SunOS typically) and was able to get them to run, though the .h files were pretty complicated with OS weirdness. I even wrote a file transfer program connecting Primos to SunOS, calling it ‘icp’ for intelligent copy. It used tcp/ip networking and byte-converted the files as they transferred. (The Stevens Networking book was brand new back then, and a godsend.) Living on the 50-series meant you had to be comfortable with octal and high-bit ascii all sorts of other quirks (use ‘LD’ not ‘ls’ to list files in a directory), but they were decent performing systems and extremely stable – I knew of one server that ran for several years without a reboot.

Finally, consider that Prime Computer and Oracle worked together and ported the Oracle database to the 50 series. I used it successfully at several Prime clients. I’m pretty certain the Oracle database was the most complicated C code base ever run on the 50 series. Oracle version 4.5 was written almost entirely in C and ran well, most of the time – but that’s another story.

cheers,

No trackbacks yet.

Cost ratio for bespoke hardware+software Plotting artifacts when the axis involves lines of code

The Shape of Code

C compilers on PR1ME computers

Recent Posts

Recent Comments

Archives

Meta