The Shape of Code

About

Home > Uncategorized > A trip down memory lane with Microsoft Word 1.1

A trip down memory lane with Microsoft Word 1.1

March 31, 2014 Derek Jones Leave a comment Go to comments

Microsoft has donated the source code of Word for Windows 1.1a to the Computer history Museum and I have been rummaging around in this C code from the last 1980s.

What immediately struck me is how un-Microsoft Windows like the code looks, in some ways it looks more like Unix code. There are:

very few near/far/huge pointer declarations. The Intel x86 architecture is based on segmented addressing of memory, a fact that most developers are blissfully unaware of because 32-bit segments are big enough to be able to ignore this fact; back in the day we had 16-bit segments and there were near pointers that could only point within a segment, far pointers that could point to the contents of other segments (there were alignment issues associated with these) and huge pointers which are essentially 32-bit pointers. Many developers obsessed over saving time/space and did their utmost to only use near pointers (who is ever going to need a data structure larger than 64k?) and Windows code was often littered with different kinds of pointer usage,
very few #if/#endif. Why do developers use #if/#endif? They want to port their code to different versions of MS-DOS, use different vendor compilers and to use different third-party libraries. Microsoft were well known for not being backwards compatible with themselves (they have improved somewhat on that score) and obviously had no interest in using other vendor compilers and third-party libraries. Unix code of the day and today is of course packed with #if/#endif.

For any developer who has only used C since 2000 the most obvious code ‘quirk’ is probably the use of K&R style function definitions. This used to be the only way of defining functions until the first C Standard, in 1989, introduced function prototypes and the mass migration of the 1990’s occurred. The K&R style function definition looks like no big deal:

f()
int a;
struct T b;
{
/* blah blah */
}

instead of:

f(int a, struct T b)
{
/* blah blah */
}

until you find out that in the first case, when f is called, there is no checking of the arguments against the parameters appearing in the definition. This lack of checking was/is a constant source of annoying faults that could/can take hours to track down; once a developer has experienced the benefits of automatic argument/parameter checking they soon convert to the prototype form.

The makefile included with the source contains compiler options that won’t be familiar to Windows developers. This is because a special purpose C compiler was used, one that generated P-code (the P here is for Packed not Portable). In the late 1980s, when this code was written, a 256K was considered a lot of memory for your average person’s computer and at +180K lines of code Word 1.1 was enormous. Word processors are on the whole not cpu intensive tasks and trading off a factor of 10 (I’m guessing, probably more) for a memory footprint saving of 2.5-4 (not so much of a guess) was obviously considered worthwhile (but then Microsoft apps have never been known for being nimble).

The extensive use of bit-fields is the other clue that memory is tight. Do this ever get used outside of comms software and embedded systems these days?

I was surprised at how clean and presentable this source code is. I should not have judged Microsoft developers by the horrors perpetrated by many developers targeting their platform.

Categories: Uncategorized Tags: K&R, Microsoft Word, pointer, segmented architecture

Comments (3) Trackbacks (0) Leave a comment Trackback

Gareth Rees

March 31, 2014 21:40 | #1

Reply | Quote

Bitfields are alive and well. CPU speed has increased faster than memory bandwidth, so packing data makes more sense than ever (for applications which are time-critical). The more of your data you can squeeze into the level 1 cache, the faster your program goes.
Derek Jones

March 31, 2014 21:55 | #2

Reply | Quote

@Gareth Rees
Don’t forget that bitfield access is likely to require more instructions, which could fill up more cache than is saved. If the struct is part of a tree then the cost/benefit analysis is complicated by issues such as how the tree is walked and how many structs can be fitted into a single cache line.

Perhaps a more important issue is whether developers are willing to give up those bits that they might need to ‘expand’ the value into as requirements grow. Nicely packed structs are great until one of the fields needs a bit or two to expand into.
Andrey Karpov

April 2, 2014 16:08 | #3

Reply | Quote

Just for fun: checking Microsoft Word 1.1a source code with PVS-Studio static code analyzer – http://www.viva64.com/en/b/0245/

No trackbacks yet.

Heartbleed: Critical infrastructure open source needs government funding Hack, a template for improving code reliability

The Shape of Code

A trip down memory lane with Microsoft Word 1.1

Recent Posts

Recent Comments

Archives

Meta