Posts Tagged ‘memory management’

Least Recently Used: The experiment that made its reputation

June 20th, 2016 No comments

Today we all know that least recently used is the best page replacement algorithm for virtual memory systems (actually paging is complicated in today’s intertwined computing world).

Virtual memory was invented in 1959 and researchers spent the 60’s trying to figure out the best page replacement algorithm.

Programs were believed to spend most of their time in loops and this formed the basis for the rationale for why FIFO, First In First Out, was the best page replacement algorithm (widely used at the time).

Least recently used, LRU, was on people’s radar as a possible technique and was mathematically analysed, along with various other techniques. The optimal technique was known and given the name OPT; a beautifully simple technique with one implementation drawback, it required knowledge of future memory usage behavior (needless to say some researchers set to work trying to predict future memory usage so this algorithm could be used).

An experiment by Tsao, Comeau and Margolin, published in 1972, showed that LRU outperformed FIFO and random replacement. The rest, as they say, is history; in this case almost completely forgotten history.

The paper “A multi-factor paging experiment: I. The experiment and conclusions” was published as one of a collection of papers in “Statistical Computer Performance Evaluation” edited by Freiberger. A second paper by two of the authors in the same book outlines the statistical methodology. Appearing in a book means this paper can be very hard to track down. I recently bought a copy of the book on Amazon for one penny (the postage was £2.80).

The paper contains a copy of the experimental results and below are the page swap numbers:

loadseq      group group group freq freq  freq alpha alpha alpha
  Pages         24    20    16   24   20    16    24    20    16
   LRU     S    32    48   538   52  244   998    59   536  1348
   LRU     M    53    81  1901  112  776  3621   121  1879  4639
   LRU     L   142   197  5689  262 2625 10012   980  5698 12880
   FIFO    S    49    67   789   79  390  1373    85   814  1693
   FIFO    M   100   134  3152  164 1255  4912   206  3394  5838
   FIFO    L   233   350  9100  458 3688 13531  1633 10022 17117
   RAND    S    62   100  1103  111  480  1782   111   839  2190
   RAND    M    96   245  3807  237 1502  6007   286  3092  7654
   RAND    L   265  2012 12429  517 4870 18602  1728  8834 23134

Three Fortran programs were used, Small (55 statements), Medium (215 statements) and Large (595 statements). These programs were loaded by group (sequences of frequently called subroutines grouped together), freq (subrotines causing the most page swaps were grouped together), alpha (subroutines were grouped alphabetically).

The system was configured with either 24, 20 or 16 pages of 4,096 bytes; it had a total of 256K of memory (a lot of memory back then). The experiment consumed 60 cpu hours.

Looking at the table it is easy to see that LRU has the best performance. In places random replacement beats FIFO. A regression model (code and data) puts numbers on the performance advantage.

The paper says that the only interaction was between memory size (i.e., number of pages) and how the programs were loaded. I found pairwise interaction between all variables, but then I am using a technique that was being invented as this paper was being published (see code for details).

Number of page swaps was one of three techniques used for measuring performed. The other two were activity count (average number of pages in main memory referenced at least once between page swaps) and inactivity count (average time, measured in page swaps, of a page in secondary storage after it had been swapped out). See data for details.

I vividly remember dropping in on a randomly chosen lecture in computer science in the mid-70s (I studied physics and electronics), the subject was page selection algorithms and there were probably only a dozen people in the room (physics and electronics sometimes had close to 100). The lecturer regaled those present with how surprising it was that LRU was the best and somebody had actually done an experiment showing this. Having a physics/electronics background the experimental approach to settling questions was second nature to me.

Software memory management

November 23rd, 2011 No comments

I recently wrote about how computer memory capacity limits were a serious constraint for compiler writers. One technique used to increase the amount of memory available to a compiler (back in the days when pointers usually contained 16-bits) is software based paged memory management. Yes, compiler writers were generally willing to take the runtime performance hit to increase effectively accessible memory by around a factor of 10-25 (e.g., a 2 byte number used as an index into an array of 20 to 50 byte records).

The code to iterate over a data structure stored under the control of a software memory manager looks like the following (taken from a C to Pascal translator):

	Flds    : Sw_Ptr;  (* in practice an integer *)
        T_Node  : Sw_Node; (* in practice a pointer to a record *)
While Flds <> Sw_Nil Do  (* Sw_Nil is the memory managers Nil value *)
  Sw_Node_Ref(Cpswfile, Flds, T_Node, Mm_Readonly);
    If T_Node.Pn^.Node_Is<>N_Is_Field Then
      Verify_Error(Ve_Cputils, Ve_Scan_Fld);
  Flds := T_Node.Pn^.Field_Node.Next;

Where Sw_Node_Ref is a procedure in the memory management package that ensures the record denoted by Flds (whose value was obtained by a previous called to Sw_New_Node) is available in memory and returned in T_Node. Had Mm_ReadWrite rather than Mm_Readonly been specified the memory manager would assume that the record had been modified and when the page containing the record was swapped out of main memory it would write the contents of the page containing it to the swapfile (specified by the first argument, Cpswfile).

A call to Sw_Node_Ref only guarantees that the record is at the returned location until the next memory management procedure is called. This takes advantage of common usage which is: read a record, do something with its contents and then move on to the next one. The procedure Sw_Node_Ptr is available for when a record needs to be held in main memory across multiple Sw_ calls; this procedure locks a record in the limited capacity memory pool until explicitly freed (a Pascal style Mark/Release system was also available).

Multiple records were overlayed on a page (invariably 512 bytes) of storage. Some implementations used a fancy tool to create the overlay while others did it manually. The size of the pool in main memory used to hold swapped-in pages was specified when the memory manager was initialized; the maximum number of records that could be created by a call to Sw_New_Node was only limited by the maximum value of an integer and available disk space.

I learned about this implementation technique while on secondment at Intermetrics in the early 1980s, and they told me it came from the PQCC project of the mid 1970s. There is a paper in the Proceedings of the 1982 SIGPLAN symposium describing the system/library used by Intermetrics, which rambles on about nothing in particular and fails to say anything about software memory management (it is too useful an idea for a commercial company to tell anybody else); I don’t know of any other published description. Everybody I know who left Intermetrics to work on other compilers created their own implementation of a software memory management package.