Heap

From redlog.eu
Jump to: navigation, search

The term heap refers to the computer memory dynamically allocated by REDUCE or, more precisely, by the underlying Lisp system. A heap in this sense is not related to the heap data structure.

Heap Organization

In CSL a considerable amount of memory is allocated from the operating system at startup, which is establishes the heap. The heap is organized into pages. The current page size on 32 bit architectures is around 4 MB. Each page contains data of one of the following kinds:

  1. Lisp stack
  2. cons data
  3. vectors (including strings and bignums)
  4. byte-compiled code.

In PSL also allocates memory at startup. This memory is divided into three parts, one of which is the heap:

  1. The PSL code (bpsl). Depending on the architecture this is about 0.5-9 MB.
  2. The binary program storage (BPS) for compiled code (8 MB).
  3. The heap for Lisp items. On some architectures there are 2 heaps, which are used mutually. This depends on the garbage collection model.

Cells

When referring to the heap it is natural to refer to sizes in terms of Lisp cells rather than bytes.

On 32 bit systems the size of Lisp object (cell) is 4 bytes and that of a consed pair (double cell) is 8 bytes.

On 64 bit systems these sizes double, i.e. 4 bytes for a cell and 8 bytes for a double cell. The numbers given for "occupied" and "free" in the PSL garbage collection messages with the switch on gc refer to Lisp cells.

Restrictions with PSL

On 32 bit systems, PSL can address at most 128 MB corresponding to 32·10⁶ cells.

The heap can be enlarged only on systems that provide a variant of realloc that does not relocate the memory segment. Systems not admitting heap enlargement include Fedora Linux, MacOS X, and generally 64 bit systems.

Initial Heap Sizes

CSL

By default CSL chooses a suitable initial heap size itself. There is an option -k bytes for explicitly choosing the initial heap size. The use of this option is not recommended, and is usually not necessary since CSL automatically expands its heap if it finds that memory is getting full. On a typical 32-bit machine it will generally be able to expand up to somewhat over a gigabyte of heap, while on a 64-bit system there may be some current minor internal limit that stops expansion at around 10 Gbytes: at the time of writing that is more than most people can gain access to.

On 32 bit systems the default is around 56 MB, of which 12 MB are occupied after startup. So there remain 56 MB – 12 MB = 44 MB ≈ 11·10⁶ cells. On 64 bit systems the initial default allocation is around 116 MB, of which 12 MB are occupied after startup. So there remain 116 MB – 12 MB = 104 MB ≈ 13·10⁶ cells. These initial default memory allocations should be a matter of supreme indifference to anybody apart from somebody trying to port CST to a new platform that has severely limited amounts of real memory.

PSL

In PSL the binary executable bpsl knows an argument -td [mega]bytes. Suitable values for this are coded in the calling scripts, e.g. redpsl, and need not be given by the user. When referring to MB in connection with -td, we make the convention that 1 MB = 10⁶ B in contrast to 1024² B.

On 32 bit systems one common default choice for -td is 16 MB. That is, after startup there remain 16 MB – 8 MB for BPS – size of bpsl ≤ 8 MB ≈ 2·10⁶ cells. On 32 bit systems, where the heap cannot be enlarged, the -td is ignored and there are is the maximum of 128 MB allocated, which amounts to approximately 32·10⁶ cells.

On 64 bit systems, if the argument of -td is less than 10⁶, then is multiplied by 10⁶ thus describing MB in the sense above. The default choice is 1000 (MB). There are generally 2 heaps used so that after startup the available heap amounts to (1000 MB – 8 MB for BPS – size of bpsl) ÷ 2 ≤ 496 MB ≈ 62·10⁶ cells.

Heap Enlargement

In both CSL and PSL the heap size is never reduced. Well in CSL when you checkpoint a heap image and reload it later on the version in the new copy of Reduce will not necessarily use the same amount of space that it did in the run that created it, and in particular it could run in less memory than the run that created it used.

CSL

When after garbage collection there are at least 75% of the heap occupied the current size is enlarged by a factor of 100/75. That is there are always at most 75% occupied. The details of this behaviour are not considered by the CSL maintainers as part of the public specification of the system or something that should either concern anybody or that they should rely on. The procedure descibed here is intended to be a heuristic that keeps total garbage collection overhead a modest proportion of total CPU time while keeping a tolerably polite memory footprint. Every year or two any such trade-off will need to be revisited and so what is described here may change without notice.

It is not possible to explicitly enlarge the heap via a Lisp call. Also the absolute size of the heap is not available within REDUCE. Since memory is expanded automatically there should be no need for the user to feel a need to expand it under program control, and on modern computers with virtual memory and an uncertain number of other tasks running alongside Reduce the concept of a fixed maximum limit of available memory does not really seem meaningful.

PSL

As discussed above, the heap cannot be enlarged on all systems, in particular not on 64 bit systems.

On 32 bit systems that admit to do so, the heap is enlarged by 100 kB ≈ 25000 cells whenever it is essentially completely occupied after garbage collection.

There is a PSL-specific function set_heap_size cells for choosing a new heap size, which must be larger than the current one. The current heap size (in cells) is obtained by set_heap_size nil.