All programs, other than the most trivial, need to store data in memory. Intermediate values, sets of data to operate upon, storage for input and output of instructions are examples of what is stored in memory. Typically, a programmer will designate certain memory locations to hold values required by the program. In a PAL-11 program, this can be done using labels and a few assembler directives:
.= 1000 MOV A,C ; C = A ADD B,C ; C = A + B HALT ; END OF PROGRAM ; STATIC MEMORY A: .WORD 2000 B: .WORD 3000 C: .WORD 0 .END
This gets assembled to
001000 .= 1000 001000 016767 MOV A,C ; C = A 000012 000014 001006 066767 ADD B,C ; C = A + B 000006 000006 001014 000000 HALT ; END OF PROGRAM ; STATIC MEMORY 001016 002000 A: .WORD 2000 001020 003000 B: .WORD 3000 001022 000000 C: .WORD 0 000001 .END A 001016 B 001020 C 001022 . = 001024
Here, you can see that the assembler is using the relative addressing mode to read and write values to the designated location. A program can also use directives to "reserve" a larger block of memory:
.= 1000 MOV A,C ; C = A ADD B,C ; C = A + B MOV C,BLOCK ; BLOCK = C ; STATIC MEMORY A: .WORD 2000 B: .WORD 3000 BLOCK: .=.+4000 C: .WORD 0 .END
001000 .= 1000 001000 016767 MOV A,C ; C = A 000016 004020 001006 066767 ADD B,C ; C = A + B 000012 004012 001014 016767 MOV C,BLOCK ; BLOCK = C 004006 000004 ; STATIC MEMORY 001022 002000 A: .WORD 2000 001024 003000 B: .WORD 3000 005026 BLOCK: .=.+4000 005026 000000 C: .WORD 0 000001 .END A 001022 B 001024 BLOCK 001026 C 005026 . = 005030
These memory locations are burned into the program. In the example above,
BLOCK label always points to location 001026. If, for some reason,
the program needs to store more than 4000 bytes of data, the program will
need to be changed to increase the size.
(Note: all values in this article are in octal.)
In many cases, however, it's not clear at the time the program is written how much memory will be needed. For example, a program may need to hold a list of names from a tape, but the number of names may change each time the program is run. A program may only need to hold these names in memory for only a portion of the its operations. In many cases, it is helpful for a program to request a chunk of memory, use that memory to perform some operation, then release it for use by some other part of the program.
Instead of having a programmer try to figure out how to lay out all of
those memory chunks ahead of time, they can instead write some routines
to manage all of that instead. This technique is called "dynamic memory
allocation". One routine (
MALLOC) allocates, or reserves, a chunk of memory.
It takes the number of bytes as an argument and returns an address that has
been reserved that is at least as large and the number of bytes requested.
Another routine (
FREE) deallocates, or frees, chunks of memory can take
that address and remove the reservation, making the memory available for
the next reservation request.
The area of memory from which dynamic blocks are carved is commonly known
as "the heap". Note, this is unrelated to other uses of the term "heap" in
computer science, particluarly the tree-based data structure. This is memory
that does not contain the program instructions or the "static" memory
locations defined within the program itself, like the
BLOCK locations above.
In the examples here, our PDP-11 memory (diagrammed here in the traditional way, with the highest memory location on top) will be laid out as
The "blank" space below 160000 in the memory map represents the lack of memory hardware. In a fully maxed out twenty-eight kiloword PDP-11, there is no blank space, and the "xx7777" is 157777. The smallest memory configuration is four kilowords, which means "xx7777" is "017777". Attempting to access these locations leads to a Unibus bus error trap being raised.
The heap itself can be statically defined in the program. For example
.=1000 MOV R1,-(SP) ; STORE R1 ON THE STACK ; etc. HALT ; END OF PROGRAM ; STATIC MEMORY BALANC: .WORD 0 ; THE BALANCE PRICE: .WORD 0 ; THE PRICE NAME: .ASCII /THE NAME/ ; THE NAME ; DYNAMIC MEMORY HEAP: .=.+100000 ; THE HEAP, 32 KILOBYTES .END
With this, we have a thirty-two kilobyte chunk of space that can be carved up and provided to the program when it needs it. A clever program can make use of the Unibus bus error trap to determine the topmost memory. In any case, there has to be some bookkeeping so the system knows which parts of memory have been allocated, and which parts are available to be allocated.
The heap will be arranged as a series of blocks. A block has a header, and space for data. The block header is one word: the size of the data portion of the block. In an allocated block, the data portion is free-form: the program can use the memory space in any way the programmer desires. A free block uses the first word of the data portion as a pointer to the next free block. The remaining space in the data portion is undefined, and may be "polluted" with data from previous allocations.
When a heap is initialized, it consists of a free block with a size of two bytes, followed by a free block with a size of the rest of memory used by the heap. That first free block is special: it cannot be deallocated. The last free block in the heap has zero as its next free block, since there are no free blocks after it.
MALLOC routine follows a linked list of free blocks until it finds a
block that is large enough to satisfy the requested size. It keeps track
of the location of the current free block, which starts out as the "first
free block" value, and the previous free block pointer location. The routine
checks four conditions:
If the current free block value is zero, that means the routine could not find a free block large enough to satisfy the request, so an error condition (zero) is returned.
If the current free block is smaller than the requested size, we set the current free block pointer to the value of the next free block pointer, and try again.
If the current free block has a size that is exactly equal to the requested size, the previous free block is set to the current free block's next free block value, and the current free block location is returned.
If the current free block has a size that is larger than the requeted size, the current block is split into two: the current block size is set to the requested size, and the a new free block is created out of the remainder, its size is the original size of the block, minus the size of the request, and its next free block pointer is set that of the original block. The previous free block pointer is updated to point to this new free block.
For example, if a request for a block of 1000 bytes comes in, and while walking down the block list, a free block of size 1600 is found at location 13320, we start with
As an example of block splitting, say we start with a free block of 1600 bytes at location 13320:
11566: 500 ; block size is 500. This block is free. 11600: 13320 ; next free block is at location 13320 ... 13320: 1600 ; block size is 1600 bytes. This block is free. 13322: 21242 ; next free block is at location 21242 ... 15120: ; last word in free block ... ... other allocated block(s) here ... ... 21242: 700 ; block size is 700. This block is free. 21244: xxxxx ; next free block is at location xxxxx ...
MALLOC is called with a request to reserve a 1000 byte block, it is split
off from the 1600 bytes block, leaving the rest as a free block:
11566: 500 ; block size is 500. This block is free. 11600: 14322 ; next free block is at location 14322 ... 13320: 1000 ; block size is 1000 bytes 13322: ; first word of data in allocated block 13324: ; second word of data in allocated block ... 14320: ; last word of data in allocated block 14322: 576 ; block size is 576 bytes 14324: 21242 ; next free block is at location 21242 ... 15120: ; last word in free block ... ... other allocated block(s) here ... ... 21242: 700 ; block size is 700. This block is free. 21244: xxxxx ; next free block is at location xxxxx ...
The remainder free block has a size of 576 (not 600) becuase two bytes are needed for the block size. Also, the block that had 13320 as the next free block needs to now point to 14324, the remainder block, as its next free block. The remainder's next free block points to the origninal free block's next free block: in this case, at 21242.
MALLOC routine returns the address of the data section of the block-
the calling program should not need to skip over the header to get at the
memory that has been allocated for it. And it should definitely not
attempt to change the value of the data in the header, namely its size.
Of course, this generation of PDP-11 has no memory management hardware,
so there's nothing stopping a program from accessing other parts of memory.
Later PDP-11 models do have memory management that can be used to protect
against a program poking around in memory that does not belong to it.
When a program no longer needs the memory it previously allocated, it calls
FREE routine to release the memory to be used by other allocation
requests. This routine needs to figure out what the previous free block is,
and set this block's next free block it point to the previous free block's
next free block pointer. Then it needs to point the previous free block's
next free block pointer to this block. Basically, this inserts the freed
block into the list of free blocks.
One problem that arises from this is that as blocks are allocated and deallocated, we will end up with many small free blocks. In many cases these blocks will also be consecutive: A free block of size 300 may be followed by a free block of size 200. If an allocation request comes in for a block of size 400, the allocation routine will skip over these blocks becuase they are not big enough to satisfy the request. As time goes on, there may be plenty of free memory, but it may be broken down into many small blocks.
One way to address this is to have the
FREE routine merge adjoining free
blocks. Say there is a free block of size 300, followed by an allocated block
of size 200, then a free block of size 100. If the size 200 block is
deallocated, it should be merged with the previous and subsequent blocks,
resulting in a single free block of size 600 where the free block of size
11566: 300 ; Block size 300. This block is free. 11570: 12272 ; Next free block is at 12272 ... 12066: ; last word in free block 12070: 200 ; Block size 200. 12072: ; first word of data in allocated block. ... 12270: ; last word of data in allocated block. 12272: 100 ; Block size is 100. This block is free. 12274: xxxxx ; Next free block is at xxxxx
FREE is called with the data location of the second block (12072, which
MALLOC returned to the program when it was allocated), it will merge
the blocks and we will end up with
11566: 604 ; Block size 604. This block is free. 11570: xxxxx ; Next free block is at xxxx
The size has four extra bytes, which come from the headers of the original 200 and 100 blocks.
Here is an implementation of the
MALLOC routine. It does a few argument
checks at the beginning: it makes sure the block size is even so it falls
on a word boundary. It also make sure that a request for zero bytes actually
allocates two bytes, so there's a place for the "next block" pointer when
the block is deallocated. It also assumes the label
HEAP points to the
special first free block, which is initialized to point to the following
"real" free block that spans the rest of the heap.
; ; MALLOC(SIZE) ; ; ALLOCATE A CONTIGUOUS BLOCK OF MEMORY WITH AT LEAST THE GIVEN SIZE ; IN BYTES. RETURN 0 IF THIS CANNOT BE DONE ; MALLOC: MOV R1,-(SP) ; SAVE R1 MOV R2,-(SP) ; SAVE R2 MOV R3,-(SP) ; SAVE R3 MOV R4,-(SP) ; SAVE R4 ; WORD-ALIGN THE SIZE INC 12(SP) ; INCREMENT SIZE BIC #1,12(SP) ; CLEAR LOW BIT BNE MALLO0 ; SKIP AHEAD IF NOT ZERO MOV #2,12(SP) ; SMALLEST BLOCK SIZE IS 2 ; INITIALIZE FREE BLOCK POINTER MALLO0: MOV #HEAP,R1 ; R1 <- PTR TO FIRST FREE BLOCK MALLO1: ; FIND A FREE BLOCK WITH ENOUGH SIZE MOV R1,R2 ; R2 IS THE PREVIOUS FREE BLOCK TST 2(R1) ; CHECK THE PTR BEQ MALLO8 ; IF NO NEXT FREE BLOCK, RETURN NULL MOV 2(R1),R1 ; OTHERWISE, POINT R1 TO THE NEXT FREE MOV (R1),R3 ; FIND THE DIFFERENCE BETWEEN THE SUB 12(SP),R3 ; CURR BLOCK SIZE AND REQUESTED TST R3 ; CHECK TO SEE IF THEY'RE EQUAL BEQ MALLO2 ; IF SO, ALLOCATE WITHOUT SPLIT CMP R3,#4 ; IS THE CURRENT BLOCK SIZE BIG ENOUGH BGT MALLO3 ; IF SO, ALLOCATE BY SPLITTING BR MALLO1 ; OTHERWISE, TRY THE NEXT FREE BLOCK MALLO2: ; ALLOC FREE BLOCK THAT IS EXACTLY THE SAME SIZE AS REQUESTED MOV 2(R1),2(R2) ; POINT THE PREV NEXT FREE TO NEXT FREE MOV R1,R0 ; SET THE RESULT TO THE CURRENT BLOCK ADD #2,R0 ; POINT R0 TO THE DATA ELEMENT OF BLOCK BR MALLO9 ; WE'RE DONE MALLO3: ; SPLIT FREE BLOCK MOV R1,R0 ; MOVE THE BLOCK HEADER TO R0 ADD #2,R0 ; POINT R0 TO DATA ELEMENT OF BLOCK MOV (R1),R3 ; STASH ORIGINAL BLOCK SIZE IN R3 MOV 2(R1),R4 ; STASH ORIGINAL NEXT FREE IN R4 MOV 12(SP),(R1) ; WRITE NEW SIZE ADD 12(SP),R1 ; POINT TO START OF EMPTY PART ADD #2,R1 ; ADD SPACE FOR HEADER SUB 12(SP),R3 ; SUBTRACT OUT NEW SIZE SUB #2,R3 ; SUBCTRACT OUT HEADER SPACE MOV R3,(R1) ; SET SIZE OF EMPTY PART MOV R4,2(R1) ; SET NEXT FREE BLOCK MOV R1,2(R2) ; POINT THE PREV NEXT FREE TO EMPTY B BR MALLO9 ; RETURN MALLO8: CLR R0 ; R0 <- NULL MALLO9: MOV (SP)+,R4 ; RESTORE R4 MOV (SP)+,R3 ; RESTORE R3 MOV (SP)+,R2 ; RESTORE R2 MOV (SP)+,R1 ; RESTORE R1 RTS PC ; RETURN TO CALLER
FREE deallocation routine:
; ; FREE(PTR) ; ; FREE MEMORY ALLOCATED BY MALLOC ; FREE: MOV R1,-(SP) ; SAVE R1 MOV R2,-(SP) ; SAVE R2 MOV R3,-(SP) ; SAVE R3 MOV #HEAP,R1 ; START AT THE BEGINNING MOV 10(SP),R2 ; PUT DATA PTR IN R2 SUB #2,R2 ; POINT TO BLOCK FREE0: TST 2(R1) ; IS NEXT FREE NULL? BEQ FREE9 ; IF SO, WE'RE DONE CMP 2(R1),R2 ; COMPARE NEXT FREE WITH PTR BGT FREE1 ; IF IT'S GREATER, WE FOUND IT MOV 2(R1),R1 ; OTHERWISE, BR FREE0 ; NEXT FREE1: ; MERGE WITH NEXT BLOCK IF IT IS FREE MOV R2,R3 ; R3 <- PTR TO BLOCK TO FREE ADD #2,R3 ; SKIP PAST HEADER ADD (R2),R3 ; R3 <- PTR TO BLOCK AFTER THIS CMP 2(R1),R3 ; IS THE BLOCK AFTER THIS ALSO THE ; NEXT FREE BLOCK? IF SO MERGE BNE FREE2 ; IF NOT, SKIP THIS MERGE ADD (R3),(R2) ; ADD THE NEXT BLOCK'S SIZE ADD #2,(R2) ; ADD TWO TO ACCOUNT FOR HEADER MOV 2(R3),2(R2) ; COPY THE NEXT BLOCK'S NEXT FREE MOV R2,2(R1) ; SET PREV BLOCK NEXT FREE TO THIS FREE2: ; MERGE WITH PREVIOUS BLOCK IF IT IS FREE MOV R1,R3 ; R3 <- PTR TO PREV FREE BLOCK ADD #2,R3 ; SKIP PAST HEADER ADD (R1),R3 ; R3 <- PTR TO BLOCK AFTER PREV CMP R2,R3 ; DOES THE PREV FREE BLOCK ABUT US? BNE FREE3 ; IF NOT, SKIP THIS MERGE ADD (R2),(R1) ; ADD THE SIZE OF THIS BLOCK TO THE ; PREVIOUS BLOCK ADD #2,(R1) ; ADD TWO TO ACCOUNT FOR HEADER BR FREE9 ; WE'RE DONE FREE3: ; THIS BLOCK IS BETWEEN TWO ALLOCATED BLOCKS MOV 2(R1),2(R2) ; COPY NEXT PTR TO FREED BLOCK NEXT MOV R2,2(R1) ; COPY PTR TO NEXT PTR FREE9: MOV (SP)+,R3 ; RESTORE R3 MOV (SP)+,R2 ; RESTORE R2 MOV (SP)+,R1 ; RESTORE R1 RTS PC ; RETURN TO CALLER
The algorithm used here is similar to the approach taken by Prof. Donald Knuth in the first volume of The Art of Computer Programming.