Every non-trivial program needs a way to reuse fragments of code that are used in more than one place in the program. Subroutines allow the program to jump to another part of the program, execute some instructions, then jump back to where the subroutine was called from. This allows the program to have only one copy of the code, which is important for machines with small amounts of memory. It also allows a programmer to write a series of instructions once, and can be debugged in one place, instead of having to fix the instruction sequence throughout the program. Subroutines also let a programmer break down a complex problem into a set of smaller problems.
The PDP-11 instruction set includes the JSR
(Jump to SubRoutine) function.
This instruction takes two operands; the first is a register, and the second
is the address of the subroutine. When the processor executes the JSR
instruction, it pushes the value of the register operand to the stack, and
stores the address of the instruction following the JSR
instruction into
the register. It then sets the program counter (PC) to the value of the
destination operand, so the next instruction to be run is the first
instruction of the subroutine.
The RTS
(Return from SubRoutine) instuction does the opposite of this. It
takes a register operand. When the processor executes the RTS
instruction,
it sets the PC to the value of the register operand, and pops the value from
the top of the stack into the register operand.
In early PDP-11 software, it was customary to use R5 as the register operand, and would also be known as the "link register". As a trivial example, say we have a subroutine that multiplies the R0 register by two, by shifting its bits to the left. A program that wants to use this subroutine may look like the following
; PROGRAM CODE HERE
JSR R5,TIMES2 ; CALL TIMES2 WITH R5 LINKAGE
; MORE PROGRAM CODE HERE
TIMES2: ASL R0 ; SHIFT LEFT ONE BIT
RTS R5 ; RETURN TO CALLER USING R5 LINKAGE
When the program reaches the JSR
instruction, R5 is pushed to the stack,
R5 is set to the address after the JSR
instruction, and PC is set to the
address of the ASL
instruction. When the program reaches the RTS
instruction, it sets the PC to the value stored in R5 (the address after the
JSR
instruction), and pops the value from the top of the stack into R5.
Subroutines are more useful if they can take parameters- for example, some parts of the program would want to multiply the value of a certain memory location by two, and some other parts of the program may want to multiply the value of a different memory location by two. Instead of writing two subroutines that do largely the same thing, a programmer can write one subroutine that takes one or more parameters.
There needs to be some agreement between the caller of the subroutine and the author of the subroutine about how these parameters are passed from the caller to the subroutine, and how results of the routine are returned to the caller. These agreements are called "calling conventions".
A simple calling convention would be "the R0 is used for the first input parameter, and upon return, R1 will contain the result. Use R5 as the linkage register". For example,
MOV #2000,R0 ; WE WANT TO MULTIPLY THE VALUE AT 2000
JSR R5,TIMES2 ; CALL THE SUBROUTINE
; MULTIPLY THE ADDRESS IN R0 BY TWO
; RETURN THE ORIGINAL VALUE IN R1
TIMES2: MOV (R0),R1 ; COPY THE ORIGINAL VALUE TO R1
ASL (R0) ; SHIFT THE MEMORY ADDRESS IN R0 LEFT
RTS R5 ; RETURN TO CALLER AT THE ADDRESS IN R5
The problem with this is that you may want to use R0 or R5 for something else; if you want to use those values later, you need to save them somewhere and later restore them. As a caller, you would need to know which registers the subroutine modifies, which may be different for each subroutine. Also, if there are more parameters than registers, this convention won't work.
A different calling convention used by early PDP-11 systems uses a bit of
trickery: it puts the parameters at the address directly after the JSR
instruction, and expects the subroutine to use the linkage register to
find the parameter values, incrementing it along the way, so by the time
RTS
is called, the linkage register should hold the address of the
instruction after the last parameter. For example
JSR R5,TIMES2 ; CALL THE SUBROUTINE
.WORD 2000 ; FIRST PARAMETER
; MULTIPLY THE ADDRESS IN R0 BY TWO
; RETURN THE ORIGINAL VALUE IN R1
TIMES2: MOV (R5),R1 ; R5 IS THE ADDRESS AFTER JSR. SAVE IT TO R1
ASL (R5)+ ; SHIFT THE VALUE AT THE ADDRESS IN R5,
; AND INCREMENT R5 TO POINT TO THE ADDRESS
; AFTER THE .WORD DIRECTIVE
RTS R5 ; RETURN TO CALLER AT THE ADDRESS IN R5
One of the downsides to this approach is that the address in this case
is "burned in" to the program- if you wanted this part of the code to
use a different address, the program would have to modify the value after
the JSR
instruction: if this program was in read-only memory, this would
not be possible. Also- later PDP-11 machines do not allow the memory that
holds instructions to be altered by other instructions.
Any register can be used as the "linkage" register, even PC. The above could also be written as
; PROGRAM CODE HERE
JSR PC,TIMES2 ; CALL TIMES2 WITH PC LINKAGE
; MORE PROGRAM CODE HERE
TIMES2: ASL R0 ; SHIFT LEFT ONE BIT
RTS PC ; RETURN TO CALLER USING PC LINKAGE
In this case, JSR
pushes the current value of PC (the instruction after
JSR
) to the stack, then sets the PC to the address of the ASL
instruction.
At RTS
, it sets the PC to the value stored in PC (which does nothing), and
pops the value from the top of the stack into PC.
Another common calling convention was introduced in the UNIX operating system, and its associated C programming laguage developed on the PDP-11. Also known as the "C" calling convention, this uses the stack to pass parameters into the subroutine, using PC as the linkage register, and using R0 and R1 to return values. The caller is responsible for putting parameters on the stack and cleaning up the stack after the call. Parameters are pushed "right to left", in reverse order- if a subroutine needs three parameters, the caller pushes the third parameter, then the second, then the first. Our example here only has one parameter.
MOV #2000,-(SP) ; PUSH 2000 TO STACK
JSR PC,TIMES2 ; CALL TIMES2 WITH PC LINKAGE
ADD #2,SP ; CLEAN UP STACK
; MULTIPLY THE ADDRESS IN THE FIRST PARAMETER BY TWO
; RETURN THE ORIGINAL VALUE IN R0
TIMES2: MOV @2(SP),R0 ; SAVE ORIGINAL VALUE IN R0
ASL @2(SP) ; SHIFT LEFT ONE BIT
RTS PC ; RETURN TO CALLER USING PC LINKAGE
Remember that the @n(Rx) is the deferred indexed addressing mode, and
tells the processor that the operand is at the address given by adding
n to the value of Rx. In this case, the address at the location of
2 plus SP. At the beginning of a subroutine, the top of the stack (ie,
the address given by SP) contains the address where the program should
return when RTS
is executed. Since the stack grows "downward", 2(SP)
is the word "underneath" the top of the stack, in this case, the value
2000 that was put there by the MOV instruction. @2(SP) means the source
is the value at the address in 2(SP), in this case, the value is 2000.
So, MOV @2(SP),R0
moves the value at 2000 into R0. Likewise, ASL @2(SP)
tells the processor to shift the value at 2000 left. After returning,
the caller should 'pop' the parameters from the stack. Adding 2 to the
stack pointer effectively restores the top of the stack to the value it
had before the call.
This may seem like an additional overhead for the caller to have to remember to clean up the stack- after all, a subroutine could be designed to manipulate the stack in a way that it cleans up after itself. But sometimes, the programmer may want to use the same parameters for a series of subroutine calls:
MOV #2000,-(SP) ; PUSH 2000 AS FIRST PARAMETER
JSR PC,SUB1 ; CALL SUB1 WITH 2000 PARAMETER
JSR PC,SUB2 ; CALL SUB2 WITH 2000 PARAMETER
JSR PC,SUB3 ; CALL SUB3 WITH 2000 PARAMETER
ADD #2,PC ; RESTORE THE STACK
If the subroutines were responsible for cleaning up the stack themselves, then the programmer would have to push 2000 to the stack before each call, because the subroutine would "helpfully" clean the parameter from the stack at the end of the call. Having the caller be responsible for this can speed up the program and provide the caller with more flexibility, at the expense of having the clean up the stack themselves.
A well-behaved subroutine should leave the other registers as they were before the call. A common idiom is to push registers modified in the subroutine to the stack at the beginning of the subroutine, and restore them from the stack at the end. For example
TIMES2: MOV R3,-(SP) ; STASH R3 IN STACK
MOV R4,-(SP) ; STASH R4 IN STACK
CLR R3 ; MODIFY R3
CLR R4 ; MODIFY R4
MOV @6(SP),R0 ; SAVE ORIGINAL VALUE IN R0
ASL @6(SP) ; SHIFT VALUE LEFT
MOV (SP)+,R4 ; RESTORE R4 FROM STACK
MOV (SP)+,R3 ; RESTORE R3 FROM STACK
RTS PC ; TOP OF STACK IS RETURN ADDRESS
; RESULT IS IN R0
Notice the value is no longer at 2(SP), but at 6(SP). This is because at the
MOV @6(SP),R0
instruction, the top of the stack (SP) contains the original
value of R4, 2(SP) contains the original value of R3, 4(SP) contains the
return address put there by the JSR
instruction, and 6(SP) contains the
value put there by the MOV value,-(SP)
call before the JSR
. Thus, it is
very important that just before the RTS
instruction, the top of the stack
contains the return address set by the JSR
instruction. Popping R4 and R3
from the stack "undoes" the pushes of R3 and R4 at the beginning of the
subroutine.
This calling convention is also used in later architectures- it is seen in in C code compiled for many 32-bit architectures. Today's 64-bit architectures have many registers, and use a calling convention that pass subroutine arguments in registers. C compilers allow a programmer to specify which calling convention to use- some default to this "C" convention (modified to suit the processor and OS architecture). An example of a variant "fast call" convention would put the first two parameters into R0 and R1, and the remainder of the parameters on the stack like "C". Since many subroutines need two or fewer parameters, this would save the time needed to push and pop parameters to and from the stack.
In the end, whichever convention is chosen, it is important that the convnentions used within a program are consistent and documented.