Like many other computers of the time, the base PDP-11/20 model has not have a "multiply" instruction. Any program that needs to multiply numbers needs to provide a routine that can use the existing instructions to perform the multiplication operation. One technique is to use a series of shifts and adds, similar to the way multiplication is done by hand.

Start with a zero result value. For each bit of the multiplier, shift the result and multiplier left one bit. If the multiplier's high bit was set, then add the muptlicand to the result. Repeat this for all bits of the multiplier. As an example, multiply 13 (1101 in binary) by 11 (1011).

```
Result: 0
Multiplier: 1101
Multiplicand: 1011
```

Shift the Result left, and the Multiplier to the left. Since the high bit of the multiplier is a 1, the multiplicand is added to the result.

```
Result: 1011
Multiplier: 1010
Multiplicand: 1011
```

Next, shift result left, and the multiplier. Since the high bit of the multiplier was a 1, the multiplicand is added to the result:

```
Result: 100001 (10110 + 1011)
Multiplier: 0100
Multiplicand: 1011
```

Shift the result and multplier left. Since the high bit of the multiplier
was a 0, the multiplicand is *not* added to the result:

```
Result: 1000010
Multiplier: 1000
Multiplicand: 1011
```

Shift the result and multiplier left. Since the high bit of the multiplier was a 1, the multiplicand is added to the result:

```
Result: 10001111 (10000100 + 1011)
Multiplier: 0000
Multipliand: 1011
```

That was all the bits of the multiplier. The result is 10001111 in binary, which is 143 in decimal, which is in fact the product of 13 and 11.

Any N bit number multiplied by another N bit number can result in a 2N bit number. Specifically, a 16-bit number multiplied by another 16-bit number can result in a 32-bit number. Another way of saying this is that one word multiplied by another word may end up with a two word value.

Our multiply routine will take two words, a multplicand and a multiplier, and return two words in the result, the high and low words of the product. This example will use the "C" calling convention, where the inputs are pushed onto the stack, and the result will return in the R0 and R1 registers. The routine will be called MULWU: Multiply Word Unsigned. We'll create a routine that can multiply signed values later, and may create a Double-word (32 bit inputs) in the future.

This uses the "C" calling convention, so the multiplier and multiplicand are pushed to the stack before calling the routine. The result is the double word value held in R0 and R1, with R0 holding the high order bits, and R1 holding the low order bits.

Inline documentation is always handy, which indicates the inputs and outputs of the routine.

```
; MULWU(MULTIPLIER, MULTIPLICAND): R0, R1
;
; MULTIPLY WORDS UNSIGNED
;
; R0: RESULT HI
; R1: RESULT LO
MULWU:
```

This routine will require a couple of registers to hold intermediate results and a counter. The values of these registers need to be stashed so they can be restored before the routine returns control to the caller.

```
MOV R2,-(SP) ; SAVE R2
MOV R3,-(SP) ; SAVE R3
MOV R4,-(SP) ; SAVE R4
```

Next, the registers are initialized. For this routine, the low word of the result is cleared, the multiplier is copied from the stack into R0, the multiplicand copied from the stack into R2, and R3 is set to 16 (decimal), which is used as a counter, as the algorithm performs 16 shifts.

At this point, the stack pointer SP is pointing to the value of R4. The next item on the stack, at location SP + 2 is the value of R3. SP + 4 holds the value of R2. SP + 6 holds the return address for this subroutine call. SP + 10 holds the value that was pushed before JSR was called, which in this case is the multiplicand. SP + 12 holds the value that was pushed before that, in this case the multiplier.

```
CLR R0 ; CLEAR RESULT HI
MOV 10(SP),R4 ; MOVE MULTIPLIER TO R4
MOV 12(SP),R2 ; MOVE MULTIPLICAND TO R2
MOV #20,R3 ; SET R3 COUNTER TO 16(10)
CLR R1 ; CLEAR RESULT LO
```

This is the shift-and-possibly-add sequence. The first two lines implement a two word (32 bit) left rotate. ASL (Arithmatic Shift Left) shifts the bits of the word left by one bit, filling in the low bit with a zero, and sets the carry bit to the previous value of the it that just "fell off" the left side of the word (ie, the previous value of the high bit). ROL (Rotate Left) is similar, but fills in the low bit with the previous value of the carry flag. Used together, these effectively become a multi-word left shift.

```
MULWU0: ASL R1 ; SHIFT LEFT RESULT LO
ROL R0 ; ROTATE LEFT RESULT HI
```

Then, the multiplier is shifted left.

```
ASL R4 ; SHIFT LEFT MULTIPLIER
```

If a zero "fell off" the left side of the word (ie, the previous value of the high bit was zero), the add step can be skipped.

```
BCC MULWU1 ; CARRY CLEAR: NO NEED TO ADD
```

Now, the multiplicand can be added to the result. Since the result is held in two words, the multiplicand is added to the low word, and if that caused the carry flag to be set, add it to the high word.

```
ADD R2,R1 ; ADD MULTIPLICAND TO RESULT LO
ADC R0 ; ADD CARRY (IF ANY) TO RESULT HI
```

Decrement the bit counter. If the counter hasn't reached zero, there are are more bits in the multipler.

```
MULWU1: DEC R3 ; DECREMENT COUNT
BNE MULWU0 ; IF NOT DONE, NEXT DIGIT
```

If there are no more bits to consider, we are done. Restore the registers to their original values, and return back to the caller.

```
MOV (SP)+,R4 ; RESTORE R4
MOV (SP)+,R3 ; RESTORE R3
MOV (SP)+,R2 ; RESTORE R2
RTS PC ; RETURN TO CALLER
```

To use the routine, push the multiplier and multiplicand to the stack, call the routine, and upon return, the result will appear in the R0 and R1 registers.

```
MOV #2002,-(SP) ; PUSH 1026 DECIMAL
MOV #3003,-(SP) ; PUSH 1539 DECIMAL
JSR PC,MULWU ; CALL MULWU(MULTIPLICAND, MULTIPLIER)
ADD #4,SP ; CLEAN UP STACK
```

Here, R0 is 000030 and R1 is 014006. In binary, these are

```
0 000 000 000 011 000 0 001 100 000 000 110
```

Taken as a 32 bit number, this is 1579014 decimal, which is the product of 1026 and 1539.

There is an optimization that can be used to eliminate one of the ASL instructions; notice that the high word of the result and the multiplier are shifted left in back-to-back instructions. If we use the high word of the result to hold the multiplier, the ASL instruction serves double-duty: it shifts the multiplier bits to the left, and the high bits of the results are shifted from the low word to the high, in the same instruction. This also means we can use one less register.

```
; MULWU(MULTIPLIER, MULTIPLICAND): R0, R1
;
; MULTIPLY WORDS UNSIGNED
;
; R0: RESULT HI
; R1: RESULT LO
MULWU: MOV R2,-(SP) ; SAVE R2
MOV R3,-(SP) ; SAVE R3
MOV 6(SP),R0 ; MOVE MULTIPLIER TO R0
MOV 10(SP),R2 ; MOVE MULTIPLICAND TO R2
MOV #20,R3 ; SET R3 COUNTER TO 16
CLR R1 ; CLEAR RESULT LO
MULWU0: ASL R1 ; SHIFT LEFT RESULT LO
ROL R0 ; ROTATE LEFT RESULT HI
BCC MULWU1 ; CARRY CLEAR: NO NEED TO ADD
ADD R2,R1 ; ADD MULTIPLICAND TO RESULT LO
ADC R0 ; ADD CARRY (IF ANY) TO RESULT HI
MULWU1: DEC R3 ; DECREMENT COUNT
BNE MULWU0 ; IF NOT DONE, NEXT DIGIT
MOV (SP)+,R3 ; RESTORE R3
MOV (SP)+,R2 ; RESTORE R2
RTS PC ; RETURN TO CALLER
```

Creating a version of this routine that handles signed multiplication is straightforward: for each argument, if the value is negative, invert a flag, and compute its twos' compliment, to make it positive. Call the unsigned multiply routine. If the flag indicates that only one of the original arguments was negative, then compute the (32-bit!) twos' compliment of the result.

This can be implemented by modifying the multiplication routine by adding an alternate entry point that handles negation and sign management, and by modifying the end of the original routine to negate the result if the inputs had opposite signs. The orignal routine will be modified first, to set aside a register to hold the negation flag, and to check the value of that flag at the end to see if the result needs to be negated:

```
; MULWU(MULTIPLIER, MULTIPLICAND): R0, R1
;
; MULTIPLY WORDS UNSIGNED
;
; R0: RESULT HI
; R1: RESULT LO
MULWU: MOV R2,-(SP) ; SAVE R2
MOV R3,-(SP) ; SAVE R3
MOV R4,-(SP) ; SAVE R4
CLR R4 ; CLEAR RESULT SIGN BIT
MOV 10(SP),R0 ; MOVE MULTIPLIER TO R0
MOV 12(SP),R2 ; MOVE MULTIPLICAND TO R2
MULW0: MOV #20,R3 ; SET R3 COUNTER TO 16
CLR R1 ; CLEAR RESULT LO
MULWU0: ASL R1 ; SHIFT LEFT RESULT LO
ROL R0 ; ROTATE LEFT RESULT HI
BCC MULWU1 ; CARRY CLEAR: NO NEED TO ADD
ADD R2,R1 ; ADD MULTIPLICAND TO RESULT LO
ADC R0 ; ADD CARRY (IF ANY) TO RESULT HI
MULWU1: DEC R3 ; DECREMENT COUNT
BNE MULWU0 ; IF NOT DONE, NEXT DIGIT
TST R4 ; CHECK THE RESULT SIGN BIT
BPL MULW9 ; IF NOT SET, SKIP TO THE END
COM R0 ; COMPLIMENT RESULT HI
NEG R1 ; TWO'S COMPLIMENT RESULT LO
MULW9:
MOV (SP)+,R4 ; RESTORE R4
MOV (SP)+,R3 ; RESTORE R3
MOV (SP)+,R2 ; RESTORE R2
RTS PC
```

This adds a couple of unneeded stack operations and a test/branch for the unsigned multiplication operation, but it's a small price to pay to make the signed operation simpler, and the combined routine smaller.

The signed routine checks to see if the mutiplier and multiplicand are negative, and for each, negate them if necessary and flip the result sign bit. It then jumps into the unsigned multiplication routine for the shift-and-add phase. This skips the setup part of the usnigned routine. The signed routine could have been implemented by calling the unsigned routine as a subroutine, but that would require additional stack manipulation. The way used here basically merges the unsigned and signed routines into one routine that has two entry points.

```
; MULWS(MULTIPLIER, MULTIPLICAND): R0, R1
;
; MULTIPLY WORDS SIGNED
;
; R0: RESULT HI
; R1: RESULT LO
MULWS: MOV R2,-(SP) ; SAVE R2
MOV R3,-(SP) ; SAVE R3
MOV R4,-(SP) ; SAVE R4
CLR R4 ; CLEAR RESULT SIGN BIT
MOV 10(SP),R0 ; MOVE MULTIPLIER TO R0
MOV 12(SP),R2 ; MOVE MULTIPLICAND TO R2
TST R0 ; TEST THE MULTIPLIER
BPL MULWS0 ; IF NOT NEGATIVE, SKIP TO MULTIPLICAND
NEG R0 ; MAKE THE MULTIPLIER POSITIVE
COM R4 ; FLIP THE RESULT SIGN BIT
MULWS0: TST R2 ; TEST THE MULTIPLICAND
BPL MULW0 ; IF NOT NEGATIVE, SKIP TO SHIFT-AND-ADD
NEG R2 ; MAKE THE MULTIPLICAND POSITIVE
COM R4 ; FLIP THE RESULT SIGN BIT
BR MULW0 ; BRANCH TO THE SHIFT-AND-ADD SECTION
```

If the multiplicand and multiplier are both positive or negative, the R4 register will be zero. If only one of them is negative, the R4 register will be -1 (all bits set), which signals the unsigned routine that the result needs to be negated. Calling this routine will exercise this negation section:

```
MOV #2002,-(SP) ; PUSH 1026 DECIMAL
MOV #174775,-(SP) ; PUSH -1539 DECIMAL
JSR PC,MULWS ; CALL MULWS(MULTIPLICAND, MULTIPLIER)
ADD #4,SP ; CLEAN UP STACK
```

R0 and R1 will be 177747 and 163772, respectively. In binary:

```
1 111 111 111 100 111 1 110 011 111 111 010
```

This 32-bit value using two's compliment, is -1579014 decimal, the product of 1026 and -1539.

The Paper Tape Software set included a tape that had math routines that included a multiply routine. Digital offered a coprocessor module (Extended Arithmetic Element) that could be used by programs to perform a handful of operations including multiplications. Later PDP-11 models included a processor option called the Extended Instruction Set (EIS) that includes a MUL instruction, which was later made standard.