Monday, September 11, 2006

The Mark II OISC Self-Interpreter

It's been a while since my first post about a self-interpreter for a "subtract-and-branch-unless-positive" based OISC, and my early attempts to work out the corresponding matrix for that interpreter.

Back then I had manually generated a 2 X 2 matrix for that self-interpreter, although I knew this was only an approximation. For one thing I'd ignored the fact that the number of instructions executed by the interpreter when interpreting a "halt" (i.e. when jumping to a negative address) was different from the other cases.

When I finished my last post I thought that an 8 x 8 version matrix instead would produce "exact" results. So I tried to find that matrix, going through my usual pattern of taking shortcuts, making mistakes, repeating those mistakes, confusing myself, and generally doing everything the hard way, but still slowly making progress! Along the way I also realised that there were some straightforward improvements that could be made to that first version interpreter, and so I made those changes. Therefore this post is based on the improved version and its corresponding matrix rather than the original. After all, this whole thing started with me wondering about the lowest possible eigenratios, and as the updated OISC self-interpreter turns out to have an eigenratio of 34.159614... rather than 39.06.... it seemed more appropriate to use that. Hmmm... for just a moment there I thought the new eigenratio was exactly 10 times pi!

Here's my source code for the improved self-interpreter:


# Self interpreter for OISC ("One Instruction Set Computer") using "subleq".
#
# This uses "subleq" as the "one instruction" where A, B and C are memory
# addresses, and in the explanations and comments [A] means the integer value
# stored at address A. Memory is a zero based array of integers of some
# suitable "width". Signed 32 bit values are perfectly adequate.
#
# A single "subleq" instruction written like this
#
# subleq A B C
#
# and is stored in memory as three addresses A B and C in consecutive
# locations. The following pseudocode describes what happens when one
# instruction is executed.
#
# [B] = [B] - [A]
# if [B] <= 0 goto C
#
# Note that value of C is taken to be the value before the subtraction.
#
# Omitting the C operand (i.e. writing just "subleq A B") is understood to be
# shorthand for specifying C as the address of the next instruction. When the
# program starts running, the program counter is 0. It halts when a branch
# to a negative address happens.
#
# This self-interpreter expects the program that it should run to appear
# directly after its own code/data in memory.
#
# Written by Clive Gifford, 29/30 August 2006 based on the Wikipedia
# description of a OISC using the "subleq" instruction.
# 3 Sep 2006: did some "constant folding", removing a few instructions.
# 8 Sep 2006, a few more changes resulting in smaller memory footprint.

start:

# [A1] = [PC]

subleq A1 A1
subleq PC ZERO
subleq ZERO A1
subleq ZERO ZERO

# Code below (after modification from above) is equivalent to [A] = [[PC]]

subleq A A
A1: data 0
data ZERO
data A2
A2: subleq ZERO A
subleq ZERO ZERO

# [A] = [A] + [LEN]

subleq NEGL A

# [PC] = [PC] + 1

subleq NEG1 PC

# [B1] = [PC]

subleq B1 B1
subleq PC ZERO
subleq ZERO B1
subleq ZERO ZERO


# Code below (after modification from above) is equivalent to [B] = [[PC] + 1]

subleq B B
B1: data 0
data ZERO
data B2
B2: subleq ZERO B
subleq ZERO ZERO

# [B] = [B] + [LEN]

subleq NEGL B

# We have to copy [C] now, just in case the emulated subtraction modifies [[PC] + 2].

# [PC] = [PC] + 1

subleq NEG1 PC

# [C1] = [PC]

subleq C1 C1
subleq PC ZERO
subleq ZERO C1
subleq ZERO ZERO

# Code below (after modification from above) is equivalent to [C] = [[PC] + 2]

subleq C C
C1: data 0
data ZERO
data C2
C2: subleq ZERO C
subleq ZERO ZERO

# [[B]] = [[B]] - [[A]];
# if [[B]] <= 0 goto LEQZ

# Earlier code referring to addresses A and B has modified the next
# couple of words to create the equivalent of "subleq [A] [B] LEQZ"
# This is where we're "emulating" the subtraction...

A: data 0
B: data 0
data LEQZ

# [PC] += 1

subleq NEG1 PC
subleq ZERO ZERO START

# We come to address LEQZ when the emulated subleq is going to take
# the branch to the emulated address C.

LEQZ:

# First save the "raw" new PC
# [PC] = [C]

subleq PC PC
subleq C ZERO
subleq ZERO PC
subleq ZERO ZERO

# Check if [C] is less than zero. Halt if it is.
# We don't care about changing [C] as we've already copied the value to [PC] above.

subleq NEG1 C -1

# [PC] = [PC] + [LEN]

subleq NEGL PC

# Go back to the start of the loop ready for the next instruction

subleq ZERO ZERO START

NEGL: data -119 # make this == -PROG (my assembler is too primitive!)
NEG1: data -1
ZERO: data 0

C: data 0
PC: data PROG

# Code for the program to be interpreted must start at the PROG address...
PROG:

That is "assembled" into the following values in memory:


15 15 3
118 116 6
116 15 9
116 116 12
84 84 15
0 116 18
116 84 21
116 116 24
114 84 27
115 118 30
45 45 33
118 116 36
116 45 39
116 116 42
85 85 45
0 116 48
116 85 51
116 116 54
114 85 57
115 118 60
75 75 63
118 116 66
116 75 69
116 116 72
117 117 75
0 116 78
116 117 81
116 116 84
0 0 93
115 118 90
116 116 0
118 118 96
117 116 99
116 118 102
116 116 105
115 117 -1
114 118 111
116 116 0
-119
-1
0
0
119

Here's the full matrix for this self-interpreter:

| 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 |
| 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 |
| 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 |
| 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 |
| 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 |
| 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 |
| 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 |
| 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 |
| 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 |
| 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 |
| 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 |
| 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 |
| 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 |
| 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 |
| 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 |
| 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 |
| 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 |
| 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 |
| 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 |
| 8 10 9 8 10 9 10 11 11 9 11 10 10 11 11 9 11 10 11 12 12 10 12 11 |
| 1 1 0 1 1 0 0 0 0 1 1 0 0 0 0 1 1 0 0 0 0 1 1 0 |
| 1 1 0 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 |
| 26 24 24 26 24 24 20 19 19 25 23 23 20 19 19 25 23 23 19 18 18 24 22 22 |
| 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 |

This is a 24 x 24 matrix. The first 18 rows are all zeroes because those rows correspond to "variations" of the subleq "instruction" that aren't used at all in the interpreter itself. Therefore you can get the correct eigneratio by considering only the lower right 6 x 6 sub-matrix. In fact this smaller matrix was the first one that I found that also had a dominant eigenvalue matching the observed eigenration for the interpreter. It was only afterwards that I realised further variations had to be included in order to be able to use the matrix to calculate the exact number of instructions executed for arbitrarily tall towers of this self-interpreter.

Consider a single "subleq" instruction, "subleq A B C", where A B and C are addresses, and [A] represents the value stored in the memory at address A, etc. Then the rows and columns in the above matrix correspond to combinations of the following variations:
  1. A == 0 versus A > 0
  2. B == 0 versus B > 0
  3. [A] < [B] versus [A] >= [B]
  4. C == 0 versus C > 0 versus C < 0
The first 12 rows/columns for for variations where 'A == 0'. The second 12 are for when 'A > 0'. Within each of these blocks the first six are for when 'B == 0' and the second six are for when 'B > 0'. Within each of the six by six blocks, the first three rows/columns cover cases where '[A] < [B]' (meaning the branch is not taken) and the second three cover cases where '[A] >= [B]' (the branch will be taken). And finally, within each three by three block that have drilled down to, the first is for when 'C == 0', the second for when 'C > 0', and the third for when 'C < 0' (execution will stop if the branch is taken).

Within each pass through the main loop of the interpreter, each instruction executed can itself branch (even if only to the next instruction) or not (just fall through). To fill in the matrix we need to be able to predict which "variation" applies to each instruction in the interpreter. Doing this manually is tedious and error prone (as I discovered!) so I eventually ended up writing some Python tools to help me. These first helped identify which "variations" in the "input" instruction - the one being emulated - actually affected whether each individual instruction in the interpreter would branch or not, and then once those had been identified I used another tool to generate the matrix by emulating a single pass through the self-interpreter's main loop (it is just one big loop) for each such "variation" that could be emulated, counting the number of times each "variation" was executed, and thus populating the matrix.

Okay. I think I've just about had enough of OISC machines for a while. But while playing with OISC I've had some thoughts about how these "variations" can be generalised and how we might also be able to apply the idea of eigeratios to high level languages. Might try to describe that next time. Although I'd also like to see what kind of eigenratio a Universal Turing Machine might have...