Tuesday, August 29, 2006

How to build the full matrix, and why we don't really need to!

In "What's inside um.um" I described how I couldn't (at that time) see an easy way to handle some conditional branching and loops inside the um.um code responsible for emulating certain instructions. This meant that the matrix I eventually built had some "simplifications" (even though I also hoped/suspected they weren't significant in terms of calculating the correct eigenratio for um.um).

But after revisiting the idea of splitting these more complex instructions into multiple columns in the matrix, I've looked at um.um again (and in more detail), and splitting opcodes 1 and 2 into separate columns in the matrix turns out to be quite easy after all. In fact, the values in the columns of the expanded matrix are essentially the same for all four cases (two opcodes, and two variations for each) although those instructions are executed in different orders and with different operands as you'd expect.

For example, if we look at opcode 1, there are two execution paths in um.um, one for when the emulated register B is zero, and the other for when it is not. The sections of code in um.um to emulate opcode 1 for those two cases are also quite similar to each other but one ends up executing opcode 1 with register B equal to zero, and the other executes opcode 1 with register B equal to the value from the emulated register B. In other words, um.um is not using anything other than array 0 for its "own purposes" directly, but if the emulated program allocates new arrays and accesses them, then the um.um code will eventually "mirror" the appropriate instructions, essentially delegating the final handling of the specific task to the next layer down.

The other "tricky" case was emulating opcode 12 when register B is not zero (which means you are loading a new program to replace the current one in array 0). Um.um is essentially also using its own array 0 as storage for the emulated array 0. Therefore it emulates opcode 12 (with non-zero register B) by creating a new array big enough for a copy of its own code and data (a total of 256 platters) plus a copy of the new program that the emulated program is loading. It copies itself into the first 256 "platters" and then copies the array being "loaded" in the emulated program into memory following that. The main point is that a variable amount of data has to be copied which means that some sections of code in um.um are executed a variable number of times in a loop. Actually, there are two separate loops, one to copy um.um itself (fixed size of 256 platters) and another to copy the array that is being "loaded" in the "next level up". These loops are controlled by a count, and we can quite easily determine the correct values for the number of times each type of instructions will be executed in our current level, for any particular size of array that is being "loaded" by the next level up. So, by using one column for each possible size of the array being loaded we can now describe exactly how to construct the full version, even though this will generate a very large matrix. And once we have the full matrix then we can then determine the dominant eigenvalue (i.e., eigenratio) as before.

Note that there might still be some cases where this kind of approach isn't possible. However um.um turns out to be another case of a self-interpreter that can be also described precisely as a matrix, even though also requiring a more complicated approach than is possible with simpler instruction sets and self-interpreters. It seems reasonable to me to expect that most self-interpreters written in a "machine language" should be able to be handled in a similar way.

Anyway, having outlined how we can build a full matrix for um.um, I'm now going to to attempt to describe some row and column properties we might find in a matrix corresponding to a specific self-interpreter. When these patterns occur, I believe the associated rows and columns don't affect the dominant eigenvalue, and therefore don't affect the eigenratio of that self-interpreter. In other words, in particular circumstances we can eliminate some rows and columns without affecting the dominant eigenvalue. Hopefully, that will help explain why I am confident the simplified matrix for um.um (presented previously) gives the same eigenratio as we would get from the much larger "full matrix" that we could construct by splitting the troublesome instructions into multiple variations each with their own rows and columns. (If we wanted to go to all that trouble, which I don't!)

If one row of our matrix is all zeros, then that row and the corresponding column can be removed without affecting the eigenratio. This would mean our interpreter is not using the corresponding instruction/variation in any way at all. This situation is possible (perhaps even likely) if the instruction in question can be recognised and emulated only using instructions from a completely different subset of the language.

A column full of zeroes doesn't really make any sense because even if the instruction set includes some kind of NOP, we still have to recognise it, and that must require something be executed before we "emulate it" - presumably by basically doing nothing at all! But in any case, a column filled with zeroes and its corresponding row can also be removed without changing the value of the eigenratio.

One way to see why rows or columns that are filled with zeroes can't contribute to the dominant eigenvalue for the matrix (at least for the specific kinds of matrices that we are dealing with) is to recognise that when we apply the Power Method we are multiplying the matrix by itself over and over again (in effect) and in that case these zero filled rowed or columns never change because of the way matrix multiplication works - they'll always stay full of zeroes.

A variation on this is a row that is all zeroes except for a non zero value in the leading diagonal position. That would correspond to an instruction that isn't used anywhere except in the code that is responsible for emulating its own behaviour - and presumably the common situation in that case would be that the interpreter is delegating the emulation to the next level down (i.e. the machine CPU or another interpreter). In that case there would be a 1 in the leading diagonal and we'd have some other non zero values elsewhere in that column (because we still have to "recognise" the instruction). But once again, if the value in the leading diagonal is 1 and we have zeroes everywhere else across that row (or column), then when the matrix is multiplied by itself some number of times (corresponding to a tower of interpreters of that height) the values in that row are preserved. In a diagonal matrix the values along the diagonal are in fact also the eigenvalues. However our dominant eigenvalue must be greater than one and a row/column filled with zeroes except for a one in the leading diagonal won't change the value of the dominant eigenvalue which must come from some other part of the matrix in that case.

This happens in the "full" um.um matrix for all cases of opcode 12 with register B not equal to zero. In all those cases, um.um handles the emulation using various other instructions plus one execution of opcode 12 with register B non-zero. And opcode 12 (with a non -zero value in register B) isn't executed anywhere else in the um.um code. A similar situation applies for opcodes 1 and 2 when register B (or A as approporiate) is non-zero.

Removing a row (and corresponding column) that is all zeroes except (possibly) for a 1 in the leading diagonal also won't reduce the maximum eigenvalue for that matrix if it is greater than 1 in terms of the remaining rows and columns. That is true for the matrices we are using and so the bottom line is that although it seems we have to add a very large number of rows and columns to construct the "full" matrix for um.um, in the end all these additions won't change the magnitude of the largest eigenvalue for that matrix. In other words, the correct eigenratio for um.um can be obtained from the smaller simplified matrix.

Next project: write a self-interpreter for a OISC (using the "subleq" instruction), obtain its eigenratio from a matrix with one column per "variation" that the "subleq" has, and then finally check that the calculated eigenratio matches real-world "observations"!