A bytecode interpreter is basically a loop and a lookup table. The interpreter starts at the beginning of an array of bytes. Each byte in this array is an index into the lookup table. For each key in this table there is a piece of code that needs to be executed. Code for that may look like this:
typedef unsigned char byte;
// the simplest program
byte program[] = {
0,
};
byte* ip = program; // instruction pointer
while ((ip=(lookup_table[*ip])(ip))!=0) {
// does nothing else, but could
}
The expression in the while loop is a bit complex, but we can divide it in
parts. First there is ip
. This is the instruction pointer. It points to the
instruction that needs to be executed.
The lookup_table
is an array with the functions that correspond with the
bytecode. The typedef for that looks like this:
typedef byte* (*bytecode_function)(byte* ip);
byte* func_end(byte* ip) {
return 0;
}
bytecode_function lookup_table[] = {
func_end,
};
The two lines define the lookup_table
. The size of the lookup table is
defined to be 256, because that’s the size of bytes. The bytecode_function
returns the next ip
. This way a function can change the ip
and jump to
other places. If the function returns a NULL
pointer, it will end the loop.
The instruction pointer ip
is dereferenced to give the current byte at that
place in the program. This byte is the bytecode that is used to lookup the
function that needs to be executed.
The last thing is the ip
argument to the function. This argument lets the
function look at the bytes around the function. These bytes are the arguments
to the function.