The Lua Virtual Machine [Part 1]

The Implementation of Lua Virtual Machine, This is an Advanced topic. Due to text contraints, this tutorial is a multi-parter.

by MemoryAddress0

Author Avatar

A Lua Learning Special

ESTIMATED READ TIME: 15 minutes

Introduction


We'll be talking about the Lua 5.1 VM, the VM is very similar to Roblox's Luau.

Lua is a powerful and lightweight programming language that is often used as an embeddable scripting engine in applications. It is known for its simplicity, flexibility, and ease of use.

One of the key components of the Lua programming language is its virtual machine (VM). The Lua VM is responsible for executing Lua code, managing memory, and handling interactions with the host application. In this tutorial, we will dive into the inner workings of the Lua 5.1 VM, including its opcodes, data types, and execution model.

The VM has a number of internal registers that are used to store data and state information. These include the stack, which is used to store and manipulate values during execution, and the program counter (PC), which tracks the current instruction being executed.

Instructions in the Lua 5.1 Virtual Machine are typically created using a compiler, which converts Lua code into the bytecode format used by the virtual machine. However, in this tutorial I will not cover the compiler, as it is not necessary to understand the how the virtual machine works. Understanding how these instructions work will allow you to better understand how the virtual machine functions and how to optimize your code for execution.

Opcodes


image|342x372 Image from Lua 5.1 VM Instructions by Kein-Hong Man

The Lua 5.1 VM has a set of opcodes that are used to perform various operations. These opcodes include

MOVE This opcode is used to move a value from one location on the stack to another. The A register specifies the destination location, and the B register specifies the source location. For example, "MOVE A B" would mean "move the value at location B on the stack to location A on the stack."

LOADK This opcode is used to load a constant value onto the stack. The A register specifies the destination location on the stack, and the Bx register specifies the index of the constant value to load. For example, "LOADK A Bx" would mean "load the constant value at index Bx onto the stack at location A."

LOADBOOL This opcode is used to load a boolean value onto the stack. The A register specifies the destination location on the stack, the B register specifies the boolean value (1 for true, 0 for false), and the C register specifies whether or not to skip the next instruction. For example, "LOADBOOL A B C" would mean "load the boolean value B onto the stack at location A, and if C is 1, skip the next instruction."

LOADNIL This opcode is used to load nil values onto a range of locations on the stack. The A register specifies the first location to be set to nil, and the B register specifies the last location to be set to nil. For example, "LOADNIL A B" would mean "set the values at locations A through B on the stack to nil."

GETUPVAL This opcode is used to retrieve the value of an upvalue (a variable that is defined in an enclosing function) and push it onto the stack. The A register specifies the destination location on the stack, and the B register specifies the index of the upvalue to retrieve. For example, "GETUPVAL A B" would mean "push the value of the upvalue at index B onto the stack at location A."

GETGLOBAL This opcode is used to retrieve the value of a global variable and push it onto the stack. The A register specifies the destination location on the stack, and the Bx register specifies the index of the global variable in the constant table. For example, "GETGLOBAL A Bx" would mean "push the value of the global variable at index Bx in the constant table onto the stack at location A."

GETTABLE This opcode is used to retrieve the value of a table element and push it onto the stack. The A register specifies the destination location on the stack, the B register specifies the index of the table on the stack, and the C register specifies the key of the element to retrieve. For example, "GETTABLE A B C" would mean "push the value of the element with key C in the table at location B onto the stack at location A."

SETGLOBAL This opcode is used to set the value of a global variable. The A register specifies the location on the stack containing the value to be set, and the Bx register specifies the index of the global variable in the constant table. For example, "SETGLOBAL A Bx" would mean "set the value of the global variable at index Bx in the constant table to the value at location A on the stack."

SETUPVAL This opcode is used to set the value of an upvalue. The A register specifies the location on the stack containing the value to be set, and the B register specifies the index of the upvalue. For example, "SETUPVAL A B" would mean "set the value of the upvalue at index B to the value at location A on the stack."

SETTABLE This opcode is used to set the value of a table element. The A register specifies the index of the table on the stack, the B register specifies the key of the element to set, and the C register specifies the value to set the element to. For example, "SETTABLE A B C" would mean "set the element with key B in the table at location A on the stack to the value at location C on the stack."

NEWTABLE This opcode is used to create a new table and push it onto the stack. The A register specifies the destination location on the stack, the B register specifies the size of the array part of the table, and the C register specifies the size of the hash part of the table. For example, "NEWTABLE A B C" would mean "create a new table with an array part of size B and a hash part of size C, and push it onto the stack at location A."

SELF This opcode is used to retrieve the value of a table element and push it onto the stack, similar to GETTABLE. However, it also pushes the table itself onto the stack before the element value. The A register specifies the destination location on the stack for the element value, the B register specifies the index of the table on the stack, and the C register specifies the key of the element to retrieve. For example, "SELF A B C" would mean "push the table at location B onto the stack, then push the value of the element with key C in the table onto the stack at location A."

ADD This opcode is used to add two values from the top of the stack and push the result onto the top of the stack. The A register specifies the destination location on the stack for the result, and the B and C registers specify the locations of the values to be added. For example, "ADD A B C" would mean "add the values at locations B and C on the stack and push the result onto the stack at location A."

SUB This opcode is used to subtract two values from the top of the stack and push the result onto the top of the stack. The A register specifies the destination location on the stack for the result, and the B and C registers specify the locations of the values to be subtracted. For example, "SUB A B C" would mean "subtract the value at location C on the stack from the value at location B and push the result onto the stack at location A."

MUL This opcode is used to multiply two values from the top of the stack and push the result onto the top of the stack. The A register specifies the destination location on the stack for the result, and the B and C registers specify the locations of the values to be multiplied. For example, "MUL A B C" would mean "multiply the values at locations B and C on the stack and push the result onto the stack at location A."

DIV This opcode is used to divide two values from the top of the stack and push the result onto the top of the stack. The A register specifies the destination location on the stack for the result, and the B and C registers specify the locations of the values to be divided. For example, "DIV A B C" would mean "divide the value at location B on the stack by the value at location C and push the result onto the stack at location A."

POW This opcode is used to raise one value to the power of another value from the top of the stack and push the result onto the top of the stack. The A register specifies the destination location on the stack for the result, and the B and C registers specify the locations of the values to be used in the power operation. For example, "POW A B C" would mean "raise the value at location B on the stack to the power of the value at location C and push the result onto the stack at location A."

UNM This opcode is used to negate a value from the top of the stack and push the result onto the top of the stack. The A register specifies the destination location on the stack for the result, and the B register specifies the location of the value to be negated. For example, "UNM A B" would mean "negate the value at location B on the stack and push the result onto the stack at location A."

NOT This opcode is used to negate a boolean value from the top of the stack and push the result onto the top of the stack. The A register specifies the destination location on the stack for the result, and the B register specifies the location of the boolean value to be negated. For example, "NOT A B" would mean "negate the boolean value at location B on the stack and push the result onto the stack at location A."

CONCAT This opcode is used to concatenate multiple values from the top of the stack and push the result onto the top of the stack. The A register specifies the destination location on the stack for the result, and the B and C registers specify the range of values to be concatenated. For example, "CONCAT A B C" would mean "concatenate the values at locations B through C on the stack and push the result onto the stack at location A."

JMP This opcode is used to jump to a specific location in the bytecode. The sBx register specifies the number of instructions to skip, with a positive value indicating a forward jump and a negative value indicating a backward jump. For example, "JMP sBx" would mean "jump sBx instructions forward or backward in the bytecode."

EQ This opcode is used to compare two values from the top of the stack and push a boolean value indicating whether they are equal onto the top of the stack. The A register specifies the destination location on the stack for the result, and the B and C registers specify the locations of the values to be compared. For example, "EQ A B C" would mean "compare the values at locations B and C on the stack and push a boolean value indicating whether they are equal onto the stack at location A."

LT This opcode is used to compare two values from the top of the stack and push a boolean value indicating whether the first value is less than the second value onto the top of the stack. The A register specifies the destination location on the stack for the result, and the B and C registers specify the locations of the values to be compared. For example, "LT A B C" would mean "compare the value at location B on the stack to the value at location C and push a boolean value indicating whether the value at location B is less than the value at location C onto the stack at location A."

LE This opcode is used to compare two values from the top of the stack and push a boolean value indicating whether the first value is less than or equal to the second value onto the top of the stack. The A register specifies the destination location on the stack for the result, and the B and C registers specify the locations of the values to be compared. For example, "LE A B C" would mean "compare the value at location B on the stack to the value at location C and push a boolean value indicating whether the value at location B is less than or equal to the value at location C onto the stack at location A."

TEST This opcode is used to test a boolean value from the top of the stack and either skip the next instruction or push the boolean value onto the top of the stack. The A register specifies the destination location on the stack for the result, the B register specifies the location of the boolean value to be tested, and the C register specifies the condition to be tested (0 for false, 1 for true). For example, "TEST A B C" would mean "if the boolean value at location B on the stack is equal to the condition specified by C, skip the next instruction. Otherwise, push the boolean value at location B onto the stack at location A."

CALL This opcode is used to call a function and push the result onto the top of the stack. The A register specifies the destination location on the stack for the result, the B register specifies the number of arguments to the function, and the C register specifies the number of results to be returned. For example, "CALL A B C" would mean "call the function at location A on the stack with B arguments and push C results onto the stack."

TAILCALL This opcode is used to call a function and return the result in place of the current function. The A register specifies the location on the stack of the function to be called, the B register specifies the number of arguments to the function, and the C register specifies the number of results to be returned. For example, "TAILCALL A B C" would mean "call the function at location A on the stack with B arguments and return C results in place of the current function."

RETURN This opcode is used to return from a function and push the result onto the top of the stack. The A register specifies the location on the stack of the first result to be returned, and the B register specifies the number of results to be returned. For example, "RETURN A B" would mean "return B results starting at location A on the stack."

FORLOOP This opcode is used to perform the looping portion of a for loop. The A register specifies the location on the stack of the loop counter, the sBx register specifies the number of instructions to skip if the loop should continue, and the values at locations A+1 and A+2 on the stack specify the loop bounds. For example, "FORLOOP A sBx" would mean "increment the loop counter at location A on the stack, and if it is within the loop bounds specified at locations A+1 and A+2, skip sBx instructions in the bytecode."

TFORLOOP This opcode is used to perform the looping portion of a for-each loop. The A register specifies the location on the stack of the loop variable, the C register specifies the number of loop variables, and the values at locations A+1 and A+2 on the stack specify the loop bounds. For example, "TFORLOOP A C" would mean "iterate over the loop bounds specified at locations A+1 and A+2 on the stack and assign the values to the loop variables at locations A through A+C-1 on the stack."

TFORPREP This opcode is used to prepare for a for-each loop by setting up the loop bounds. The A register specifies the location on the stack of the loop variable, and the sBx register specifies the number of instructions to skip if the loop should continue. For example, "TFORPREP A sBx" would mean "if the value at location A on the stack is a table, set the loop bounds to the table's key-value pairs and skip sBx instructions in the bytecode. Otherwise, leave the loop bounds unchanged."

SETLIST This opcode is used to set the values of multiple elements in a table on the stack. The A register specifies the index of the table on the stack, the Bx register specifies the index of the first element to be set, and the values at locations A+1 through A+1+(Bx%FPF+1) on the stack specify the values to be set. For example, "SETLIST A Bx" would mean "set the values of Bx%FPF+1 elements in the table at location A on the stack starting at element index Bx."

CLOSE This opcode is used to close all open upvalues in a function's stack frame. The A register specifies the location on the stack of the first open upvalue to be closed. For example, "CLOSE A" would mean "close all open upvalues in the current stack frame starting at location A on the stack."

##Up Next: Instructions ##→ Continue to Part 2

View in-game to comment, award, and more!