Bytecode format
Lua 5.2 and 5.3 have an identical bytecode format other than a few extra instructions.
file:
1b 4C 75 61 | Lua bytecode signature
[u8 version] | Version number (0x52 for Lua 5.2, etc)
[u8 impl] | Implementation (0 for reference impl)
[u8 endian] | Big-endian flag
[u8 intsize] | Size of integers (usually 4)
[u8 size_t] | Size of pointers
[u8 instsize] | Size of instructions (always 4)
[u8 numsize] | Size of Lua numbers (usually 8)
[u8 use_int] | Use integers instead of floats (usually for embedded)
19 93 0D 0A 1A 0A | Lua magic (used to detect presence of EOL conversion)
[func main]
string:
[size_t size]
... data
00
func:
[int line_start] | debug info
[int line_end] | debug info
[u8 nparams]
[u8 varargflags]
[u8 nregisters]
[int ninstructions]
... instructions:
[instsize instruction]
[int nconsts]
... consts:
[u8 type]
type 0: | nil
type 1: | bool
[u8 value]
type 3: | number
[numsize value]
type 4: | string
[string value]
[int nprimitives]
... primitives:
[func primitive]
[int nupvals]
... upvals:
[u8 stack]
[u8 register]
[string source] | debug info
[int nlines]
... lines:
[int line]
[int nlocals]
... locals:
[string name] | debug info
[int startpc]
[int endpc]
[int nupvalnames]
... upvalnames:
[string name] | debug info
Instruction type | 31..23 (9 bits) | 24..14 (9 bits) | 13..6 (8 bits) | 5..0 (6 bits) |
---|---|---|---|---|
iABC | B | C | A | opcode |
iABx | Bx | A | opcode | |
iAsBx | Bx (signed) | A | opcode | |
iAx | A | opcode |
Opcodes
Lua 5.2:
Id | Instruction | Parameters |
---|---|---|
0 | MOVE | iABC |
1 | LOADK | iABx |
2 | LOADKX | iABx |
3 | LOADBOOL | iABC |
4 | LOADNIL | iABC |
5 | GETUPVAL | iABC |
6 | GETTABUP | iABC |
7 | GETTABLE | iABC |
8 | SETTABUP | iABC |
9 | SETUPVAL | iABC |
10 | SETTABLE | iABC |
11 | NEWTABLE | iABC |
12 | SELF | iABC |
13 | ADD | iABC |
14 | SUB | iABC |
15 | MUL | iABC |
16 | DIV | iABC |
17 | MOD | iABC |
18 | POW | iABC |
19 | UNM | iABC |
20 | NOT | iABC |
21 | LEN | iABC |
22 | CONCAT | iABC |
23 | JMP | iAsBx |
24 | EQ | iABC |
25 | LT | iABC |
26 | LE | iABC |
27 | TEST | iABC |
28 | TESTSET | iABC |
29 | CALL | iABC |
30 | TAILCALL | iABC |
31 | RETURN | iABC |
32 | FORLOOP | iAsBx |
33 | FORPREP | iAsBx |
34 | TFORCALL | iABC |
35 | TFORLOOP | iAsBx |
36 | SETLIST | iABC |
37 | CLOSURE | iABx |
38 | VARARG | iABC |
39 | EXTRAARG | iAx |
Lua 5.3:
Id | Instruction | Parameters |
---|---|---|
0 | MOVE | iABC |
1 | LOADK | iABx |
2 | LOADKX | iABx |
3 | LOADBOOL | iABC |
4 | LOADNIL | iABC |
5 | GETUPVAL | iABC |
6 | GETTABUP | iABC |
7 | GETTABLE | iABC |
8 | SETTABUP | iABC |
9 | SETUPVAL | iABC |
10 | SETTABLE | iABC |
11 | NEWTABLE | iABC |
12 | SELF | iABC |
13 | ADD | iABC |
14 | SUB | iABC |
15 | MUL | iABC |
16 | MOD | iABC |
17 | POW | iABC |
18 | DIV | iABC |
19 | IDIV | iABC |
20 | BAND | iABC |
21 | BOR | iABC |
22 | BXOR | iABC |
23 | SHL | iABC |
24 | SHR | iABC |
25 | UNM | iABC |
26 | BNOT | iABC |
27 | NOT | iABC |
28 | LEN | iABC |
29 | CONCAT | iABC |
30 | JMP | iAsBx |
31 | EQ | iABC |
32 | LT | iABC |
33 | LE | iABC |
34 | TEST | iABC |
35 | TESTSET | iABC |
36 | CALL | iABC |
37 | TAILCALL | iABC |
38 | RETURN | iABC |
39 | FORLOOP | iAsBx |
40 | FORPREP | iAsBx |
41 | TFORCALL | iABC |
42 | TFORLOOP | iAsBx |
43 | SETLIST | iABC |
44 | CLOSURE | iABx |
45 | VARARG | iABC |
46 | EXTRAARG | iAx |
Locals
Locals point to registers on a closure's stack, you may notice in the bytecode they do not explicitly state a register but define the range of instructions its accessible from. Starting at register 0, you can infer the register from the order the locals are accessible.
... locals:
[string name] | optional, for debugging
[int startpc]
[int endpc]
While iterating instructions you simply check if a local starts there, then point the local to the top of the stack, then when the local ends simply pop it off the stack.
Upvalues
Upvalues are locals you access from parent functions, for example:
local potato
local function walrus()
potato = 1
end
Here walrus
would call SETUPVAL
on potato
, upvalues are statically defined in each of their prototypes by two variables:
stack
is the number of stacks above the variable is, in this case 1.
register
is the register of the stack the upvalue references.
In Lua 5.2 and 5.3 upvalue 1 corresponds to _ENV
.
Instructions
Name | Description |
---|---|
R | Register list |
K | Constant list |
U | Upvalue list |
pc | Program counter aka instruction pointer |
Kproto | Function prototype list |
Typename | Description |
---|---|
reg | Register |
const | Constant |
value | Register but when negative its a constant |
upvalue | Upvalue |
closure | Closure |
int | Literal integer |
jump | Relative jump position |
MOVE(reg a, reg b)
a = b;
LOADK(reg a, const bx)
a = bx;
LOADKX(reg a) EXTRAARG(const ax)
a = ax;
LOADBOOL(reg a, int b, int c)
a = (bool)b;
if (c) pc++;
LOADNIL(reg a, reg b)
a..b = nil;
GETUPVAL(reg a, upvalue b)
a = b;
GETTABUP(reg a, upvalue b, value c)
a = b[c];
GETTABLE(reg a, reg b, value c)
a = b[c];
SETTABUP(upvalue a, value b, value c)
a[b] = c;
SETUPVAL(upvalue a, reg b)
a = b;
SETTABLE(reg a, value b, value c)
a[b] = c;
NEWTABLE(reg a, int b, int c)
a = {} // preallocate b array elements and c hash elements
SELF(reg a, reg b, value c)
a + 1 = b;
a = b[c];
operator(reg a, value b, value c)
Where operator is one of the following:
ADD | + | SUB | - | MUL | * | MOD | % |
POW | % | DIV | / | IDIV | // | BAND | & |
BOR | | | BXOR | ~ | SHL | << | SHR | >> |
a = b <op> c;
unary operator(reg a, value b)
Where unary operator is one of the following:
UNM | - | BNOT | ~ | NOT | not | LEN | # |
a = <op> b;
CONCAT(reg a, value b, value c)
a = b .. .. c;
JMP(int a, jump sbx)
pc += sbx;
if (a) /* close upvalues >= a - 1*/
EQ(int a, value b, value c)
if ((b == c) ~= a) pc++;
LT(int a, value b, value c)
if ((b < c) ~= a) pc++;
LE(int a, value b, value c)
if ((b <= c) ~= a) pc++;
TEST(int a, value b, value c)
if ((b <= c) ~= a) pc++;