ðĨïļ Bytecode Virtual Machine
A stack-based VM with compiler, garbage-safe execution, and tracing
Overview
This module implements a complete bytecode virtual machine from scratch: a high-level expression language compiles down to a flat array of opcodes, which a stack-based VM executes. It covers the same ground as the internals of languages like Lua, Python, and early Java â making abstract "how interpreters work" concepts concrete and runnable.
Architecture
âââââââââââââââ ââââââââââââ ââââââââââââââ
â Source AST â âââķ â Compiler â âââķ â Bytecode â
â (expr) â â â â (opcode â
â â â â â array) â
âââââââââââââââ ââââââââââââ ââââââââââââââ
â
âž
ââââââââââââââ
â VM â
â (stack + â
â frames) â
ââââââââââââââ
Instruction Set
| Opcode | Stack Effect | Description |
|---|---|---|
CONST v | â v | Push a constant value |
ADD | a b â (a+b) | Pop two, push sum |
SUB / MUL / DIV / MOD | a b â result | Arithmetic operations |
NEG | a â (âa) | Negate top of stack |
EQ / LT / GT / LE / GE | a b â bool | Comparison operators |
NOT | a â (ÂŽa) | Boolean negation |
AND / OR | a b â bool | Logical operators |
GET_LOCAL i | â v | Push local variable at slot i |
SET_LOCAL i | v â | Store top into local slot i |
JMP offset | â | Unconditional jump |
JMP_IF_FALSE offset | cond â | Conditional branch |
CALL func_idx | args â result | Push call frame, enter function |
RETURN | result â | Pop call frame, return value |
PRINT | v â | Output a value |
HALT | â | Stop execution |
Value Types
type value =
| VFloat of float (* all numbers are double-precision *)
| VBool of bool (* boolean true/false *)
| VNil (* null/unit value *)
| VString of string (* string literals *)
| VFunc of int (* function reference by index *)
(* Truthiness: VBool false and VNil are falsy; everything else is truthy *)
let value_is_truthy = function
| VBool false | VNil -> false
| _ -> true
VM State
type call_frame = {
func_idx : int; (* which function we're in *)
return_ip : int; (* instruction pointer to return to *)
base_slot : int; (* stack base for local variables *)
}
type vm = {
mutable stack : value array; (* operand stack *)
mutable sp : int; (* stack pointer *)
mutable ip : int; (* instruction pointer *)
mutable frames : call_frame list; (* call stack *)
mutable steps : int; (* execution counter *)
mutable output : string list; (* captured print output *)
trace : bool; (* instruction tracing enabled *)
}
Safety Features
- Bounded execution â configurable
max_execution_steps(default 1M) prevents infinite loops - Call depth limit â
max_call_depth(default 512) catches runaway recursion - Stack bounds checking â push/pop verify the stack pointer, raising
VM_erroron overflow/underflow - Tracing mode â when enabled, prints every instruction with stack state for debugging
The Compiler
(* High-level expressions compiled to flat bytecode *)
type expr =
| EConst of float | EBool of bool | ENil
| EVar of string | EBinOp of op * expr * expr
| EUnaryOp of unary_op * expr
| EIf of expr * expr * expr
| ELet of string * expr * expr
| EFunc of string list * expr (* params, body *)
| ECall of expr * expr list
| EBlock of expr list | EPrint of expr
(* compile_expr recursively emits opcodes *)
(* Local variables are assigned stack slots at compile time *)
(* Functions are compiled to separate code arrays referenced by index *)
Disassembler
(* Built-in disassembler for inspecting generated bytecode *)
let disassemble_op i = function
| CONST v -> Printf.sprintf "%04d CONST %s" i (value_to_string v)
| ADD -> Printf.sprintf "%04d ADD" i
| CALL idx -> Printf.sprintf "%04d CALL func_%d" i idx
| JMP off -> Printf.sprintf "%04d JMP %04d" i (i + off)
| RETURN -> Printf.sprintf "%04d RETURN" i
| ...
(* Example output:
0000 CONST 3.0
0001 CONST 4.0
0002 ADD
0003 PRINT
0004 HALT *)
OCaml Concepts Demonstrated
- Mutable arrays for performance â stack and bytecode use OCaml arrays for O(1) random access
- Pattern matching on opcodes â the main
run_codeloop dispatches via a large match expression - Variant types as instruction encoding â opcodes carry their operands directly in the constructor
- Compile-time environment â local variable resolution at compile time, slot-based at runtime
- Ref cells for mutable counters â step counter and call depth tracked mutably
- Recursive descent compilation â
compile_exprmirrors the AST structure
Running It
# Compile and run
ocamlfind ocamlopt bytecode_vm.ml -o bytecode_vm
./bytecode_vm
# Runs built-in examples demonstrating arithmetic, conditionals,
# function calls, recursion, and the disassembler