๐Ÿงฎ Automatic Differentiation

Exact derivatives via dual numbers and computation graphs

Overview

A complete automatic differentiation library supporting both forward mode (dual numbers) and reverse mode (computation graphs). Unlike symbolic differentiation (which manipulates expressions) or numerical differentiation (which approximates with finite differences), AD computes exact derivatives at machine precision by applying the chain rule to elementary operations.

This is the core technique behind modern deep learning frameworks like PyTorch and TensorFlow, implemented from scratch in ~1000 lines of OCaml.

Concepts Demonstrated

Forward Mode (Dual Numbers)

Each value carries its derivative alongside it. One forward pass computes the derivative with respect to one input variable. Best for functions with few inputs, many outputs (cost: O(n) passes for n inputs).

(* Dual number: value + tangent *)
type t = { v : float; d : float }

(* Derivative of xยฒ + sin(x) at x = ฯ€/4 *)
let f x = Forward.(x * x + sin x)
let deriv = Forward.diff f (Float.pi /. 4.0)
(* deriv โ‰ˆ 2ยท(ฯ€/4) + cos(ฯ€/4) โ‰ˆ 2.278 *)

Reverse Mode (Backpropagation)

Builds a computation graph during the forward pass, then propagates gradients backward. One reverse pass computes derivatives with respect to all inputs. Best for functions with many inputs, few outputs โ€” exactly the case for loss functions in ML.

(* Build computation graph *)
let x = Reverse.var 2.0
let y = Reverse.var 3.0
let z = Reverse.(x * y + sin x)

(* Backpropagate from z *)
let () = Reverse.backward z
let dz_dx = Reverse.grad x   (* โˆ‚z/โˆ‚x = y + cos(x) *)
let dz_dy = Reverse.grad y   (* โˆ‚z/โˆ‚y = x *)

Key Functions

Forward Module

FunctionDescription
var xCreate independent variable (derivative = 1)
const xCreate constant (derivative = 0)
diff f xCompute f'(x) via one forward pass
nth_diff n f xn-th derivative (AD + finite differences)
gradient f xGradient โˆ‡f at point x (array)
jacobian f xJacobian matrix of f : โ„โฟ โ†’ โ„แต
hessian f xHessian matrix of f : โ„โฟ โ†’ โ„
directional_deriv f x vDirectional derivative along v

Reverse Module

FunctionDescription
var xCreate input node in computation graph
backward zBackpropagate gradients from output z
grad xRead accumulated gradient โˆ‚z/โˆ‚x
gradient f xGradient via single reverse pass
grad_descent f x0 ~lr ~stepsMinimize f by gradient descent

Supported Operations

CategoryOperations
Arithmetic+ - * / neg abs pow
Trigonometricsin cos tan asin acos atan atan2
Hyperbolicsinh cosh tanh
Exponentialexp log sqrt
Activationssigmoid relu softplus

Forward vs Reverse Mode

Forward ModeReverse Mode
MechanismDual numbers (value + tangent)Computation graph + backprop
Cost per passOne โˆ‚f/โˆ‚xแตขAll โˆ‚f/โˆ‚xแตข
Best forf : โ„ โ†’ โ„แต (few inputs)f : โ„โฟ โ†’ โ„ (few outputs)
MemoryO(1) extraO(ops) โ€” stores graph
Use caseJacobian columns, sensitivitiesLoss function gradients (ML)

When to Use