JSON Parser
A complete JSON parser built with parser combinators
Overview
This module implements a full RFC 8259 JSON parser using the parser combinator pattern. It demonstrates how small, composable parsing functions can be combined to handle a real-world data format — including strings with Unicode escapes, floating-point numbers with scientific notation, and arbitrarily nested structures.
Concepts Demonstrated
- Recursive descent parsing — mutual recursion between JSON value types
- Monadic composition — sequencing parsers with bind (
>>=) - Algebraic data types — the
jsontype models all JSON values - Higher-order functions — parsers as first-class values
- Unicode handling —
\uXXXXescapes with surrogate pair support - Pretty printing — compact and indented output
- Functional transforms — map, filter, fold over JSON structures
JSON Data Type
type json =
| Null
| Bool of bool
| Number of float
| String of string
| Array of json list
| Object of (string * json) list
This algebraic data type naturally represents every JSON value. Pattern matching makes it easy to destructure and transform JSON data.
Key Functions
| Function | Description |
|---|---|
parse | Parse a string into a json value |
to_string_compact | Serialize to minimal JSON string |
to_string_pretty | Serialize with indentation |
query | Dot-notation path access ("user.name") |
equal | Deep structural equality (order-independent for objects) |
map_array | Transform array elements |
filter_array | Filter array elements |
fold_array | Reduce array to a single value |
merge | Merge two objects (right-biased) |
type_name | Get the type as a string |
keys / values | Extract object keys or values |
Parser Architecture
The parser is built from small combinators, each handling one piece of the JSON grammar:
(* The grammar, expressed as combinators *)
json_value = json_null | json_bool | json_number
| json_string | json_array | json_object
(* Arrays: [ value, value, ... ] *)
json_array = between '[' ']' (sep_by json_value ',')
(* Objects: { "key": value, ... } *)
json_object = between '{' '}' (sep_by (key ':' value) ',')
Mutual recursion is handled via a forward reference (json_parser_ref),
allowing arrays and objects to contain any JSON value, including themselves.
Query Examples
(* Dot-notation access into nested structures *)
let json = parse "{\"user\": {\"scores\": [95, 87, 92]}}"
query "user.scores.0" json (* => Some (Number 95.0) *)
query "user.scores.-1" json (* => Some (Number 92.0) — negative index *)
query "missing.path" json (* => None *)
Running
ocamlfind ocamlopt json.ml -o json && ./json
# Or simply:
make json && ./json
Source
json.ml — ~600 lines, zero dependencies