Parsing Algorithm
CCL Parsing Algorithm
Section titled “CCL Parsing Algorithm”CCL is parsed through recursive descent to a fixed point. The algorithm is simple:
- Parse text into key-value entries
- Build nested objects from indented entries
- Recursively parse values that contain more CCL
- Stop when no more CCL syntax remains (fixed point)
Core Algorithm
Section titled “Core Algorithm”Input Format
Section titled “Input Format”CCL is line-based text with key-value pairs:
key = valueanother = value with spacesnested = child = nested value sibling = another nestedStage 1: Parse Entries
Section titled “Stage 1: Parse Entries”Split lines on first = character:
"key = value" → Entry {key: "key", value: "value"}"a = b = c" → Entry {key: "a", value: "b = c"}Whitespace rules:
- Trim whitespace from keys:
" key "→"key" - Preserve value whitespace except leading:
" value "→"value " - Tab characters and spaces both count as indentation
Special keys:
- Empty key
= value→ list item - Comment entry
/ = text→ key is/, value istext
Stage 2: Build Hierarchy
Section titled “Stage 2: Build Hierarchy”Group entries by indentation level:
parent = child = nested sibling = anotherBecomes:
Entry {key: "parent", value: "child = nested\nsibling = another"}Indentation logic:
- Count leading whitespace characters
- Lines with MORE indentation than previous = part of previous value
- Lines with SAME/LESS indentation = new entry at that level
Stage 3: Recursive Parsing (Fixed Point)
Section titled “Stage 3: Recursive Parsing (Fixed Point)”Parse values that contain CCL syntax:
value: "child = nested\nsibling = another"→ Contains '=' → Parse recursively→ Results in: {child: "nested", sibling: "another"}Fixed-point termination:
- Parse value as CCL
- If parsing yields structure → recurse on nested values
- If value has no ’=’ → stop (fixed point reached)
- Prevents infinite recursion: plain strings have no structure to parse
Complete Example
Section titled “Complete Example”Input:
database = host = localhost port = 5432
users = = alice = bobStep 1 - Parse entries:
Entry {key: "database", value: "host = localhost\nport = 5432"}Entry {key: "users", value: "= alice\n= bob"}Step 2 - Recursive parsing:
database.value contains '=' → parse recursively: Entry {key: "host", value: "localhost"} Entry {key: "port", value: "5432"}
users.value contains '=' → parse recursively: Entry {key: "", value: "alice"} Entry {key: "", value: "bob"}Step 3 - Build objects:
{ "database": { "host": "localhost", "port": "5432" }, "users": ["alice", "bob"]}Fixed point reached: “localhost”, “5432”, “alice”, “bob” contain no ’=’ → stop.
Implementation Pattern
Section titled “Implementation Pattern”Pseudocode for recursive parser:
def parse_ccl(text): entries = parse_entries(text) # Stage 1: split on '=' hierarchy = build_hierarchy(entries) # Stage 2: group by indentation return recursively_parse(hierarchy) # Stage 3: fixed point
def recursively_parse(entries): result = {} for entry in entries: value = entry.value
if contains_ccl_syntax(value): # Has '=' character # Recursively parse the value parsed = parse_ccl(value) result[entry.key] = parsed else: # Fixed point: plain string result[entry.key] = value
return resultError Handling Essentials
Section titled “Error Handling Essentials”Malformed input:
- Line with no ’=’ → skip or error (implementation choice)
- Inconsistent indentation → use explicit indentation counting
- Empty lines → ignore
Edge cases:
- Keys with ’=’ in them → impossible, first ’=’ is split point
- Values with ’=’ → fine, parse recursively
- Unicode in keys/values → valid, CCL is UTF-8
- CRLF vs LF → CCL treats only LF as a newline, so CRs present are preserved as-is
Why This Is Core CCL
Section titled “Why This Is Core CCL”From the blog post:
“The simplest possible config language is just key-value pairs. That’s it.”
The recursive fixed-point algorithm is how those key-value pairs create nested structure. The OCaml type definition makes this explicit:
type t = Fix of t Map.Make(String).tThis says: “A CCL value is a fixed point of a map from strings to CCL values.”
The recursion is the structure. The fixed point is the termination.
What’s NOT in This Algorithm
Section titled “What’s NOT in This Algorithm”These are library conveniences, not core CCL:
- Type conversion: “5432” → integer (user’s job)
- Validation: checking required keys (user’s job)
- Dotted key expansion:
foo.bar→ nested object (optional) - Pretty printing: formatting output (implementation detail)
Core CCL is: parse key-value pairs recursively until fixed point.
Implementation Notes
Section titled “Implementation Notes”Different languages can adapt this algorithm:
Functional approach: Use recursive descent with pattern matching OOP approach: Entry/Parser classes with builder pattern Dynamic approach: Dictionaries/hashmaps with recursive loops
The algorithm is the same; the implementation style varies by language.