Parsing Algorithm
CCL Parsing Algorithm
Section titled “CCL Parsing Algorithm”CCL is parsed through recursive descent to a fixed point. The algorithm is simple:
- Parse text into key-value entries
- Build nested objects from indented entries
- Recursively parse values that contain more CCL
- Stop when no more CCL syntax remains (fixed point)
Core Algorithm
Section titled “Core Algorithm”Input Format
Section titled “Input Format”CCL consists of key-value pairs separated by =:
key = valueanother = value with spacesnested = child = nested value sibling = another nestedParse Entries
Section titled “Parse Entries”Find the first = character and split:
"key = value" → Entry {key: "key", value: "value"}"a = b = c" → Entry {key: "a", value: "b = c"}Key whitespace rules:
- Trim all whitespace from keys (including newlines):
" key "→"key" - Keys can span multiple lines if
=appears on a subsequent line
Value whitespace rules:
- Trim leading whitespace on first line:
key = value→"value" - Trim trailing whitespace on final line
- Preserve internal structure (newlines + indentation for continuation lines)
Indentation tracking:
- Determine the baseline indentation (N) for the current parsing context
- For each subsequent line, compare its indentation to N:
indent > N→ continuation line (part of value)indent ≤ N→ new entry starts
- Which characters count as whitespace depends on parser behavior:
tabs_as_whitespace: spaces and tabs are whitespacetabs_as_content: only spaces are whitespace; tabs are content
Special keys:
- Empty key
= value→ list item - Comment entry
/ = text→ key is/, value istext
Build Hierarchy
Section titled “Build Hierarchy”Indentation determines structure. Example:
parent = child = nested sibling = anotherThe parser records parent’s indentation level (0). Lines child and sibling have greater indentation (2), so they become part of parent’s value:
Entry {key: "parent", value: "child = nested\nsibling = another"}Recursive Parsing (Fixed Point)
Section titled “Recursive Parsing (Fixed Point)”Parse values that contain CCL syntax:
value: "child = nested\nsibling = another"→ Contains '=' → Parse recursively→ Results in: {child: "nested", sibling: "another"}Fixed-point termination:
- Parse value as CCL
- If parsing yields structure → recurse on nested values
- If value has no ’=’ → stop (fixed point reached)
- Prevents infinite recursion: plain strings have no structure to parse
Complete Example
Section titled “Complete Example”Input:
database = host = localhost port = 5432
users = = alice = bobParse entries:
Entry {key: "database", value: "host = localhost\nport = 5432"}Entry {key: "users", value: "= alice\n= bob"}Recursive parsing:
database.value contains '=' → parse recursively: Entry {key: "host", value: "localhost"} Entry {key: "port", value: "5432"}
users.value contains '=' → parse recursively: Entry {key: "", value: "alice"} Entry {key: "", value: "bob"}Build objects:
{ "database": { "host": "localhost", "port": "5432" }, "users": ["alice", "bob"]}Fixed point reached: “localhost”, “5432”, “alice”, “bob” contain no ’=’ → stop.
Implementation Pattern
Section titled “Implementation Pattern”Pseudocode for recursive parser:
def parse_ccl(text): entries = parse_entries(text) # split on '=' hierarchy = build_hierarchy(entries) # group by indentation return recursively_parse(hierarchy) # fixed point
def recursively_parse(entries): result = {} for entry in entries: value = entry.value
if contains_ccl_syntax(value): # Has '=' character # Recursively parse the value parsed = parse_ccl(value) result[entry.key] = parsed else: # Fixed point: plain string result[entry.key] = value
return resultError Handling Essentials
Section titled “Error Handling Essentials”Malformed input:
- Line with no ’=’ → skip or error (implementation choice)
- Inconsistent indentation → use explicit indentation counting
- Empty lines → ignore
Edge cases:
- Keys with ’=’ in them → impossible, first ’=’ is split point
- Values with ’=’ → fine, parse recursively
- Unicode in keys/values → valid, CCL is UTF-8
- CRLF vs LF → CCL treats only LF as a newline, so CRs present are preserved as-is
Why This Is Core CCL
Section titled “Why This Is Core CCL”From the blog post:
“The simplest possible config language is just key-value pairs. That’s it.”
The recursive fixed-point algorithm is how those key-value pairs create nested structure. The OCaml type definition makes this explicit:
type t = Fix of t Map.Make(String).tThis says: “A CCL value is a fixed point of a map from strings to CCL values.”
The recursion is the structure. The fixed point is the termination.
What’s NOT in This Algorithm
Section titled “What’s NOT in This Algorithm”These are library conveniences, not core CCL:
- Type conversion: “5432” → integer (user’s job)
- Validation: checking required keys (user’s job)
- Dotted key expansion:
foo.bar→ nested object (optional) - Pretty printing: formatting output (implementation detail)
Core CCL is: parse key-value pairs recursively until fixed point.
Implementation Notes
Section titled “Implementation Notes”Different languages can adapt this algorithm:
Functional approach: Use recursive descent with pattern matching OOP approach: Entry/Parser classes with builder pattern Dynamic approach: Dictionaries/hashmaps with recursive loops
The algorithm is the same; the implementation style varies by language.