Architecture

This page describes how the Quale system is organized internally - the pipeline from .quale source files to evolved neural networks, the bytecode virtual machine, the 7-step tick loop, and how the same bytecode runs on both CPU and GPU.

For the DSL syntax itself, see the Language Reference. For CLI usage, see the CLI Reference.

System Pipeline

A Quale experiment flows through a pipeline from source text to evolved connectome:

.quale files
    |
    v
[ Parser ]  -------->  Project (AST)
    |
    v
[ Validator ]  ------>  Diagnostics (errors / warnings)
    |
    v
[ V03 Compiler ]  --->  CompiledProject + bytecode programs
    |                        |
    |    +-------------------+------------------+
    |    |                                      |
    v    v                                      v
[ GenericDomain.Configure() ]          [ EvolutionEngine ]
    |                                      |
    |    +---------------------------------+
    |    |
    v    v
[ RunGeneration loop ]
    |
    v
Evolved connectome (checkpoint + best brain)

Stage 1: Parsing

The parser (parser/project.go) reads one or more .quale files and produces a Project AST. When given a directory, it scans all .quale files alphabetically and merges their definitions into a single project. There are no import statements - every file in the directory is part of the project.

The result is a collection of typed definition lists: BodyDefs, ItemDefs, WorldDefs, DynamicsDefs, FitnessDefs, EvolveDefs, PerceptionDefs, and ActionDefs.

Stage 2: Validation

The validator checks the AST for semantic correctness - things the parser cannot verify syntactically. This includes:

All references resolve (the evolve block’s body: Driver points to a real body definition)
Sensor names declared in the body match those assigned in the perception block
Entity properties referenced in handlers exist on the entity type
World topology and entity declarations are consistent
No duplicate definition names across files

Validation produces a DiagnosticList with source-located error messages. If any errors are present, compilation does not proceed.

Stage 3: Bytecode Compilation

The V03Compiler (vm/compiler_v03.go) transforms validated AST nodes into bytecode programs - flat instruction streams executable by both the Go interpreter and the C interpreter. Key compilation targets:

Perception program - reads agent state, world state, and spatial queries; writes sensor values for brain input
Action program - reads brain actuator outputs; applies physics, state updates, and guards; detects entity crossings
Entity handler programs - on_cross, on_enter, on_pass handlers compiled per entity type
Machine programs - state bodies, transition conditions, on_enter/on_exit hooks
Fitness programs - gate conditions, metric expressions, terminate-when conditions, per-tick accumulator expressions

The compiler builds index maps that translate named identifiers (e.g. agent.speed, sensor signal_aspect) into numeric array indices used by the bytecode. This allows the VM to operate entirely on indexed arrays without string lookups at runtime.

Stage 4: Domain Configuration

The GenericDomain (runtime/domain.go) receives the compiled project and wires together all bytecode programs into a runnable simulation. Configuration includes:

Building the world topology (route or grid) and populating entity stores from imported CSV data or spawn declarations
Creating perception and action runners that bind bytecode programs to the VM interpreter
Compiling state machines and creating instance factories
Setting up spatial query dispatch tables
Preparing the fitness evaluator with gates, metrics, and verbs

All domain logic comes from the .quale file. The GenericDomain is the only domain implementation - it reads any .quale file without domain-specific Go code.

Stage 5: Evolution

The EvolutionEngine runs the NEAT-style evolution loop: initialize a population of minimal genomes, evaluate each genome through the GenericDomain, speciate, reproduce, and repeat until convergence or the generation limit.

Engine Packages

Package	Purpose
`parser/`	Lexer, parser, validator, and project compiler for `.quale` DSL files
`vm/`	Bytecode VM - opcodes, compiler, interpreter, and program builder
`runtime/`	Tick loop, perception/action runners, machine executor, fitness computation, scenario runner, record store
`world/`	Topology types (route, grid), entity types and stores, spatial queries, CSV import
`core/`	Brain network, nodes, connections, signal propagation, activation functions
`evolution/`	NEAT algorithm - genomes, mutation, crossover, speciation, checkpointing
`gpu/`	C compute library for GPU-accelerated evaluation - bytecode VM, tick loop, machines, fitness
`server/`	HTTP API for checkpoint/brain data (used by browser-based viewers)

Bytecode VM

The VM (vm/) uses a stack-machine architecture where programs are sequences of opcodes that manipulate a float64 value stack. The same bytecode format runs on both the Go interpreter (vm/interpreter.go) and the C interpreter (gpu/core_vm.h).

The opcode set covers:

Category	Opcodes	Description
Arithmetic	`PUSH_CONST`, `PUSH_STATE`, `ADD`, `SUB`, `MUL`, `DIV`, `NEG`	Stack manipulation and basic math
State access	`LOAD_AGENT`, `STORE_AGENT`, `LOAD_WORLD`, `STORE_WORLD`	Read/write agent and world state arrays
Brain I/O	`STORE_SENSOR`, `READ_ACTUATOR`	Bridge between perception/action bytecode and the neural network
Spatial queries	`SPATIAL_QUERY`, `LOAD_ENTITY_PROP`, `LOAD_ENTITY_INDEX`	Execute world queries and access entity properties
Control flow	`JUMP`, `JUMP_IF_FALSE`, `JUMP_IF_TRUE`	Conditional branching for `when` guards and `match` expressions
Boolean/comparison	`NOT`, `AND`, `OR`, `CMP_GT`, `CMP_LT`, `CMP_GE`, `CMP_LE`, `CMP_EQ`, `CMP_NE`	Logic and comparison operators
Math built-ins	`CALL_MIN`, `CALL_MAX`, `CALL_CLAMP`, `CALL_ABS`, `CALL_SQRT`	Standard library functions
State machines	`ENTER_STATE`, `CHECK_ELAPSED`, `LOAD_TIMER`, `STORE_TIMER`, `RESET_TIMER`	Machine state transitions and timing
Records	`RECORD_EMIT`, `CONSUME`	Emit structured event records and remove entities from the world
Control	`TERNARY`, `HALT`	Ternary selection and program termination

Opcode values are fixed integers shared between Go and C. Adding a new opcode requires updating both vm/opcode.go and gpu/core_vm.h with matching values.

Core Components

SignalEngine (`core/signal.go`)

The SignalEngine processes one forward pass through a brain’s connectome each tick. It pre-computes a topological ordering of non-input nodes using Kahn’s algorithm during construction. The propagation loop is efficient but not fully allocation-free - readActuatorOutputs allocates a map each tick.

A single tick:

Apply sensor inputs - write sensor values into mapped input nodes
Propagate - iterate non-input nodes in topological order, accumulating weighted inputs and applying each node’s activation function (plus bias)
Apply plasticity - if the brain has a PlasticityConfig, run Hebbian learning, weight decay, and homeostatic regulation
Read actuator outputs - collect output values from actuator-mapped nodes

The SignalEngine is the innermost loop of the system. It runs once per tick, per scenario, per genome, per generation. Topological ordering and adjacency lists are pre-computed during construction to minimize per-tick work.

EvolutionEngine (`evolution/population.go`)

The EvolutionEngine orchestrates the full NEAT-style evolution loop:

Initialize population - create minimal genomes with input/output nodes (and region hidden nodes, if regions are defined)
Evaluate - run each genome through the GenericDomain’s Evaluate() method in parallel (one goroutine per genome, bounded by a semaphore equal to CPU count)
Speciate - group genomes into species based on compatibility distance
Reproduce - select parents via tournament selection, produce offspring through crossover and mutation, with elitism for species with 5+ members

Key types:

EvalResult - fitness score plus a generic Metrics map (behavioral measurements)
GenerationStats - per-generation aggregates (best/avg/worst fitness, species count, best/avg metrics)

For details on all evolution parameters, see the Evolution Configuration Reference.

MutationEngine (`evolution/mutation.go`)

The MutationEngine applies eight NEAT-style mutation operators to genomes, each gated by a per-operator probability:

Operator	Effect
`weight_shift`	Perturb or randomize connection weights
`bias_shift`	Perturb node biases
`add_node`	Split a connection, inserting a new hidden node
`remove_node`	Remove a hidden node and create bypass connections
`add_connection`	Add a connection between two unconnected nodes
`remove_connection`	Disable a connection (biased toward weak connections)
`rewire`	Move one endpoint of a connection to a different node
`change_activation`	Switch a hidden node’s activation function

When regions are defined, structural mutations are region-aware: new nodes inherit region assignments and new connections preferentially stay within the same region.

Tick Loop

Each simulation tick follows a fixed 7-step pipeline. The tick loop (runtime/tick.go) drives this sequence, coordinating bytecode execution, brain propagation, and dynamics:

1. World machines      Execute world-scope state machines (e.g. preceding
                       train movement, block signal cascade)

2. Perception          Run the compiled perception program - reads agent
                       state, world state, and spatial queries; writes
                       sensor values for the brain

3. Brain fires         SignalEngine.Tick() propagates signals through
                       the connectome (injected as a callback)

4. Action + entities   Run the compiled action program - reads actuator
                       outputs, applies physics, updates agent state.
                       Detects entity crossings and fires on_cross,
                       on_enter, on_pass handler programs

5. Agent machines      Execute agent-scope state machines (e.g. AWS
                       alert/acknowledge, vigilance monitoring)

6. Dynamics cascade    DynamicsEngine.Tick() applies per-tick and
                       conditional state rules, checks death conditions

7. Fitness accumulators   Update per-tick metric accumulators and
                          evaluate terminate-when conditions

Steps 1-7 repeat for each tick in a scenario. Multiple scenarios are run per genome evaluation, with results averaged. The tick loop is entirely driven by bytecode - the only external callback is the brain’s forward pass at step 3.

Scenario Runner

The scenario runner (runtime/scenario.go) bridges the evolution engine and the tick loop. For each scenario it:

Creates a SignalEngine from the genome’s brain
Wraps it as the tick loop’s brain callback
Runs the tick loop until the agent dies, a terminate condition fires, or the tick budget runs out
Computes fitness from final state, records, and accumulators using ComputeFitness()

Fitness Computation

Fitness is evaluated at the end of each scenario by the fitness evaluator (runtime/fitness.go). The computation follows a fixed order:

Gates - hard constraints that can zero out the score (e.g. gate alive)
Metrics - named numeric values computed from agent state, records, or tick accumulators
Verbs - weighted combination directives (maximize, penalize, reward) that produce the final scalar score

Metrics come in four types:

Simple - a single expression against final agent state
Per-record - iterates over emitted records with avg/sum/min/max aggregation
Per-tick - reads from a running accumulator updated during the tick loop
Direct aggregate - references other metric names or raw counters

Parallel Evaluation

Genome evaluation is the system’s primary bottleneck. Each generation evaluates population_size genomes, each running scenarios independent scenarios of ticks ticks.

The engine parallelizes this by evaluating genomes concurrently:

One goroutine per genome
A semaphore limits concurrency to runtime.NumCPU() goroutines
Each goroutine receives its own deterministic RNG (seeded from the engine’s RNG before the parallel section begins, ensuring reproducibility regardless of scheduling order)
Each goroutine builds its own Brain from the genome, creates its own world state, and runs independently - no shared mutable state

The Domain.Evaluate() method must be goroutine-safe. Since each call receives its own Brain, rng, and creates its own world, this is satisfied naturally without locks.

GPU Architecture

The gpu/ directory contains a C compute library that runs the same bytecode programs on GPU hardware. The C code mirrors the Go runtime package one-to-one:

C Header	Go Equivalent	Purpose
`core_vm.h`	`vm/interpreter.go`	Bytecode interpreter (stack machine)
`core_tick.h`	`runtime/tick.go`	7-step tick loop
`core_machine.h`	`runtime/machine.go`	State machine executor
`core_fitness.h`	`runtime/fitness.go`	Fitness computation (gates, metrics, verbs)
`core_brain.h`	`core/signal.go`	Brain signal propagation
`core_dynamics.h`	`evolution/dynamics.go`	Dynamics cascade

All C functions are static inline in header files for direct inclusion in GPU kernel files. The opcode enum values in core_vm.h match vm/opcode.go exactly - both interpreters execute the same compiled bytecode format.

The GPU evaluation path runs all scenarios for a genome in parallel on the device, using the same 7-step tick pipeline. The Go side marshals genome and bytecode data to the C library, launches evaluation, and reads back fitness results.

To enable GPU evaluation, build the C library with CMake and pass the --gpu flag to quale evolve. See the CLI Reference for details.

File Format Summary

Extension	Purpose	Format
`.quale`	Source specification files	Text (Quale DSL)
`.quale-ckpt`	Evolution checkpoints (full population state)	Binary (Go gob encoding)
`.quale-brain`	Evolved brain snapshots	Binary (Go gob encoding)