Skip to content

Architecture

This page describes how the Quale system is organized internally - the pipeline from .quale source files to evolved neural networks, the bytecode virtual machine, the 7-step tick loop, and how the same bytecode runs on both CPU and GPU.

For the DSL syntax itself, see the Language Reference. For CLI usage, see the CLI Reference.


A Quale experiment flows through a pipeline from source text to evolved connectome:

.quale files
|
v
[ Parser ] --------> Project (AST)
|
v
[ Validator ] ------> Diagnostics (errors / warnings)
|
v
[ V03 Compiler ] ---> CompiledProject + bytecode programs
| |
| +-------------------+------------------+
| | |
v v v
[ GenericDomain.Configure() ] [ EvolutionEngine ]
| |
| +---------------------------------+
| |
v v
[ RunGeneration loop ]
|
v
Evolved connectome (checkpoint + best brain)

The parser (parser/project.go) reads one or more .quale files and produces a Project AST. When given a directory, it scans all .quale files alphabetically and merges their definitions into a single project. There are no import statements - every file in the directory is part of the project.

The result is a collection of typed definition lists: BodyDefs, ItemDefs, WorldDefs, DynamicsDefs, FitnessDefs, EvolveDefs, PerceptionDefs, and ActionDefs.

The validator checks the AST for semantic correctness - things the parser cannot verify syntactically. This includes:

  • All references resolve (the evolve block’s body: Driver points to a real body definition)
  • Sensor names declared in the body match those assigned in the perception block
  • Entity properties referenced in handlers exist on the entity type
  • World topology and entity declarations are consistent
  • No duplicate definition names across files

Validation produces a DiagnosticList with source-located error messages. If any errors are present, compilation does not proceed.

The V03Compiler (vm/compiler_v03.go) transforms validated AST nodes into bytecode programs - flat instruction streams executable by both the Go interpreter and the C interpreter. Key compilation targets:

  • Perception program - reads agent state, world state, and spatial queries; writes sensor values for brain input
  • Action program - reads brain actuator outputs; applies physics, state updates, and guards; detects entity crossings
  • Entity handler programs - on_cross, on_enter, on_pass handlers compiled per entity type
  • Machine programs - state bodies, transition conditions, on_enter/on_exit hooks
  • Fitness programs - gate conditions, metric expressions, terminate-when conditions, per-tick accumulator expressions

The compiler builds index maps that translate named identifiers (e.g. agent.speed, sensor signal_aspect) into numeric array indices used by the bytecode. This allows the VM to operate entirely on indexed arrays without string lookups at runtime.

The GenericDomain (runtime/domain.go) receives the compiled project and wires together all bytecode programs into a runnable simulation. Configuration includes:

  • Building the world topology (route or grid) and populating entity stores from imported CSV data or spawn declarations
  • Creating perception and action runners that bind bytecode programs to the VM interpreter
  • Compiling state machines and creating instance factories
  • Setting up spatial query dispatch tables
  • Preparing the fitness evaluator with gates, metrics, and verbs

All domain logic comes from the .quale file. The GenericDomain is the only domain implementation - it reads any .quale file without domain-specific Go code.

The EvolutionEngine runs the NEAT-style evolution loop: initialize a population of minimal genomes, evaluate each genome through the GenericDomain, speciate, reproduce, and repeat until convergence or the generation limit.


PackagePurpose
parser/Lexer, parser, validator, and project compiler for .quale DSL files
vm/Bytecode VM - opcodes, compiler, interpreter, and program builder
runtime/Tick loop, perception/action runners, machine executor, fitness computation, scenario runner, record store
world/Topology types (route, grid), entity types and stores, spatial queries, CSV import
core/Brain network, nodes, connections, signal propagation, activation functions
evolution/NEAT algorithm - genomes, mutation, crossover, speciation, checkpointing
gpu/C compute library for GPU-accelerated evaluation - bytecode VM, tick loop, machines, fitness
server/HTTP API for checkpoint/brain data (used by browser-based viewers)

The VM (vm/) uses a stack-machine architecture where programs are sequences of opcodes that manipulate a float64 value stack. The same bytecode format runs on both the Go interpreter (vm/interpreter.go) and the C interpreter (gpu/core_vm.h).

The opcode set covers:

CategoryOpcodesDescription
ArithmeticPUSH_CONST, PUSH_STATE, ADD, SUB, MUL, DIV, NEGStack manipulation and basic math
State accessLOAD_AGENT, STORE_AGENT, LOAD_WORLD, STORE_WORLDRead/write agent and world state arrays
Brain I/OSTORE_SENSOR, READ_ACTUATORBridge between perception/action bytecode and the neural network
Spatial queriesSPATIAL_QUERY, LOAD_ENTITY_PROP, LOAD_ENTITY_INDEXExecute world queries and access entity properties
Control flowJUMP, JUMP_IF_FALSE, JUMP_IF_TRUEConditional branching for when guards and match expressions
Boolean/comparisonNOT, AND, OR, CMP_GT, CMP_LT, CMP_GE, CMP_LE, CMP_EQ, CMP_NELogic and comparison operators
Math built-insCALL_MIN, CALL_MAX, CALL_CLAMP, CALL_ABS, CALL_SQRTStandard library functions
State machinesENTER_STATE, CHECK_ELAPSED, LOAD_TIMER, STORE_TIMER, RESET_TIMERMachine state transitions and timing
RecordsRECORD_EMIT, CONSUMEEmit structured event records and remove entities from the world
ControlTERNARY, HALTTernary selection and program termination

Opcode values are fixed integers shared between Go and C. Adding a new opcode requires updating both vm/opcode.go and gpu/core_vm.h with matching values.


The SignalEngine processes one forward pass through a brain’s connectome each tick. It pre-computes a topological ordering of non-input nodes using Kahn’s algorithm during construction. The propagation loop is efficient but not fully allocation-free - readActuatorOutputs allocates a map each tick.

A single tick:

  1. Apply sensor inputs - write sensor values into mapped input nodes
  2. Propagate - iterate non-input nodes in topological order, accumulating weighted inputs and applying each node’s activation function (plus bias)
  3. Apply plasticity - if the brain has a PlasticityConfig, run Hebbian learning, weight decay, and homeostatic regulation
  4. Read actuator outputs - collect output values from actuator-mapped nodes

The SignalEngine is the innermost loop of the system. It runs once per tick, per scenario, per genome, per generation. Topological ordering and adjacency lists are pre-computed during construction to minimize per-tick work.

The EvolutionEngine orchestrates the full NEAT-style evolution loop:

  1. Initialize population - create minimal genomes with input/output nodes (and region hidden nodes, if regions are defined)
  2. Evaluate - run each genome through the GenericDomain’s Evaluate() method in parallel (one goroutine per genome, bounded by a semaphore equal to CPU count)
  3. Speciate - group genomes into species based on compatibility distance
  4. Reproduce - select parents via tournament selection, produce offspring through crossover and mutation, with elitism for species with 5+ members

Key types:

  • EvalResult - fitness score plus a generic Metrics map (behavioral measurements)
  • GenerationStats - per-generation aggregates (best/avg/worst fitness, species count, best/avg metrics)

For details on all evolution parameters, see the Evolution Configuration Reference.

The MutationEngine applies eight NEAT-style mutation operators to genomes, each gated by a per-operator probability:

OperatorEffect
weight_shiftPerturb or randomize connection weights
bias_shiftPerturb node biases
add_nodeSplit a connection, inserting a new hidden node
remove_nodeRemove a hidden node and create bypass connections
add_connectionAdd a connection between two unconnected nodes
remove_connectionDisable a connection (biased toward weak connections)
rewireMove one endpoint of a connection to a different node
change_activationSwitch a hidden node’s activation function

When regions are defined, structural mutations are region-aware: new nodes inherit region assignments and new connections preferentially stay within the same region.


Each simulation tick follows a fixed 7-step pipeline. The tick loop (runtime/tick.go) drives this sequence, coordinating bytecode execution, brain propagation, and dynamics:

1. World machines Execute world-scope state machines (e.g. preceding
train movement, block signal cascade)
2. Perception Run the compiled perception program - reads agent
state, world state, and spatial queries; writes
sensor values for the brain
3. Brain fires SignalEngine.Tick() propagates signals through
the connectome (injected as a callback)
4. Action + entities Run the compiled action program - reads actuator
outputs, applies physics, updates agent state.
Detects entity crossings and fires on_cross,
on_enter, on_pass handler programs
5. Agent machines Execute agent-scope state machines (e.g. AWS
alert/acknowledge, vigilance monitoring)
6. Dynamics cascade DynamicsEngine.Tick() applies per-tick and
conditional state rules, checks death conditions
7. Fitness accumulators Update per-tick metric accumulators and
evaluate terminate-when conditions

Steps 1-7 repeat for each tick in a scenario. Multiple scenarios are run per genome evaluation, with results averaged. The tick loop is entirely driven by bytecode - the only external callback is the brain’s forward pass at step 3.

The scenario runner (runtime/scenario.go) bridges the evolution engine and the tick loop. For each scenario it:

  1. Creates a SignalEngine from the genome’s brain
  2. Wraps it as the tick loop’s brain callback
  3. Runs the tick loop until the agent dies, a terminate condition fires, or the tick budget runs out
  4. Computes fitness from final state, records, and accumulators using ComputeFitness()

Fitness is evaluated at the end of each scenario by the fitness evaluator (runtime/fitness.go). The computation follows a fixed order:

  1. Gates - hard constraints that can zero out the score (e.g. gate alive)
  2. Metrics - named numeric values computed from agent state, records, or tick accumulators
  3. Verbs - weighted combination directives (maximize, penalize, reward) that produce the final scalar score

Metrics come in four types:

  • Simple - a single expression against final agent state
  • Per-record - iterates over emitted records with avg/sum/min/max aggregation
  • Per-tick - reads from a running accumulator updated during the tick loop
  • Direct aggregate - references other metric names or raw counters

Genome evaluation is the system’s primary bottleneck. Each generation evaluates population_size genomes, each running scenarios independent scenarios of ticks ticks.

The engine parallelizes this by evaluating genomes concurrently:

  • One goroutine per genome
  • A semaphore limits concurrency to runtime.NumCPU() goroutines
  • Each goroutine receives its own deterministic RNG (seeded from the engine’s RNG before the parallel section begins, ensuring reproducibility regardless of scheduling order)
  • Each goroutine builds its own Brain from the genome, creates its own world state, and runs independently - no shared mutable state

The Domain.Evaluate() method must be goroutine-safe. Since each call receives its own Brain, rng, and creates its own world, this is satisfied naturally without locks.


The gpu/ directory contains a C compute library that runs the same bytecode programs on GPU hardware. The C code mirrors the Go runtime package one-to-one:

C HeaderGo EquivalentPurpose
core_vm.hvm/interpreter.goBytecode interpreter (stack machine)
core_tick.hruntime/tick.go7-step tick loop
core_machine.hruntime/machine.goState machine executor
core_fitness.hruntime/fitness.goFitness computation (gates, metrics, verbs)
core_brain.hcore/signal.goBrain signal propagation
core_dynamics.hevolution/dynamics.goDynamics cascade

All C functions are static inline in header files for direct inclusion in GPU kernel files. The opcode enum values in core_vm.h match vm/opcode.go exactly - both interpreters execute the same compiled bytecode format.

The GPU evaluation path runs all scenarios for a genome in parallel on the device, using the same 7-step tick pipeline. The Go side marshals genome and bytecode data to the C library, launches evaluation, and reads back fitness results.

To enable GPU evaluation, build the C library with CMake and pass the --gpu flag to quale evolve. See the CLI Reference for details.


ExtensionPurposeFormat
.qualeSource specification filesText (Quale DSL)
.quale-ckptEvolution checkpoints (full population state)Binary (Go gob encoding)
.quale-brainEvolved brain snapshotsBinary (Go gob encoding)