Rail-004

Structured Brain Architecture vs Flat Topology: Regions, Plasticity, and the Trade-off Between Optimality and Realism

Author: Dan Battye
Date: 2026-03-15
Affiliation: Quale Project
Experiment ID: Rail-004

Abstract

We evolved train driver brains with structured regional architecture (24 pre-allocated hidden neurons across 3 functional regions) and runtime Hebbian plasticity, comparing against Rail-003b's flat topology (7 evolved hidden neurons, no structure, no plasticity). The structured brain achieved lower peak fitness (87.99 vs 99.69) but exhibited qualitatively different behavior: 15 sensors influenced throttle control (vs 4), dedicated reflex nodes produced binary signal responses, and the fatigue management region developed connections to AWS acknowledgment. The flat brain's superior score came from a degenerate speed-governor strategy that avoids signal encounters entirely - a mathematical shortcut unavailable to the structured brain whose pre-wired regions forced it to actively process signal information. These findings suggest that architectural constraints channel evolution toward naturalistic behavior at the cost of raw optimality, and that fitness score alone is an inadequate measure of behavioral realism.

1. Introduction

1.1 Background

Experiments Rail-001 through Rail-003b evolved train driver brains using flat topologies - all hidden neurons structurally identical, no predefined grouping, no runtime learning. The best result (Rail-003b, fitness 99.69) produced a brain with 7 hidden neurons and 36 connections that discovered emergent human factors phenomena: complacency countermeasures, dead man's switch patterns, and dual-pathway signal processing.

However, Rail-003b's strategy was fundamentally un-human. Its primary safety mechanism was a proportional speed governor (current_speed -> brake, weight +1.29) that kept speed permanently below the SPAD threshold. A real driver cannot operate this way - they must drive at operational speed and actively manage signal compliance through perception, judgment, and timely braking.

1.2 What Changed from Rail-003b to Rail-004

This experiment introduces two new Quale v0.2 features:

Regions - Named clusters of hidden neurons with distinct structural properties:

Region	Nodes	Density	Activation	Purpose
reflex	6	0.7	Step (binary)	Fast signal responses
situational_awareness	12	0.3	Sigmoid (graded)	Pattern recognition
fatigue_management	6	0.4	Sigmoid (graded)	Driver state tracking

Total: 24 pre-allocated hidden neurons with ~130 initial intra-region connections. The NEAT mutation engine respects region boundaries: new nodes inherit their region's activation function, and 80% of new connections are intra-region.

Plasticity - Runtime weight adaptation during scenarios:

Mechanism	Parameters	Effect
Hebbian learning	rate: 0.005, max_weight: 2.0	Co-active connections strengthen during the scenario
Synaptic decay	rate: 0.0005, min_weight: 0.0	Inactive connections weaken toward zero
Homeostatic regulation	target_activity: 0.3, adjustment_rate: 0.003	Per-region gain adjustment to prevent saturation or silence

Plasticity changes persist across scenarios within a genome evaluation but reset between genomes. This creates within-lifetime learning without Lamarckian inheritance.

1.3 Hypothesis

A structured brain with dedicated functional regions and runtime plasticity will produce more naturalistic driving behavior than a flat topology, even if raw fitness is lower. The architectural constraints will channel evolution toward strategies that actively process signal information rather than degenerate speed-limiting shortcuts.

2. Materials and Methods

2.1 Experimental Configuration

Parameter	Rail-003b	Rail-004	Change
Population	300	300	Same
Generations	2000 (converged 349)	500 (converged 227)	Same max, different convergence
Sensors	18	18	Same
Actuators	5	5	Same
Initial hidden nodes	0	24 (3 regions)	New
Initial connections	~10 (sparse input->output)	~130 (intra-region + sparse IO)	New
Plasticity	None	Hebbian + decay + homeostatic	New
Region-aware mutations	No	Yes (80% intra-region preference)	New
Signal system	QLD 5-aspect	QLD 5-aspect	Same
Attention gating	Yes	Yes	Same

2.2 Region Design Rationale

Region sizes and properties were chosen to model a simplified human driver cognitive architecture:

Reflex (6 nodes, Step activation): Analogous to brainstem reflexes. Binary fire/don't-fire responses. High density (0.7) for fast internal processing. Intended for immediate signal reactions.
Situational awareness (12 nodes, Sigmoid activation): Analogous to parietal cortex spatial processing. Graded responses for nuanced assessment. Lower density (0.3) for selective, pattern-based computation. Largest region because situational assessment is the most complex task.
Fatigue management (6 nodes, Sigmoid activation): Analogous to hypothalamic fatigue monitoring. Medium density (0.4). Intended for tracking driver state over time.

2.3 Plasticity Parameters

Conservative learning rates chosen to avoid catastrophic weight instability:

Hebbian rate 0.005 (half of whitepaper default 0.01) - gradual strengthening
Decay rate 0.0005 (half of whitepaper default 0.001) - slow forgetting
Homeostatic target 0.3 (30% of region nodes active) - moderate activity level

3. Results

3.1 Fitness Progression

Generation	Best Fitness	Avg Fitness	Species	Topology	Survival	Idle
0	37.47	7.48	1	47n/130c	100%	100%
5	78.99	30.61	1	47n/130c	78%	70%
50	81.49	64.92	15	49n/135c	93%	73%
100	82.46	59.16	13	51n/133c	93%	75%
200	86.13	58.55	16	56n/257c	95%	75%
227 (converged)	87.99	-	-	56n/197c	-	-

Rail-004: Fitness progression (227 generations, structured brain)

Best fitness Average fitness

Generation	Best	Avg
0	37.47	7.48
5	78.99	30.61
50	81.49	64.92
100	82.46	59.16
200	86.13	58.55
227	87.99	-

3.2 Comparison with Rail-003b

Metric	Rail-003b (flat)	Rail-004 (structured)	Interpretation
Best fitness	99.69	87.99	Flat brain found a more optimal strategy
Convergence gen	349	227	Structured brain converged faster
Total hidden neurons	7	33 (24 initial + 9 evolved)	Structured brain is much larger
Enabled connections	36	197	5.5x more wiring
Sensors influencing throttle	4	15	Structured brain uses far more information
Sensors influencing brake	3	15	Same - richer braking decisions
Sensors influencing attention	4	8	More attention inputs
Emergency brake wired	No (bias only)	No (bias only)	Same - neither brain uses it
Minimum idle rate	48%	71%	Flat brain drove more actively
Functional hidden neurons	5 of 7	33 of 33	All structured nodes participate

Sensor influence: flat (Rail-003b) vs structured (Rail-004)

Category	Rail-003b	Rail-004
Sensors to throttle	4	15
Sensors to brake	3	15
Sensors to attention	4	8

3.3 Evolved Topology Analysis

56 total nodes: 18 input + 33 hidden + 5 output

The 33 hidden neurons break down as:

6 reflex-region nodes (Step activation, nodes 23-28)
12 awareness-region nodes (Sigmoid, nodes 29-40)
6 fatigue-region nodes (Sigmoid, nodes 41-46)
9 evolution-added nodes (Sigmoid, nodes 119-1586)

Region specialisation:

Reflex region (nodes 23-28):

Feeds into attention (nodes 23, 25 -> attention with weights -0.72, -1.09)
Dense inter-node wiring (26 intra-region connections)
Receives from braking_distance, crossing_ahead, route_familiarity
Step activation produces binary signals: "danger/no danger"
Node 27 connects to acknowledge_aws (-0.79) - a reflex to suppress AWS

Situational awareness region (nodes 29-40):

Largest region, handles primary throttle/brake computation
Node 32: stress -> H(32) (+1.62) and H(32) -> brake (+2.00, max weight) - stress triggers maximum braking
Node 33: central hub receiving from 8 sensors, feeding 8 other nodes - acts as a situation integrator
Node 34: receives cognitive_load (-1.45), at_station (+0.16), feeds brake (-1.98) - cognitive overload suppresses braking (dangerous but schedule-optimal)
Node 37: cognitive_load (+1.74), crossing_ahead (+1.83) -> throttle (-1.48), brake (-1.96) - high cognitive load near crossings suppresses both throttle and brake (freeze response)

Fatigue management region (nodes 41-46):

Node 41: fatigue (+1.77), aws_alert (+1.41) - fatigue and AWS converge
Node 44: feeds acknowledge_aws (-0.88) - fatigue-influenced AWS response
Node 46: fatigue (-1.11) -> modulates other fatigue nodes
The region developed an internal circuit where fatigue level modulates AWS acknowledgment timing

Evolution-added nodes (119, 269, 438, 597, 1021, 1130, 1383, 1565, 1586):

All Sigmoid activation (inter-region default)
Node 269: speed_limit (-1.61) -> feeds both throttle (-1.98) and brake (+0.25) - a speed limit processor
Node 1130: feeds throttle (-1.48), brake (+1.44) - evolved a throttle/brake coordinator
Node 1383: current_speed (+1.46) -> H(38) - a speed-to-awareness bridge
These nodes bridge between regions, creating inter-regional pathways that evolution discovered were necessary but the initial structure didn't provide

4. Discussion

4.1 Why the Structured Brain Scored Lower

Three factors contributed to the 12-point fitness gap:

Over-parameterisation: 197 connections means ~197 weights to optimise simultaneously. With 300 genomes evaluated over 227 generations, evolution had ~68,100 evaluation opportunities to tune those weights. Rail-003b's 36 connections required roughly the same number of evaluations but had 5.5x fewer parameters to tune - a much easier optimisation surface.

Dense initial wiring creates noise: The reflex region started with density 0.7, meaning ~30 random connections between 6 nodes. Most of these are evolutionary garbage - random weights that inject noise into the signal path. Evolution must either repurpose or suppress them, which consumes generations that could be spent discovering useful pathways.

The speed governor is unavailable: Rail-003b's winning strategy was a simple current_speed -> brake proportional governor that kept speed below the SPAD threshold at all times. With 24 pre-wired hidden neurons between inputs and outputs, the direct input-to-output pathway is buried under layers of regional processing. The brain can't easily implement the trivial speed governor because signals must traverse regional nodes first.

4.2 Why the Structured Brain is More Realistic

Despite lower fitness, the structured brain's behavior is arguably more human:

Richer information processing: 15 sensors influence throttle (vs 4). The driver "considers" signal aspect, distance, speed, crossing proximity, station proximity, stress, cognitive load, fatigue, route familiarity, visibility, gradient, and pre-shift fatigue before deciding on throttle. A flat brain that only reads 4 sensors is not driving - it's applying a mathematical formula.

Region specialisation matches human cognition:

Reflex region (Step activation) produces binary danger assessments - analogous to the amygdala's threat detection
Awareness region (Sigmoid) produces graded situational assessments - analogous to cortical processing
Fatigue region modulates AWS response based on fatigue level - analogous to how fatigue degrades procedural compliance

Cognitive overload produces realistic failure modes: Node 37's response to high cognitive load near crossings - suppressing both throttle and brake simultaneously (freeze response) - is a documented human factors phenomenon. Under cognitive overload, drivers sometimes fail to act at all. The flat brain never exhibited this because it didn't process cognitive load in the context of crossings.

Fatigue affects specific behaviors, not general performance: The fatigue region's connection to AWS acknowledgment (but not to throttle or brake) suggests the brain learned that fatigue degrades procedural responses (acknowledging warnings) before it degrades operational responses (speed management). This matches real-world observations: fatigued drivers miss procedural checks before they miss signals.

4.3 The Optimality-Realism Trade-off

Rail-003b's flat brain achieved near-perfect fitness (99.69) through a strategy no human driver would use: maintain permanently low speed to avoid all signal encounters. Rail-004's structured brain achieved lower fitness (87.99) through a strategy that actively processes signals, manages cognitive load, and degrades realistically under fatigue.

This reveals a fundamental tension in connectome-based behavior evolution: unconstrained evolution finds degenerate shortcuts that maximise fitness without producing realistic behavior. Architectural constraints (regions) channel evolution away from these shortcuts and toward strategies that must actually process information through structured pathways - the way biological brains do.

The implication for fitness function design: if the goal is realistic behavior, the fitness function must make degenerate strategies impossible (as we did with the terminus penalty and idle-as-death rule in Rail-002), AND the brain architecture must force information through structured processing (as we did with regions in Rail-004).

4.4 Plasticity Impact

The plasticity mechanisms (Hebbian, decay, homeostatic) were active but their impact is difficult to isolate in this experiment because they co-vary with regions. A clean comparison would require:

Rail-004b: regions WITHOUT plasticity
Rail-004c: plasticity WITHOUT regions (flat brain + learning)

These controlled experiments would determine whether the behavioral differences come from structure, learning, or their interaction.

4.5 Evolution-Added Neurons

9 neurons evolved on top of the 24 regional nodes. All 9 use Sigmoid activation (the inter-region mutation default). Their primary role: bridging between regions. Node 269 bridges speed_limit to the awareness region processing. Node 1383 bridges current_speed to the awareness region. Node 1130 evolved a dedicated throttle/brake coordinator.

This suggests that the initial regional structure was incomplete - evolution needed inter-regional pathways that the intra-region density didn't provide. A future experiment could add initial inter-region connections (pathway hints from the whitepaper) to see if this reduces the need for evolution-added bridge nodes.

5. Emergent Behaviors: What the Structured Brain Invented

Designed (in .quale)	Emergent (evolved by brain)
3 named regions with specific node counts	Region specialisation matching intended function
Step activation for reflex nodes	Binary danger assessment feeding attention
Sigmoid for awareness nodes	Central situation integrator (node 33) processing 8 sensors
Fatigue region exists	Fatigue modulates AWS acknowledgment timing specifically
Plasticity parameters set	Cannot isolate plasticity effect (confounded with regions)
Hebbian rate 0.005	Connections between co-active pathways strengthened during scenarios
Inter-region connections evolve	Bridge neurons connecting speed processing to awareness
Cognitive load as a sensor	Freeze response under cognitive overload near crossings

6. Cross-Experiment Summary

Feature	Rail-001	Rail-002	Rail-003	Rail-003b	Rail-004
Throttle	Binary	Continuous	Continuous	Continuous	Continuous
Attention	Unrewarded	Rewarded	Causal	Causal	Causal
Regions	No	No	No	No	Yes (3)
Plasticity	No	No	No	No	Yes (all 3)
Best fitness	81.70	99.58	99.03	99.69	87.99
Hidden neurons	0	0	1	7	33
Connections	11	13	22	36	197
Sensors -> throttle	0	2	2	10	15
Primary strategy	Don't move	Speed governor	Active vigilance	Speed gov + vigilance	Multi-region processing
Human-like?	No	Partially	More	More	Most

Best fitness across all rail experiments

Rail-001 Rail-002 Rail-003 Rail-003b Rail-004

Experiment	Best Fitness
Rail-001	81.70
Rail-002	99.58
Rail-003	99.03
Rail-003b	99.69
Rail-004	87.99

7. Conclusion

Rail-004 demonstrated that structured brain architecture produces qualitatively different evolved behavior than flat topologies. The 24-neuron, 3-region brain with plasticity achieved lower fitness (87.99 vs 99.69) but exhibited richer, more realistic behavior: processing 15 sensors for speed decisions, developing region-specific specialisation matching intended cognitive functions, and producing realistic failure modes (cognitive overload freeze response, fatigue-degraded procedural compliance).

The central finding: architectural constraints trade optimality for realism. Unconstrained flat brains find degenerate shortcuts. Structured brains must process information through functional pathways, producing behavior that more closely resembles human cognition - including its failure modes.

Design principle: To evolve realistic behavior, constrain the brain's architecture to match the target organism's cognitive structure. Accept lower fitness scores as the cost of authenticity. Fitness measures how well the agent games the fitness function; behavioral analysis measures how realistically it performs the task.

8. Future Directions

Controlled ablation: Rail-004b (regions without plasticity) and Rail-004c (plasticity without regions) to isolate individual feature contributions.
Pathway hints: Pre-seed inter-region connections (sensor -> reflex, awareness -> actuator) to reduce evolution's need for bridge neurons.
Extended evolution: Run Rail-004 for 2000+ generations to determine if the structured brain eventually matches or exceeds flat brain fitness while retaining its richer behavior.
Signal speed: Add multi-tick signal delay through slow regions (v0.3) to model processing latency differences between reflexive and deliberative circuits.
Recurrence: Enable feedback loops in the fatigue management region (v0.3) to model fatigue memory - tracking accumulated fatigue over time rather than just current fatigue level.