Home / Research / Rail-004
Rail-004

Structured Brain Architecture vs Flat Topology: Regions, Plasticity, and the Trade-off Between Optimality and Realism

Abstract

We evolved train driver brains with structured regional architecture (24 pre-allocated hidden neurons across 3 functional regions) and runtime Hebbian plasticity, comparing against Rail-003b's flat topology (7 evolved hidden neurons, no structure, no plasticity). The structured brain achieved lower peak fitness (87.99 vs 99.69) but exhibited qualitatively different behavior: 15 sensors influenced throttle control (vs 4), dedicated reflex nodes produced binary signal responses, and the fatigue management region developed connections to AWS acknowledgment. The flat brain's superior score came from a degenerate speed-governor strategy that avoids signal encounters entirely - a mathematical shortcut unavailable to the structured brain whose pre-wired regions forced it to actively process signal information. These findings suggest that architectural constraints channel evolution toward naturalistic behavior at the cost of raw optimality, and that fitness score alone is an inadequate measure of behavioral realism.

1. Introduction

1.1 Background

Experiments Rail-001 through Rail-003b evolved train driver brains using flat topologies - all hidden neurons structurally identical, no predefined grouping, no runtime learning. The best result (Rail-003b, fitness 99.69) produced a brain with 7 hidden neurons and 36 connections that discovered emergent human factors phenomena: complacency countermeasures, dead man's switch patterns, and dual-pathway signal processing.

However, Rail-003b's strategy was fundamentally un-human. Its primary safety mechanism was a proportional speed governor (current_speed -> brake, weight +1.29) that kept speed permanently below the SPAD threshold. A real driver cannot operate this way - they must drive at operational speed and actively manage signal compliance through perception, judgment, and timely braking.

1.2 What Changed from Rail-003b to Rail-004

This experiment introduces two new Quale v0.2 features:

Regions - Named clusters of hidden neurons with distinct structural properties:

Region Nodes Density Activation Purpose
reflex 6 0.7 Step (binary) Fast signal responses
situational_awareness 12 0.3 Sigmoid (graded) Pattern recognition
fatigue_management 6 0.4 Sigmoid (graded) Driver state tracking

Total: 24 pre-allocated hidden neurons with ~130 initial intra-region connections. The NEAT mutation engine respects region boundaries: new nodes inherit their region's activation function, and 80% of new connections are intra-region.

Plasticity - Runtime weight adaptation during scenarios:

Mechanism Parameters Effect
Hebbian learning rate: 0.005, max_weight: 2.0 Co-active connections strengthen during the scenario
Synaptic decay rate: 0.0005, min_weight: 0.0 Inactive connections weaken toward zero
Homeostatic regulation target_activity: 0.3, adjustment_rate: 0.003 Per-region gain adjustment to prevent saturation or silence

Plasticity changes persist across scenarios within a genome evaluation but reset between genomes. This creates within-lifetime learning without Lamarckian inheritance.

1.3 Hypothesis

A structured brain with dedicated functional regions and runtime plasticity will produce more naturalistic driving behavior than a flat topology, even if raw fitness is lower. The architectural constraints will channel evolution toward strategies that actively process signal information rather than degenerate speed-limiting shortcuts.

2. Materials and Methods

2.1 Experimental Configuration

Parameter Rail-003b Rail-004 Change
Population 300 300 Same
Generations 2000 (converged 349) 500 (converged 227) Same max, different convergence
Sensors 18 18 Same
Actuators 5 5 Same
Initial hidden nodes 0 24 (3 regions) New
Initial connections ~10 (sparse input->output) ~130 (intra-region + sparse IO) New
Plasticity None Hebbian + decay + homeostatic New
Region-aware mutations No Yes (80% intra-region preference) New
Signal system QLD 5-aspect QLD 5-aspect Same
Attention gating Yes Yes Same

2.2 Region Design Rationale

Region sizes and properties were chosen to model a simplified human driver cognitive architecture:

  • Reflex (6 nodes, Step activation): Analogous to brainstem reflexes. Binary fire/don't-fire responses. High density (0.7) for fast internal processing. Intended for immediate signal reactions.
  • Situational awareness (12 nodes, Sigmoid activation): Analogous to parietal cortex spatial processing. Graded responses for nuanced assessment. Lower density (0.3) for selective, pattern-based computation. Largest region because situational assessment is the most complex task.
  • Fatigue management (6 nodes, Sigmoid activation): Analogous to hypothalamic fatigue monitoring. Medium density (0.4). Intended for tracking driver state over time.

2.3 Plasticity Parameters

Conservative learning rates chosen to avoid catastrophic weight instability:

  • Hebbian rate 0.005 (half of whitepaper default 0.01) - gradual strengthening
  • Decay rate 0.0005 (half of whitepaper default 0.001) - slow forgetting
  • Homeostatic target 0.3 (30% of region nodes active) - moderate activity level

3. Results

3.1 Fitness Progression

Generation Best Fitness Avg Fitness Species Topology Survival Idle
0 37.47 7.48 1 47n/130c 100% 100%
5 78.99 30.61 1 47n/130c 78% 70%
50 81.49 64.92 15 49n/135c 93% 73%
100 82.46 59.16 13 51n/133c 93% 75%
200 86.13 58.55 16 56n/257c 95% 75%
227 (converged) 87.99 - - 56n/197c - -
Rail-004: Fitness progression (227 generations, structured brain)
Best fitness Average fitness
GenerationBestAvg
037.477.48
578.9930.61
5081.4964.92
10082.4659.16
20086.1358.55
22787.99-

3.2 Comparison with Rail-003b

Metric Rail-003b (flat) Rail-004 (structured) Interpretation
Best fitness 99.69 87.99 Flat brain found a more optimal strategy
Convergence gen 349 227 Structured brain converged faster
Total hidden neurons 7 33 (24 initial + 9 evolved) Structured brain is much larger
Enabled connections 36 197 5.5x more wiring
Sensors influencing throttle 4 15 Structured brain uses far more information
Sensors influencing brake 3 15 Same - richer braking decisions
Sensors influencing attention 4 8 More attention inputs
Emergency brake wired No (bias only) No (bias only) Same - neither brain uses it
Minimum idle rate 48% 71% Flat brain drove more actively
Functional hidden neurons 5 of 7 33 of 33 All structured nodes participate
Sensor influence: flat (Rail-003b) vs structured (Rail-004)
CategoryRail-003bRail-004
Sensors to throttle415
Sensors to brake315
Sensors to attention48

3.3 Evolved Topology Analysis

56 total nodes: 18 input + 33 hidden + 5 output

The 33 hidden neurons break down as:

  • 6 reflex-region nodes (Step activation, nodes 23-28)
  • 12 awareness-region nodes (Sigmoid, nodes 29-40)
  • 6 fatigue-region nodes (Sigmoid, nodes 41-46)
  • 9 evolution-added nodes (Sigmoid, nodes 119-1586)

Region specialisation:

Reflex region (nodes 23-28):

  • Feeds into attention (nodes 23, 25 -> attention with weights -0.72, -1.09)
  • Dense inter-node wiring (26 intra-region connections)
  • Receives from braking_distance, crossing_ahead, route_familiarity
  • Step activation produces binary signals: "danger/no danger"
  • Node 27 connects to acknowledge_aws (-0.79) - a reflex to suppress AWS

Situational awareness region (nodes 29-40):

  • Largest region, handles primary throttle/brake computation
  • Node 32: stress -> H(32) (+1.62) and H(32) -> brake (+2.00, max weight) - stress triggers maximum braking
  • Node 33: central hub receiving from 8 sensors, feeding 8 other nodes - acts as a situation integrator
  • Node 34: receives cognitive_load (-1.45), at_station (+0.16), feeds brake (-1.98) - cognitive overload suppresses braking (dangerous but schedule-optimal)
  • Node 37: cognitive_load (+1.74), crossing_ahead (+1.83) -> throttle (-1.48), brake (-1.96) - high cognitive load near crossings suppresses both throttle and brake (freeze response)

Fatigue management region (nodes 41-46):

  • Node 41: fatigue (+1.77), aws_alert (+1.41) - fatigue and AWS converge
  • Node 44: feeds acknowledge_aws (-0.88) - fatigue-influenced AWS response
  • Node 46: fatigue (-1.11) -> modulates other fatigue nodes
  • The region developed an internal circuit where fatigue level modulates AWS acknowledgment timing

Evolution-added nodes (119, 269, 438, 597, 1021, 1130, 1383, 1565, 1586):

  • All Sigmoid activation (inter-region default)
  • Node 269: speed_limit (-1.61) -> feeds both throttle (-1.98) and brake (+0.25) - a speed limit processor
  • Node 1130: feeds throttle (-1.48), brake (+1.44) - evolved a throttle/brake coordinator
  • Node 1383: current_speed (+1.46) -> H(38) - a speed-to-awareness bridge
  • These nodes bridge between regions, creating inter-regional pathways that evolution discovered were necessary but the initial structure didn't provide

4. Discussion

4.1 Why the Structured Brain Scored Lower

Three factors contributed to the 12-point fitness gap:

Over-parameterisation: 197 connections means ~197 weights to optimise simultaneously. With 300 genomes evaluated over 227 generations, evolution had ~68,100 evaluation opportunities to tune those weights. Rail-003b's 36 connections required roughly the same number of evaluations but had 5.5x fewer parameters to tune - a much easier optimisation surface.

Dense initial wiring creates noise: The reflex region started with density 0.7, meaning ~30 random connections between 6 nodes. Most of these are evolutionary garbage - random weights that inject noise into the signal path. Evolution must either repurpose or suppress them, which consumes generations that could be spent discovering useful pathways.

The speed governor is unavailable: Rail-003b's winning strategy was a simple current_speed -> brake proportional governor that kept speed below the SPAD threshold at all times. With 24 pre-wired hidden neurons between inputs and outputs, the direct input-to-output pathway is buried under layers of regional processing. The brain can't easily implement the trivial speed governor because signals must traverse regional nodes first.

4.2 Why the Structured Brain is More Realistic

Despite lower fitness, the structured brain's behavior is arguably more human:

Richer information processing: 15 sensors influence throttle (vs 4). The driver "considers" signal aspect, distance, speed, crossing proximity, station proximity, stress, cognitive load, fatigue, route familiarity, visibility, gradient, and pre-shift fatigue before deciding on throttle. A flat brain that only reads 4 sensors is not driving - it's applying a mathematical formula.

Region specialisation matches human cognition:

  • Reflex region (Step activation) produces binary danger assessments - analogous to the amygdala's threat detection
  • Awareness region (Sigmoid) produces graded situational assessments - analogous to cortical processing
  • Fatigue region modulates AWS response based on fatigue level - analogous to how fatigue degrades procedural compliance

Cognitive overload produces realistic failure modes: Node 37's response to high cognitive load near crossings - suppressing both throttle and brake simultaneously (freeze response) - is a documented human factors phenomenon. Under cognitive overload, drivers sometimes fail to act at all. The flat brain never exhibited this because it didn't process cognitive load in the context of crossings.

Fatigue affects specific behaviors, not general performance: The fatigue region's connection to AWS acknowledgment (but not to throttle or brake) suggests the brain learned that fatigue degrades procedural responses (acknowledging warnings) before it degrades operational responses (speed management). This matches real-world observations: fatigued drivers miss procedural checks before they miss signals.

4.3 The Optimality-Realism Trade-off

Rail-003b's flat brain achieved near-perfect fitness (99.69) through a strategy no human driver would use: maintain permanently low speed to avoid all signal encounters. Rail-004's structured brain achieved lower fitness (87.99) through a strategy that actively processes signals, manages cognitive load, and degrades realistically under fatigue.

This reveals a fundamental tension in connectome-based behavior evolution: unconstrained evolution finds degenerate shortcuts that maximise fitness without producing realistic behavior. Architectural constraints (regions) channel evolution away from these shortcuts and toward strategies that must actually process information through structured pathways - the way biological brains do.

The implication for fitness function design: if the goal is realistic behavior, the fitness function must make degenerate strategies impossible (as we did with the terminus penalty and idle-as-death rule in Rail-002), AND the brain architecture must force information through structured processing (as we did with regions in Rail-004).

4.4 Plasticity Impact

The plasticity mechanisms (Hebbian, decay, homeostatic) were active but their impact is difficult to isolate in this experiment because they co-vary with regions. A clean comparison would require:

  • Rail-004b: regions WITHOUT plasticity
  • Rail-004c: plasticity WITHOUT regions (flat brain + learning)

These controlled experiments would determine whether the behavioral differences come from structure, learning, or their interaction.

4.5 Evolution-Added Neurons

9 neurons evolved on top of the 24 regional nodes. All 9 use Sigmoid activation (the inter-region mutation default). Their primary role: bridging between regions. Node 269 bridges speed_limit to the awareness region processing. Node 1383 bridges current_speed to the awareness region. Node 1130 evolved a dedicated throttle/brake coordinator.

This suggests that the initial regional structure was incomplete - evolution needed inter-regional pathways that the intra-region density didn't provide. A future experiment could add initial inter-region connections (pathway hints from the whitepaper) to see if this reduces the need for evolution-added bridge nodes.

5. Emergent Behaviors: What the Structured Brain Invented

Designed (in .quale) Emergent (evolved by brain)
3 named regions with specific node counts Region specialisation matching intended function
Step activation for reflex nodes Binary danger assessment feeding attention
Sigmoid for awareness nodes Central situation integrator (node 33) processing 8 sensors
Fatigue region exists Fatigue modulates AWS acknowledgment timing specifically
Plasticity parameters set Cannot isolate plasticity effect (confounded with regions)
Hebbian rate 0.005 Connections between co-active pathways strengthened during scenarios
Inter-region connections evolve Bridge neurons connecting speed processing to awareness
Cognitive load as a sensor Freeze response under cognitive overload near crossings

6. Cross-Experiment Summary

Feature Rail-001 Rail-002 Rail-003 Rail-003b Rail-004
Throttle Binary Continuous Continuous Continuous Continuous
Attention Unrewarded Rewarded Causal Causal Causal
Regions No No No No Yes (3)
Plasticity No No No No Yes (all 3)
Best fitness 81.70 99.58 99.03 99.69 87.99
Hidden neurons 0 0 1 7 33
Connections 11 13 22 36 197
Sensors -> throttle 0 2 2 10 15
Primary strategy Don't move Speed governor Active vigilance Speed gov + vigilance Multi-region processing
Human-like? No Partially More More Most
Best fitness across all rail experiments
Rail-001 Rail-002 Rail-003 Rail-003b Rail-004
ExperimentBest Fitness
Rail-00181.70
Rail-00299.58
Rail-00399.03
Rail-003b99.69
Rail-00487.99

7. Conclusion

Rail-004 demonstrated that structured brain architecture produces qualitatively different evolved behavior than flat topologies. The 24-neuron, 3-region brain with plasticity achieved lower fitness (87.99 vs 99.69) but exhibited richer, more realistic behavior: processing 15 sensors for speed decisions, developing region-specific specialisation matching intended cognitive functions, and producing realistic failure modes (cognitive overload freeze response, fatigue-degraded procedural compliance).

The central finding: architectural constraints trade optimality for realism. Unconstrained flat brains find degenerate shortcuts. Structured brains must process information through functional pathways, producing behavior that more closely resembles human cognition - including its failure modes.

Design principle: To evolve realistic behavior, constrain the brain's architecture to match the target organism's cognitive structure. Accept lower fitness scores as the cost of authenticity. Fitness measures how well the agent games the fitness function; behavioral analysis measures how realistically it performs the task.

8. Future Directions

  1. Controlled ablation: Rail-004b (regions without plasticity) and Rail-004c (plasticity without regions) to isolate individual feature contributions.
  2. Pathway hints: Pre-seed inter-region connections (sensor -> reflex, awareness -> actuator) to reduce evolution's need for bridge neurons.
  3. Extended evolution: Run Rail-004 for 2000+ generations to determine if the structured brain eventually matches or exceeds flat brain fitness while retaining its richer behavior.
  4. Signal speed: Add multi-tick signal delay through slow regions (v0.3) to model processing latency differences between reflexive and deliberative circuits.
  5. Recurrence: Enable feedback loops in the fatigue management region (v0.3) to model fatigue memory - tracking accumulated fatigue over time rather than just current fatigue level.