ADR-062: QEMU ESP32-S3 Swarm Configurator
Field
Value
Status
Accepted
Date
2026-03-14
Authors
RuView Team
Relates
ADR-061 (QEMU testing platform), ADR-060 (channel/MAC filter), ADR-018 (binary frame), ADR-039 (edge intel)
Term
Definition
Swarm
A group of N QEMU ESP32-S3 instances running simultaneously
Topology
How nodes are connected: star, mesh, line, ring
Role
Node function: sensor (collects CSI), coordinator (aggregates + forwards), gateway (bridges to host)
Scenario matrix
Cross-product of topology × node count × NVS config × mock scenario
Health oracle
Python process that monitors all node UART logs and declares swarm health
ADR-061 Layer 3 provides a basic multi-node mesh test: N identical nodes with sequential TDM slots connected via a Linux bridge. This is useful but limited:
All nodes are identical — real deployments have heterogeneous roles (sensor, coordinator, gateway)
Single topology — only fully-connected bridge; no star, line, or ring topologies
No scenario variation per node — all nodes run the same mock CSI scenario
Manual configuration — each test requires hand-editing env vars and arguments
No swarm-level health monitoring — validation checks individual nodes, not collective behavior
No cross-node timing validation — TDM slot ordering and inter-frame gaps aren't verified
Real WiFi-DensePose deployments use 3-8 ESP32-S3 nodes in various topologies. A single coordinator aggregates CSI from multiple sensors. The firmware must handle TDM conflicts, missing nodes, role-based behavior differences, and network partitions — none of which ADR-061 Layer 3 tests.
Build a QEMU Swarm Configurator — a YAML-driven tool that defines multi-node test scenarios declaratively and orchestrates them under QEMU with swarm-level validation.
┌─────────────────────────────────────────────────────┐
│ swarm_config.yaml │
│ nodes: [{role: sensor, scenario: 2, channel: 6}] │
│ topology: star │
│ duration: 60s │
│ assertions: [all_nodes_boot, tdm_no_collision, ...] │
└──────────────────────┬──────────────────────────────┘
│
┌────────────▼────────────┐
│ qemu_swarm.py │
│ (orchestrator) │
└───┬────┬────┬───┬──────┘
│ │ │ │
┌────▼┐ ┌▼──┐ ▼ ┌▼────┐
│Node0│ │N1 │... │N(n-1)│ QEMU instances
│sens │ │sen│ │coord │
└──┬──┘ └─┬─┘ └──┬───┘
│ │ │
┌──▼──────▼─────────▼──┐
│ Virtual Network │ TAP bridge / SLIRP
│ (topology-shaped) │
└──────────┬───────────┘
│
┌──────────▼───────────┐
│ Aggregator (Rust) │ Collects frames
└──────────┬───────────┘
│
┌──────────▼───────────┐
│ Health Oracle │ Swarm-level assertions
│ (swarm_health.py) │
└──────────────────────┘
YAML Configuration Schema
# swarm_config.yaml
swarm :
name : " 3-sensor-star"
duration_s : 60
topology : star # star | mesh | line | ring
aggregator_port : 5005
nodes :
- role : coordinator
node_id : 0
scenario : 0 # empty room (baseline)
channel : 6
edge_tier : 2
is_gateway : true # receives aggregated frames
- role : sensor
node_id : 1
scenario : 2 # walking person
channel : 6
tdm_slot : 1 # TDM slot index (auto-assigned from node position if omitted)
- role : sensor
node_id : 2
scenario : 3 # fall event
channel : 6
tdm_slot : 2
assertions :
- all_nodes_boot
- no_crashes
- tdm_no_collision
- all_nodes_produce_frames
- coordinator_receives_from_all
- fall_detected_by_node_2
- frame_rate_above : 15 # Hz minimum per node
- max_boot_time_s : 10
Topology
Network
Description
star
All sensors connect to coordinator; coordinator has TAP to each sensor
Hub-and-spoke, most common
mesh
All nodes on same bridge (existing Layer 3 behavior)
Every node sees every other
line
Node 0 ↔ Node 1 ↔ Node 2 ↔ ...
Linear chain, tests multi-hop
ring
Like line but last connects to first
Circular, tests routing
Role
Behavior
NVS Keys
sensor
Runs mock CSI, sends frames to coordinator
node_id, tdm_slot, target_ip
coordinator
Receives frames from sensors, runs edge aggregation
node_id, tdm_slot=0, edge_tier=2
gateway
Like coordinator but also bridges to host UDP
node_id, target_ip=host, is_gateway=1
Assertion
What It Checks
all_nodes_boot
Every node's UART log shows boot indicators within timeout
no_crashes
No Guru Meditation, assert, panic in any log
tdm_no_collision
No two nodes transmit in the same TDM slot
all_nodes_produce_frames
Every sensor node's log contains CSI frame output
coordinator_receives_from_all
Coordinator log shows frames from each sensor's node_id
fall_detected_by_node_N
Node N's log reports a fall detection event
frame_rate_above
Each node produces at least N frames/second
max_boot_time_s
All nodes boot within N seconds
no_heap_errors
No OOM or heap corruption in any log
network_partitioned_recovery
After deliberate partition, nodes resume communication (future)
Preset
Nodes
Topology
Purpose
smoke
2
star
Quick CI smoke test (15s)
standard
3
star
Default 3-node (sensor + sensor + coordinator)
large-mesh
6
mesh
Scale test with 6 fully-connected nodes
line-relay
4
line
Multi-hop relay chain
ring-fault
4
ring
Ring with fault injection mid-test
heterogeneous
5
star
Mixed scenarios: walk, fall, static, channel-sweep, empty
ci-matrix
3
star
CI-optimized preset (30s, minimal assertions)
scripts/
├── qemu_swarm.py # Main orchestrator (CLI entry point)
├── swarm_health.py # Swarm-level health oracle
└── swarm_presets/
├── smoke.yaml
├── standard.yaml
├── large_mesh.yaml
├── line_relay.yaml
├── ring_fault.yaml
├── heterogeneous.yaml
└── ci_matrix.yaml
.github/workflows/
└── firmware-qemu.yml # MODIFIED: add swarm test job
Declarative testing — define swarm topology in YAML, not shell scripts
Role-based nodes — test coordinator/sensor/gateway interactions
Topology variety — star/mesh/line/ring match real deployment patterns
Swarm-level assertions — validate collective behavior, not just individual nodes
Preset library — quick CI smoke tests and thorough manual validation
Reproducible — YAML configs are version-controlled and shareable
Still requires root for TAP bridge topologies (star, line, ring); mesh can use SLIRP
QEMU resource usage — 6+ QEMU instances use ~2GB RAM, may slow CI runners
No real RF — inter-node communication is IP-based, not WiFi CSI multipath
ADR-061: QEMU ESP32-S3 firmware testing platform (Layers 1-9)
ADR-060: Channel override and MAC address filter provisioning
ADR-018: Binary CSI frame format (magic 0xC5110001)
ADR-039: Edge intelligence pipeline (biquad, vitals, fall detection)