How KWAM works

Deterministic, SHA-gated resilience — from the bit up to the fleet

KWAM is a small DSL, a deterministic erasure-coding runtime, an observability server, a consent-gated client, and an embeddable SDK. Every layer is content-verified; the model only advises.

The pipeline

Protect → Detect → Heal

A two-call SDK over a deterministic codec core.

# content-verified, SHA-gated — raises on any unrecoverable loss
from kwam import protect, recover

shards = protect("./src", durability="6-nines-target")   # RS/LRC encode, SHA-256 per fragment
recover(shards)                                       # reconstructs — never returns silently corrupted bytes
1 · Protect

Reed–Solomon / LRC encode into signed, content-addressed fragments spread across fault domains.

2 · Detect

Every fragment is SHA-256 gated on read. Mismatches are surfaced immediately: no silent corruption.

3 · Heal

The healer reconstructs from surviving fragments. Beyond the code distance, loss is reported, not masked.

Architecture

Five layers, one honest contract

Each component does one job; none of them invents data.

LayerWhat it doesGuarantee
DSL + SDKThe KWAM language and the protect/recover SDK surface.Deterministic, content-verified
Codec runtimeReed–Solomon / LRC / fountain codecs, durability math, the healer.Produces every byte; SHA-gated
NMS + HUDObservability server: fleet topology, durability SLA, live metrics.Simulated data labeled plainly
ClientConsent-gated responder + installer; read-only discovery.Never self-mints consent
AI manifestOpen, decodable description of the system.Never a covert payload
Consent Installation is gated on an externally minted, signed, single-use token. The server cannot forge it, and discovery is bounded and read-only, never a scanner.
On the hardware

KWAM on the NVIDIA H100 fleet

The orchestration layer that sits above your accelerators. It never touches the silicon's compute, only the resilience of the Python codebases and data that run on it.

NVIDIA H100 SXM5 8-GPU baseboard inside a Lenovo ThinkSystem SR680a V3 server chassis
NVIDIA H100 SXM5 8-GPU baseboard, Lenovo ThinkSystem SR680a V3 inner chassis. Eight GPUs give KWAM eight independent fault domains to spread across.
Lenovo ThinkSystem NVIDIA H100 GPU card
Lenovo ThinkSystem NVIDIA H100 GPU, the unit of compute KWAM keeps Python codebases and data alive across.

An H100 fleet is the most concentrated, most valuable compute most teams will ever run. KWAM is the resilience and orchestration layer that sits above it. It erasure-codes your Python codebase and its data into signed, content-addressed fragments, then spreads them across the fleet's natural fault domains so the loss of any single domain stays recoverable instead of fatal.

8 domains
Per SXM5 board. Eight GPUs become eight independent placement domains for the anti-affinity placer.
SHA‑256
On every fragment read from every H100 node. Corruption is detected, never returned silently.
6 nines
Durability design target across the fleet, engineered and defended, never sold as zero loss.
0 silent
Silently-corrupted bits. The guarantee KWAM makes outright, independent of the recovery math.
Mapped to the box

From one GPU up to the whole fleet

KWAM reads the physical shape of your deployment and treats every level of it as an independent fault domain. The deeper your fleet, the more places it can lose at once and still rebuild.

On a Lenovo ThinkSystem SR680a V3, that hierarchy runs from the individual H100, to the SXM5 baseboard, to the node, to the rack and power domain, and out to the availability zone. The placer keeps fragments in distinct domains at every level, so a dead GPU, a failed board, or a lost node never holds the only copy of your code.

Fault-domain ladder

GPU
↳ SXM5 baseboard (×8)
↳ node (SR680a V3)
↳ rack & power domain
↳ availability zone

anti-affinity keeps fragments in distinct domains at every level

Why it matters here

Built for the way H100 fleets actually fail

Fault domains, mapped to the box

An 8-GPU SXM5 board gives KWAM eight independent placement domains in a single chassis. The anti-affinity placer keeps fragments in distinct domains, so a failed GPU or board never takes the only copy of your codebase.

SHA-gated, never silent

Every fragment read from an H100 node is SHA-256 verified. Corruption from a flaky link, a thermal event, or a radiation upset is surfaced immediately, never returned as a silently-corrupted byte.

The model only advises

KWAM's model can re-rank where to place or heal fragments across the fleet, but the deterministic RS/LRC codecs produce every byte. The model never produces a data byte, and never mints consent to a node.

Continuous healing in the background

As nodes drop and return, the healer rebuilds missing fragments from the survivors without pausing your workload. Reconstruction is deterministic and content-verified at every step.

Two calls in your code

protect() and recover() wrap any Python codebase. No rewrite of your training or inference stack, and nothing about the H100 compute path changes.

Honest under real loss

Push past the code distance and data is gone. KWAM reports exactly what it could not recover, with the sample size and confidence behind every measured recovery rate. Knowing is not the same as recovering, and we never blur the two.

What KWAM does not do KWAM does not accelerate, schedule, or alter H100 compute, and it makes no performance claim about the GPUs themselves. It governs the durability of the code and data on the fleet, a six-nines design target with detection guaranteed, and surfaces, honestly, anything it cannot recover.

NVIDIA, H100, and SXM are trademarks of NVIDIA Corporation. ThinkSystem is a trademark of Lenovo. Hardware shown for illustration. KWAM is independent software and is not affiliated with or endorsed by NVIDIA or Lenovo.

The math

Why "nines," not "zero"

Durability is a probability, engineered against fault-domain independence and code distance. Push past that distance and data is lost; KWAM's job is to detect and report that, not pretend it can't happen.

That residual probability is exactly what the nines describe. Detection (SHA-256) is the part we guarantee outright; recovery is the part we design toward six nines and measure honestly.

Design target

99.9999%

Six nines durability, engineered and defended, never marketed as perfection.

detect: guaranteed (SHA-256)
recover: design target (RS/LRC)

Space weather

Grounded in the real source

Accelerator fleets care about radiation environment. KWAM reads space-weather signals from the authoritative source and keeps deep-space context clearly separate.

Honest sourcing NOAA SWPC is the real space-weather source. JWST is deep-space context only and cannot forecast solar weather; we never imply otherwise.

Ready to scope a deployment?

KWAM is licensed directly. We'll map it to your fleet and your durability requirements.

Legal

Ownership & governing law

KWAM is our intellectual property, grounded in Swiss law.

Intellectual property & governing law

KWAM is the sole and exclusive property of the owners of KWAM.CH

KWAM — its source code, the KWAM language, the JHMM reconstruction orchestrator, the deterministic codec runtime, and all associated AI components — is a proprietary computer program and the sole and exclusive intellectual property of KWAM.CH. As a computer program it is a protected work under the Swiss Federal Act on Copyright and Related Rights (Copyright Act, CopA), and the exclusive rights of use vest in KWAM as employer; it is further protected as a trade secret under the Swiss Federal Act Against Unfair Competition (UCA). KWAM is offered by private licence only. All rights reserved.

CopA (SR 231.1) Art. 2 para. 3 & Art. 17 · UCA (SR 241) Art. 6 · Governed by the laws of Switzerland · Place of jurisdiction: Zürich