LeanGuard, a 395M label-only encoder, matches much larger chain-of-thought reasoning guards at about ~100× lower inference cost and a single forward pass.
On the same decoder base, adding a chain-of-thought does not improve accuracy. Re-sampling the chain changes the verdict on only ~5% of inputs, and a linear probe shows the verdict is fixed before the chain is written the reasoning justifies a decision rather than computing it.
Under injected training-label noise a single-pass encoder degrades more gracefully than generation, and at a strict 1% false-positive rate it retains 44.8 recall versus the reasoning guard's 10.1.
The decoder's verdict is essentially decided before the chain is generated; the later reasoning restates it at ~100× the cost.
Headline F1 (unweighted mean over prompt-harm, response-harm, refusal), evaluated cell-for-cell under the GuardReasoner protocol over eleven public benchmarks.
| Model | Backbone | Params | CoT | Single pass | Headline F1 |
|---|---|---|---|---|---|
| WildGuard-7B | decoder | 7B | ✓ | ✗ | 81.96 |
| GuardReasoner-1B | decoder | 1.24B | ✓ | ✗ | 82.05 |
| GuardReasoner-3B | decoder | 3B | ✓ | ✗ | 82.50 |
| LeanGuard (ours) | encoder | 395M | ✗ | ✓ | 82.90 |
LeanGuard matches or exceeds the reasoning guards at about ~100× lower inference cost, wins or ties 9 of 13 per-benchmark cells against GuardReasoner-1B, and a free 3-seed vote reaches 83.35. A classifier trained with 10% of its labels corrupted (82.16) still matches a clean GuardReasoner-1B (82.05).
| Paper (arXiv preprint) | Available |
| Figures and project page | Available |
| Training and evaluation code | Coming soon |
| Google Colab demo (one-click reproduction) | Coming soon |
| Pretrained checkpoints + ONNX export (Hugging Face) | Coming soon |
| Training / evaluation dataset splits | Coming soon |
@article{na2026leanguard,
title = {Do Safety Guardrails Need to Reason? LeanGuard: A Fast
and Light Approach for Robust Moderation},
author = {Na, Dongbin},
journal = {arXiv preprint arXiv:XXXX.XXXXX},
year = {2026}
}