FINDING · DETECTION

Mamba-2's constrained scalar-times-identity A-matrix acts as an implicit regularizer for packet-byte sequences: under matched settings it yields higher mean F1 (0.9909 vs 0.9874), better worst-case F1 (0.9824 vs 0.9769), and 48% lower cross-dataset variance (4.77×10⁻⁵ vs 9.21×10⁻⁵) relative to Mamba-1, while delivering 30–60% faster backward passes and 2–4× lower GPU memory usage.

From 2026-kulatilleke-mambanetburst-direct-byte-levelMambaNetBurst: Direct Byte-level Network Traffic Classification without Tokenization or Pretraining · §V-B, §V-C, Table V · 2026 · arXiv preprint

Implications

Tags

censors
generic
techniques
ml-classifier

Extracted by claude-sonnet-4-6 — review before relying.