FINDING · DETECTION

Early downsampling via striding (stride=4) is the single most damaging ablation, reducing average macro-F1 from 0.9909 to 0.9772 and increasing cross-dataset variance from 4.77×10⁻⁵ to 4.51×10⁻⁴, while the worst-case dataset drops to F1=0.9524 — far larger degradation than any other design choice including Mamba-1 vs Mamba-2.

From 2026-kulatilleke-mambanetburst-direct-byte-levelMambaNetBurst: Direct Byte-level Network Traffic Classification without Tokenization or Pretraining · §V-B, Table IV · 2026 · arXiv preprint

Implications

Tags

censors
generic
techniques
ml-classifiertraffic-shape

Extracted by claude-sonnet-4-6 — review before relying.