FINDING · EVALUATION

Fine-tuned BERT and RoBERTa steganalysis discriminators achieve only 47.8–50.6% detection accuracy across GPT-2, OPT-1.3B, and Llama-2-7B stegotext — indistinguishable from random guessing. Human evaluators perform similarly poorly (46.6–50.6% accuracy, F1 ≤ 51.5%), while the paper notes statistical classifiers already outperform humans on this discrimination task.

From 2026-yan-efficient-provably-secureEfficient Provably Secure Linguistic Steganography via Range Coding · §6.5, §6.6, Table 3, Table 8 · 2026 · arXiv preprint

Implications

Tags

techniques
ml-classifier
defenses
steganography

Extracted by claude-sonnet-4-6 — review before relying.