2012-weinberg-stegotorus
findings extracted from this paper
-
The StegoTorus HTTP module degrades severely with network latency: it can sustain only a 50 kB/s stream at latencies below 200 ms and fails entirely at higher rates or latencies, because the HTTP request-response pattern transfers only one or two 512-byte Tor cells per round-trip. Plain Tor and chopper-only StegoTorus show no measurable throughput degradation at latencies up to 450 ms. Increasing parallel HTTP connections improves low-latency throughput but does not recover high-latency performance.
-
HTTP steganography in StegoTorus expands upstream traffic by a factor of 41× and downstream by 12× compared to a direct connection (uploading 966,964 bytes vs. 23,643 bytes to transfer a 1 MB file). Chopper-only operation adds only ~2.7× upstream overhead, comparable to plain Tor. Maximum achievable goodput with the HTTP module is ~27 kB/s (~4× a 56 kbps modem), which the authors attribute to a minimum expansion factor of 8× inherent in contemporary steganographic schemes.
-
A naive-Bayes website-fingerprinting classifier achieves AUC > 0.94 against vanilla Tor for 8 of 9 Alexa top-ten sites (e.g., Wikipedia 0.9991, YouTube 0.9947). Against StegoTorus-HTTP, AUC drops to ≤ 0.75 for 7 of 9 sites (YouTube 0.4125, Facebook 0.5413, Google 0.6928), which the authors argue is too low for practical perimeter-scale deployment where near-perfect precision is required to avoid error floods.
-
Tor's fixed 512-byte cells packed into TLS 1.0 records produce a characteristic TCP payload of 586 bytes (512 + 74 bytes of TLS overhead). A perimeter filter running a simple exponential moving average (τ ← ατ + (1−α)1ₗ₌₅₈₆, α=0.1, T=0.4) identifies Tor flows within a few dozen packets; this attack succeeds at backbone rates of ~540,000 packets/second on commodity hardware. Obfsproxy does not alter packet sizes or timings and therefore does not defeat this classifier.
-
StegoTorus distributes a fixed set of packet traces and HTTP covertext databases with the software, but allows users to record their own; classifiers trained on the distributed covertext will not generalize to user-generated databases. The paper further notes that reusing a small number of traces repeatedly creates a statistical fingerprint because censors can learn conversation patterns from packet sizes and timings alone, implying that trace diversity must be maintained over time.