TECHNIQUES

random-payload-detect Random / high-entropy payload detection

Censors flag flows whose first N bytes have entropy higher than typical legitimate protocols. The original GFW shadowsocks detection used this.

4 papers on file

2025-himmelberger-drivel Drivel: A Quantum-Safe Fully Encrypted Protocol Proxy ETH Zurich (MSc thesis) · 2025
2023-wu-fully-encrypted-detect How the Great Firewall of China detects and blocks fully encrypted traffic USENIX Security · 2023
2022-blocking-tls-circumvention Large scale blocking of TLS-based censorship circumvention tools in China gfw.report · 2022
2020-alice-shadowsocks-detection How China Detects and Blocks Shadowsocks IMC · 2020

39 findings tagged here

An uncertainty-aware filtering (UF) mechanism quantifies per-token reliability via Shannon entropy of the cross-modal header–payload attention matrix, finding that encrypted payloads still contain low-entropy tokens with stable cross-modal alignment that serve as reliable classification anchors — demonstrating that nominally randomized byte streams retain exploitable low-entropy structure.

2026-he-trafficmoe-heterogeneity-aware-mixture §III-D detection
Encrypted traffic exhibits a 'full-frequency' spectral property where both low- and high-frequency components are highly active with comparable intensity, unlike natural images which are dominated by low-frequency components. Fourier Transform analysis across CIC-IoT2023, DoHBrw2020, and ISCX-Tor2016 confirms this distinction is pervasive. This signature is an inherent consequence of encryption disrupting byte-level semantics into a visually disordered, noise-like spatial pattern.

2026-lian-decompose-understand-fuse §II detection
Ablation experiments show that removing the high-frequency branch from FreeUp degrades AUC from 86.68% to 77.09% on CIC-IoT2023 (−9.6 pp) and from 95.53% to 95.10% on ISCX-Tor2016. Removing the entire frequency-decoupled framework causes the largest degradation, dropping to 82.10% AUC on CIC-IoT2023 and 81.26% on DoHBrw2020, confirming that high-frequency components are the primary discriminative signal in encrypted traffic anomaly detection.

2026-lian-decompose-understand-fuse §V-C, Table II detection
I2P payload entropy is close to 8 bits per packet (Figure 9), confirming strong encryption that renders payload content analytically unusable. Across all CNN experiments, models trained on payload data alone achieved 72.5–76.5% accuracy versus 95.17–99.5% for metadata-only variants; encrypted payload acted as 'noise that confused the model' rather than as a signal.

2026-rohrer-convolutional-neural-networks-deanonymisation-i2p §IV-A, §V Experiment 2, Table IV detection
I2P payload entropy was measured at close to 8 bits per byte across sampled packets (Figure 9), confirming that payload content is cryptographically indistinguishable from random noise and provides no usable signal for classification. All experimental variants using raw payload alone achieved poor and high-variance accuracy (72.5–76.5%), while excluding payload improved accuracy to 99.5% in lab conditions.

2026-rohrer-convolutional-neural-networks-deanonymisation-i2p §IV-A / §V Discussions detection
Drivel evaluates its design against the GFW's fully-encrypted-traffic detector (documented in Wu et al. 2023). The thesis demonstrates that switching to post-quantum primitives does not by itself change the traffic's appearance to a statistical censor classifier — the fully-encrypted detection problem is independent of the underlying cryptographic algorithm and must be addressed at the traffic-shaping layer regardless of key-exchange choice.

2025-himmelberger-drivel §3, §4 detection
Drivel is an obfs4-style fully-encrypted proxy protocol that replaces obfs4's pre-quantum cryptographic primitives with post-quantum alternatives. It is one of the first circumvention protocols explicitly designed to remain secure under a quantum adversary, addressing the forward-secrecy threat to deployed circumvention traffic recorded today for future decryption.

2025-himmelberger-drivel §1, §2 defense
State-of-the-art ML-based obfs4 detection (Wang et al. decision tree) achieves 97% precision at equal base rates (λ=1) but precision collapses to 3% at a still-conservative λ=1,000; at λ=10⁶ precision approaches zero for all classifiers tested. This base-rate failure was previously uncharacterized because prior evaluations only considered balanced or near-balanced datasets.

2024-wails-precisely §IV-D (Scalability), Figure 3, Table I detection
obfs4 and obfs⋆ produce characteristic wire patterns—bursts of roughly MTU-sized payloads followed by a randomly-sized chaff packet—that CNN classifiers detect purely from packet-size sequences without payload inspection. A trivial per-bridge entropy-biasing re-encoding (obfs⋆) completely defeats the hand-tuned decision tree (0% precision, 0% recall) but does not reduce CNN detectability, because the CNN generalizes across size-distribution variants.

2024-wails-precisely §V-E, §IV-D-3, Figure 4 detection
Despite fully encrypted protocols existing since obfs2 in 2012, the first documented evidence of the GFW passively detecting them purely by randomness appeared only in 2021 — approximately a decade later — and was limited to certain foreign IP address ranges and a subsampled fraction of traffic. Meanwhile, the GFW had been discovering obfs2/obfs3 servers via active probing as early as 2013, indicating censors found active-probing-based address discovery cheaper and more reliable than passive statistical classifiers for this protocol family.

2023-fifield-comments §5 evaluation
The GFW detects Shadowsocks by flagging apparently high-entropy connections that are not TLS or HTTP, but this detection is brittle: connections are explicitly allowed if the first 6 bytes of the first packet of a flow are all printable ASCII characters (range 0x20–0x7E). Adding a 6-byte alphanumeric preamble to the Shadowsocks message definition is sufficient to bypass this heuristic and requires only a short patch to the protocol specification file.

2023-wails-proteus §3.2 detection
The GFW's fully-encrypted detector (deployed Nov 2021) operates by exempting likely-benign traffic and blocking the rest. Five inferred exemption rules applied to the first TCP payload (pkt): Ex1 — popcount(pkt)/len(pkt) ≤ 3.4 or ≥ 4.6 (bits/byte); Ex2 — first 6+ bytes are printable ASCII [0x20–0x7e]; Ex3 — more than 50% of bytes are printable ASCII; Ex4 — more than 20 contiguous printable ASCII bytes; Ex5 — first bytes match TLS or HTTP fingerprint. Traffic failing all five exemptions is blocked. Experiments confirmed all rules still held as of February 2023.

2023-wu-fully-encrypted-detect §4, Algorithm 1 detection
The GFW applies the fully-encrypted detector probabilistically and only to a targeted subset of IP address space. Each qualifying connection is blocked with probability p = 26.3% (geometric distribution fit over 109,489 affected IPs in a 10% IPv4 scan); residual censorship then blocks the same 3-tuple (client IP, server IP, server port) for 180 seconds after a first block. The detector only monitors ~26% of connections and targets specific IP ranges of popular data centers (VPS providers such as Alibaba US, Constant, DigitalOcean, Linode); large CDNs (Akamai, Cloudflare) and most residential/enterprise IPs are unaffected. 98% of scanned IPs were unaffected. Simulated on live university traffic, the rules would block ~0.6% of normal connections as collateral damage.

2023-wu-fully-encrypted-detect §6, §6.3 detection
The October 2022 blocking wave is the confirmed operational deployment of the fully-encrypted-traffic detector later formalized in Wu et al. (USENIX Security 2023). The detector was therefore in live production from at least late 2022, more than a year before the academic paper describing it was published. This event establishes that the GFW's passive fully-encrypted classifier operates at scale in adversarial real-world conditions, not just in controlled experiments.

2022-blocking-tls-circumvention full post deployment
Current randomized-payload circumvention tools (obfs4/ScrambleSuit, SkypeMorph, VoIP-tunneling) rely on censors 'defaulting open' — treating unidentified traffic as innocuous. If censors instead block all traffic not explicitly recognizable as meaningful plaintext, these tools fail entirely. The paper notes anecdotal evidence this is already occurring, including blocking of some TLS 1.3 connections.

2021-kaptchuk-meteor §1 Introduction detection
Variable-length sampling (Adaptation 2) achieves a provably secure but impractical encoding: a 16-byte plaintext encoded with GPT-2 requires 502–2994 tokens, produces 2.3–13.6 KiB of stegotext (149×–870× overhead), and takes 42–765 seconds even with GPU acceleration, depending on security parameter k=16–128.

2021-kaptchuk-meteor §4 / Table 1 evaluation
Classical public-key steganography (Algorithm 1 from [54]) has a 100% failure rate when encoding a 16-byte message using GPT-2, because GPT-2's per-token entropy drops near zero frequently and standard rejection sampling cannot find an acceptable token. Entropy bounding reduces failure to 0–10% but introduces detectable statistical bias: selected tokens come from a visibly different probability distribution than baseline samples.

2021-kaptchuk-meteor §4 Adapting Classical Steganographic Schemes / Figure 2b evaluation
Meteor encodes bits by embedding a PRG-masked random value into the token-sampling randomness of a generative model, recovering bits proportional to the shared prefix length of the sampled interval. Expected throughput per sampling event is asymptotically within 1/2 of the Shannon entropy of the channel (proven in Appendix A), so Meteor automatically adapts to high entropy variability without explicit signaling or padding.

2021-kaptchuk-meteor §5 Meteor / §5.2 defense
Meteor is proven secure against chosen-hiddentext attacks: any PPT adversary distinguishing Meteor output from honest model output can be reduced to breaking the underlying PRG. The scheme produces stegotext provably indistinguishable from the generative model's own output distribution, and requires only a shared public model — not a secret channel — making the model analogous to a common random string. On GPU the encoding overhead is ~1× model-load time; on CPU ~4.6×; on mobile ~49.5×.

2021-kaptchuk-meteor §5.2 / §6 / Table 2 defense
The GFW's passive classifier uses two features of the first data packet to flag probable Shadowsocks traffic: (1) high Shannon entropy (per-byte entropy > ~7 bits strongly correlates with replay probability, which is nearly 4x higher at entropy 7.2 than at 3.0) and (2) packet length in the range 160–700 bytes with specific remainders mod 16. A single data packet after the TCP handshake is sufficient to trigger the downstream active-probing pipeline.

2020-alice-shadowsocks-detection §4.2 detection
The GFW was observed detecting Shadowsocks servers by sending follow-up active probes after an initial Shadowsocks-sized client message, including permuted replays of the client's message and random-data probes of various sizes up to and exceeding Shadowsocks' unique 50-byte data limit. This defeats shadowsocks-libev's replay cache because the GFW permutes the replayed bytes rather than resending them verbatim.

2020-frolov-httpt §2 Background detection
Measured packet loss rates under GFW censorship (Feb–Apr 2017, client at Tsinghua University/CERNET): Tor with meek obfuscation suffers 4.4% average PLR; Shadowsocks (AES-256-CFB) suffers 0.77% PLR; native VPN (PPTP/L2TP) and OpenVPN both achieve ~0.21% PLR. For comparison, the same tools accessed from a US vantage point show PLR below 0.1%, confirming the excess loss is GFW-induced. The GFW's DPI and active probing techniques specifically target Tor and Shadowsocks protocol signatures.

2017-lu-accessing §4.3 evaluation
Wiley's Bayesian classifier against obfuscated protocols (Dust, SSL, obfs-openssh) found that entropy detection achieved 94% accuracy using only the first packet, timing-based detection achieved 89% accuracy over entire packet streams, and length-based detection achieved only 16% accuracy.

2016-khattak-sok §2.4.1 detection
Randomization-based obfuscation systems (obfs2/3, obfs4, ScrambleSuit, Dust) resist blacklist DPI but fail entirely under protocol-whitelist filtering, as explicitly demonstrated during the Iranian elections where censors permitted only known-good protocols. Pure randomization provides no signal of being a permitted protocol, making it trivially blockable under any whitelist regime.

2015-dyer-marionette §1, §2 detection
Rook constructs per-field symbol tables by observing 600 packets (~60 seconds) of real gameplay at session start, then restricts substituted values to only those previously observed with frequency within two orders of magnitude of the median. This ensures altered packets never contain field values that are absent or anomalously rare in legitimate traffic, defeating value-anomaly and out-of-range DPI filters.

2015-vines-rook §2.6 defense
The paper demonstrates that 'having no fingerprint is itself a fingerprint': randomizing obfuscators that emit uniformly random bytes from the first packet are detectable precisely because conventional protocols (TLS, SSH, HTTP) always begin with fixed plaintext headers. This structural distinction requires no deep payload parsing — the attack operates on only the first TCP packet — and achieves TPR=1.0 / FPR=0.002 against obfsproxy3/4 using commodity-implementable statistics.

2015-wang-seeing §1, §5.1 defense
Obfsproxy3 and obfsproxy4 are reliably detected by an entropy-distribution test (KS test, block size k=8) applied to the first 2,048 bytes of the first client-to-server packet, combined with a minimum payload-length check of 149 bytes. On three university campus datasets totaling over 14 million TCP flows, the test achieves TPR=1.0 with FPR ranging from 0.24% to 0.33%. Omitting the length check raises the SSL/TLS false-positive rate to approximately 23%.

2015-wang-seeing §5.1, Table 5 detection
Analysis of the AOL search corpus shows an average search query length of 17.42 bytes with an entropy of 4.48 bits/byte, yielding 78.04 bits of deniable information per HTTP GET request. This entropy matches real user search behavior, making entropy-based traffic analysis unable to distinguish Facade traffic from genuine search sessions.

2014-jones-facade §5.2 defense
A pre-shared key enables encrypting the entire GoHop packet—header, payload, and padding bytes—achieving true randomness in the full byte stream. Standard VPN protocols such as OpenVPN encrypt only the payload while leaving headers in plaintext, exposing protocol-identifying fields to DPI without payload inspection. This design choice is a prerequisite for defeating header-based fingerprinting.

2014-wang-gohop §III.A defense
TapDance introduces chosen-ciphertext steganography, which allows the client to embed an arbitrary-length hidden message inside a valid TLS ciphertext without invalidating the TLS MAC or session. By exploiting ciphertext malleability in both stream-cipher (counter) mode and CBC mode, the client can choose specific byte values to appear in the ciphertext while constraining plaintext to a safe ASCII range (0x40–0x7F), encoding 6 bits of tag data per ciphertext byte. This provides unbounded covert-channel bandwidth, compared to the fixed 224-bit TLS nonce used by Telex and Decoy Routing or the 24-bit TCP ISN used by Cirripede.

2014-wustrow-tapdance §3, §6 defense
SSH transfers utilized only 15% of available bandwidth versus 85–89% for HTTP/HTTPS. When SSH was obfuscated by XORing payloads with a constant key (hiding the plaintext handshake), throughput dropped to near-zero during all trials. Applying the same obfuscation to HTTP transfers produced the same near-zero result, supporting the hypothesis that Iran whitelists known-approved protocols rather than blacklisting specific ones, which would preemptively block any unrecognized or randomized transport including Tor's obfsproxy.

2013-aryan-internet §4.4 detection
All trained ML classifiers (K-NN, Naive Bayes, ANN, SVM, vote ensemble) performed at near-chance levels when distinguishing RSA-encrypted stego messages from clean photos — best precision 52.05%, best meaningful recall 61.52% (K-NN on clean class). The authors attribute this to embedding only a few hundred bytes into cover photos hundreds of KB in size, with natural image entropy in noisy pixel regions being empirically indistinguishable from RSA-ciphertext statistics.

2013-invernizzi-message §5.1, Table 1 defense
The IBST construction is provably secure under the bilinear decisional Diffie-Hellman (BDDH) assumption in the random oracle model. Any adversary with advantage ε(λ) against IBST indistinguishability implies an adversary against BDDH with advantage at least ε(λ)/e(1+qE), where qE is the number of private-key extraction queries. Tags produced by the scheme are computationally indistinguishable from uniform random bitstrings for any party lacking the recipient's private key.

2013-ruffing-identity-based §3.2 Construction / Theorem 1 defense
BTP's wire protocol contains no handshakes, timeouts, or plaintext headers. Connections open with a pseudo-random b-byte tag that the recipient can compute in advance from its key state, making BTP frames indistinguishable from random data to a passive observer who does not know the shared secret.

2012-rogers-secure §2, §3.2 defense
Scrambling without secret key management can frustrate DPI-based censors if the de-scrambling function satisfies 'high-inertia' — meaning an adversary computing S⁻¹ on n inputs cannot use less than Θ(n) times the resources of a single commodity-PC user, including electricity, memory, and computation time. This forces bulk censorship to become computationally infeasible without over-censoring all scrambled content.

2011-bonneau-scrambling §1–2 defense
A politically active blogger in an anonymized censored country explicitly avoided BlackBerry encryption stating: 'they can't crack that encryption and they would just get suspicious. Cause they listen to me and listen to me and then suddenly I am encrypting and so that means I am really saying something they don't want me to.' This documents censor behavior where the mere use of strong encryption—independent of content—serves as a targeting signal.

2011-shklovski-online §Blocked sites as a form of protection policy
Censors responding to encryption-based circumvention have two escalation options: block all encrypted connections outright, or identify the underlying protocol via traffic signatures that persist even inside encrypted tunnels. The paper frames these as the two dominant censor responses to DPI being defeated by encryption.

2011-wright-fine-grained §3 detection
Undetectability of a message requires that it be indistinguishable from 'random noise' — an attacker cannot sufficiently distinguish whether the message exists or not. This is distinct from anonymity, which protects only the relationship between an IOI and a subject, not the IOI's existence itself. Undetectability is possible only for subjects not involved in the IOI; senders and recipients cannot achieve it against each other.

2010-pfitzmann-terminology §6 defense
Dagster requires both clients and servers to enforce a randomness predicate rand?(x) on every block before storage or forwarding, ensuring all server-stored data is statistically indistinguishable from uniform random noise. This provides server deniability — the operator can credibly deny knowledge of content — and also closes the attack present in Publius and Freenet where a malicious client could post plaintext, potentially exposing the operator for 'knowingly' hosting illegal content.

2001-stubblefield-dagster §4.2, §5.3 defense