TECHNIQUES
dpi Deep Packet Inspection
Inspecting payload bytes beyond the TCP/IP headers.
261 papers on file
- 2025-iran-shutdown-measurement Characterizing Iran's Phased National Internet Shutdown in 2025: A Progressive and Distributed Action
- 2026-ablove-characterizing Characterizing the Implementation of Censorship Policies in Chinese LLM Services
- 2026-almutairi-server Server, Client, or Relay? Dual-Role Detection of Circumvention Relays
- 2026-anon-letsvpn-vpn 快连(LetsVPN)宣布退出中国大陆:传统VPN时代的眼泪? | 二毛
- 2026-article19-tightening-the-net Tightening the Net: China's Infrastructure of Oppression in Iran
- 2026-fares-game The Game Has Changed: Revisiting proxy distribution and game theory
- 2026-jois-assemblage Assemblage: Chipping Away at Censorship with Generative Steganography
- 2026-kamali-huma Huma: Censorship Circumvention via Web Protocol Tunneling with Deferred Traffic Replacement
- 2026-lipphardt-dual Dual Standards: Examining Content Moderation Disparities Between API and WebUI Interfaces in Large Language Models
- 2026-niere-dpyproxy-dns Towards Automated DNS Censorship Circumvention
- 2026-patterniha-mitm-domainfronting MITM-DomainFronting: client-only domain fronting via local TLS MITM with a user-installed CA
- 2026-ratliff-mirage Mirage: Private, Mobility-based Routing for Censorship Evasion
- 2026-sheffey-geedge Geedge Cases: Censorship Measurement Insights from the Geedge Networks Leak
- 2026-tolley-architectural Architectural VPN Vulnerabilities, Disclosure Fatigue, and Structural Failures
- 2026-zohaib-extended Extended Abstract: CensorAlert -- Leveraging LLM Agents for Automated Censorship Report Aggregation and Analysis
- 2025-alaraj-iran-refraction Measuring Censorship in Iran Using Refraction-based Proxies
- 2025-amnesty-pakistan-shadows Shadows of Control: Censorship and mass surveillance in Pakistan
- 2025-aryapour-stealth-blackout Iran's Stealth Internet Blackout: A New Model of Censorship
- 2025-fan-wallbleed Wallbleed: A Memory Disclosure Vulnerability in the Great Firewall of China
- 2025-geedge-mesa-leak Geedge & MESA Leak: Analyzing the Great Firewall's Largest Document Leak
- 2025-gfw-port443-rst Analysis of the GFW's Unconditional Port 443 Block on August 20, 2025
- 2025-habib-examining Examining Leading Pakistani Mobile Apps
- 2025-hyperion-cs-censor-has-new Censor has a new method of blocking
- 2025-interseclab-internet-coup The Internet Coup
- 2025-inyangson-amigo Amigo: Secure Group Mesh Messaging in Realistic Protest Settings
- 2025-jfm-silk-road-surveillance Silk Road of Surveillance
- 2025-kamali-anix Anix: Anonymous Blackout-Resistant Microblogging with Message Endorsing
- 2025-lange-i-ra-nconsistencies I(ra)nconsistencies: Novel Insights into Iran's Censorship
- 2025-lipphardt-1-800-censorship 1-800-Censorship: Analyzing internet censorship data using the Internet Yellow Pages
- 2025-miaan-stealth-blackout Iran's Stealth Blackout: A Multi-stakeholder Analysis of the June 2025 Internet Shutdown
- 2025-mixon-baca-hidden Hidden Links: Analyzing Secret Families of VPN Apps
- 2025-niere-encrypted Encrypted Client Hello (ECH) in Censorship Circumvention
- 2025-niere-transport Transport Layer Obscurity: Circumventing SNI Censorship on the TLS-Layer
- 2025-nourin-nobody Is Nobody There? Good! Globally Measuring Connection Tampering without Responsive Endhosts
- 2025-pereira-position Position Paper: A Case for Machine-Checked Verification of Circumvention Systems
- 2025-piotrowska-nym-iran-blackout Nym Report on Iran's Recent Internet Blackouts (June 2025): What it Means for Censorship Resistance and NymVPN
- 2025-rodriguez-revisiting Revisiting BAT Browsers: Protecting At-Risk Populations from Surveillance, Censorship, and Targeted Attacks
- 2025-sivan-sevilla-probing Probing the third-party infrastructure of digital news on the Web
- 2025-tai-irblock IRBlock: A Large-Scale Measurement Study of the Great Firewall of Iran
- 2025-tusing-minecraft-tunnels Minecraft tunnels for covert communications
- 2025-vilalonga-extended Extended Abstract: Using TURN Servers for Censorship Evasion
- 2025-vines-extended Extended Abstract: Nobody’s Fault but Mine: Using Unauthenticated Unidirectional Pushes for Client Update
- 2025-wails-censorship Censorship Evasion with Unidentified Protocol Generation
- 2025-wang-custom Is Custom Congestion Control a Bad Idea for Circumvention Tools?
- 2025-wendzel-survey A Survey of Internet Censorship and its Measurement: Methodology, Trends, and Challenges
- 2025-wrana-sok-surveillance SoK: The Spectre of Surveillance and Censorship in Future Internet Architectures
- 2025-wu-regional-censorship A Wall Behind A Wall: Emerging Regional Censorship in China
- 2025-zohaib-quic-sni Exposing and Circumventing SNI-based QUIC Censorship of the Great Firewall of China
- 2017-frolov-water-pluggable WATER: a programmable framework for pluggable transports
- 2024-chi-just Just add WATER: WebAssembly-based Circumvention Transports
- 2024-hoang-gfweb GFWeb: Measuring the Great Firewall's Web Censorship at Scale
- 2024-lorimer-extended Extended Abstract: Traffic Splitting for Pluggable Transports
- 2024-niere-http-smuggling Turning Attacks into Advantages: Evading HTTP Censorship with HTTP Request Smuggling
- 2024-sakamoto-bleeding Bleeding Wall: A Hematologic Examination on the Great Firewall
- 2024-xue-tspu-russia Tspu: Russia's decentralized censorship system
- 2024-zillien-look Look What's There! Utilizing the Internet's Existing Data for Censorship Circumvention with OPPRESSION
- 2023-ding-discop Discop: Provably secure steganography in practice based on ``distribution copies''
- 2023-feng-study A Study of China's Censorship and Its Evasion Through the Lens of Online Gaming
- 2023-fifield-running Running a high-performance pluggable transports Tor bridge
- 2023-jia-voiceover Voiceover: Censorship-Circumventing Protocol Tunnels with Generative Modeling
- 2023-katira-censorwatch CensorWatch: On the Implementation of Online Censorship in India
- 2023-master-worldwide A Worldwide View of Nation-state Internet Censorship
- 2023-niere-poster Poster: Circumventing the GFW with TLS Record Fragmentation
- 2023-nourin-detecting Detecting Network Interference Without Endpoint Participation
- 2023-nourin-measuring Measuring and Evading Turkmenistan's Internet Censorship
- 2023-tulloch-lox Lox: Protecting the Social Graph in Bridge Distribution
- 2023-wails-proteus Proteus: Programmable Protocols for Censorship Circumvention
- 2023-wu-fully-encrypted-detect How the Great Firewall of China detects and blocks fully encrypted traffic
- 2022-blocking-tls-circumvention Large scale blocking of TLS-based censorship circumvention tools in China
- 2022-chang-covid-19 COVID-19 increased censorship circumvention and access to sensitive topics in China
- 2022-figueira-stegozoa Stegozoa: Enhancing WebRTC Covert Channels with Video Steganography for Internet Censorship Circumvention
- 2022-harrity-get GET /out: Automated Discovery of Application-Layer Censorship Evasion Strategies
- 2022-raman-network Network Measurement Methods for Locating and Examining Censorship Devices
- 2021-bock-your Your Censor is My Censor: Weaponizing Censorship Infrastructure for Availability Attacks
- 2021-kaptchuk-meteor Meteor: Cryptographically Secure Steganography for Realistic Distributions
- 2021-knockel-measuring Measuring QQMail's Automated Email Censorship in China
- 2021-lorimer-oustralopithecus OUStralopithecus: Overt User Simulation for Censorship Circumvention
- 2021-padmanabhan-multi-perspective A multi-perspective view of Internet censorship in Myanmar
- 2021-rambert-chinese Chinese Wall or Swiss Cheese? Keyword filtering in the Great Firewall of China
- 2021-rosen-balboa Balboa: Bobbing and Weaving around Network Censorship
- 2021-satija-blindtls BlindTLS: Circumventing TLS-Based HTTPS Censorship
- 2021-sharma-camoufler Camoufler: Accessing The Censored Web By Utilizing Instant Messaging Channels
- 2021-wei-domain Domain Shadowing: Leveraging Content Delivery Networks for Robust Blocking-Resistant Communications
- 2020-alharbi-opening Opening Digital Borders Cautiously yet Decisively: Digital Filtering in Saudi Arabia
- 2020-alice-shadowsocks-detection How China Detects and Blocks Shadowsocks
- 2020-anonymous-triplet-censors Triplet Censors: Demystifying Great Firewall's DNS Censorship Behavior
- 2020-barradas-poking Poking a Hole in the Wall: Efficient Censorship-Resistant Internet Communications by Parasitizing on WebRTC
- 2020-barradas-towards Towards a Scalable Censorship-Resistant Overlay Network based on WebRTC Covert Channels
- 2020-birtel-slitheen Slitheen++: Stealth TLS-based Decoy Routing
- 2020-bock-come Come as You Are: Helping Unmodified Clients Bypass Censorship with Server-side Evasion
- 2020-bock-detecting Detecting and Evading Censorship-in-Depth: A Case Study of Iran's Protocol Filter
- 2020-fifield-turbo Turbo Tunnel, a good way to design censorship circumvention protocols
- 2020-frolov-detecting Detecting Probe-resistant Proxies
- 2020-frolov-httpt HTTPT: A Probe-Resistant Proxy
- 2020-gfw-esni-blocking Exposing and Circumventing China's Censorship of ESNI
- 2020-minaei-moneymorph MoneyMorph: Censorship Resistant Rendezvous using Permissionless Cryptocurrencies
- 2020-nasr-massbrowser MassBrowser: Unblocking the Censored Web for the Masses, by the Masses
- 2020-oakley-protocol Protocol Proxy: An FTE-based covert channel
- 2020-raman-investigating Investigating Large Scale HTTPS Interception in Kazakhstan
- 2020-ramesh-decentralized Decentralized Control: A Case Study of Russia
- 2020-sharma-siegebreaker SiegeBreaker: An SDN Based Practical Decoy Routing System
- 2020-singh-india How India Censors the Web
- 2020-v2ray-weaknesses Summary on Recently Discovered V2Ray Weaknesses
- 2020-vandersloot-running Running Refraction Networking for Real
- 2020-wang-symtcp SymTCP: Eluding Stateful Deep Packet Inspection with Automated Discrepancy Discovery
- 2020-zhu-characterizing Characterizing Transnational Internet Performance and the Great Bottleneck of China
- 2015-frolov-the-use-of-tls The use of TLS in censorship circumvention
- 2019-bock-geneva Geneva: Evolving Censorship Evasion Strategies
- 2019-chen-impact The Impact of Media Censorship: 1984 or Brave New World?
- 2019-frolov-conjure Conjure: Summoning Proxies from Unused Address Space
- 2019-hoang-measuring Measuring I2P Censorship at a Global Scale
- 2019-iszaevich-distributed Distributed Detection of Tor Directory Authorities Censorship in Mexico
- 2019-nasr-enemy Enemy At the Gateways: Censorship-Resilient Proxy Distribution Using Game Theory
- 2018-bocovich-secure Secure asymmetry and deployability for decoy routing systems
- 2018-dunna-analyzing Analyzing China's Blocking of Unpublished Tor Bridges
- 2018-hoang-empirical An Empirical Study of the I2P Anonymity Network and its Censorship Resistance
- 2018-hobbs-sudden How Sudden Censorship Can Increase Access to Information
- 2018-manfredi-multiflow MultiFlow: Cross-Connection Decoy Routing using TLS 1.3 Session Resumption
- 2018-martiny-proof-of-censorship Proof-of-Censorship: Enabling centralized censorship-resistant content providers
- 2018-nisar-incentivizing Incentivizing Censorship Measurements via Circumvention
- 2018-tschantz-bestiary A Bestiary of Blocking: The Motivations and Modes behind Website Unavailability
- 2018-wright-identifying On Identifying Anomalies in Tor Usage with Applications in Detecting Internet Censorship
- 2018-yadav-where Where The Light Gets In: Analyzing Web Censorship Mechanisms in India
- 2017-barradas-deltashaper DeltaShaper: Enabling Unobservable Censorship-resistant TCP Tunneling over Videoconferencing Streams
- 2017-bocovich-lavinia Lavinia: An audit-payment protocol for censorship-resistant storage
- 2017-cho-churn A Churn for the Better: Localizing Censorship using Network-level Path Churn and Network Tomography
- 2017-darer-filteredweb FilteredWeb: A Framework for the Automated Search-Based Discovery of Blocked URLs
- 2017-frolov-isp-scale An ISP-Scale Deployment of TapDance
- 2017-gebhart-internet Internet Censorship in Thailand: User Practices and Potential Threats
- 2017-gosain-devil-s The Devil's in The Details: Placing Decoy Routers in the Internet
- 2017-gosain-mending Mending Wall: On the Implementation of Censorship in India
- 2017-heydari-scalable Scalable Anti-Censorship Framework Using Moving Target Defense for Web Servers
- 2017-javaid-online Online Advertising under Internet Censorship
- 2017-jermyn-autosonda Autosonda: Discovering Rules and Triggers of Censorship Devices
- 2017-lee-usability A Usability Evaluation of Tor Launcher
- 2017-li-detor DeTor: Provably Avoiding Geographic Regions in Tor
- 2017-li-lib-cdot-erate lib$\cdot$erate, (n): A library for exposing (traffic-classification) rules and avoiding them efficiently
- 2017-lu-accessing Accessing Google Scholar under Extreme Internet Censorship: A Legal Avenue
- 2017-matic-dissecting Dissecting Tor Bridges: a Security Evaluation of Their Private and Public Infrastructures
- 2017-morshed-when When the Internet Goes Down in Bangladesh
- 2017-nasr-waterfall The Waterfall of Liberty: Decoy Routing Circumvention that Resists Routing Attacks
- 2017-singh-characterizing Characterizing the Nature and Dynamics of Tor Exit Blocking
- 2017-tanash-decline The Decline of Social Media Censorship and the Rise of Self-Censorship after the 2016 Failed Turkish Coup
- 2017-wang-your Your State is Not Mine: A Closer Look at Evading Stateful Internet Censorship
- 2017-weinberg-topics Topics of Controversy: An Empirical Analysis of Web Censorship Lists
- 2016-aceto-analyzing Analyzing Internet Censorship in Pakistan
- 2016-al-saqaf-internet Internet Censorship Circumvention Tools: Escaping the Control of the Syrian Regime
- 2016-bocovich-slitheen Slitheen: Perfectly Imitated Decoy Routing through Traffic Replacement
- 2016-douglas-ghostpost GhostPost: Seamless Restoration of Censored Social Media Posts
- 2016-douglas-salmon Salmon: Robust Proxy Distribution for Censorship Circumvention
- 2016-elahi-framework A Framework for the Game-theoretic Analysis of Censorship Resistance
- 2016-fifield-censors Censors' Delay in Blocking Circumvention Proxies
- 2016-hahn-games Games Without Frontiers: Investigating Video Games as a Covert Channel
- 2016-khattak-sok SoK: Making Sense of Censorship Resistance Systems
- 2016-kohls-skypeline SkypeLine: Robust Hidden Data Transmission for VoIP
- 2016-li-mailet Mailet: Instant Social Networking under Censorship
- 2016-mcpherson-covertcast CovertCast: Using Live Streaming to Evade Internet Censorship
- 2016-nasr-game Game of Decoys: Optimal Decoy Routing Through Game Theory
- 2016-safaka-matryoshka Matryoshka: Hiding Secret Communication in Plain Sight
- 2016-tschantz-sok SoK: Towards Grounding Censorship Circumvention in Empiricism
- 2016-zarras-leveraging Leveraging Internet Services to Evade Censorship
- 2016-zolfaghari-practical Practical Censorship Evasion Leveraging Content Delivery Networks
- 2015-dyer-marionette Marionette: A Programmable Network-Traffic Obfuscation System
- 2015-ellard-rebound Rebound: Decoy Routing on Asymmetric Routes Via Error Messages
- 2015-ensafi-active-probing Examining how the Great Firewall discovers hidden circumvention servers
- 2015-ensafi-analyzing Analyzing the Great Firewall of China Over Space and Time
- 2015-fifield-blocking-resistant Blocking-resistant communication through domain fronting
- 2015-gill-characterizing Characterizing Web Censorship Worldwide: Another Look at the OpenNet Initiative Data
- 2015-hiruncharoenvate-algorithmically Algorithmically Bypassing Censorship on Sina Weibo with Nondeterministic Homophone Substitutions
- 2015-holowczak-cachebrowser CacheBrowser: Bypassing Chinese Censorship without Proxies Using Cached Content
- 2015-jones-can Can Censorship Measurements Be Safe(r)?
- 2015-jones-ethical Ethical Concerns for Censorship Measurement
- 2015-knockel-every Every Rose Has Its Thorn: Censorship and Surveillance on Social Video Platforms in China
- 2015-levin-alibi Alibi Routing
- 2015-marczak-analysis An Analysis of China's ``Great Cannon''
- 2015-nisar-case A Case for Marrying Censorship Measurements with Circumvention
- 2015-tanash-known Known Unknowns: An Analysis of Twitter Censorship in Turkey
- 2015-vines-rook Rook: Using Video Games as a Low-Bandwidth Censorship Resistant Communication Platform
- 2015-wang-seeing Seeing through Network-Protocol Obfuscation
- 2014-brubaker-cloudtransport CloudTransport: Using Cloud Storage for Censorship-Resistant Networking
- 2014-chaabane-censorship Censorship in the Wild: Analyzing Internet Filtering in Syria
- 2014-connolly-trist TRIST: Circumventing Censorship with Transcoding-Resistant Image Steganography
- 2014-houmansadr-no No Direction Home: The True Cost of Routing Around Decoys
- 2014-jones-facade Facade: High-Throughput, Deniable Censorship Circumvention Using Web Search
- 2014-khattak-look A Look at the Consequences of Internet Censorship Through an ISP Lens
- 2014-king-reverse-engineering Reverse-engineering censorship in China: Randomized experimentation and participant observation
- 2014-li-facet Facet: Streaming over Videoconferencing for Censorship Circumvention
- 2014-luchaup-libfte LibFTE: A Toolkit for Constructing Practical, Format-Abiding Encryption Schemes
- 2014-morrison-toward Toward automatic censorship detection in microblogs
- 2014-tan-censorship Censorship Resistance as a Side-Effect
- 2014-wachs-censorship-resistant A Censorship-Resistant, Privacy-Enhancing and Fully Decentralized Name System
- 2014-wang-gohop GoHop: Personal VPN to Defend from Censorship
- 2014-wustrow-tapdance TapDance: End-to-Middle Anticensorship without Flow Blocking
- 2013-dalek-method A Method for Identifying and Confirming the Use of URL Filtering Products for Censorship
- 2013-dyer-protocol Protocol Misidentification Made Easy with Format-Transforming Encryption
- 2013-fifield-oss OSS: Using Online Scanning Services for Censorship Circumvention
- 2013-geddes-cover Cover Your ACKs: Pitfalls of Covert Channel Censorship Circumvention
- 2013-hasan-building Building Dissent Networks: Towards Effective Countermeasures against Large-Scale Communications Blackouts
- 2013-houmansadr-i I want my voice to be heard: IP over Voice-over-IP for unobservable censorship circumvention
- 2013-houmansadr-parrot The Parrot is Dead: Observing Unobservable Network Communications
- 2013-invernizzi-message Message In A Bottle: Sailing Past Censorship
- 2013-khattak-towards Towards Illuminating a Censorship Monitor's Model to Facilitate Evasion
- 2013-nabi-anatomy The Anatomy of Web Censorship in Pakistan
- 2013-robinson-collateral Collateral Freedom: A Snapshot of Chinese Internet Users Circumventing Censorship
- 2013-ruffing-identity-based Identity-Based Steganography and Its Applications to Censorship Resistance
- 2013-verkamp-five Five Incidents, One Theme: Twitter Spam as a Weapon to Drown Voices of Protest
- 2013-wachs-feasibility On the Feasibility of a Censorship Resistant Decentralized Name System
- 2013-wang-rbridge rBridge: User Reputation based Tor Bridge Distribution with Privacy Preservation
- 2013-winter-scramblesuit ScrambleSuit: A Polymorphic Network Protocol to Circumvent Censorship
- 2013-winter-towards Towards a Censorship Analyser for Tor
- 2013-zhou-sweet SWEET: Serving the Web by Exploiting Email Tunnels
- 2012-aase-whiskey Whiskey, Weed, and Wukan on the World Wide Web: On Measuring Censors' Resources and Motivations
- 2012-anderson-hidden The Hidden Internet of Iran: Private Address Allocations on a National Network
- 2012-appelbaum-technical Technical analysis of the Ultrasurf proxying software
- 2012-fifield-evading Evading Censorship with Browser-Based Proxies
- 2012-king-censorship How Censorship in China Allows Government Criticism but Silences Collective Expression
- 2012-lincoln-bootstrapping Bootstrapping Communications into an Anti-Censorship System
- 2012-ling-extensive Extensive Analysis and Large-Scale Empirical Evaluation of Tor Bridge Discovery
- 2012-moghaddam-skypemorph SkypeMorph: Protocol Obfuscation for Tor Bridges
- 2012-rogers-secure Secure Communication over Diverse Transports
- 2012-schuchard-routing Routing Around Decoys
- 2012-thomas-adapting Adapting Social Spam Infrastructure for Political Censorship
- 2012-vasserman-one-way One-way indexing for plausible deniability in censorship resistant storage
- 2012-wang-censorspoofer CensorSpoofer: Asymmetric Communication using IP Spoofing for Censorship-Resistant Web Browsing
- 2012-weinberg-stegotorus StegoTorus: A Camouflage Proxy for the Tor Anonymity System
- 2012-winter-great How the Great Firewall of China is Blocking Tor
- 2012-wright-regional Regional Variation in Chinese Internet Filtering
- 2011-bachrach-h00t \#h00t: Censorship Resistant Microblogging
- 2011-bonneau-scrambling Scrambling for lightweight censorship resistance
- 2011-houmansadr-cirripede Cirripede: Circumvention Infrastructure using Router Redirection with Plausible Deniability
- 2011-jones-hiding Hiding Amongst the Clouds: A Proposal for Cloud-based Onion Routing
- 2011-karlin-decoy Decoy Routing: Toward Unblockable Internet Communication
- 2011-kathuria-bypassing Bypassing Internet Censorship for News Broadcasters
- 2011-knockel-three Three Researchers, Five Conjectures: An Empirical Analysis of TOM-Skype Censorship and Surveillance
- 2011-liu-tor Tor Instead of IP
- 2011-mccoy-proximax Proximax: A Measurement Based System for Proxies Dissemination
- 2011-seltzer-infrastructures Infrastructures of Censorship and Lessons from Copyright Resistance
- 2011-shklovski-online Online Contribution Practices in Countries that Engage in Internet Blocking and Censorship
- 2011-wiley-dust Dust: A Blocking-Resistant Internet Transport Protocol
- 2011-wright-fine-grained Fine-Grained Censorship Mapping: Information Sources, Legality and Ethics
- 2011-wustrow-telex Telex: Anticensorship in the Network Infrastructure
- 2011-xu-internet Internet Censorship in China: Where Does the Filtering Occur?
- 2010-burnett-chipping Chipping Away at Censorship Firewalls with User-Generated Content
- 2010-mahdian-fighting Fighting Censorship with Algorithms
- 2010-pfitzmann-terminology A terminology for talking about privacy by data minimization: Anonymity, Unlinkability, Undetectability, Unobservability, Pseudonymity, and Identity Management
- 2009-backes-anonymity Anonymity and Censorship Resistance in Unstructured Overlay Networks
- 2009-cao-skyf2f SkyF2F: Censorship Resistant via Skype Overlay Network
- 2009-mclachlan-risks On the risks of serving whenever you surf: Vulnerabilities in Tor's blocking resistance design
- 2008-aycock-good ``Good'' Worms and Human Rights
- 2008-sovran-pass Pass it on: Social Networks Stymie Censors
- 2006-clayton-ignoring Ignoring the Great Firewall of China
- 2006-dingledine-design Design of a blocking-resistant anonymity system
- 2005-perng-censorship Censorship Resistance Revisited
- 2004-danezis-economics The Economics of Censorship Resistance
- 2004-k-psell-achieve How to Achieve Blocking Resistance for Existing Systems Enabling Anonymous Web Surfing
- 2003-dornseif-government Government mandated blocking of foreign Web content
- 2003-feamster-thwarting Thwarting Web Censorship with Untrusted Messenger Discovery
- 2002-serjantov-anonymizing Anonymizing Censorship Resistant Systems
- 2001-handley-network Network Intrusion Detection: Evasion, Traffic Normalization, and End-to-End Protocol Semantics
- 1998-ptacek-insertion Insertion, Evasion, and Denial of Service: Eluding Network Intrusion Detection
- 1996-anderson-eternity The Eternity Service
498 findings tagged here
-
During the June 2025 Iran shutdown, circumvention tool performance diverged sharply by transport design. Psiphon's multi-protocol architecture sustained 1.5 million concurrent users—roughly one-third of its normal Iranian base. Lantern's "proxyless" protocol (domain-fronting via CDN, ~40% of Lantern's Iranian traffic) showed moderate success. Tor usage collapsed during the blackout but bridge connections surged and rebounded quickly after lifting. BeePass (serving 500k+ daily users at shutdown onset) used live A/B testing of port/obfuscation-prefix combinations to probe the censors' blocking parameters in real time. The Ceno Browser's P2P network grew from 600 active peers on June 13 to ~8,000 by July 11, indicating that decentralized fallback paths stayed up even during peak blocking.
-
The June 2025 Iran shutdown—carried out during the Iran-Israel war beginning ~June 19—did not use BGP route withdrawals as in 2019. Instead, authorities applied service-level restrictions at the national border: DNS poisoning of foreign destinations, protocol whitelisting permitting only pre-approved domestic services, and DPI to block circumvention-tool traffic. Iran's international traffic fell roughly 90% while the country's BGP routes remained advertised, making the shutdown invisible to BGP-based monitoring systems. OONI measurement volume, which totalled 121,333 in June 2025, collapsed to under 200 submissions on June 19-20.
-
During the June 2025 shutdown, Iranian authorities blocked international One-Time Password (OTP) SMS delivery, preventing new sign-ins to foreign secure-messaging platforms and VPN services. This forced users toward government-approved domestic platforms that lack security and privacy protections. The blockade of OTPs effectively weaponized account-recovery flows as a secondary shutdown layer, disproportionately affecting users who needed to activate new circumvention tools during the crisis.
-
Input blocking in Chinese LLM services (DeepSeek, Qwen, Kimi, Doubao) is overwhelmingly consistent: all four services persistently block the exact same queries across all 5 measurement samples in both Simplified and Traditional Chinese. Output blocking is far less consistent, with only 29 out of 349 output-blocked queries blocked across all 5 samples. Baidu-Chat is exceptional: it performs almost no input blocking but instead relies heavily on post-search and output blocking (78.6% of blocks are output-phase).
-
A three-stage detection pipeline exploiting the "dual-role" behavioral fingerprint of single-IP circumvention relays achieved 23.2% recall (96/414 ground-truth relays) with a 0.18% false-positive rate against 97,651 benign TLS servers, for an overall accuracy of 99.5%. The ground-truth set covered OpenVPN, WireGuard, and SOCKS relays identified in a 17 TB single-day backbone trace (WIDE Project, April 9, 2025).
-
The paper identifies a fundamental architectural vulnerability in single-IP circumvention designs: a relay must generate new observable flows (via DNS or TLS SNI) to reach end services after receiving client connections, creating a detectable server-and-client behavioral contrast. A relay accessing user-facing domains (news, social media) scores high on a Relay Suspicion Score (w=0.9) versus infrastructure domains (w=0.1). The paper argues this host-level signal is censorship-invariant and cannot be concealed by link obfuscation.
-
Stage 1 of the detection pipeline uses a lightweight heuristic: restrict analysis to IP addresses in "VPS-dense ASNs," which censors already target for resource-intensive inspection of fully-encrypted traffic. This pre-filter dramatically reduces the search space before applying the more expensive dual-role behavioral analysis. The evaluation was conducted without Stages 1 and 3 due to dataset limitations, meaning the reported 23% recall and 0.18% FPR are conservative lower bounds on the full pipeline's performance.
-
AnyTLS's default padding scheme operates across 8 levels (stop=8), with initial padding fixed at 30 bytes, small-data padding 100–400 bytes, and medium-to-large data padding chains of 400–500 bytes continuing through multiple 500–1000 byte segments. The 'c' (continue) marker allows multi-stage padding sequences within a single connection burst.
-
AnyTLS is a TLS-based proxy protocol maintained by the sing-box team, designed in 2024 and first released in the sing-box dev-next branch. Its core mechanism wraps arbitrary proxy traffic in standard TLS and applies a configurable padding scheme (Padding Scheme) to enhance traffic concealment while maintaining compatibility with standard TLS infrastructure.
-
LetsVPN's exit is described as part of a broader pattern: a wave of 'airport' (proxy subscription service) outages across multiple providers in April 2026, documented in a companion article titled '2026年四月份各大机场断线详解,' indicating coordinated or systematic GFW blocking affecting the circumvention ecosystem broadly during this period.
-
Two documented enforcement actions against VPN users were reported in April 2026 in China: one user was summoned, given a warning, and fined 300 RMB for using LetsVPN to access foreign platforms; a second user in Dalian was fined 500 RMB on April 2, 2026 for using 快喵VPN to log into Telegram.
-
The blog author, drawing on evaluation experience, concludes that LetsVPN's failure was not caused by IP exhaustion or ordinary node instability but by precise protocol-signature identification: once the GFW extracts a client handshake feature, it can simultaneously block all connections sharing that signature across hundreds of thousands of users.
-
LetsVPN permanently exited the Chinese mainland market in late April 2026 after its technical team spent 20 days making hourly adjustments and confirmed it could not restore connectivity. The official announcement designated April 8, 2026 as the effective service-termination date and initiated a full refund program.
-
The article documents that large-scale 'one-click' commercial VPN providers with static protocol stacks have become effectively non-viable in China, while subscription-based proxy node services using open-source clients (Clash, Shadowrocket) with server-side rapid IP and datacenter switching demonstrate substantially greater resilience to GFW blocking waves.
-
Article 19 documents that Iran's National Information Network (NIN / SHOMA) was designed with explicit reference to China's Great Firewall as a model, with institutional mirroring: Iran's Supreme Council of Cyberspace parallels China's Cyberspace Administration of China, and both governments share a "cyber sovereignty" doctrine used to justify domestic content controls and cross-border technology transfer. The report frames Iran's filtering infrastructure as deliberately architected to replicate GFW capabilities, not as an independently developed system.
-
The report maps specific Belt and Road Initiative Digital Silk Road projects through which Chinese technology vendors have transferred censorship and surveillance infrastructure to Iran, including fiber backbone investments, data-center co-location agreements, and equipment supply chains. Specific vendors named include Huawei and ZTE as network infrastructure providers, with the report noting that equipment exports include filtering-capable hardware that Iran's ISPs have deployed at network choke points.
-
Simulations extending the ENEM19 game-theory framework show that ephemeral proxy schemes (modeled on Snowflake/Lantern) effectively neutralize both the "optimal" and "aggressive" censors from the original framework. In overprovisioned settings (proxies arriving at 250/step vs. 200 clients/step), even the null censor scenario outperforms either censor in equal-arrival settings. Over 90% of waiting users receive a proxy within 1 time step. The critical variable is not censor sophistication but proxy arrival rate relative to client demand—high proxy churn combined with high arrival rate defeats both enumeration strategies tested.
-
Adversarial pre-padding — prepending stochastic byte noise to packets — degrades ET-BERT encrypted traffic classification accuracy from >99% to 25.68%, exposing a structural vulnerability in all payload-byte-dependent detection systems. White-box adversarial attacks (Ayaka AH-MSI) additionally achieve evasion rates exceeding 99.5% against standard continuous-time sequence models via Manifold Shattering, where adversaries align malicious temporal distributions with benign baselines.
-
Explicitly disentangling packet headers (structured, low-entropy) from encrypted payloads (high-entropy, stochastic) into separate MoE branches yields consistent gains across six datasets: 86.85% F1 on 120-class TLS 1.3 traffic (CSTNET-TLS), 97.88% F1 on USTC-TFC2016 malware/benign flows, and 92.65% F1 on imbalanced IoT traffic (CIC-IoT2022), demonstrating that headers and payloads carry fundamentally different and independently exploitable discriminative signals.
-
Assemblage's anti-censorship collateral damage argument rests on the economic and social value of AI-generated image communities. Blocking DeviantArt (65M MAU), Reddit (1.21B MAU), X/Twitter (611M MAU), or Telegram (1B MAU) to suppress steganographic circumvention would cause massive collateral damage to legitimate users—and to Chinese companies' revenue in the case of platforms popular in CN. The paper observes that even in authoritarian regimes, everyday users actively post AI-generated content, making blanket platform blocking politically and economically costly.
-
Lossy image compression is the primary practical barrier to deploying Assemblage on major platforms. Of 8 tested platforms, WeChat and Rednote (combined 2.6 billion MAU) failed because they serve only lossy-compressed downloads, destroying embedded steganographic content. Platforms that preserve lossless originals (Reddit, X/Twitter, DeviantArt, Discord, Imgur, Telegram) succeeded end-to-end. Discord serves ~30 KB compressed thumbnails by default but provides lossless originals via its native "Download" option.
-
Assemblage's diffusion-model steganography (Pulsar) encodes 300–618 bytes per image vector (mean ± SD by model). Generating one local state takes ~9.5 sec on an Apple M4 Pro; encoding takes ~4.4 sec; decoding takes ~4.2 sec. Sending a compressed 300-word message requires only K+h = 4+2 images using the church-256 model, with total send time ~90 sec and receive time ~30 sec. Perceptual-hash candidate detection runs in ~0.33 ms per image, making scanning all ~150 daily posts on /r/AIArt take under 1 second.
-
Assemblage inherits the bootstrapping limitation of all generative steganographic schemes: sender and receiver must share a symmetric key before communication begins. Public-key steganography exists in theory but does not currently support common image/text channels efficiently. The paper identifies three viable deployment scenarios: (1) travelers who carry a pre-shared secret before entering a censored region; (2) users in countries with episodic censorship who establish the key during uncensored periods; (3) a hybrid where a one-time signaling channel establishes the secret, after which Assemblage carries subsequent traffic.
-
WebSocket, required by HTTPT and WebTunnel to establish covert channels inside TLS connections, had an adoption rate as low as 6.3% of websites in 2021, sharply limiting the pool of volunteer websites that can act as proxies for these tools. By contrast, Huma's traffic replacement scheme embeds covert data in standard HTTP leaf objects (images, scripts, CSS), requiring only that the DW serve HTTP content — a near-universal property.
-
Striding with factor 4 (early downsampling) produces the largest single-factor degradation in the ablation study: average macro-F1 drops from 0.9909 to 0.9772 and cross-dataset variance increases from 4.77×10⁻⁵ to 4.51×10⁻⁴, with worst-case dataset performance falling to MIN 0.9524. Fine-grained byte order and short-range structure — protocol headers, payload signatures, repeated byte motifs — carry essential discriminative signal that stride-based aggregation destroys.
-
A burst of just 5 packets truncated to 320 bytes each (1600 bytes total) suffices for macro-F1 ≥0.9824 across all six benchmarks; the classification token reads from the final recurrent state after a 4-layer Mamba-2 stack processing this fixed-length prefix, with no additional flow-level or session-level context required.
-
Classification from the first 5 packets × 320 bytes (1600-byte burst) achieves near-perfect accuracy across Tor (F1=0.9990), VPN (F1=0.9871), malware (F1=0.9954), and IoT attack traffic (F1=0.9966), with IP addresses masked and only header and initial payload retained. The earliest portion of each packet provides sufficient discriminative information for a classification decision made within the first kilobyte of a flow.
-
MambaNetBurst classifies Tor traffic (ISCXTor2016) at F1=0.9990 and VPN traffic (ISCXVPN2016) at F1=0.9871 using only the first 5 packets (1600 bytes total) with no pre-training, matching or exceeding pre-trained baselines like ET-BERT (ISCXTor F1=0.9967, ISCXVPN F1=0.9565) and NetMamba (ISCXTor F1=0.9986, ISCXVPN F1=0.9806) at 2.5–2.7M parameters.
-
MambaNetBurst achieves macro-F1 of 0.9990 on ISCXTor2016 and 0.9871 on ISCXVPN2016 without any pretraining, matching or exceeding heavily pretrained baselines such as ET-BERT (F1=0.9967/0.9565) and YaTC (F1=0.9986/0.9806). High-accuracy Tor and VPN traffic classification is achievable with a compact 2.5M-parameter supervised model requiring no labeled pretraining corpus.
-
TCP segmentation (splitting a DNS message into 20-byte TCP fragments) successfully circumvented DNS censorship in China for nearly all resolvers that support TCP. In Iran, TCP segmentation was only partially effective due to the censor's ability to reassemble TCP fragments when system load permits—some runs succeeded completely, others failed entirely across all resolvers. The "Last Response" mode (wait 3 seconds for the final UDP reply) was highly effective against China's on-path GFW injector for all resolvers except the fully IP-blocked Cloudflare 1.1.1.1 resolver.
-
QUICstep successfully circumvents the GFW's QUIC SNI censorship (active since April 2024) in live testing. Using an Alibaba VM in mainland China as client and an AWS instance in North Virginia as server, a native QUIC client was blocked after several fetches of youtube.com SNI, while QUICstep consistently succeeded across 50 consecutive fetches. 7 tiktokcdn.com subdomains that were QUIC-SNI blocked were also reliably accessible via QUICstep. The approach routes only QUIC long-header (handshake) packets through a WireGuard tunnel; all subsequent short-header (data) packets travel the native path.
-
TCP segmentation — splitting DNS-over-TCP messages into 20-byte fragments — successfully circumvented DNS censorship for 40 of 41 tested resolvers in China. In Iran, TCP segmentation is inconsistently effective: it succeeds in some scan runs and fails entirely in others, suggesting the Iranian censor can reassemble TCP fragments when processing capacity permits.
-
Under controlled lab conditions, a CNN trained on packet metadata (ports, sizes, TCP sequence numbers) achieved 99.5% accuracy classifying I2P packets with the 'Without payload' variant, versus only 72.5–76.5% using encrypted payload alone. However, when applied to the full recorded dataset, the 'Without payload' model's accuracy for the dominant irrelevant-traffic class dropped to 95.17% while maintaining 100% on target-class packets — but with a high false-positive rate making it forensically unreliable.
-
The largest single source of censored domains in the GNL is MESA lab's SNI monitoring dataset (E21-SNI-Top200w.txt) containing 57,362 censored domains, and E21-SNI-Top120W-20221020.txt with 36,467 domains—totaling over 93K domains from network tap data alone for a single country (E21 = Ethiopia per InterSecLab attribution). A separate Xinjiang dataset (XJ-CUCC-SNI-Top200w.txt) contains 13,604 domains. These datasets "do not seem to come from popular domain lists, and instead appear to be gathered from network taps," confirming that Geedge builds censorship target lists directly from passive traffic observation.
-
Of 6,915,266 domains extracted from the 572 GiB Geedge Networks Leak (GNL), 298,955 censored domains (93.7% of all GNL-censored domains) appear in neither Tranco top-1M nor CitizenLab test lists. Measurements across China (Guangzhou/Nanjing), Myanmar, Pakistan, and Algeria confirmed censorship via DNS injection and SNI-based TLS connection termination. The GNL covers 25–62% of Tranco-censored domains across countries, showing substantial but incomplete overlap. This vendor-side ground truth reveals a censorship surface roughly two orders of magnitude larger than curated academic test lists.
-
The GNL reveals that Geedge actively maintains dedicated VPN-infrastructure tracking datasets. The China-specific component includes 7,016 domains in a "vpn-finder-plugins" repository (mesalab_git/intelligence-learning-engine), 4,810 NordVPN server domains, and a Pakistan-specific file listing 68 Psiphon CDN domains (geedge_docs/TSGEN/.../Psiphon-CDN_20240430.json) dated April 2024. A Myanmar deployment file (M22-VPN List.html, 27 domains) further confirms country-specific VPN blocklists are operationally maintained. The "Appsketch" program reverse-engineers VPN apps to extract domains and IP addresses for blocking.
-
Obscura's browser-to-browser (B-B) WebRTC connections produce DTLS ClientHello and ServerHello messages indistinguishable from genuine browser traffic: across 100 captured handshakes compared against Facebook Messenger, Google Meet, Discord, and a reference WebRTC app using the dfind tool, no unique identifiers were found in C-C connections, and the sole Firefox-specific fingerprint (ServerHello length 86 bytes, cipher TLS_AES_128_GCM_SHA256, extension field length 46 bytes) matches the default Firefox WebRTC profile — meaning blocking it would also block all legitimate Firefox WebRTC users.
-
A differential degradation attack (DDA) that selectively drops RTP packets carrying the last packet of a video frame — exploiting the fact that a single lost packet causes the entire encoded frame to be discarded — reduces Protozoa's covert throughput to single-digit KBps at 1920×1080 with 15% frame loss and at 426×240 with 50% frame loss, while maintaining acceptable video quality for legitimate WebRTC traffic.
-
Authoritarian regimes blocked Snowflake primarily through DPI targeting fingerprints in Pion's DTLS handshake and TLS fingerprints in complementary WebRTC protocols, not through ML-based traffic analysis — confirming that cost-effective censors consistently favor simple, deterministic methods over computationally expensive classifiers.
-
Iran's censorship of refraction-networking proxies (Conjure via Psiphon) is not monolithic: different ISPs independently deploy different techniques and timelines. Over 800 million logged Conjure connections from July 2023–February 2025 across 10+ Iranian ASes show TCI (AS58224, ~33% of traffic) uses packet injection, while MCCI/Hamrah-e Avval (AS197207, ~22%) applies IP-based blocking, and some ASes (Parsonline AS16322, Shatel AS31549) show no proxy blocking at all.
-
Two Iranian ASes apply a protocol allowlist that drops traffic not matching known application-layer protocol patterns (after ~6 packets), independently of the destination IP. Experiments with fresh /26 phantom subnets showed that prefixing Conjure connections with a plain HTTP GET payload evaded this blocking for four weeks, while TLS Client Hello-prefixed and SSH-prefixed connections were blocked within 72 hours (TLS) or 72 hours after port rotation (SSH). HTTP GET on port 80 was the only tested prefix that survived the full experiment window.
-
Amnesty International's 102-page investigation identifies a multi-vendor surveillance stack deployed in Pakistan: Chinese DPI (Geedge/MESA-derived), Canadian social-media monitoring (Netsweeper), and Emirati commercial spyware (Pegasus and FinFisher). The system enables deep packet inspection, SNI-based filtering, and traffic-shape classification at national scale, including targeted interception of encrypted messaging apps and VPN traffic.
-
VPN search demand in Iran spiked approximately 707% during the June 2025 stealth blackout, as measured by Top10VPN analytics, making it one of the highest-documented circumvention-demand spikes associated with a single shutdown event. Despite this demand, many VPN connections failed because the protocol whitelist eliminated non-HTTPS tunneling methods and HTTP-level filters could detect known VPN signatures on port 443.
-
TTL-based path analysis showed that all censorship actions (DNS poisoning, HTTP injection, TLS resets) in the June 2025 shutdown occurred at the same network hop across all tested ISPs, indicating a single centralized national border gateway—likely TCI AS gateways—rather than per-ISP enforcement. Global BGP announcements were kept intact throughout, making the shutdown invisible to routing monitors while domestic connectivity collapsed.
-
Over 90% of tested censored domains returned private IP addresses in the 10.10.34.0/24 range (chiefly 10.10.34.34) via injected DNS replies during the June 2025 shutdown, with poisoned response TTLs often very low—consistent with inline DPI injection rather than a recursive DNS lookup. A small set of domains including Google and state-approved services were whitelisted and resolved correctly.
-
Iran's June 2025 shutdown enforced a strict national protocol whitelist: only DNS (UDP/53), HTTP (port 80), and HTTPS (port 443) traffic from Iranian networks to external servers was forwarded; all other protocols—including OpenVPN (UDP/1194), SSH (port 22), and arbitrary TCP/UDP ports—were silently dropped without response by DPI at the border.
-
TLS connections to blocked services (instagram.com, telegram.org) were terminated by TCP RST immediately after the client's ClientHello, before any certificate exchange, confirming SNI-based DPI that reads the plaintext SNI extension and aborts the handshake. HTTP filtering additionally matched Host headers and URL keywords case-sensitively, with injected HTTP 403 pages or TCP RST responses, and case-change evasions were sometimes effective.
-
Analysis of 5.1 billion Wallbleed responses revealed that the leaked memory contains fragments of live network traffic processed by the injection device: IP/TCP/UDP/HTTP headers and payloads (including plaintext traffic not related to DNS), x86_64 Linux stack frames with ASLR-consistent pointer patterns, and what appear to be glibc stack canaries. The 166 million UPnP/SSDP snippets in leaked memory suggest the GFW device shares a memory pool with traffic from private RFC 1918 addresses, hinting at internal management-plane traffic co-located with the censorship infrastructure. A side channel — the fixed cyclic ordering of false IP addresses across injection processes — distinguishes individual GFW injector processes from each other.
-
The September 2025 leak of ~600 GB from Geedge Networks and the MESA Lab (Institute of Information Engineering, Chinese Academy of Sciences) is the largest known document disclosure from the GFW vendor ecosystem. It establishes a direct lineage: MESA Lab (founded 2012 by Fang Binxing's team, annual contracted revenue >35M RMB by 2016) spun out Geedge Networks in 2018, with MESA alumni filling key engineering roles (e.g. Zheng Chao as CTO). The leak includes ~64 GB of MESA git repositories, ~35 GB of MESA internal documents, ~15 GB of Geedge internal documents, and a ~3 GB Jira export — providing direct access to source code, work logs, and internal communications behind GFW R&D.
-
Internal Geedge documents confirm active contracts to deploy GFW-derived censorship and surveillance infrastructure in Myanmar, Pakistan, Ethiopia, Kazakhstan, and at least one additional unidentified country under the Belt and Road framework, in addition to domestic deployments in Xinjiang, Jiangsu, and Fujian. The exported product (the Tiangou Secure Gateway / TSG line) is not a stripped-down export variant — leaked TSG documentation shows DPI, active-probing, ML classifiers, and granular per-region traffic control rules that mirror the domestic GFW capability set.
-
The August 20, 2025 unconditional RST event revealed an asymmetry in the GFW's triggering mechanism: for traffic originating inside China, both the client SYN and the server SYN+ACK each independently triggered three injected RST+ACK packets (six total per connection). For traffic to China from outside, only the Chinese server's SYN+ACK triggered RSTs — the foreign client's SYN alone was insufficient. This asymmetry implies the responsible device observed the SYN+ACK from the Chinese server as the trigger condition, not a port-match rule on the SYN.
-
Fragmenting large server responses across multiple independent TCP connections each below the ≈15–20 KB threshold circumvents the freeze, but at severe cost: downloading a 50 MB file requires approximately 2,560 separate TCP connections, which is operationally suspicious and significantly degrades throughput.
-
The freezing threshold is packet-count-based rather than strictly byte-based: the censor typically freezes after 25 packets have been sent in either direction (incoming or outgoing), which averages approximately 16 KB of payload. The limit applies to both TCP and UDP flows, and varies slightly by ISP.
-
The Russian DPI maintains two whitelists that exempt flows from the freeze: (1) a SNI-based whitelist covering select domains (visible in the TLS ClientHello), and (2) a CIDR-based whitelist of IP subnets for trusted destination servers. The SNI whitelist can be exploited by VLESS+Reality clients using an allowed SNI value as the apparent destination; the CIDR whitelist requires routing through an IP from a whitelisted prefix, making circumvention 'extremely difficult' without an intermediate node in a whitelisted subnet.
-
Russia's mobile operators (MTS, Beeline, MegaFon, Yota) deployed a TCP connection-freezing technique in mid-2025 that silently halts packet delivery after approximately 15–20 KB of server-to-client data within a single TCP connection, without sending RST packets, causing clients to stall until timeout. The trigger requires: (1) TLS 1.3 or TLS 1.2 over TCP, (2) destination IP located in a foreign datacenter ASN (e.g., Hetzner, DigitalOcean), and (3) cumulative in-connection payload exceeding the threshold.
-
Only SSH/SFTP and sometimes RDP are observed to pass through the Russian mobile network freeze without data-size limitations; raw TCP transfers without TLS and all common TLS-based proxy protocols (VLESS, Reality, Trojan, Shadowsocks) are subject to the 15–20 KB per-connection cap. This suggests the censor's DPI whitelist is protocol-specific and SSH's wire format is recognized as exempt.
-
InterSecLab frames the Geedge/TSG export program as the commoditization of national firewall capability: rather than each censor state independently developing detection infrastructure, they contract Geedge for a turnkey system incorporating the cumulative R&D of MESA Lab (>10 years, National Science and Technology Progress Award winners). This structural shift means the marginal cost for an autocratic government to acquire GFW-grade censorship is now a procurement decision, not a multi-year engineering program. The report identifies that Geedge's relationship with the MESA Lab gives customer states indirect access to ongoing academic R&D improvements, not just a static product.
-
InterSecLab's 76-page analysis of the Geedge/MESA leak (based on nine months of indexing and translating >100,000 documents) characterizes the Tiangou Secure Gateway (TSG) product line as a commercially deployable detection stack that combines deep packet inspection, real-time mobile subscriber monitoring, active probing, ML-based traffic classifiers, and granular per-region rule sets. TSG is not a research prototype — leaked documentation includes deployment timelines and client government interactions for Kazakhstan, Ethiopia, Pakistan, Myanmar, and one unnamed country, with censorship rules explicitly tailored to each region.
-
Amigo introduces a decentralized continuous key agreement protocol and novel routing scheme for secure group mesh messaging over short-range radio (Bluetooth/ Wi-Fi Direct) when governments disable the Internet during protests. Extensive simulations demonstrate that prior approaches fail to scale to realistic protest environments that have high link churn, physical spectrum contention, and dense mobility — Amigo's protest-specific optimizations address these but also reveal that scaling to protests with thousands of participants remains an open challenge.
-
Simulations show that previous secure mesh messaging systems fail to provide efficient private group communication under realistic protest conditions — specifically high node mobility, link churn, and RF spectrum contention — conditions that prior work did not evaluate. Bridgefy, the most widely deployed protest mesh app, was broken cryptographically in 2021 and 2022, and even its successor designs lack the scalability needed for protests with thousands of participants.
-
The report traces the specific corporate pathway through which Geedge Networks exported GFW-derived technology to Myanmar: via front companies, shell entities, and Belt and Road Initiative contract frameworks that obscure the Chinese state's direct involvement. The report names at least three intermediary entities used to transfer equipment and technical personnel to the Myanmar military, and documents that the same export channel was used for ongoing product updates post-deployment.
-
Justice for Myanmar documents that Geedge Networks supplied Myanmar's military junta with GFW-derived surveillance and censorship infrastructure under Belt and Road frameworks following the February 2021 coup. The deployed system (Tiangou Secure Gateway / TSG) incorporates the same DPI, active-probing, and ML-classifier capabilities as the domestic Chinese GFW, giving Myanmar one of the most technically capable censorship systems in Southeast Asia.
-
Bridgefy included both sender and receiver long-term identifiers on every message; Albrecht et al. found this unsafe and the deployed security upgrades proved insufficient, leaving Bridgefy unable to provide anonymity. Firechat similarly transmits long-term public user IDs with every message, uniquely identifying accounts to every recipient in the mesh.
-
Iran's HTTP censor exhibits several parsing inconsistencies exploitable for evasion: (1) it is case-sensitive and ignores lowercase method variant "gET"; (2) it does not censor the Host header for HTTP version strings "HTTP", "1.1", and "example" (suggests a version regex of HTTP/.*); (3) when the Host header is absent, the path is not censored for versions "HTTP" and "HTTP/1"; (4) the body is never analyzed regardless of version. All HTTP and DNS censorship occurs at the same last-hop border node, suggesting centralized architecture.
-
In China, multiple URLs show 100% failure rates across 3–7 ASNs with near-zero confirmed blockpage rates (e.g., hkleaks.ru, blockdx.co, libgen.space each at 100% failure, avg_confirmed ≈ 0), indicating that China increasingly uses non-blockpage mechanisms — connection drops, TCP anomalies — that evade blockpage-based detection while achieving complete access denial.
-
Ceno Browser's decentralized peer-to-peer network grew from approximately 600 active peers on June 13 to nearly 8,000 by July 11, 2025 — a 13× increase in under 30 days — with some Ceno connections remaining online throughout the full blackout, indicating that P2P architectures without fixed enumerable infrastructure can survive centralized application-layer shutdowns.
-
The June 2025 Iran shutdown achieved approximately 90% reduction in international traffic without BGP withdrawal by combining DNS poisoning, protocol whitelisting, and DPI at the national border — maintaining an outward appearance of normal connectivity for traditional monitoring tools while severing the population's access to the global Internet. Unlike the 2019 shutdown, which was implemented per-provider over 24+ hours, the 2025 operation was centralized and covert.
-
The Iranian government blocked international One-Time Passwords (OTPs) during the June 2025 shutdown, forcing citizens to abandon secure international platforms and migrate to government-approved domestic services with known security and privacy vulnerabilities — using authentication infrastructure as a deliberate chokepoint to coerce adoption of surveilled platforms at scale.
-
Lantern's proxyless protocol accounted for approximately 40% of its traffic during the June 2025 Iran shutdown, demonstrating that a direct-server / proxyless transport mode provided a significant load-bearing fallback when conventional proxy infrastructure was blocked by centralized DPI enforcement.
-
Psiphon's multi-protocol design maintained access for approximately 1.5 million users during the June 2025 Iran shutdown — roughly one-third of its normal user base — while traffic throttling rendered many single-protocol circumvention tools functionally useless for anything beyond basic text communication.
-
The DTLS ClientHello extensions field is the most prominent feature for fingerprinting Snowflake's Pion WebRTC stack. A passive DPI tool (dfind) validated against the MacMillan et al. dataset of 6,500 DTLS handshakes reliably identifies Pion-based implementations via unique extension byte patterns. Chrome randomized its extension list order starting with version 129.0.6668.58 (September 2024), yielding 6! = 720 unique permutations and hardening it against deterministic matching. Firefox adopted DTLS 1.3 by default from version 127 (May 2024), which changes the extension structure entirely and renders DTLS 1.2 mimicry obsolete for Firefox traffic.
-
Beyond business-filing cross-references, the paper introduces a method of linking VPN provider families by showing they share VPN server cryptographic credentials (Shadowsocks passwords, server TLS fingerprints) across distinct app identities. This extends prior ownership-attribution methods that relied solely on corporate registry data and code similarity, adding shared live infrastructure as a linkage signal that is harder for operators to obscure.
-
Three families of VPN apps with combined Google Play download counts exceeding 700 million share not only common ownership but hardcoded cryptographic credentials, including Shadowsocks passwords embedded in their APKs. An attacker who extracts these hardcoded passwords can passively decrypt all traffic of users of these apps. Business filing and APK analysis linked the families to the same operators; one previously-identified family (Innovative Connecting / Autumn Breeze / Lemon Clove) had already been linked to the People's Liberation Army.
-
Russian TSPU devices directly block ECH by dropping ClientHello messages that contain both an ECH extension and the outer SNI hostname "cloudflare-ech.com" — the static outer SNI Cloudflare advertises in all its ECH configurations. Blocking affects both TLS and QUIC. ECH connections to servers with Cloudflare ECH support but outside Cloudflare's official IP ranges are NOT blocked. TCP segmentation alone or TLS record fragmentation alone did NOT bypass TSPU ECH blocking, but combining both techniques did circumvent it. TSPU has also added TCP reassembly capabilities that defeat previously effective fragmentation-only bypasses.
-
Censorship classifiers and traffic analysis attacks consistently exploit the initial seconds of a proxy connection, where packet-size, inter-arrival-time, and burst features are maximally discriminative. Cited work demonstrates that website fingerprinting classifiers trained solely on the first few seconds of Tor traffic achieve high accuracy, and real-world GFW detection of fully-encrypted protocols also targets early-connection bytes.
-
The proposed framework operates as a transparent shim between application and network layers, enforcing a configurable schedule over packet size, timing, and burst patterns. The shaping logic is transport-agnostic — applicable across TCP, UDP, QUIC, and TLS — and activates only after the underlying protocol handshake completes, making it reusable across heterogeneous circumvention stacks.
-
The framework is designed for adoption into existing censorship-resistant systems in the same manner as uTLS — as a drop-in Go library requiring minimal code changes. Primary integration targets are Tor pluggable transports and WireGuard-based VPNs that currently lack built-in traffic obfuscation. Predefined hand-crafted schedules are provided alongside GAN-generated ones to enable developer stress-testing without model inference.
-
Security arguments for existing circumvention systems are based on ad-hoc adversary models that are often incomplete or unrepresentative of real-world adversaries, leading to allegedly secure designs that fail against relatively straightforward attacks. Protocols that substitute or parasitize a cover application's encrypted traffic channel fail against application-aware adversaries who observe or induce violations of application-specific behavioral invariants — a weakness that pre-trained classifiers on custom traces fail to surface.
-
The paper proposes modeling HCS undetectability as a simulation-based cryptographic distinguishability problem: if traces produced by the real-world HCS channel are computationally indistinguishable from ideal-world application-channel traces (T_HCS ∼ T_simulator), the HCS achieves provable security against any adversary — passive or active. The simulation paradigm is parametric in adversary capability, meaning a single proof covers the full spectrum from passive SNI monitoring to active DPI.
-
Iran's June 2025 shutdown enforced a four-layer DPI topology: ISP-administered DPI boxes, centrally commanded DPI at large ISPs under the Communications Regulatory Authority, DPI at Tehran IX that filters domestic-only transit traffic, and DPI at internationally-linked networks — almost all funneling through AS48159 (Telecommunications Infrastructure Company, TIC).
-
Between 21–25 June 2025, Iranian fixed-line networks partially restored access via TCP-based protocols (SSH, WebSockets) while mobile networks and UDP-based protocols remained heavily restricted, indicating deliberate asymmetric enforcement to restore domestic data-center operation without re-enabling VPN circumvention.
-
During the June 2025 blackout, virtually all UDP-based protocols were blocked across major Iranian networks — WireGuard, AmneziaWG, QUIC, WebRTC, and OpenVPN — with the sole deliberate exception of UDP port 53 (DNS), preserved to avoid cascading failures in internal infrastructure.
-
All six Chinese browsers (Baidu Searchbox, UC Browser, QQ Browser, OPPO, Redmi/Mi, VIVO) transmit the full URL of every page visited—including HTTPS pages—along with page titles and search terms out-of-band to vendor servers, entirely bypassing VPN tunnel protection. In five of six cases this data is transmitted with no cryptography or weak cryptography (purely symmetric AES with hardcoded keys, or textbook RSA with a 128-bit modulus factorable in under 3 seconds), making it readable by any on-path actor between the VPN egress and the vendor's servers.
-
Of the four Chinese browsers offering incognito mode (Baidu Searchbox, UC Browser, QQ Browser, Redmi/Mi), all four continue to leak PII and three continue to transmit full browsing activity including URLs; UC Browser specifically sends data during incognito sessions encrypted with hardcoded AES/CBC key "Ine34@32b#jeRs2h" and a zero initialization vector to crash-upload endpoints. Incognito mode in these browsers provides no protection against vendor-side or on-path surveillance and creates false privacy expectations for circumvention tool users.
-
All six browsers grant dangerous Android permissions (READ_PHONE_STATE, INTERNET, ACCESS_NETWORK_STATE) to third-party SDKs; built-in phone browsers grant significantly more such permissions than app-store browsers. Baidu Mobile Tongji Analytics SDK—present in all six via Baidu as default search engine—collects IMEI, UUID, CUID, GAID, device MAC, and Bluetooth MAC, creating a persistent cross-app device fingerprint that identifies users across VPN sessions and survives IP changes.
-
The GFI's HTTP and HTTPS filters are now stateful (requiring initial SYN packet with matching sequence numbers) and have been activated on all TCP ports—not only standard ports 80 and 443 as reported by prior studies. This is a significant departure from previous work that found stateless HTTP/HTTPS blocking limited to standard ports. The HTTP filter injects a 403 Forbidden blockpage (not RST packets as used by the GFW), while HTTPS injects a single RST+ACK packet. The GFI also exhibits TCP non-compliance (not requiring a full three-way handshake to trigger filtering), enabling outside-in measurement without in-country servers.
-
The GFI operates three distinct DNS/HTTP injectors with different fake IP addresses (10.10.34.34, 10.10.34.35, 10.10.34.36) and partially overlapping blocklists—mirroring the GFW's triplet-censor architecture. Injector 10.10.34.35 exhibits TTL reflection (injected response TTL = probe TTL − hop count), identical to the GFW. No IP exclusively receives injections from 10.10.34.34 (a smaller, selective component); the two primary injectors 10.10.34.35 and 10.10.34.36 handle the majority of censorship. Different injectors maintain distinct domain blocklists, meaning which domains a user sees as censored depends on routing through their AS.
-
MinecruftPT encodes circumvention traffic steganographically inside the Minecraft Java Edition network protocol, making a censored connection appear to a network observer as an ordinary online Minecraft game session. The cover channel is a high-volume, varied-packet-size TCP protocol with a large and active user population, making statistical fingerprinting harder than for lower-volume cover protocols.
-
MinecruftPT achieves mimicry by implementing enough of the Minecraft protocol to pass as a real client-server game session, not just in header structure but in behavioral sequence. The paper evaluates it under DPI and traffic-shape analysis, finding that faithful protocol mimicry at the behavioral level (packet sequence, message types, timing) is necessary to defeat classifiers that go beyond simple byte-pattern matching.
-
MinecruftPT uses the TCP-based Minecraft protocol rather than a WebRTC/UDP approach. The paper notes this gives it an availability advantage in environments where WebRTC is filtered or where UDP is blocked — a common configuration in corporate or institutional networks and some national censorship regimes. This positions it as complementary to Snowflake in the circumvention transport portfolio.
-
The proposed system adopts the turbo tunnel architecture to provide a reliability layer over lossy TURN relay paths and to allow traffic reassembly at a single bridge across multiple TURN proxies. Three encapsulation modes are specified: direct application data inside TURN messages, DTLS datagrams via WebRTC data channels, and video frames inside WebRTC media streams — the latter two mimicking the encapsulation strategies of existing WebRTC circumvention systems such as Snowflake and TorKameleon.
-
TURN servers used by major applications such as Facebook Messenger for media relay are hypothesized to be less likely blocked in censored regions due to collateral damage to legitimate WebRTC traffic. Providers like Cloudflare, Metered Video, and ExpressTURN supply geographically distributed TURN infrastructure that can be used without any special configuration by a censorship evasion system.
-
The system targets a threat model where the censor performs passive DPI to fingerprint and block the client-to-TURN-proxy channel, and also conducts active enumeration attacks to discover and block proxy endpoints. The paper explicitly notes that traffic splitting may introduce distinct fingerprints of its own that require empirical evaluation — acknowledging that multi-path approaches are not fingerprint-free.
-
Traffic splitting across N TURN proxies (1 ≤ N ≤ M) is hypothesized to resist active probing because each TURN server responds to probing requests identically to a regular TURN server, providing no distinguishing signal. Additionally, proxy ephemerality combined with splitting allows on-the-fly migration to new proxies when existing ones are blocked, maintaining connectivity even under partial blocking.
-
Three open-source DPI tools (Zeek, libprotoident, nDPI) fail to identify 93–100% of UPGen flows across all tools. libprotoident misidentified 7% of UPGen flows as RTMP; nDPI and Zeek produced zero false labels. On a real-world MAWI/WIDE backbone capture, Zeek failed to recognize 90% of flows and nDPI failed on 67%, confirming that unidentified-protocol traffic is common in the wild; allowlisting without significant collateral damage (≥4%) is infeasible.
-
State-of-the-art ML classifiers (Deep Fingerprinting, Decision Tree, Random Forest, nPrintML) trained on known UPGen protocols and benign traffic always incur high out-of-distribution false-positive rates when attempting to block unknown UPGen protocols — in the vast majority of experiments the OOD FPR is 100%. The one exception (SSH OOD, Deep Fingerprinting) achieved a UPGen TPR of only 20%. By contrast, identical classifiers successfully generalize to block unknown Obfs4 flows with near-zero collateral damage in 3 of 4 cases.
-
UPGen's generator samples 18 independent parameters to produce 4.2×10^22 distinct structured encrypted protocols (entropy 38.4 bits). Each proxy is assigned a unique generated protocol, so identifying one protocol exposes only a single proxy. The generator was designed by studying 27 real-world encrypted protocols and sampling from observed structural patterns (greeting strings, handshake patterns, field orderings, key encodings).
-
The GFW detects fully encrypted protocols using ad-hoc rules including the percentage of printable ASCII characters per packet (threshold: over 50%) and the observation that FEP entropy is considerably higher than normal encrypted TLS traffic. These rules are subject to frequent changes, making rigid FEP designs unable to adapt.
-
The paper concludes with design guidelines for future FIA-based privacy-enhancing technologies, identifying that path-aware routing in SCION and NDN's in-network caching both create new surveillance exposure: SCION path headers reveal routing metadata to on-path censors; NDN caching at routers means content is replicated at points under censor control. The authors recommend that PETs built on FIAs treat these architectural features as threat vectors, not privacy benefits.
-
Wrana et al. systematically assess how well existing surveillance and censorship mechanisms can target users of Future Internet Architectures (FIAs) — including NDN, SCION, XIA, and MobilityFirst — finding that DPI and flow-correlation techniques from the current internet map onto FIA traffic with moderate adaptation. The paper identifies that FIA naming/addressing schemes introduce new censorship attack surfaces (e.g., content-name-based filtering in NDN) not present in IP-based architectures.
-
Since August 2023, Henan Province has operated its own TLS SNI-based and HTTP Host-based censorship middleboxes that inspect and block traffic exiting the province—a second filtering layer on top of the national GFW. The Henan Firewall is fingerprinted by a unique TCP RST+ACK injection carrying a fixed 10-byte payload (0x01 02 03 04 05 06 07 08 09 00), IP ID 0x0001, and an observed TTL of 58. Unlike the GFW, it injects resets only toward the client, performs no residual censorship, and requires no TCP handshake to trigger. Longitudinal testing (Nov 2023–Mar 2025, Tranco top 1M daily + 227M CZDS domains weekly) found the Henan Firewall blocked a cumulative 4.2 million domains—more than five times the GFW's cumulative blocklist—and at peak blocked ten times more domains than the GFW.
-
The Henan Firewall only inspects traffic leaving Henan Province toward the rest of the world—it does not inspect domestic intra-China traffic nor inbound traffic entering the province. This contrasts with the GFW, which operates bidirectionally at China's national border. Measurement across seven CN cities (Beijing, Shanghai, Chongqing, Guangzhou, Nanjing, Chengdu, Zhengzhou) found no evidence of comparable provincial firewalls in the other six locations, making Henan the only documented province with an autonomous censorship layer as of March 2025. The Henan Firewall also uses the same blocklist for both HTTP Host-based and TLS SNI-based censorship, whereas the GFW maintains separate domain lists per protocol.
-
The Henan Firewall is stateless in two exploitable ways: (1) it requires the TCP header to be exactly 20 bytes—enabling any TCP option (e.g., TCP Timestamps, which Windows disables by default) to bypass it entirely; (2) it does not perform TCP reassembly, so splitting a TLS ClientHello across two TCP segments such that the SNI extension straddles the boundary bypasses the censor. Both bypasses require only client-side changes and have already been implemented in Xray, GoodbyeDPI, and Shadowrocket. TLS record fragmentation (splitting the ClientHello across multiple TLS records within one TCP segment) also defeats both the Henan Firewall and the GFW, since neither performs TLS reassembly.
-
The computational cost of decrypting QUIC Initial packets limits the GFW's throughput: blocking effectiveness drops measurably as cross-border QUIC traffic increases and exhibits a diurnal pattern, falling during China's peak traffic hours. In a controlled experiment, sending QUIC Initial packets at 100–1500 kpps (TTL-limited so they reach the GFW but not end-hosts) caused GFW censorship effectiveness to decrease monotonically with sending rate, while equal-rate random-payload UDP traffic produced no such degradation—confirming the bottleneck is QUIC decryption, not raw bandwidth. A related availability attack using IP-spoofed QUIC Initials from one machine can cause the GFW to drop all UDP traffic between arbitrary Chinese hosts and any foreign endpoint for the 180-second residual window.
-
Since April 7, 2024, the GFW decrypts every QUIC client Initial packet at China's national border and blocks connections whose TLS ClientHello SNI matches a QUIC-specific blocklist. Blocking takes the form of dropping all subsequent UDP packets sharing the same (src-IP, dst-IP, dst-port) 3-tuple for 180 seconds—with no RST injection. The GFW applies a source-port heuristic: packets with src-port ≤ dst-port are not inspected, capturing >92% of real QUIC client Initials while processing only ~30% of all UDP traffic. The QUIC blocklist contains 58,207 unique FQDNs (Tranco, Oct 2024– Jan 2025), approximately 60% of the DNS blocklist in size; 33% of blocked FQDNs do not actually support QUIC, suggesting the list was derived from an existing domain-name blocklist rather than live QUIC service discovery.
-
The GFW's QUIC censor does not reassemble QUIC client Initial packets that are split across multiple UDP datagrams, nor does it reassemble QUIC CRYPTO frames split within a single datagram. Three practical bypasses follow: (1) send any UDP datagram with a random payload before the QUIC Initial—the GFW uses 60-second UDP flow state and won't inspect a mid-flow packet; (2) fragment the TLS ClientHello SNI across multiple QUIC CRYPTO frames; (3) use an unknown QUIC version number in the first packet (Version Negotiation bypass, payload undecryptable). Chrome independently exploits (2) through its Chaos Protection feature (since 2021) and post-quantum Kyber key-agreement (since v124, Sep 2024), whose larger key sizes force fragmentation across UDP datagrams. As of January 2025, the GFW also does not block ECH-containing QUIC payloads unless the outer (cleartext) SNI is on the blocklist.
-
WATER (WebAssembly Transport Executables at Runtime) defines a pluggable-transport architecture in which the transport logic is compiled to a WASM module that is loaded and executed at runtime by a thin Go host process. This separates the stable host ABI (dial, accept, read, write) from the rapidly-evolving transport logic, allowing new or updated transports to be delivered as small WASM binaries without recompiling or redeploying the host application.
-
Testing from a VPS in Iran showed that standard DTLS handshakes are blocked at that vantage point, but Oscur0 avoids this blocking by transmitting only Application Data packets (with Connection ID extension per RFC 9146) after the initial one-shot setup packet, never completing a visible DTLS handshake. A proof-of-concept was implemented in approximately 600 lines of Go using the pion/dtls library.
-
Oscur0 eliminates Conjure's separate registration phase by steganographically encoding ECDH public key, phantom IP, and transport parameters into the encrypted application data of the first UDP (DTLS 1.2 with Connection ID) packet sent to the phantom IP, using Elligator encoding to make the public key indistinguishable from random bytes. This removes several round trips — registration, TCP handshake, and application handshake — compared to standard Conjure, and means censors cannot block the scheme by blocking registration alone.
-
Registration-dependent Refraction Networking schemes such as Conjure create multiple single points of failure: censors can block registration channels independently of phantom connections. Domain fronting, a primary registration channel, has been progressively banned by major CDNs — Microsoft Azure in 2021 and Fastly in early 2024 — reducing its viability as a covert registration mechanism.
-
Prior circumvention transports that tunneled over VoIP or voice-conferencing software were identifiable to censors by their TCP retransmission fingerprint: real VoIP applications do not retransmit dropped packets in the same way, making the covert channel's reliability mechanisms a distinguishing artifact. DTLS and QUIC avoid this because they natively support both fault-tolerant and sequential delivery modes without external indicators of which mode is active.
-
WATER (WebAssembly Transport Executables Runtime) separates transport logic from the host application by compiling it to a WASM module (WATM) that is distributed and loaded independently at runtime. Deploying a new or updated circumvention technique requires only distributing the new WATM binary and optional configuration — no change to the host application and no app-store update cycle is required.
-
Traditional circumvention tool development and deployment is slow because new strategies must be developed, integrated into each tool separately, and then distributed via platform app-stores. WATER's WASM module architecture specifically addresses this asymmetry: censors evolve blocking techniques quickly, while circumventors are bottlenecked by binary release cycles. The paper argues that dynamic WATM delivery breaks this bottleneck by decoupling transport updates from application releases.
-
A protocol-agnostic classifier that identifies RFC-mandated TCP behaviors (three-way handshake, 500ms ACK, 2×RMSS acknowledgement) leaking through UDP-based VPN tunnels achieves a false positive rate of 0.11–0.29% on real campus traffic, an order of magnitude lower than ML-based VPN detection techniques (FPR 1.4–5.5%) and on par with the GFW's estimated heuristic FPR of 0.6%.
-
GFWeb discovered that the GFW's bidirectional blocking is not symmetric: certain domains trigger blocking only when probed from inside China, not from outside. This overturns the prior assumption that the GFW blocks the same domains symmetrically in both directions. The paper also documents that the GFW has been upgraded to fix previously-reported evasion techniques, including overblocking mitigation and improved fragmented-packet reassembly, indicating active engineering iteration on the censor side.
-
GFWeb tested 1.02 billion domains against the GFW over 20 months and discovered 943,000 pay-level domains blocked by HTTP filters and 55,000 by HTTPS filters — the largest GFW blocklist dataset ever published. The HTTP-to-HTTPS ratio (17:1) confirms that the GFW's HTTPS keyword-based and SNI-based blocking covers far fewer domains than its HTTP host-header blocking, likely because HTTPS blocks carry higher collateral-damage risk.
-
Longitudinal GFWeb data spanning 20 months shows the GFW actively patched previously-published evasion findings during the measurement period: overblocking bugs reported in academic papers were fixed, and fragmented-packet reassembly failures that researchers used to bypass blocking were corrected. This demonstrates that the GFW operator monitors published research and iterates on the system in response to disclosed vulnerabilities.
-
Chivo Wallet posts logs of every in-app event to NewRelic ('log-api.newrelic.com'), including keystrokes — DUI national ID numbers, phone numbers, and passwords — without privacy-policy disclosure. Separately, MiTelcel (76% Mexican mobile market share, 10M+ downloads) leaks users' phone numbers and emails to five distinct third-party servers via the HTTP 'referer' field on every 'Experiencias' tab click.
-
In Latin America, censorship predominantly takes the form of targeted surveillance coupled with physical threats rather than network-level blocking. Mexico had documented Pegasus infections on journalists and activists between 2019–2022, at least 25 private spyware vendors sold surveillance tools to Mexican federal and state police, and at least 119 journalists have been killed in Mexico since 2000. Dynamic analysis of 8 widely-used LATAM apps (combined 100M+ downloads) found security failures across all three assessed categories: cleartext traffic, undisclosed PII exfiltration to third parties, and unvetted external code update mechanisms.
-
MiClaro Colombia sends device latitude and longitude to multiple third-party servers without user disclosure, in violation of its own privacy policy. Among the four Movistar country variants, the Argentina app requests access to all phone-call-related permissions while the Uruguay app requests none — demonstrating that third-party SDK inclusion, background receivers, and dangerous permissions vary substantially by country version of the same ostensibly unified telco app.
-
The SAT Móvil app (Mexico's official tax service, 1M+ downloads) consistently fetches its 'Chat' page over cleartext HTTP, exposing citizen ID numbers (CURP, RFC), passwords, and tax documents to any in-path attacker. None of the four major Latin American telco apps (MiTelcel, MiTigo, MiClaro, MiMovistar) implement HSTS on SMS-delivered external links, making all of them uniformly vulnerable to SSL strip downgrade attacks.
-
Censors employing deep learning can use DTLS connection duration as a precise identifier to classify and block Snowflake traffic. The paper proposes switching PT connections after a variable time limit as a countermeasure to prevent duration-based classification.
-
China's Great Firewall showed anomalous inconsistency: 13 test vectors produced mixed outcomes—TCP RST injection on some executions and a clean server response on others—with circumvention rates between 10% and 35% across 100 executions per vector. The authors attribute this to heterogeneous GFW infrastructure components applying different HTTP parsing logic, a departure from the GFW's usual consistency.
-
Of 4,488 total HTTP Request Smuggling test vectors, 2,015 (44.9%) were accepted by at least one web server. CL*/TE vectors had a 99.0% acceptance rate (1,103/1,114); TE*/CL had 76.0% (859/1,130); CL/TE* had only 4.7% (53/1,130); and TE/CL* had 0%. Nginx 1.25.2 accepted 1,315 vectors while Apache 2.4.57 accepted only 11, reflecting HRS countermeasures added in Apache 2.4.25 and 2.4.52.
-
HTTP Request Smuggling—a web-security vulnerability that exploits CL/TE header parsing ambiguities between a front-end (censor) and back-end (web server)—can be systematically repurposed as a censorship circumvention technique. By hiding a censored Host in the body of a benign outer request, the censor parses only the uncensored outer request while the destination server processes both, successfully bypassing HTTP censorship in China (19 vectors), Iran (254 vectors), and Russia (all 2,015 vectors) from the evaluated vantage points.
-
Iran's censor contains an implementation bug: when the Content-Length header carries an invalid (non-integer) value and a Transfer-Encoding header is also present, the censor gracefully skips the invalid CL value and attempts to parse subsequent traffic, but fails to correctly interpret the TE header—causing it to pass the smuggled (censored) request. This bug enabled 254 of 2,015 evaluated test vectors to bypass Iranian censorship, all using the CL*/TE or CL/TE* vector types.
-
Russia's censor (at the Moscow/ASN-50867 vantage point) inspects only the first HTTP packet of the first TCP segment per TCP stream and never analyzes subsequent HTTP requests—whether in the same TCP packet or a later one. This caused all 2,015 accepted test vectors to successfully evade censorship, and the bypass is achievable with standard-compliant HTTP (e.g., whitespace or case variations in header names, which HTTP/1.1 explicitly permits).
-
TCP-compliant packet alphabets are insufficient for modeling stateful firewall evasion. Including non-TCP-compliant traffic — specifically flipped-direction SYNs, out-of-window seq/ack numbers, and packets that form a parallel TCP connection in the reverse direction — is what unlocks discovery of deep attack paths. Prior model-inference work (Alembic) that restricted itself to compliant sequences produced models incapable of generating any of the 6,000+ attacks Pryde found.
-
Pryde generates more than 6,000 successful and unique evasion attacks against 4 popular stateful firewalls, which is 2–3 orders of magnitude higher than censorship circumvention algorithms (e.g., Geneva) and black-box fuzzing. The gap arises because circumvention tools only uncover shallow evasion sequences and cannot systematically explore the full attack-state space.
-
China's GFW exhibited unusually inconsistent HTTP censorship behavior: 13 of the evaluated HRS test vectors circumvented the GFW in some executions but not others, with per-vector success rates between 10% and 35% across 100 executions per domain. The authors attribute this to two distinct parts of GFW infrastructure employing different HTTP censorship mechanisms, a departure from the GFW's typical consistency.
-
HTTP request smuggling (HRS) vectors that exploit CL/TE header parsing divergence between a censor-as-middlebox and a destination web server can circumvent HTTP censorship in China, Iran, and Russia. Of 4,488 test vectors derived from prior HRS research, 2,015 (44.9%) were accepted by at least one web server; CL*/TE vectors achieved a 99.0% web-server acceptance rate while TE/CL* vectors achieved 0%.
-
Iran's censor injects an HTTP block page consistently but contains an implementation bug: it fails to parse the TE header when a CL header with an invalid (non-integer) value is present, causing it to pass subsequent traffic. 254 of the evaluated test vectors circumvented Iran's censor; the 'Wrapping' CL*/TE strategy (e.g., 'Content-Length: <len>\u00FF\x0aX: X') was especially effective, exploiting this graceful-degradation fault.
-
The Russian censor at the tested Moscow vantage point (ASN 50867, China Unicom-equivalent private ISP) inspects only the first HTTP packet of the first TCP segment in a TCP stream and never blocks a second HTTP request, whether coalesced in the same TCP packet or sent in a subsequent one. All 2,015 web-server-accepted test vectors evaded Russian censorship, including standard-compliant whitespace-injection vectors (e.g., 'Content-Length\x20: <len>\x20').
-
Web security vulnerabilities whose exploitation depends on parser divergence between two co-located systems are structurally isomorphic to censorship circumvention attacks, where the censor acts as the frontend parser and the destination server as the backend. The authors demonstrated this by directly converting all HRS test vectors from prior security research into circumvention probes with no modification, showing that censorship-circumvention techniques can be systematically constructed from existing vulnerability corpora.
-
TLS-Attacker's Workflow Traces and Modifiable Variables mechanisms allow testers to specify arbitrary protocol flows and apply field-level modifications — including adding, removing, or overwriting individual TLS message fields — without breaking the internal TLS state machine. This makes it the standard instrument for probing how DPI systems and active-probing detectors respond to non-standard or mutated TLS handshakes.
-
Amazon SQS routes client traffic through a single fixed HTTPS endpoint (https://sqs.us-east-1.amazonaws.com), making it infeasible for a censor to distinguish circumvention-bound SQS traffic from legitimate AWS service traffic; blocking this signaling channel would require blocking all Amazon SQS, imposing significant collateral damage on businesses and developers.
-
The GFW's DNS packet injector (Injector 3, identified by TTL mirroring and zero IP ID) contained an out-of-bounds read vulnerability: due to missing label-length and null-terminator validation, malformed DNS requests caused the injector to copy adjacent stack memory into forged responses. Over three days in October 2023, researchers collected over 1 TB of data containing over 13 billion leaks, ~87.43% with non-duplicate content, including live Internet traffic transiting China's backbone and stack frames of the GFW's packet-handling processes.
-
Automated pattern analysis of 13 billion leaked GFW memory frames found over 52.8 million HTTP/1.x protocol signatures, 984,567 Authorization headers, 1.9 million Cookie headers, 79,090 password-in-URL occurrences, and 59,326 SMTP/IMAP plaintext credential sequences — yielding over 3 million pieces of potentially sensitive data collected at a deliberately limited rate of 5,000 exploit packets per second.
-
Analysis of leaked stack frames confirmed the GFW's packet injector processes run on x86-64 Linux with ASLR and PIE enabled but without stack canaries, implying that buffer overflow vulnerabilities in the GFW may lack effective mitigation. Each injector process was inferred to use exactly four packet-handling threads, identified by up to four unique stack-address groups per return address (each group spanning within the 8 MB default Linux stack size).
-
GFW verification tests confirmed over 90% of OONI-detected DNS anomalies as true blocks: 429/457 domains in Beijing and 422/461 in Shanghai. In total, 527 unique domains were confirmed censored via DNS, HTTP, and HTTPS filters; an additional 718 domains suspected blocked due to IP-address-level blocking of their hosting servers rather than domain-level entries.
-
CenDTect, an unsupervised decision-tree system using iterative parallel DBSCAN, analyzed more than 70 billion Censored Planet data points (January 2019 – December 2022) and discovered 15,360 HTTP(S) censorship event clusters across 192 countries and 1,166 DNS event clusters across 77 countries. Manual validation against 38 known censorship events from news reports confirmed all human-identified events were recoverable from CenDTect's output. The system additionally identified more than 100 ASes in 32 countries with persistent ISP-level blocking and 11 temporary blocking events in 2022 correlated with elections, protests, and armed conflict.
-
The system uses a shared Pub/Sub topic for all users, where session IDs (SIDs) are visible to all subscribers on the broker topic. The paper argues this does not compromise user anonymity because SIDs are randomly generated per-session by client-side software with no link to user identity, and all subsequent bridge-info payloads are encrypted under a session-specific symmetric key exchanged via asymmetric encryption.
-
State-of-the-art ML-based obfs4 detection (Wang et al. decision tree) achieves 97% precision at equal base rates (λ=1) but precision collapses to 3% at a still-conservative λ=1,000; at λ=10⁶ precision approaches zero for all classifiers tested. This base-rate failure was previously uncharacterized because prior evaluations only considered balanced or near-balanced datasets.
-
Obfuscated proxy traffic (including Shadowsocks, VMess, VLESS, Trojan, obfs4, and REALITY) can be reliably fingerprinted by detecting encapsulated TLS handshakes — the inner TLS ClientHello that appears inside an outer encrypted tunnel. This fingerprint is protocol-agnostic: any proxy that wraps TLS-bearing application traffic will produce it. The authors deployed a similarity-based classifier within a mid-size ISP serving over one million users and demonstrated detection with minimal collateral damage.
-
While stream multiplexing reduces the visibility of encapsulated TLS handshakes by merging inner connections, the paper cautions that multiplexing plus random padding alone is "inherently limited" as a long-term countermeasure. Censors can adapt by monitoring burst sizes and round-trip counts at the outer-connection level, which remain correlated with the number of inner TLS sessions regardless of padding.
-
TSPU devices perform in-line packet manipulation — they can inject RST packets, drop traffic, and throttle connections — rather than routing traffic to an out-of-band sniffer that votes to block. The inline placement means TSPU can act on the first-packet payload and impose latency on all matching flows, not only on those selected by sampling. Blocking decisions are therefore applied with high recall at the ISP edge, and circumvention tools that rely on short observation windows (e.g. only obfuscating the first N bytes) are vulnerable to continued inline inspection of subsequent traffic.
-
Russia's TSPU ("Средства противодействия угрозам") system is deployed inline at individual ISP edges rather than at centralized internet exchange points, producing substantial per-ISP heterogeneity: some providers apply layer-7 SNI/Host filtering while others rely primarily on IP-prefix blocklists, and QUIC/HTTP3 is blocked at several major providers. Rollout timing and enforcement depth vary measurably across autonomous systems, meaning a single "Russia passes/fails" test fixture systematically underestimates blocking coverage.
-
Geneva packet-manipulation probing traffic exhibits distinctive features — corrupt data-offset fields, smaller packet sizes, overlapping TCP segments, TTL variance, and non-zero SYN packets — that allow simple ML classifiers (Decision Trees, Random Forests, Logistic Regression, SVM) to detect it with AUC > 0.99. A subsequent TRW-based IP-level detector can then block the source IP with high confidence after inspecting only 2 Geneva probing flows.
-
Replacement-based covert channels that substitute genuine media streams with ciphertext (Protozoa replacing WebRTC video, Balboa replacing audio) are immediately detectable when the censor controls or has plaintext access to the protocol gateway — for example, a WebRTC relay that decrypts and validates incoming media. Censors can also systematically suppress these channels by selectively degrading or blocking encrypted traffic for which they have no decryption trapdoor.
-
Protocol mimicry that replicates only statistical or syntactic traffic properties is insufficient for unobservability: Houmansadr et al. (2013) showed SkypeMorph was trivially detectable by the absence of Skype control channels, missing login-server communication, and failure to replicate implementation-specific bugs present in real Skype—demonstrating that full behavioral replication, not just traffic shaping, is required to withstand scrutiny.
-
HTTP-based blocking is the dominant censorship technique across Indian ISPs, observed in 64 of 71 measured ASes. However, the authors note it is largely ineffective because over 90% of web connections now use HTTPS, meaning ISPs cannot inspect the HOST header for the vast majority of traffic — making HTTP blocking easily bypassed by any HTTPS client.
-
HTTP/URL/keyword filtering was the most prevalent censorship method both during the measurement period (49% of countries) and historically (69%), despite 82% global HTTPS adoption. The authors attribute this persistence to censors lacking technical sophistication to upgrade, and to uneven HTTPS adoption leaving older methods effective in underserved regions.
-
Protocol fingerprinting — including DPI-based identification of VPNs, circumvention tools, and E2EE messengers — was active in only 6% of countries during the measurement period (13% all-time), but all confirmed instances came from focused individual studies, not from mass measurement platforms like OONI or Censored Planet. The authors flag encrypted traffic analysis (ETA) tools and next-generation firewalls (NGFWs) capable of blocking Signal or Tor Browser as an emerging threat to freedom of expression.
-
Residual censorship — where a censor detects an objectionable connection via one method and then blocks all traffic between the same 3-tuple (client IP + server IP + port) or 4-tuple (client IP + port + server IP + port) for a short duration — was documented in China, Iran, and Kazakhstan. This means a single detected circumvention attempt can trigger temporary IP-level blocking of the entire endpoint regardless of protocol.
-
TLS-based filtering (SNI blocking) was active in 41% of 70 surveyed countries during the June 2020–May 2021 measurement period and 44% historically, driven by the 82% global HTTPS adoption rate (Mozilla telemetry, Oct 2021). China took the unprecedented step of blocking ESNI traffic entirely, and the authors note that widespread ECH deployment could render this entire censorship category obsolete.
-
The GFW's SNI inspection is a stateless single-record parser: it cannot detect the SNI extension when the ClientHello is split across multiple TLS records, even when all records are contained within the same TCP segment. In contrast, the GFW does detect SNI when it appears fully within the first TCP segment despite TCP fragmentation, indicating the reassembly gap is specific to the TLS record layer.
-
TCP fragmentation before the SNI extension circumvents the GFW, but TCP fragmentation placing the SNI in the first TCP segment does not. The paper notes the GFW is showing 'the first signs of successfully handling TCP fragmentation,' indicating active hardening of TCP-layer circumvention that makes TLS-layer techniques increasingly necessary.
-
TLS record fragmentation successfully circumvents the GFW in all tested configurations: splitting the ClientHello across multiple TLS records — whether the split falls before or after the SNI extension — bypasses GFW SNI-based blocking in every case (Table 1). TCP fragmentation after the SNI extension fails, but any TLS-layer fragmentation succeeds.
-
Using Geneva (genetic algorithm censorship evasion), five new evasion strategies were discovered that defeat Turkmenistan's censorship at both transport and application layers across DNS, HTTP, and HTTPS. The strategies exploit Turkmenistan's use of a commercial DPI box ("Golden DPI" by Qurium) and can be applied server-side without requiring changes to censored users' client software.
-
The paper introduces TMC, a remote measurement tool that infers domain-blocking status across DNS, HTTP, and HTTPS without requiring in-country vantage points, using only 38% Internet penetration in a country of 6 million people. TMC enabled the largest Turkmenistan censorship measurement to date by exploiting middlebox reflection properties observable from outside the country.
-
The largest measurement study of Turkmenistan censorship to date tested 15.5 million domains and found more than 122,000 domains censored using separate blocklists for DNS, HTTP, and HTTPS. Reverse-engineering the blocking rules revealed approximately 6,000 over-blocking rules that cause incidental filtering of more than 5.4 million additional domains — a 44x collateral damage ratio relative to intentionally blocked domains.
-
AS201776 (Miranda-Media Ltd) is responsible for the largest volume of Russian transit censorship by destination IP count, affecting approximately 16,000 IP addresses in Ukraine from US and Sydney vantage points. AS3216 (PJSC Vimpelcom) has the widest geographic reach—delivering blockpages for traffic destined to 8 countries—but impacts no more than 1,000 IP addresses per country from any single vantage point.
-
Scanning the IP address spaces of 18 countries surrounding Russia, the authors identify Russian transit censorship affecting at least 8 countries (Afghanistan, Azerbaijan, Kyrgyzstan, Kazakhstan, Lithuania, South Korea, Tajikistan, and Ukraine), attributable to at least 6 Russian ASes. Only 2 of these 8 countries (Kyrgyzstan and Kazakhstan) had been reported in prior work, and the collateral damage is characterized as a lower bound due to the study's blockpage-only methodology.
-
The study's three vantage points (US university, AWS Sydney, AWS Tokyo) produce substantially different transit censorship observations: the US vantage point detects blockpages in all 8 affected countries, while Sydney and Tokyo detect transit censorship only in Kazakhstan and Ukraine. This variance is attributed to routing path differences across vantage points, confirming that transit censorship coverage is highly path-dependent.
-
AS60299 (Mezhdugorodnyaya Mezhdunarodnaya Telefonnaya Stanciya Ltd) and AS201776 (Miranda-Media Ltd) deploy commercial DPI technology manufactured by Russian company VAS Experts to perform transit censorship. Ukraine is subject to transit censorship by the most Russian ASes (at least 5: AS3216, AS25227, AS35816, AS47203, AS201776), likely due to post-2022 re-routing of Ukrainian Internet traffic through Russian telecommunications infrastructure.
-
Manual analysis of 700+ unique packet groupings from possibly tampered connections yielded 19 high-confidence tampering signatures — up from 6 in prior work — covering 86.9% of all possibly tampered connections. Post-SYN signatures account for 43.2% of possibly tampered connections (99.5% matching a known signature), post-ACK for 16.1% (98.7%), and post-first-data-packet (PSH+ACK) for 5.3% (97.9%), with 19 signatures described as flag-sequence patterns of the form ⟨X→Y⟩ in Table 1.
-
Censoring middleboxes predominantly use RST injection rather than in-path packet dropping because injecting forged RST/RST+ACK packets does not require the middlebox to sit in the data path — off-path copies of packets suffice. The GFW specifically injects both RST and RST+ACK packets simultaneously after an offending PSH, a known idiosyncratic signature, while Iran's censor uses post-handshake RST injection (⟨SYN;ACK→RST⟩) and packet drops at the same stage.
-
Following the invasion, Psiphon user counts and VPN usage in Russia increased many-fold and correlated with specific censorship events, while multiple access paths to Tor (direct connections, bridges, pluggable transports) were progressively blocked. Despite this surge, circumvention tools reached only a small fraction of all Russian Internet users, indicating that aggressive multi-vector blocking and lack of user awareness left most people unable to access censored resources.
-
OONI data shows anomaly rates in Russia's top five ASes (including Rostelecom AS12389, Vimpelcom AS8402) rose from roughly 7–11% in January and early February 2022 to 12–21% in mid-March 2022, with social-media and news domains such as Facebook, Twitter, Instagram, and BBC going from available to near-completely blocked after the invasion.
-
FSK-encoded Dolphin audio is distinguishable from normal human speech via offline amplitude analysis: Dolphin's mean signal amplitude is 0.4 (std 224) versus 205 (std 1590) for natural speech — approximately an order of magnitude lower — enabling classification by a telecom operator who records calls. The paper also notes that standard CRC checksums appearing periodically every chunk provide a unique detectable signature if the adversary attempts to decode the audio.
-
The proposed crowdsourced system runs multiple isolated Geneva training pools on a controlled server — one pool per censorship system (initially China and Iran) — and instructs volunteer browsers via JavaScript to send forbidden requests to isolated ports, with no download or software installation required from the user. The server monitors per-strategy success or failure to drive genetic evolution entirely from the server side.
-
Browsers cannot independently set the HTTP Host header or TLS SNI field, blocking the standard censorship-trigger methods used in Geneva training. The paper proposes two workarounds: (1) keyword-based HTTP censorship triggers using forbidden strings in URL parameters, limited to censors that employ keyword filtering; and (2) registering domains whose strings contain a censored substring to exploit censor overblocking via overbroad regular expressions (e.g., registering a domain matching torproject.org's regex to also catch mentorproject.org).
-
Server-side censorship evasion strategies require zero client-side changes: clients bypass censorship without installing software or even being aware of the evasion, and this approach has been adopted in production tools including Psiphon's packetman. The packet manipulations exploit weaknesses in how censors track or tear down TCP connections, occurring entirely at the server during the three-way handshake.
-
All existing automated server-side strategy discovery tools — Geneva, Alembic, and SymTCP — require researcher control of a client during training, even when the discovered strategies are deployed exclusively server-side. This dependency makes it infeasible to train against censors in networks where researchers cannot place a controlled machine.
-
Relying on third-party email providers to verify users was demonstrated by Ling et al. to leave Tor's BridgeDB vulnerable to censors capable of creating multiple accounts, enabling bridge enumeration via sock-puppet attacks at scale. Active and passive detection techniques — including traffic flow analysis, DPI, website fingerprinting, and active probing — have been demonstrated in prior work to reveal Tor bridges, making Tor inaccessible for the majority of users in some regions.
-
The GFW detects Shadowsocks by flagging apparently high-entropy connections that are not TLS or HTTP, but this detection is brittle: connections are explicitly allowed if the first 6 bytes of the first packet of a flow are all printable ASCII characters (range 0x20–0x7E). Adding a 6-byte alphanumeric preamble to the Shadowsocks message definition is sufficient to bypass this heuristic and requires only a short patch to the protocol specification file.
-
ShadowTLS relays are detectable via three active probing techniques exploiting behavioral discrepancies from the mask sites they mimic: (1) responding to plaintext HTTP on port 443 with FIN-ACK rather than an error (only 17% of TLS servers share this behavior), (2) silently ignoring non-TLS record data post-handshake rather than sending a fatal alert (only 0.14% of 30M hosts behaved this way), and (3) silently ignoring corrupted TLS Application Data records rather than sending a bad_record_mac alert (only 0.12% of hosts silent).
-
ShadowTLS is structurally limited to TLS 1.2 because in TLS 1.3 the Finished message is sent as encrypted Application Data (record type 0x17), preventing the relay from detecting handshake completion without decrypting the session. This forces ShadowTLS to advertise TLS 1.2, which is an increasingly anomalous fingerprint as TLS 1.3 adoption grows.
-
The GFW's fully-encrypted detector (deployed Nov 2021) operates by exempting likely-benign traffic and blocking the rest. Five inferred exemption rules applied to the first TCP payload (pkt): Ex1 — popcount(pkt)/len(pkt) ≤ 3.4 or ≥ 4.6 (bits/byte); Ex2 — first 6+ bytes are printable ASCII [0x20–0x7e]; Ex3 — more than 50% of bytes are printable ASCII; Ex4 — more than 20 contiguous printable ASCII bytes; Ex5 — first bytes match TLS or HTTP fingerprint. Traffic failing all five exemptions is blocked. Experiments confirmed all rules still held as of February 2023.
-
PushProxy's high-frequency downstream channel generates over 100 push notifications to load a typical webpage, contrasting sharply with the daily average of 46 push notifications received by a smartphone. This statistical anomaly makes PushProxy flows identifiable by simple rule-based filters without requiring sophisticated traffic analysis.
-
Starting October 3, 2022, more than 100 users reported simultaneous blocking of TLS-based circumvention servers running Trojan, Xray, V2Ray TLS+WebSocket, VLESS, and gRPC. Blocking was port-specific initially (mainly port 443, but also non-443 ports), then escalated to full IP blocking when users switched ports. Domain names were not added to DNS or SNI blocklists. naiveproxy was notably not affected. The blocking was dynamic in at least some cases (browsers could still reach the port, but circumvention tools could not), strongly indicating protocol-level identification rather than blind port blocking.
-
Extending Geneva's genetic algorithm to the application layer automatically discovered 77 unique HTTP evasion strategies and 9 DNS evasion strategies against censors in China, India, and Kazakhstan — all requiring only unprivileged usermode modifications with no TCP/IP header access. Against India's Airtel censor, 56 of the 77 strategies succeeded; 29 worked against Kazakhstan; 22 evaded China's keyword-based HTTP censorship and 27 evaded its Host-header censorship.
-
China's GFW keyword-based and Host-header HTTP censorship can be simultaneously defeated by a 'sandwich' strategy: a header with a name ≥64 bytes must appear before the Host header, the Host header value must start ≥1,281 bytes from the start of the headers, and the final header must be ≥129 bytes total — and the Host header must not be first or last. A 64+ byte header name alone is sufficient to defeat Host-header censorship because it prevents the GFW from reading further headers.
-
India's Airtel HTTP censor fails to reassemble TCP segments: padding any HTTP request to at least 1,449 bytes causes the IP+TCP overhead (52 bytes) to push the total past the Ethernet MTU of 1,500 bytes, forcing segmentation that the censor cannot handle and achieving 100% evasion. Kazakhstan requires the segmentation boundary to fall precisely between the Host header name and value (with two trailing spaces), rather than anywhere in the request.
-
A central finding of the paper is that RFC-compliance in the censor creates evasion opportunities: the more faithfully a censor parses HTTP/DNS per the RFC, the more RFC-permitted variants it will pass that servers also accept, yielding more viable evasion strategies. In contrast, India's Airtel censor was the most brittle (56/77 strategies bypassed it) precisely because it failed on many legitimate RFC variants; China's more sophisticated parser left fewer openings.
-
OpenVPN's application-layer P_ACK packets — uniform in size and concentrated only in the handshake phase — provide a timing and count fingerprint detectable via threshold comparison over 10-packet bins. Tunnel-based obfuscation wrappers (Stunnel, SSH, obfs2/3, Shadowsocks) that do not add random padding preserve the 1:1 packet correspondence with the underlying OpenVPN stream, leaving 16 of 20 tested tunnel-based obfuscated configurations vulnerable to ACK fingerprinting.
-
A two-phase passive-filter-plus-active-probing framework deployed at a 1-million-user ISP identified 85.90% of vanilla OpenVPN flows (1,718/2,000) and 72.67% of obfuscated flows (1,468/2,020), with an upper-bound false positive rate of 0.0039% across over 10 million flows — three orders of magnitude lower than prior ML-based approaches (1.4–5.5%). The system processed 15 TB and 2 billion flows per day on a single commodity server.
-
Even with tls-auth/tls-crypt HMAC protection making OpenVPN servers nominally 'probe-resistant' (silent to unauthenticated clients), the framework fingerprints servers via TCP-level timing side channels: a complete 16-byte client-reset probe triggers an immediate connection drop (HMAC validation fails after full packet reassembly), while a 15-byte truncated probe causes the server to stall awaiting the final byte until a server-specific handshake timeout expires. Over 97% of non-OpenVPN endpoints have RST thresholds below 500 or above 4,000 bytes, versus OpenVPN's characteristic 1,550–1,660 bytes derived from default MTU configurations.
-
OpenVPN's unencrypted opcode header byte is exploited to fingerprint vanilla and XOR-obfuscated flows: the XOR patch specification excludes the first buffer byte (the opcode) from reversal, so opcodes are always XOR-ed with the same key byte and map deterministically to fixed ciphertext values. All 4 of the top-5 VPN providers that offer obfuscated services use XOR-based obfuscation, and all were flagged by opcode fingerprinting over 90% of the time.
-
Censoring middleboxes respond to non-compliant TCP sequences because they must handle asymmetric routing and cannot rely on observing both sides of a connection. The hSYN; PSH+ACKi sequence elicited responses from 69.6% of 184 tested censoring middleboxes with a maximum amplification of 7,455×, while a lone PSH+ACK with no prior handshake elicited responses from 33.2% of middleboxes.
-
Current randomized-payload circumvention tools (obfs4/ScrambleSuit, SkypeMorph, VoIP-tunneling) rely on censors 'defaulting open' — treating unidentified traffic as innocuous. If censors instead block all traffic not explicitly recognizable as meaningful plaintext, these tools fail entirely. The paper notes anecdotal evidence this is already occurring, including blocking of some TLS 1.3 connections.
-
Only 8% of keywords censored by Chinese chat clients (WeChat, Sina Weibo — ~63,200 total terms) are also censored by GFW packet inspection, demonstrating independently maintained blocklists. The GFW's packet-inspection chat-derived blocklist contains up to 1,221 distinct censored keywords for outbound traffic; just 68 keyword components account for all censored terms from Beijing, with 「六四」(June Fourth) alone responsible for more than half.
-
The GFW only inspects two locations within an HTTP request for censored keywords: the path component of the request line and the Host header, in UTF-8 and GB 18030 encodings (with %-decoding applied). Cookie headers, custom headers (e.g., X-Tension), and POST body fields are not monitored. Even in monitored positions, only approximately 75% of requests containing censored keywords actually trigger a TCP RST disconnection.
-
The GFW enforces SNI-based blocking on every TCP port (not just 443), triggering TCP RST injection and a penalty box for known-censored hostnames (e.g., facebook.com, zh.wikipedia.org) in the TLS ClientHello. The SNI blocklist is separate from the HTTP keyword blocklist — keyword-derived subdomains in the SNI did not trigger censorship. No evidence was found for indiscriminate HTTPS decryption or certificate substitution.
-
The GFW maintains two HTTP keyword sublists: 15 terms censored unconditionally, and approximately 60–63 additional terms censored only when the English word "search" also appears in the request URL. No other English word among the 10,000 most common, no Chinese search synonym (搜索, 查找, 关键词), and no common URL parameter abbreviation ("q", "kw", "s") replicates this expanded-censorship trigger.
-
Balboa currently supports only TLS 1.2 stream cipher suites, covering approximately 81% of TLS connections; an active censor can force non-stream cipher suite negotiation, causing Balboa to silently enter pass-through mode—a potential denial-of-service vector. Separately, if the server's traffic model deviates from the local baseline (e.g., the same audio file streamed repeatedly), a sufficiently powerful censor can detect the anomaly independently of whether Balboa is running.
-
Camoufler defeats active probing of its server endpoints by keeping server IM IDs private (shared only out-of-band with trusted clients) and configuring the server to respond only to those trusted IDs. An adversary systematically probing IM IDs to find Camoufler servers would receive no response from the server, making enumeration futile. When E2M-encrypted IM providers could collude with a censor, an additional application-layer key exchange (DH with RSA-wrapped ephemeral key, AES-256, PFS via key deletion) prevents the provider from revealing plaintext even under coercion.
-
Camoufler tunnels censored web traffic through real Instant Messaging applications (Signal, Telegram, WhatsApp, Slack, Skype), achieving a median page-load time of 3.6s (average 4.1s) over Signal and 2.3s median (average 2.7s) over Telegram for Alexa top-1,000 sites — compared to 120s for CovertCast loading BBC News and only 2.56 Kbps throughput for DeltaShaper. Over 90% of TTFB trials across 10 popular sites completed under 2s, with 50% under 1s.
-
Spain's blocking infrastructure, initially mandated for copyright and gambling enforcement, was repurposed to block 24 unique Catalan referendum URLs during October 2017, including the IPFS gateway and two GitHub Pages domains. GitHub Pages was blocked only via DNS manipulation (pointing to 127.0.0.1) rather than HTTP blocking specifically to avoid collateral blocking of all of GitHub.
-
Analyzing over 3 million OONI network measurements (2016–2020) from 17 ASes covering 98.45% of broadband and 90.94% of mobile subscribers in Spain, the study detected 16 unique blockpages, 2 DPI vendors (Fortinet/Fortigate in Telefonica; Allot in Vodafone), and 78 blocked websites across copyright, political, civil-rights, and referendum categories.
-
DPI blocking by Spanish ISPs (Fortinet/Telefonica) was circumvented by inserting a tab escape character (\t) into HTTP GET request headers, or by delaying HTTP GET transmission — the same techniques reported to have bypassed DPI blocking of Catalan referendum sites in 2017. Both techniques exploited the DPI's shallow, stateless inspection of the opening HTTP request.
-
Vodafone (AS12357, AS12430, AS6739) deployed Allot-based TLS interception to block womenonweb.org: the system resolver returned a legitimate IP (67.213.76.19), but connecting to it triggered a forged certificate signed by Allot; disabling TLS certificate validation fetched the Vodafone blockpage, confirming a man-in-the-middle box rather than a redirect. OONI's standard Web Connectivity test recorded only a generic ssl_error:certificate verify failed and missed this entirely.
-
Domain shadowing makes all three traffic indicators — connecting URL, SNI, and Host header — appear to belong to an allowed shadow domain while fetching content from a blocked back-end domain via CDN. Unlike domain fronting, it exploits a legitimate CDN feature (arbitrary back-end binding) rather than a SNI/Host mismatch quirk, so CDNs cannot disable it by enforcing header consistency without breaking legitimate use cases such as third-party service outsourcing via CNAME. The technique was demonstrated successfully accessing www.facebook.com from a heavily censored country.
-
Filtering rules in Saudi Arabia were uniform across all three major ISPs (STC, Zain, Mobily) and six vantage points spanning four geographically distributed cities (Riyadh, Jeddah, Makkah, Al-Khobar), indicating a single centralized national filtering infrastructure rather than per-ISP implementation.
-
Saudi Arabia progressively unblocked VoIP and messaging applications after 2017: all 18 tested apps were blocked during 2013–2017, 67% were accessible in 2018, 93% in 2019, and all except WeChat in 2020, following CITC's 2017 announcement lifting the ban on compliant applications.
-
The GFW's passive classifier uses two features of the first data packet to flag probable Shadowsocks traffic: (1) high Shannon entropy (per-byte entropy > ~7 bits strongly correlates with replay probability, which is nearly 4x higher at entropy 7.2 than at 3.0) and (2) packet length in the range 160–700 bytes with specific remainders mod 16. A single data packet after the TCP handshake is sufficient to trigger the downstream active-probing pipeline.
-
Once passive analysis flags a connection, the GFW sends seven distinct active probe types in staged sequence: five replay-based (R1–R5, where R1 is an identical replay and R2–R5 alter specific byte offsets to attack stream vs. AEAD cipher variants) and two non-replay random-length probes (NR1, NR2). The system operates in stages: R3/R4/R5 probes are withheld until the server responds to R1/R2, meaning a server with replay protection (like Shadowsocks-libev ≥ v3.3.1) never receives stage-2 probes, while one without (original OutlineVPN) escalates to full probing.
-
Protozoa's encoded media tunneling embeds covert IP packets directly into VP8-encoded frame bitstream partitions (EFBP) after lossy compression, rather than into raw pixel data. Because SRTP uses a stream cipher that preserves plaintext size, overwriting EFBP bits leaves encrypted packet sizes identical to legitimate sessions, and the covert channel achieves 98.8% utilization of available frame space at an average throughput of 1422 Kbps—a 3× improvement over Facet and roughly three orders of magnitude over DeltaShaper's 7 Kbps maximum.
-
Protozoa's encoded media tunneling achieves an AUC of 0.59 against a state-of-the-art ML traffic classifier using packet-size and inter-arrival-time features—near the 0.5 random-guessing baseline—compared to >99% detection rates for prior tools such as Facet and DeltaShaper. To block 80% of Protozoa flows (TPR=0.8), a censor would erroneously flag approximately 60% of legitimate WebRTC flows (FPR=0.6). This resistance holds across trace durations from 10–60 seconds (AUC range 0.56–0.61) and across RTT, bandwidth, and packet-loss variations.
-
Protozoa successfully bypassed censorship in China, Russia, and India using whereby.com as a carrier. Despite several WebRTC services being blocked in China (appr.tc, discordapp.com, hangouts.google.com, messenger.com), at least seven alternatives remained reachable (aws.amazon.com/chime, coderpad.io, gotomeeting.com, slack.com, whereby.com, and others), ensuring carrier availability. Covert sessions over the alternative services coderpad.io and appr.tc achieved AUCs of 0.58 and 0.60, respectively, and average throughput of 1388–1420 Kbps.
-
When a censor controls the WebRTC signaling plane, it can mount MITM attacks against CRON's vanilla covert encoding because the encoding 'fully replaces the video payload with an apparently random covert data signal that results in a scrambled video image at the receiver's endpoint.' By replaying the captured video through a WebRTC gateway, the censor obtains direct visual evidence of payload manipulation.
-
Protozoa creates a ≈1.4 Mbps covert channel over WebRTC by replacing encoded video frames with covert payload while preserving SRTP packet size and timing properties, making Protozoa flows 'hardly distinguishable from unmodified WebRTC streams using existing ML-based traffic classifiers.' Since all unencrypted packet fields remain intact, DPI cannot detect the tunnel either.
-
CRON's stego circuits defend against adversary-controlled WebRTC services by embedding covert data into encoded video frames at the compressed data domain using video steganography algorithms, maintaining the visual characteristics of the video feed rather than replacing it entirely. Endpoint authentication uses public-key encryption with keys exchanged out-of-band, preventing MITM key substitution through the censor-controlled signaling server.
-
Slitheen++ embeds covert upstream data by applying HTTP/2-like header field compression to overt HTTP requests, using the recovered space for covert data placement. This ensures that neither timing information nor observable changes to packet sizes or delays can reveal decoy routing use to an omni-scientist passive censor. GZIP compression was explicitly avoided to prevent the CRIME side-channel attack.
-
The original Slitheen appended covert upstream data directly to overt HTTP requests, significantly changing upstream traffic patterns and enabling censor identification even when traffic is encrypted. This upstream traffic analysis vulnerability—absent from Slitheen's original threat model—is the primary weakness Slitheen++ addresses.
-
All 25 applicable client-side Geneva strategies failed when mechanically translated to server-side analogs against China's GFW, even when the only structural difference was which endpoint sent the insertion packet. Experiments with the server placed inside China and client outside also failed, indicating the GFW tracks connection initiator identity and processes client versus server packets asymmetrically—meaning server-side circumvention requires a completely independent discovery approach.
-
China's GFW uses distinct, co-located censorship boxes—each with its own independent network stack implementation and bugs—for each application-layer protocol it censors. TCP-level strategies that exploit transport-layer bugs show dramatically different success rates per protocol: Strategy 1 (Simultaneous Open + Injected RST) achieves 89% for DNS but only 14% for HTTPS; Strategy 8 (TCP Window Reduction) achieves 100% for SMTP but only 2–3% for DNS, HTTP, and HTTPS. TTL-limited probes confirm all protocol boxes are co-located at the same network hop.
-
The paper identifies three distinct GFW resynchronization-state triggers with protocol-specific behavior: (1) a server payload on any non-SYN+ACK packet causes resync on the next SYN+ACK or client ACK-flagged packet for all protocols; (2) a server RST causes resync on the next client packet for all protocols except HTTPS; (3) a SYN+ACK with a corrupted acknowledgment number triggers resync only for FTP. Strategy 1's 50% per-attempt success rate for HTTP is confirmed to result from the 50% probability of the GFW entering the resynchronization state on an injected RST, consistent with Wang et al. [36].
-
The paper presents 11 purely server-side censorship evasion strategies requiring zero client-side software, successfully bypassing censorship in China, India, Iran, and Kazakhstan across DNS-over-TCP, FTP, HTTP, HTTPS, and SMTP. All strategies manipulate only TCP handshake packets (primarily the SYN+ACK) and were verified against 17 versions of 6 client operating systems (Windows XP–Server 2018, MacOS, iOS, Android, Ubuntu, CentOS) with unmodified clients.
-
TCP Window Reduction (Strategy 8)—reducing the SYN+ACK TCP window to 10 bytes and stripping wscale options, forcing the client to segment its request—achieves 100% evasion success against HTTP in India and Kazakhstan, 100% against HTTP and HTTPS in Iran, and 100% against SMTP in China, because none of these censors can reassemble TCP segments. The strategy is compatible with all 17 tested client OS versions when implemented without SYN+ACK payloads, making it the most broadly deployable server-side strategy found.
-
Using Geneva's genetic algorithm trained against Iran's live protocol filter, four evasion strategies achieving 100% success were discovered in under two hours: (1) injecting a fingerprint-matching PSH/ACK with a corrupt checksum before the real data; (2) sending two FIN packets before the SYN; (3) sending nine non-data-carrying packets (any flags, any seq/ack) during the handshake to exhaust the filter's per-flow packet limit; (4) a server-side variant that sends nine corrupted SYN+ACKs, inducing nine client RSTs before the real ACK, enabling fully unmodified clients to benefit.
-
The protocol filter's HTTPS fingerprint requires only that the first 5 bytes match a TLS header (type 0x16, version 0x03 0x01–0x03, correct length field); all subsequent bytes of the Client Hello are unchecked. Any TLS-based circumvention tool naturally satisfies this fingerprint and will bypass the filter by default. Furthermore, any one of the three permitted fingerprints (DNS, HTTP, HTTPS) can be used on any of the three monitored ports to whitelist an entire flow.
-
Testing the Alexa top-20,000 websites from within Iran, 3,595 IP addresses (17.9%) triggered the protocol filter at least 8 out of 10 times, and 3,499 (17.4%) were affected all 10 times. IP address provider is not correlated with filtering; instead, specific IP prefixes are targeted—for Cloudflare, only two prefixes (104.18.0.0/16 and 104.31.82.0/24) were fully affected while all others were unaffected.
-
Iran's protocol filter monitors only the first two data-carrying packets of a TCP connection on ports 53, 80, and 443, permitting only DNS, HTTP, and HTTPS. Once tripped, it drops all subsequent client-side packets for 60 seconds, with the timer resetting on each TCP retransmit. The filter is unidirectional (client-inside-Iran only), cannot reassemble TCP segments, and does not verify checksums.
-
Existing segmentation strategies effective against Iran's standard HTTP DPI can be counterproductive when the protocol filter is also active: if the first segment is fewer than 8 bytes, it fails the HTTP fingerprint check and trips the filter. However, segmenting such that the first segment is a valid HTTP fingerprint (≥8 bytes, well-formed verb + space) while splitting the Host: header into the second segment defeats both the protocol filter and the standard DPI censor simultaneously.
-
The dnstt DNS-over-HTTPS tunnel, built on a KCP Turbo Tunnel session layer, achieved download speeds of 130 KB/s using Google and Cloudflare DoH resolvers and 30 KB/s using Quad9, compared to iodine's maximum of 2 KB/s over the same operators' UDP DNS resolvers — a 15–65× improvement. DNS-over-HTTPS hides message contents from the censor, removing the two main classical DNS tunnel detection vectors: unusual DNS message structure and plaintext tunnel domain names in queries.
-
Each probe-resistant proxy exposes a unique TCP close-threshold fingerprint: obfs4 closes with FIN at 8,192–16,384 bytes and RST at the next multiple of 1,448 bytes beyond that; Lampshade at FIN 256 bytes / RST 257 bytes; Shadowsocks-python and -outline both at FIN 50 bytes (outline also RST at 51); OSSH at FIN 24 bytes / RST 25 bytes. A binary-search tool using random probes can discover these thresholds remotely without knowing any shared secret, providing a protocol-specific fingerprint independent of payload content.
-
Censys scans of IPv4 HTTPS servers in June 2020 found that over 21% responded to a GET / with 400 Bad Request, 11.19% with 403 Forbidden, 8.62% with 404 Not Found, and 2.91% with 401 Unauthorized. These common error-response distributions provide a statistical baseline that HTTPT servers can match to avoid standing out to active probers.
-
Splitting the TLS ClientHello so that the first TCP segment is ≤4 bytes (less than the 5-byte TLS record header) defeats the GFW's ESNI detection with near 100% reliability. Geneva expressed this as `[TCP:flags:PA]-fragment{tcp:4:True}-|` (client-side) or a server-side window-size reduction to 4 bytes that forces the client to segment. This suggests the GFW's ESNI classifier cannot reassemble TCP segments across all protocol contexts.
-
The GFW's ESNI detector is keyed specifically to extension value `0xffce` (ESNI draft-01). Replacing `0xffce` with ECH draft values `0xff02`, `0xff03`, or `0xff04` produced no blocking as of August 2020. This indicates the GFW deployed a detector matching on a specific extension ID rather than detecting encrypted SNI generically.
-
The GFW blocks ESNI by dropping client-to-server packets whenever a TLS ClientHello containing the `0xffce` encrypted_server_name extension is sent over a completed TCP handshake. Unlike GFW censorship of SNI and HTTP (which uses RST injection to both endpoints), ESNI censorship uses unidirectional packet dropping with no injected packets. The blocking applies on all ports from 1 to 65535.
-
Against censors that detect blacklisted application protocols by examining only the first 30 packets of a flow (e.g., the technique in Wang et al. 2015), a single IP migration after 30 packets have been exchanged is sufficient to defeat detection while incurring minimal performance overhead—the client continues the connection normally on the new address.
-
MoneyMorph provides provable chosen-covertext attack security (SBS-CCA) for proxy bootstrapping, unlike prior email or social-media rendezvous approaches which offer only heuristic security. Under SBS-CCA, the censor's advantage in distinguishing a covertext-bearing transaction from a random transaction in the same space is negligible.
-
Zcash shielded transactions provide the highest per-transaction bandwidth of any tested cryptocurrency: 1148 bytes for the challenge covertext and 1168 bytes for the response, at a transaction fee of less than 0.01 USD. Bitcoin yields only 20/40 bytes at $0.34 fee and Ethereum only 20 bytes at $0.18 fee.
-
In a traffic sample from a major non-anonymous circumvention tool (3.56 TB total, Feb 21, 2008), 48% of all proxied traffic belonged to websites that were not censored in Iran. Integrating CacheBrowsing to fetch CDN-hosted censored content directly further saves 41% of Buddy bandwidth for Alexa top-1000 websites.
-
Protocol Proxy uses 'protected static protocols' — UDP-based protocols whose blocking causes severe collateral damage (e.g., Synchrophasor power-grid traffic, NTP) — as cover channels. Because any detection rule that fires on Protocol Proxy traffic also fires on legitimate PMU traffic, censors face a forced trade-off between blocking circumvention and disrupting critical infrastructure.
-
Observation-based FTE constructs each packet field exclusively from values previously observed in real host-protocol traffic, guaranteeing syntactic equivalence. Wireshark correctly decodes Protocol Proxy-generated packets as valid Synchrophasor frames with correct checksums, and the Phasor Data Concentrator hardware accepts them; any rule blocking Protocol Proxy traffic must therefore also block legitimate PMU packets.
-
Static protocols — UDP-based with no application-layer handshake — are immune to stateful protocol analysis that defeated SkypeMorph: without a handshake state machine, a censor cannot flag discrepancies between observed and expected protocol states. This eliminates the detection vector that Houmansadr et al. (2013) exploited to identify SkypeMorph via handshake mismatch.
-
Of 21.8 billion raw measurements, approximately 7% (1.5 billion) were initially flagged as blocked; iterative HTML clustering and DBSCAN image clustering then removed ~500 million false positives, leaving ~1 billion confirmed blocked measurements. The clustering process formed 457 new response clusters, of which 308 were confirmed blockpages and 149 were false positives, with Cloudflare bot-checks being a notable source of false positives in HTTPS measurements.
-
Kazakhstan's 2019 HTTPS interception affected 7.0% of 6,736 measured TLS hosts when probed from North America and 24% when probed from inside the country; all affected paths traversed AS9198 (Kazakhtelecom), with 95% of injections occurring at two specific IP addresses (92.47.151.210 or 92.47.150.198), indicating a highly centralized interception infrastructure.
-
The Kazakhstan interception system connected back to the origin TLS server before issuing a fake certificate, and in doing so exposed a unique TLS fingerprint (hash f09427b5aaf9304b): it used TLS record-layer version 1.0, ClientHello version 1.2, and offered only 13 cipher suites — a fingerprint virtually unseen in normal HTTPS traffic — allowing content providers to detect when a connection was being intercepted.
-
Kazakhstan's interception system triggered solely on the TLS SNI header: a connection was intercepted only if the SNI contained one of 37 targeted domains AND the path passed through specific AS9198 hops; the server's actual certificate needed to be browser-trusted but did not need to match the SNI domain, and interception could be triggered bidirectionally — from outside the country connecting to TLS hosts inside Kazakhstan.
-
Anonymization and circumvention tools (VPNs, Tor, etc.) are among the three most commonly blocked content categories across all commercial filters surveyed, alongside pornography and gambling. This holds across diverse products including Fortinet, Cisco, and government-deployed firewalls in Iran, Saudi Arabia, and Bahrain.
-
FilterMap identified 90 blockpage clusters from 90 vendors and actors across 103 countries using 374 million measurements from ~45,000 vantage points against 18,736 sensitive domains; 87 of these signatures were previously unknown. Commercial filters were detected in 36 out of 48 countries rated 'Not Free' or 'Partly Free' by Freedom House, with Fortinet alone present in at least 60 countries.
-
In HTTP tests, more than 50% of filter responses that indicated censorship contained an injected HTML blockpage; the remainder used TCP RST injection or connection timeout. In HTTPS measurements, canonical template matching had a failure rate of only 1.9%, and 95% of Hyperquack measurements completed within 3.5 hours across ~45,000 vantage points.
-
The Great Firewall of China does not inject blockpages — it resets connections via TCP RST injection — making it invisible to blockpage-based detection systems. In contrast, the Iran firewall accounted for 97.1% of disruptions observed in Iranian vantage points, and the Bahrain and Saudi Arabia firewalls caused 71.2% and 80.2% of disruptions respectively, all using application-layer blockpage injection.
-
Russia operates the most fragmented ISP-level filtering infrastructure in the dataset: FilterMap detected 41 distinct ISPs deploying blockpage-injecting filters, and 38 out of 49 filter clusters identified by Quack were deployed in Russian ISPs. All 41 Russian blockpages explicitly cited Federal Law as the reason for blocking.
-
Despite Russia's decentralized ISP ecosystem, 9 of 14 residential probes observed more than 90% of 98,098 tested blocklist domains blocked, and all 14 probes observed at least 49% blocked—demonstrating that coordinated nationwide censorship without centralized choke-points is achievable through legal mandates and commodity equipment alone.
-
SiegeBreaker's session bootstrapping (from initial email to installed SP redirection rule) averaged 3–4 seconds across 100 trials, with the dominant delay attributed to email handling (SMTP connection, Selenium composition) rather than network latency; this setup cost is not included in the download-time benchmarks. The auxiliary ping-based switch-selection signal encodes 48 bits across three ICMP header fields (IP-ID, ping sequence number, ping identifier), requiring ~281 trillion spoofed ping packets per client–OD pair to brute-force.
-
SiegeBreaker explicitly acknowledges two unresolved attack vectors: (1) latency-based traffic analysis attacks (forced-asymmetry / RAD-style), which the system does not mitigate, and (2) website fingerprinting attacks against the proxied traffic, for which no defense is implemented. Additionally, the email-based control channel is vulnerable to a censor who can delay or block emails to the controller's address, disrupting rule installation before the client's SYN packet arrives.
-
All prior decoy routing systems (Cirripede, Telex, TapDance, Slitheen, Waterfall) require the DR to inspect every traversing flow — either all TCP SYN packets or all TLS flows — to identify DR requests, creating a privacy breach for non-DR users and a computational bottleneck. SiegeBreaker eliminates this by using an out-of-band email pre-registration (encrypted to the controller's 2048-bit RSA public key) that pins the controller's inspection rule to a single client-IP/OD-IP/ISN triple, so only authenticated potential DR flows are ever redirected.
-
Jio, India's largest ISP serving 49.7% of internet users, employs SNI inspection to block 2,951 out of 3,340 websites it censors — the first documented use of SNI-based blocking in India. No other of the six tested ISPs uses this technique.
-
V2Ray's HTTP obfuscation mode prepends an HTTP header only to the first TCP payload per connection and uses a hardcoded HTTP 500 response for all failure cases, making the mimicry trivially detectable: legitimate HTTP servers send headers on every response, and do not return 500 for protocol errors a real HTTP server would never encounter.
-
VMess authentication uses a timestamp-based credential with a maximum 120-second (average ~60-second) expiration window, allowing an attacker to replay a captured legitimate request within that window. By making 16 connections with altered Encryption Key bytes that enumerate all 16 possible Margin P padding-length values, a prober can confirm a VMess server by observing a non-repeated set of connection-close byte counts spanning a delta of 15.
-
During a major censorship event in April 2019, new censor techniques blocked many Psiphon transports while TapDance remained accessible, causing a 4× increase in the fraction of TapDance-enabled clients' traffic and daily users peaking above 25,000—with no measurable degradation in connection success rate or per-session throughput under the increased load.
-
The GFW's dominant exploitable discrepancy is accepting data packets whose TCP sequence number is ≤ the initial sequence number (ISN), while Linux rejects such packets as out-of-window. This single 'SEQ ≤ ISN' strategy accounts for the majority of the 3,152 successful evasion-packet cases against the GFW out of 4,587 total successful evasions.
-
Snort contains two novel TCP Timestamp discrepancies versus Linux: it omits RFC 7323-mandated timestamp validation on RST packets in SYN_RECV state, and its PAWS TSval acceptance window is 'off by two' — a TSval of 0 or 0xffffffff following a packet with TSval 0x80000000 is accepted by Linux but rejected by Snort, enabling insertion-based evasion by crafting packets that fall in the divergent range.
-
Snort interprets the TCP urgent pointer as the offset to the last byte of urgent data and discards all payload bytes before that offset, while Linux consumes only 1 urgent byte and leaves the remaining payload intact. Injecting a packet with the URG flag and the urgent-pointer offset pointing to an insignificant padding byte allows the full sensitive payload to reach the server while Snort strips it — a novel evasion strategy not previously reported.
-
SymTCP uses selective symbolic execution over Linux's TCP implementation (S2E + KLEE) to enumerate all packet sequences reaching 47 binary-level accept or drop points from LISTEN to ESTABLISHED, then conducts differential testing against a blackbox DPI to confirm discrepancies; the open-sourced system requires no DPI source access and covers 37 of 47 drop points within the operationally relevant handshake window.
-
SymTCP generated 56,787 candidate insertion/evasion packets in approximately one hour using concolic execution over Linux's TCP stack. Evaluating a sampled set of 10,000 test cases against real DPI systems yielded 6,082 evasions against Zeek, 652 against Snort, and 4,587 against the Great Firewall of China — discovering 14 novel evasion strategies beyond those found by prior manual approaches.
-
A/B testing across HTTP, HTTPS, VPN, and Shadowsocks traffic found no measurable difference in packet loss rates, ruling out censorship-targeted protocol throttling as the cause. Even at probe rates as low as one packet per 10 seconds, loss rates were similar across all protocol variants, indicating no per-connection or per-protocol speed throttling by the GFW.
-
Frolov and Wustrow show that every major TLS-based circumvention tool (Tor Browser, Lantern, OpenVPN, Psiphon, etc.) produces a TLS ClientHello fingerprint that is statistically distinguishable from real Chrome or Firefox: differences include cipher-suite ordering, extension set, extension ordering, ALPN values, and curve preferences. A passive observer with a classifier over ClientHello fields can identify the tool with high precision without decrypting any traffic.
-
Beyond the ClientHello, circumvention tools diverge from real browsers in TLS record-layer behavior: Go's crypto/tls splits the first application-data write differently than NSS or BoringSSL, and Go does not send a TLS ChangeCipherSpec in the same byte sequence as Chrome. These post-handshake divergences are detectable even when the ClientHello has been patched with uTLS, requiring record-layer mimicry in addition to hello-field mimicry for full fingerprint resistance.
-
Evasion strategies are strongly censor-specific: TCB Teardown strategies that achieve 80–96% against the GFW fail completely (0%) against Kazakhstan's HTTPS MITM; India's Airtel is defeated uniquely by a 'Stutter Request' (duplicating the PSH/ACK and replacing IP length to 64) at 100% success, which scores only 3% against the GFW. Geneva converged on distinct species for each censor within 4–8 hours of live training.
-
Geneva, a genetic algorithm using four packet-manipulation primitives (drop, tamper, duplicate, fragment), independently re-derived 30 of 36 (83.3%) previously published evasion strategies in controlled lab experiments and discovered successful strategies in 23 of 27 live training sessions against China's GFW, yielding 4 unique species, 8 subspecies (5 novel), and 21 fundamentally different variants. Each training session ran for 4–8 hours against a real censor.
-
Geneva experiments revealed that the GFW determines TCP three-way handshake completion using only the presence of the ACK flag — without validating sequence numbers. Upon receiving a RST or RST/ACK before the handshake completes, the GFW enters a resynchronization state approximately 50% of the time rather than tearing down its TCB; strategies that exploit this pre-handshake window achieve 92–95% success rates (Strategies 3 and 4).
-
The GFW does not verify TCP checksums or validate RST flag combinations: Strategy 5 using the entirely invalid flag set FRAPUN with TTL 10 achieved 96% success. Separately, increasing the TCP data offset (dataofs) field to 10 in an insertion duplicate causes the GFW to reinterpret the beginning of the HTTP payload as TCP header bytes, preventing keyword detection and achieving 98% success (Strategy 2) — while the destination server discards the malformed packet.
-
Geneva's Segmentation species — fragmenting HTTP requests at the TCP layer without IP fragmentation, segment overlapping, or insertion packets — achieved 94–98% success against the GFW, 100% against India's Airtel ISP, and 100% against Kazakhstan's HTTPS MITM, making it the only strategy class effective across all three tested censors. These strategies require neither raw sockets nor root privilege.
-
Conjure registration is unidirectional: the client embeds a steganographic ciphertext tag in a complete HTTPS request payload encrypted under a Diffie-Hellman shared secret, and the station passively observes it without sending any reply or spoofing packets. This design makes registration flows indistinguishable from normal HTTPS traffic and enables 25% more viable registration decoys than TapDance by removing the requirement to exclude decoys with short TCP windows or connection timeouts.
-
Feature importance analysis of XGBoost models reveals that Facet covert channels are identifiable primarily through packets in the 115–195 byte range (dominated by Skype audio packets), while DeltaShaper is identifiable through two distinct packet-length clusters: 85–100 bytes and 1105–1205 bytes. XGBoost assigns non-zero importance to only ~58% of the 300 quantized packet-length bins for Facet and ~42% of 600 bins for DeltaShaper, indicating that leakage is concentrated in a narrow portion of the packet-size distribution.
-
By 2018 the GFW shifted from blocking Tor bridges by (IP, port) tuples to blocking the entire IP address. A blocked bridge remains inaccessible for exactly 12 hours; the block renews to 12 hours if any additional Tor connection attempt is made during that window, after which the GFW re-scans and removes the IP from the blacklist if Tor is no longer running.
-
Meek over Azure CDN successfully established Tor circuits from China in all tests; meek over Amazon was inconsistent and often failed mid-circuit. Meek requires TLS on the bridge — without it the GFW blocks the bridge within minutes and purges it from the blacklist, suggesting a separate meek-specific detection and blocklist is maintained.
-
obfs4 successfully established Tor circuits on the authors' own unpublished bridge relays but failed to connect to any public obfs4 bridge, consistent with the GFW having scraped and blacklisted public bridge addresses. This demonstrates that address confidentiality is a prerequisite for obfs4's effectiveness, independent of its obfuscation properties.
-
I2P obfuscates payload content to prevent protocol identification, but flow analysis can still fingerprint I2P traffic because the first four handshake messages between I2P routers have fixed lengths of exactly 288, 304, 448, and 48 bytes. The I2P team acknowledged this and was developing an authenticated key agreement protocol to resist automated identification.
-
Across all tested countries, circumvention and anonymization tools are the most consistently blocked category: www.hotspotshield.com is blocked in 5 of 13 detected censoring countries, and three Tor Project properties (bridges.torproject.org, www.torproject.org, ooni.torproject.org) each appear in the top-10 most broadly blocked domains. Collateral damage is also documented — Iran blocks psiphonhealthyliving.com as a substring match for the psiphon.ca circumvention domain.
-
By comparing echo-server (bidirectional) versus discard-server (inbound-only) results across 11 censoring countries, Quack finds that only 4 countries (China, Egypt, Jordan, Turkey) also block inbound traffic; the remaining 7 apply DPI exclusively to outbound data. Direction-sensitive blocking is a confirmed capability of deployed middleboxes.
-
Quack's echo-server technique achieves vantage-point coverage of 4,458 autonomous systems across 184 countries — nearly an order of magnitude more than OONI's 678 ASes in 113 countries — while processing over 500 domain-server pairs per second from a single measurement machine. The public IPv4 space contains over 50,000 active echo servers daily, with 47,276 stable over 24 hours.
-
Iran's number of blocked domains increases from 25 (HTTP keyword blocking) to 374 (TLS SNI-based blocking) — a 15× increase — with the newly blocked domains shifting composition to predominantly News, Human Rights, and Anonymization tools. This demonstrates that Iran maintains a distinct, more aggressive SNI blocklist for HTTPS traffic that is largely invisible to HTTP-only measurement.
-
Stateful DPI disruption in censoring countries disengages within approximately 100 seconds in 99.9% of observed cases, with roughly 50% of servers recovering within 60 seconds. A 2-minute empirically determined delay is sufficient to distinguish stateful per-connection blocking from persistent blocking when retrying with innocuous payloads against the same server.
-
DeltaShaper embeds covert TCP/IP data into Skype's encrypted video stream using a virtual camera interface, treating Skype as a black box rather than mimicking its protocol. This approach provides active-attack resistance by design: any in-path perturbation affects covert and legitimate streams identically, because real Skype software processes both. The system achieves a goodput of 2.56 Kbps (with Reed-Solomon ECC) or 3.12 Kbps (without ECC) at optimal encoding parameters (320x240 area, 8x8 cell size, 6 bits/cell, 1 fps), with RTT of approximately 3 seconds.
-
Censors in Russia, Iran, and India implement all three measured censorship techniques simultaneously: block pages, RST injection, and TTL anomalies. Iran and Cyprus censoring ASes censor content across many URL categories (including General News, Internet Services, Pornography, Gambling), while most other censoring ASes restrict only a few category types.
-
A Random Forest classifier with 100 CART trees and a sqrt(C) feature-selection strategy achieves over 85% accuracy detecting Shadowsocks traffic from biflow statistics. Accuracy increases monotonically with train-set and test-set size before plateauing.
-
Shadowsocks traffic appears as ordinary TCP with no payload keywords or obvious protocol markers because the entire payload is encrypted; firewalls cannot distinguish it from generic TLS without behavioral flow analysis. This makes signature- and keyword-based detection ineffective against it.
-
The paper identifies that Shadowsocks can also serve as a transport layer for Tor and VPN connections, meaning a Shadowsocks flow detector functions as a first-stage classifier that unmasks compounded anonymity systems. The authors explicitly cite this as a motivation for detection.
-
82.2% of ad requests from Alexa top-500 websites are sent over HTTPS (Table 2), encrypting the HTTP Referer field. This prevents censors from correlating a user's direct-path ad request back to a censored publisher domain in the vast majority of cases; only the remaining 17.8% of HTTP ad requests are vulnerable to Referer-based traffic analysis.
-
Relay-based circumvention severely degrades ad relevance: across Alexa top-500 uncensored sites, the overlap between ad sets fetched via Tor and the direct-path ground truth averaged only 28%, with near-zero overlap for sites serving geo-targeted ads. For blocked sites, only ~16% of ads shown via Tor were in the user's language.
-
Of the 55 filters that inspected the HTTP Host header, 26 keyed only on the first Host header in a multi-Host request, 27 keyed only on the last, and only 2 examined both. Placing a benign Host header in the position the filter reads and the blocked URL in the other position bypassed the filter, and this divergence in behavior tracks RFC 7230's requirement to reject multi-Host requests with a 400 error — which none of the tested filters implemented.
-
HTTP GET fuzzing via subtle token modifications bypassed large fractions of filters: removing the `\r\n` before the Host header bypassed 36–38 of 44 Host-header filters; embedding the censored URL in the middle of a long hostname string bypassed 33–35 filters; placing the URL in an after-Host field with a non-empty Host bypassed 29–36 filters. Blacklist coverage was also weak: no filter blocked all 100 of the Alexa top adult sites, and some blocked as few as 31.
-
Among the 44 non-DNS filters, 11 did not reassemble TCP segments and 7 did not reassemble IP fragments before inspection, meaning a censored URL split across segment or fragment boundaries evaded detection. Five filters applied fragment/segment reassembly timeouts of under 2 seconds despite maintaining HTTP request state for more than 8.5 seconds, creating a window where a deliberately fragmented flow with artificial delay avoids inspection entirely.
-
Autosonda classified 76 commercial web filters in the NYC metropolitan area into three categories: 21 (27.63%) performed DNS blacklist filtering, 44 (57.89%) matched on the HTTP Host header of GET requests, and 11 (14.47%) performed a DNS lookup of the Host header value and blocked based on the resulting IP. Autosonda found circumvention paths for 100% of filters tested.
-
All 76 filters inspected only TCP traffic: sending the identical HTTP request over UDP bypassed censorship 100% of the time. Additionally, 17 of the 49 filters that censored requests to EC2 servers only inspected traffic on port 80 and passed through the same requests sent to port 9900 without modification. No filter triggered on URI query strings, so appending query parameters to any censored URL bypassed every tested filter.
-
Never-once avoidance succeeds for 75% of source-destination pairs that do not already terminate in the US (a highly routing-central country) at δ=0.5, and for nearly all pairs avoiding less central countries. Russia is the hardest case at ~35% success (δ=0.5) due to proximity to the dense European node cluster. The median successful source-destination pair has over 1,000 valid DeTor circuits when avoiding the US and 500 when avoiding China.
-
Middlebox classification state is ephemeral: the testbed carrier-grade DPI device flushes results after 120 seconds (or 10 seconds after a TCP RST), and the GFC flushes state after 40–240 seconds depending on time of day. A strategically timed pause before the matching payload, or a TTL-limited RST packet, causes the classifier to re-evaluate the connection as unclassified traffic.
-
Iran's censor and AT&T's Stream Saver restrict DPI inspection strictly to port 80; traffic on any other TCP port escapes classification entirely. Iran additionally inspects the full flow (not just initial packets), unlike T-Mobile and the testbed device which only inspect the first few packets, making packet-count-based evasion insufficient against Iran on port 80.
-
TCP segment splitting and out-of-order delivery evades DPI classification in the testbed, T-Mobile, and Iran, but fails against the GFC—which performs extensive packet validation and correctly reassembles reordered streams—and AT&T, which uses a transparent HTTP proxy that normalizes all traffic before inspection. Payload splitting to one byte in the first packet is sufficient to defeat packet-count-limited classifiers.
-
lib·erate's TTL-limited inert packet insertion—sending a decoy packet with TTL set to expire at the middlebox but carrying a misclassifying payload—successfully evades classification in a carrier-grade testbed DPI device, T-Mobile's Binge On, and the Great Firewall of China, but fails against Iran's censor and AT&T (Table 3). When bilateral server support is available, inserting a single dummy packet at flow start evades classification in all four deployments.
-
None of the operational networks tested—T-Mobile, AT&T, the Great Firewall of China, and Iran—classify UDP traffic; the authors describe this as 'a surprisingly easy way to evade their policies.' Iran's censor inspects the entire TCP flow but leaves UDP flows untouched across all tested applications.
-
Measured packet loss rates under GFW censorship (Feb–Apr 2017, client at Tsinghua University/CERNET): Tor with meek obfuscation suffers 4.4% average PLR; Shadowsocks (AES-256-CFB) suffers 0.77% PLR; native VPN (PPTP/L2TP) and OpenVPN both achieve ~0.21% PLR. For comparison, the same tools accessed from a US vantage point show PLR below 0.1%, confirming the excess loss is GFW-induced. The GFW's DPI and active probing techniques specifically target Tor and Shadowsocks protocol signatures.
-
China's Internet censorship ecosystem is bilateral: the GFW handles technical blocking while separate government agencies (MIIT, TCA, MPS, MSS) handle non-technical regulation, and 'these two components do not operate synchronously.' Google Scholar is considered a legal service by Chinese regulators but is incidentally blocked as collateral damage because it falls under the google.com domain, blocked since 2010.
-
ScholarCloud's 'message blinding' — a non-public byte mapping (f: [0, 2^8) → [0, 2^8)) applied between domestic and remote proxy — successfully evades GFW deep packet inspection with 0.22% average packet loss rate, statistically indistinguishable from native VPN (0.21%). The paper reports that even this simple encoding suffices because the GFW cannot classify the traffic; confidentiality of the algorithm is the operative property, not cryptographic strength. Because the operator controls both proxy endpoints, the blinding scheme can be rotated at any time without requiring client-side updates.
-
77% of public bridges offer only vanilla Tor, which is trivially detectable via TLS certificate pattern matching. An additional 15% offer Pluggable Transports with conflicting security properties (e.g., obfs4 + obfs3 + obfs2 co-deployed on the same bridge), allowing a censor to confirm and block the bridge via the weakest PT and thereby disable all stronger PTs on the same IP — including active-probing-resistant transports like obfs4 and ScrambleSuit.
-
Default bridges — whose IP addresses are hardcoded in the Tor Browser Bundle — carry 91.4% of all bridge clients globally in April 2016, and 86.1% in Iran and 69.2% in Syria. Because these addresses are trivially obtainable from the Tor Browser Bundle configuration files, a censor can block the vast majority of bridge users in a country at any time.
-
Evaluation of the top 10,000 Alexa websites finds that 3,916 (39%) support HTTPS, of which 1,976 (50%) perform HTTP 3XX redirects that echo the requested path in the Location header and 812 (20%) replay the URL in HTTP 404 error responses — both usable as upstream covert channels readable by downstream-only decoy routers without intercepting upstream traffic.
-
The Republic of Cyprus National Betting Authority (NBA) blocklist grew from 95 URL entries in February 2013 to 2,563 entries in April 2017 — approximately 27 times its initial size — with entries specifying full URL paths rather than just domain names, requiring DPI-capable infrastructure for correct enforcement.
-
Cypriot ISPs could not enforce HTTPS URL entries from the NBA blocklist because SSL/TLS interception was not deployed; connections to port 443 for blocked domains simply timed out with no block page or user notification, meaning HTTPS entries were effectively under-blocked.
-
Tested across 11 vantage points in 9 Chinese cities against 77 Alexa-ranked websites (50 trials each, April–May 2017), most prior TCB evasion strategies are largely broken: TCB creation with SYN achieves only 6.9% success (88.9% Failure 2), TCB teardown with FIN achieves only 11.1% success (87.9% Failure 2), while in-order data overlapping with TTL-based insertion still achieves 90.6% success and only 3.7% Failure 2. Without any evasion strategy the baseline success rate is 2.8%.
-
The GFW evolved to create a TCB not only on SYN packets but also on SYN/ACK packets, and enters a 're-synchronization state' upon seeing multiple SYN packets, multiple SYN/ACK packets, or a SYN/ACK with an incorrect acknowledgment number. Once in this state, it re-synchronizes its TCB using the next client-to-server data packet or server SYN/ACK, invalidating prior TCB-creation evasion strategies that assumed the GFW used only the first SYN sequence number.
-
INTANG, a measurement-driven tool that caches the best-performing TCP evasion strategy per server IP, achieves an average success rate of 98.3% (range 93.7%–100%) from vantage points inside China. Four combined new strategies — Improved TCB Teardown, Improved In-order Data Overlapping, TCB Creation + Resync/Desync, and TCB Teardown + TCB Reversal — each independently achieve average success rates of 94.5%–96.2% inside China and 84.6%–92.7% outside China, with Failure 2 rates below 1.1%.
-
Packets carrying an unsolicited TCP MD5 option header (RFC 2385) are silently ignored by modern Linux servers (kernel ≥ 2.6) that have not negotiated MD5 authentication, yet are accepted and processed by the GFW as normal packets that update its TCB. Crucially, none of the observed middleboxes dropped packets with MD5 options, making the MD5 header the most universally applicable insertion packet type — usable with any TCP flag (SYN, RST, or data) and immune to middlebox filtering.
-
Client-side middleboxes at every tested vantage point interfere with IP-layer evasion tactics: Aliyun (6/11 nodes) discards all IP fragments, while the Tianjin China Unicom node drops packets with wrong TCP checksums or no TCP flag. IP-layer discrepancies that survive routers (e.g., IP total-length > actual length) are still dropped by some middleboxes, making IP-layer manipulations unreliable across Chinese ISPs. TCP-layer manipulations are significantly more consistent across paths.
-
DNS-sly encodes downstream data by selecting A records from the IP address pool of CDN-hosted domains. For the top 25% of Alexa Top 500 domains, approximately one third of DNS responses contain more than 8 A records and ~15% contain 15 A records; the global IP pool has a median of ~2,000 IPs per domain (maximum ~16,000), enabling b = floor(log2(s!/(s-c)!)) bits per response.
-
DNS-sly requires out-of-band distribution of a 2.3 MB compressed bootstrap package (user profile map) before covert communication begins. The authors explicitly reject automated in-band bootstrapping to preserve deniability, accepting a hard scalability constraint as the cost; the particular censored environment tested did not interfere with DNS traffic at all, enabling successful censored-site retrieval at the same throughput rates as uncensored tests.
-
DNS-sly achieves statistical deniability by profiling each user's organic DNS behavior — recording accessed domains, semantic topics, and resolver-specific IP addresses — and constructing upstream requests that semantically overlap with that profile. Upstream communication is indistinguishable from normal DNS traffic in volume, frequency, and semantics; all DNS headers are fully legitimate with no unusual record types.
-
Slitheen replaces only 'leaf' HTTP resources (images, video) in overt-site responses with covert content, reusing all TCP/IP headers verbatim and forwarding packets immediately on arrival. This forces every observable feature—packet size, direction, inter-arrival timing—to be identical to a genuine access of the overt page, eliminating the censor's ability to apply latency analysis, website fingerprinting, or protocol fingerprinting to distinguish decoy sessions from normal traffic.
-
Measurement of the Alexa top 10,000 TLS sites showed that the fraction of traffic replaceable by a Slitheen relay varies from 0% (Facebook, due to large TLS records preventing leaf replacement) to 100% (Wikipedia, Yahoo). For representative sites: Reddit achieved 70% ±10% of leaf bytes replaced (19% ±3% of total page bytes), Gmail 87.7% ±0.2% of leaf bytes (23% ±9% total), and Quora 99% ±5% of leaf bytes (20% ±10% total), as reported in Table 2.
-
Salmon's defense against the active zig-zag attack — where a censor blocks a known server to force users onto new ones and watches for correlated reassignments — requires both per-user authentication (unique login credentials per server so unauthorized probes receive a plausible HTTPS page) and traffic camouflage. Without authentication, the server must respond as a functioning proxy to any connection, fully exposing itself to the censor; without camouflage, even a rejected connection may reveal the server's nature.
-
Adding a DPI apparatus with true positive rate TPR and false positive rate FPR creates three ordered thresholds Fam ≤ Fab ≤ Fmb governing censor strategy: allow all traffic (CTP ≤ Fam), deploy the apparatus (Fam < CTP ≤ Fmb), or block all traffic (CTP > Fmb). The apparatus does not qualitatively change the Nash equilibrium structure; it only shrinks the CTP range the circumventor can sustain, with the ordering Fmb ≥ Fab ≥ Fam holding whenever TPR ≥ FPR.
-
A censor can mount a zero-collateral-damage flooding attack by injecting fake CRS-protocol-conformant traffic into open channels, inflating the apparent CTP and evicting real circumvention traffic to throttled or sacrificial protocols. If injection is costless the censor can drive real circumvention throughput to zero while keeping all channels nominally open; the attack is equally effective against both throttling and dumping CTP control strategies.
-
In a single-round censorship game the only Nash equilibrium that keeps the channel open requires the circumvention traffic proportion (CTP) satisfy CTP ≤ F, where F = (βant+βbnt)/(αact+αbct+βant+βbnt). In repeated indefinite games a stable equilibrium exists at CTP = Z = (1−p)·CTPmax, where p is the per-round continuation probability, allowing a non-zero proportion of circumvention traffic to flow indefinitely without triggering shutdown.
-
Snowflake exclusively uses WebRTC data channels (on-wire protocol: DTLS), whereas the majority of WebRTC applications use media channels (DTLS-SRTP or SRTP/SDES); a censor can therefore block Snowflake by filtering data-channel flows alone without blocking WebRTC media applications, incurring minimal collateral damage and reducing the overblocking deterrent.
-
The authors extend Houmansadr et al.'s 'parrot is dead' argument to WebRTC: because WebRTC is a large multi-protocol framework, superficial mimicry that fails to replicate exact DTLS version, cipher suite ordering, certificate common name ('WebRTC'), 30-day validity period, STUN server selection, and ICE packet sequence leaves detectable residual distinguishers, making deep fingerprint conformance especially hard for standalone non-browser implementations such as Snowflake's client.
-
Among the five WebRTC applications analyzed (Google Hangouts, Facebook Messenger, OpenTokRTC, Sharefest, Snowflake), Snowflake is uniquely identifiable by its use of DTLSv1.2 (all others use DTLSv1.0), its 17 offered cipher suites, and its exclusive selection of TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256—a cipher suite not chosen by any other application in the study.
-
STUN and TURN packets carry a SOFTWARE attribute that explicitly names the server implementation (e.g., 'Citrix-3.2.5.1 Marshal West' for OpenTokRTC), and the choice of STUN servers, forced-TURN usage, and STUN message-type sequence (Binding-only vs. Allocate+CreatePermission vs. send-indication) differ across applications, providing a passive censor with reliable application-level fingerprints orthogonal to the DTLS layer.
-
Castle's packet-size and inter-packet-time distributions (measured via Kolmogorov-Smirnov statistic) fall within the variance observed between legitimate human-game sessions when using ≤50 units/command at ~1 command/second; the best-performing classifier (Herrmann) achieved only ~60% accuracy—roughly 10% above random guessing—against multiple Castle configurations, while two other classifiers (Liberatore, Shmatikov timing) performed near chance.
-
Of 73 censorship resistance systems surveyed through February 2016, only 11 address the Communication Establishment phase versus 62 for Conversation, even though Tschantz et al. document that real censorship attacks concentrate on Communication Establishment rather than on the Conversation tunnel.
-
The Great Firewall detects Tor bridges through a two-stage active-probing pipeline: GFW DPI first flags a flow as a potential Tor connection, then random Chinese IP addresses initiate Tor handshakes to the suspected bridge; if the handshake succeeds, the bridge IP:port combination is blocked.
-
χ² homogeneity tests on 70 audio signal pairs show that at SNR ≥ 25 dB the probability that a statistical test distinguishes modulated from original signals falls to 77.13% (i.e., the rate of successful discrimination is below 23%). Crucially, this analysis requires access to the original unmodulated signal; for live voice transmissions no such pairing is feasible for the censor, rendering statistical detection unrealizable in practice.
-
The paper's threat model explicitly assumes censors can enforce client-side VoIP software (e.g., TOM-Skype in China) giving the adversary access to the pre-encoding audio signal at both endpoints. Despite this, SkypeLine forces the censor into an all-or-nothing position: intercepting hidden data requires blocking the entire VoIP service, since no network-layer observable (packet headers, timing, encrypted payload) distinguishes steganographic from legitimate calls.
-
SkypeLine's m-ary modulation (Mode B using 128-bit Hadamard sequences) achieves a peak data rate of 2,407 bps, representing a 12,035% improvement over FHSS-based DSSS (Takahashi et al., 20.5 bps) and 19,256% over phase-coding techniques (Nutzinger et al., 12.5 bps). Four-layer parallel binary modulation (Mode A, Quattro) achieves a peak of 224 bps and mean of 106.61 bps at ≥99% reconstruction accuracy.
-
A Skype prototype operating under real-world conditions achieves 64 bps (WGN noise, no ECC) at ≥99% reconstruction accuracy and ≥23 dB SNR. With OPUS/Silk encoding (vector quantization), throughput is constrained to approximately 72 bps at two modulation layers; additional layers fail to satisfy the 99% accuracy bound because VQ codec noise reduction filters the embedded pseudo-noise sequences.
-
Wireshark captures of Skype traffic with and without hidden information at inaudible SNR show no statistically significant differences in inter-arrival times (mean IAT 0.019 s in all conditions) and only a 2.6% difference in mean packet length (130.34 bytes unmodulated vs. 126.98 bytes at inaudible SNR), well within one standard deviation (SD ≈ 12–14 bytes) and insufficient for reliable content-mismatch detection.
-
By transmitting application-level social media content over genuine SMTP/IMAP connections rather than imitating email protocols, Mailet achieves channel and content consistency, making it immune to the differential channel attacks — channel mismatch and content mismatch — that defeated earlier hide-within systems such as StegoTorus and Freewave.
-
Mailet's GCM-based Credential Recovery (GCM-CR) achieves a 120x speedup over traditional garbled-circuit 2PC for privately reconstructing split credentials inside a live TLS record, enabling a single Mailet server to support up to 200 simultaneous sessions with each service request completing in approximately 1 second.
-
CovertCast uses the identical video codecs, streaming protocols (RTMP/HTTPS), and server endpoints as any other YouTube live stream, making it indistinguishable from regular streaming traffic to both passive protocol-analysis and active traffic-manipulation attacks. Any active attack that disrupts CovertCast connections — such as selective packet dropping — would equally disrupt all non-circumvention viewers of the same streaming service, imposing prohibitive collateral damage.
-
Protocol imitation systems (SkypeMorph, CensorSpoofer, StegoTorus) fail to achieve unobservability because they implement the target protocol only partially, creating statistical discrepancies that censors can detect. Houmansadr et al. (2013) demonstrated this as a fundamental flaw: unobservability by imitation is categorically insufficient as a circumvention design principle.
-
A survey of the top 10,000 Alexa websites found that only 6% (Class 1) are fully hosted on shared CDNs with HTTPS deployments that allow removal of destination leakage — the only class browsable with plausible unobservability against a competent DPI-equipped censor — while 64% are partial-CDN sites (Class 4) whose CDN-hosted content (images, videos) can still be reached via content wrappers or dynamic mirrors at negligible operational overhead.
-
A domain-based website fingerprinting attack against CDNBrowsing traffic — using the per-domain packet volume exchanged during a browsing session as a decision-tree feature vector — achieves 0.991 ± 0.002 accuracy against CacheBrowser on 100 China/Iran-blocked HTTPS pages, modestly outperforming the state-of-the-art k-NN classifier of Wang et al. (0.94 ± 0.002) while being two orders of magnitude faster: 0.60 CPU-seconds training and 10 µs classification versus 90 CPU-seconds and 0.05 CPU-seconds on an Intel Xeon 3.5 GHz processor.
-
Real-world CDN HTTPS deployments leak the identity of visited websites through three distinct channels — TLS certificate contents (A2, B1, B2 deployments), the plaintext SNI field (B1), and dedicated IP address mappings (B2) — enabling censors to block CDNBrowsing connections via standard DPI or IP filtering without collateral damage to non-forbidden CDN content. Each leakage channel requires inspecting only a single packet from an HTTPS connection, making the attack low-cost and deployable on off-the-shelf censorship boxes.
-
Table 1 of the survey documents that by 2013–2014 censors were deploying simultaneous blocking across BGP, DNS, IP/port filtering, TCP disruption, TLS, and application-layer keyword filtering. No single detection tool in the survey covers all six layers; the most comprehensive, OONI (2012), covers DNS, IP/port, TCP, TLS, keyword, and HTTP but notes only partial BGP coverage.
-
The paper formally defines circumvention as either preventing the trigger from being seen by the surveillance device, or countering the effects of the censoring action. This two-path decomposition — hide the trigger vs. nullify the enforcement — provides a clean design framework: a circumvention tool can succeed by making traffic unrecognizable (no trigger fires) or by routing around the blocking device (action nullified).
-
Marionette is the first programmable obfuscation system to simultaneously satisfy all five threat-model dimensions evaluated in Figure 2: resistance to blacklist DPI, whitelist DPI, statistical-test DPI, protocol-enforcing proxy traversal, and multi-layer traffic control, while sustaining throughput above 1 Mbps (up to 6.7 Mbps). Every prior system (obfs4, ScrambleSuit, SkypeMorph, StegoTorus, FTE, JumpBox, etc.) fails at least one dimension, most commonly stateful proxy traversal or statistical-feature control.
-
Randomization-based obfuscation systems (obfs2/3, obfs4, ScrambleSuit, Dust) resist blacklist DPI but fail entirely under protocol-whitelist filtering, as explicitly demonstrated during the Iranian elections where censors permitted only known-good protocols. Pure randomization provides no signal of being a permitted protocol, making it trivially blockable under any whitelist regime.
-
Format-Transforming Encryption (FTE) fails under proxy-induced ciphertext modification — a single character change causes decryption failure — while Marionette's probabilistic context-free grammar (CFG) templates tolerate header rewriting, connection multiplexing, and content alteration by intermediate proxies. Validated across 10,000 streams through Squid 3.4.9, achieving 5.8 Mbps downstream and 0.41 Mbps upstream goodput.
-
He et al. found that 65% of sampled routes between public traceroute servers have some degree of AS-level asymmetry; John et al. found that asymmetry reaches 96% on Tier-1 ISP backbone links due to hot-potato routing. These figures invalidate the symmetric-route assumption underlying Telex and Cirripede and motivate a fully asymmetric design.
-
Rebound's mole protocol generates a characteristic traffic pattern — a steady stream of long HTTP GET requests followed by 404-style error responses — that may be identifiable via traffic analysis even though the channel is TLS-encrypted; the paper acknowledges this as an unmitigated vulnerability and notes that intermingling with ordinary requests reduces observability but further lowers effective throughput.
-
Rebound eliminates the stack-fingerprinting vulnerability present in Telex, Curveball, Cirripede, and TapDance by never forging packets addressed to the client; all data from the decoy router to the client travels through the real decoy host, so the TCP/IP stack fingerprint observed by a censor is always that of the genuine decoy.
-
The GFW's active-probing system launches probes at suspected circumvention servers within seconds (typically under 3 minutes) of observing a suspicious connection, making reactive defenses (e.g., delaying or rate-limiting probe responses) insufficient on their own to avoid detection and blocking.
-
The GFW sends protocol-specific probe payloads tailored to each circumvention tool: Tor bridges receive a TLS ClientHello mimicking Tor's own; obfs2/obfs3 servers receive random-looking payloads; Shadowsocks servers receive random bytes. A server that responds differently to these crafted probes versus innocent traffic (e.g., by sending a valid protocol handshake in response to a probe) reveals itself and is subsequently blocked.
-
Domain fronting exploits the fact that major CDN providers (Google, Amazon CloudFront, Akamai, Microsoft Azure) terminate TLS at the edge before inspecting the Host header, so the SNI visible to a censor names a permitted CDN domain (e.g., www.google.com) while the inner HTTP Host header routes the request to a blocked destination. Blocking the fronted service requires blocking the entire CDN, creating collateral damage that most censors are unwilling to accept for major providers.
-
The paper formally characterizes the censor's visibility gap: the SNI field in the TLS ClientHello and the HTTP Host header inside the tunnel are the two places that reveal destination, and CDNs that terminate TLS before forwarding HTTP requests prevent censors from correlating them. Any censor capable of correlating SNI to inner-Host (e.g., through CDN cooperation or plaintext HTTP/2 framing) can defeat domain fronting without CDN blocking.
-
Iran's censorship infrastructure shifted from fully decentralized (Jaccard similarity ~0 across ISPs in 2007) to highly centralized by June 2011, when the Jaccard similarity between the national gateway AS 12880 and two other ISPs reached 0.94 and 0.95. Almost all 2011 blocking was accompanied by a blockpage containing an iframe redirecting to internal IP 10.10.34.34, providing direct evidence of a single choke-point filtering infrastructure.
-
Locally curated URL lists elicit 3–5× higher blocking rates than global lists in high-censorship countries. In China and Yemen, local content was blocked three to five times more than globally sensitive content, attributed to language filtering and active censorship of local political discourse; China's 99% block rate on 'falun' in HTTP path vs. 81% for 'falun' in domain name further illustrates trigger sensitivity.
-
Across MENA countries (UAE, Tunisia, Oman, Iran, Qatar, Yemen, Saudi Arabia, Burma), over 80% of blockpage-delivering tests delivered the blockpage without DNS redirection, indicating transparent web proxies performing deep HTTP inspection rather than the cheaper DNS-intercept approach dominant in China. McAfee SmartFilter was identified in Qatar, Saudi Arabia, and UAE; Netsweepr in Qatar, UAE, and Yemen.
-
Yemen's national ISP (YemenNet) uses explicit blockpages for social and Internet-tools content while applying stealthy techniques — TCP RST injection and unrequited HTTP GETs — specifically for political and conflict content that is constitutionally protected. Censorship also ceases intermittently when the ISP exhausts filtering product licenses.
-
The Great Cannon (GC) operates as a distinct in-path system — not an extension of the GFW — capable of both injecting and suppressing traffic, enabling full man-in-the-middle capability against targeted IP addresses. Unlike the on-path GFW, the GC only examines the first data packet of each connection (avoiding TCP bytestream reassembly), targets specific destination IP addresses rather than all border traffic, and maintains a per-source-IP flow cache of approximately 16,000 entries to ignore already-processed connections.
-
Routing traffic from a user on ISP-B through a peer relay on ISP-A (which applied only HTTP-level filtering and permitted HTTPS) produced the smallest page load times in most cross-ISP comparison runs, beating both HTTPS/domain-fronting and Tor. The performance gain is attributed to lower end-to-end latency on the intra-country cross-ISP path relative to international relay routes.
-
Across two major Pakistani ISPs, blocking mechanisms varied substantially for the same URL: ISP-A applied HTTP-level blocking with redirection to a block page, while ISP-B deployed multi-stage blocking combining DNS-level resolution to localhost and independent HTTP/HTTPS request dropping. A single ISP also used different filtering techniques for different URL categories (e.g., YouTube vs. HTTPS-accessible sites).
-
At least two ISPs (Cyta and Wind) returned fake HTTP 404 errors instead of mandated block pages for a portion of censored entries, and some ISPs served connection timeouts (port 443 blocked) with no explanation — in both cases obscuring deliberate censorship as an apparent network or server failure. Additionally, Cyta embedded Google Analytics on its block landing page to track users who attempted to access censored content.
-
Across eight Greek ISPs measured in June–August 2014, DNS hijacking was the dominant blocking method: seven of eight ISPs used it exclusively, while only Vodafone deployed DPI (Bluecoat WebProxy/6.0) for URL-level filtering. Compliance with the EEEP blacklist of 438 entries ranged from 21.91% (Forthnet) to 100% (Cosmote, HOL, OTE), with no ISP exactly matching the regulator's list.
-
Vodafone Greece's DPI system (Bluecoat WebProxy/6.0) performed exact-URL matching against the EEEP blacklist: requests to rivernilecasino.net and www.rivernilecasino.net passed through unblocked, while the exact blacklisted URL www.rivernilecasino.net/index.asp was intercepted and redirected to http://1.2.3.50/ups/no_access_gambling.htm. Subdomains of DNS-hijacked domains returned NXDOMAIN with no A record, making them silently unreachable rather than redirected.
-
Rook constructs per-field symbol tables by observing 600 packets (~60 seconds) of real gameplay at session start, then restricts substituted values to only those previously observed with frequency within two orders of magnitude of the median. This ensures altered packets never contain field values that are absent or anomalously rare in legitimate traffic, defeating value-anomaly and out-of-range DPI filters.
-
Rook achieves 34 bits/second client-to-server and 26 bits/second server-to-client within Team Fortress 2, sufficient for OTR-encrypted real-time chat. Rook use did not trigger Valve Anti-Cheat warnings and did not noticeably degrade gameplay for co-located legitimate players.
-
Kolmogorov-Smirnov two-sample tests on packet-size distributions and inter-packet timing show that standard Rook (altering ~1-in-10 packets) is statistically indistinguishable from normal TF2 gameplay across 20 samples each. High-bandwidth Rook (1-in-2 packets) shows a slightly higher average bandwidth but remains difficult to distinguish on traffic-shape metrics.
-
Because Rook runs the actual game client and server rather than mimicking them, active anti-mimicry probes receive identical responses to a normal game instance. Systems based on protocol mimicry are vulnerable to probes that expose non-conforming behavior, but Rook eliminates this attack surface entirely.
-
Game-specific trigram analysis of mutable fields distinguishes high-bandwidth Rook (1-in-2 substitution rate) in server-side packet counts, showing clearly reduced distinct-trigram counts versus baseline. Standard Rook (1-in-10) produces only a few outliers and is not reliably distinguishable; any detector would face a high false-negative or false-positive rate against normal-rate Rook.
-
Format-transforming encryption (FTE) as deployed in the Tor Browser Bundle is detected by combining a URI Shannon-entropy threshold (≥5.5 bits) with an exact URI length check (239 bytes) on the first HTTP GET request. This embellished test produces only 264 false positives across approximately 10 million HTTP URIs in three campus datasets, while a length-only test causes roughly 15% false-positive rate over the same flows.
-
CART decision-tree classifiers trained on entropy-based and packet-header features detect all five Tor pluggable transports (obfsproxy3/4, FTE, meek-amazon, meek-google) with average PR-AUC=0.987, TPR=0.986, and FPR=0.003 on synthetic traces. On 14 million real campus flows the highest per-obfuscator FPR is 0.65%, and meek-google yields only 842 false positives across all three datasets. However, cross-environment portability is poor: classifiers trained on an Ubuntu/campus setup and tested on a Windows/home network achieve true-positive rates as low as 52% with false-positive rates reaching 12%.
-
Obfsproxy3 and obfsproxy4 are reliably detected by an entropy-distribution test (KS test, block size k=8) applied to the first 2,048 bytes of the first client-to-server packet, combined with a minimum payload-length check of 149 bytes. On three university campus datasets totaling over 14 million TCP flows, the test achieves TPR=1.0 with FPR ranging from 0.24% to 0.33%. Omitting the length check raises the SSL/TLS false-positive rate to approximately 23%.
-
A semantics-based attack that flags HTTP flows carrying structurally invalid PDF documents as Stegotorus produces false-positive rates as high as 43% across three campus datasets (10,847 PDF flows examined), because malformed, partial, and non-standard PDFs are common in real network traffic. By contrast, active HTTP-response fingerprinting of a suspected Stegotorus server yields only 0.03% false positives (3 matching servers out of 9,320 Alexa-top-10K servers), but requires active probing and is detectable by the proxy operator.
-
CloudTransport achieves 'entanglement' by using the exact same cloud-client libraries, protocols, and network servers as legitimate cloud storage applications, making it immune to protocol-discrepancy detection that defeated imitation systems like SkypeMorph. Iranian censors blocked Tor by exploiting differences in Diffie-Hellman moduli between genuine SSL and Tor's SSL and the expiration dates of Tor's SSL certificates; CloudTransport has no such discrepancies because it is not an imitation. Simple line-speed tests based on tell-tale differences in protocol headers or public keys cannot be used to recognize CloudTransport.
-
Syria's Blue Coat proxies blocked any URL containing the string "proxy," generating 3,954,795 censored requests (53.61% of all policy-censored traffic in Dfull). The collateral damage was severe: Google Toolbar's /tbproxy/af/query API calls and Facebook social plugins (/plugins/like.php at 43.04% and /extern/login_status.php at 38.99% of facebook.com censored traffic) together account for over 80% of censored facebook.com requests, all denied with 0 allowed counterparts.
-
Syrian censors used a custom Blue Coat URL-category to policy_redirect specific Facebook pages (Syrian.Revolution: 1,461 censored) while allowing 17.70M facebook.com requests overall — only 1.62M (8.4%) were censored. The URL-pattern matching was imprecise: www.facebook.com/Syrian.Revolution?ref=ts was blocked but the identical page with additional AJAX query parameters (__a=11&ajaxpipe=1) was not categorized as 'Blocked Site,' leaving some access through.
-
Port 9001 (Tor) ranked third among all blocked ports in Syria, behind only ports 80 and 443. Proxy SG-48 was responsible for a disproportionate share of Tor censorship — blocking Tor traffic for multiple consecutive days — while other proxies in the same deployment did not, indicating per-proxy policy specialization or traffic steering of suspected circumvention flows to dedicated blocking infrastructure.
-
By embedding messages in heavily quantized DCT frequency components at base JPEG quality 30, TRIST achieves near-zero bit error rates when images are transcoded to higher quality levels and back. The quantization mapping is many-to-one, so noise introduced by re-encoding tends to be stabilized on output, making the message robust against commodity transcoding proxies that re-encode images in-flight.
-
Using low DCT frequency components (indices 10, 9, 8, 3) at JPEG quality 30 achieves near-zero message error rates for image rescaling in the 75–95% range across a wide range of sharpening sigma values. Higher-frequency component sets (indices 18, 17, 16, 10) only survive rescaling above 100%, making them unsuitable for scenarios where censors reduce image dimensions.
-
TRIST integrated with StegoTorus as a one-hop SOCKS proxy introduces minimal additional bandwidth overhead: JPEG steganography throughput falls between StegoTorus's PDF and JSON schemes across link delays of 20–400 ms and 1–4 parallel circuits. The steganographic expansion factor is 1:6 to 1:12 (message bytes to cover JPEG file length), adequate for basic web surfing.
-
Page length comparison at a 30.19% size-difference threshold achieves a 95.03% true positive rate and 1.371% false positive rate for block page detection, outperforming DOM similarity (95.35% TP, 3.732% FP) on false positive rate and cosine similarity (97.94% TP, 1.938% FP, 74.23% precision) on precision. These metrics were evaluated via ten-fold cross-validation on the ONI dataset of ~500,000 entries from 49 countries spanning 2007–2012.
-
Five commercial filtering products (FortiGuard, Squid, Netsweeper, Websense, WireFilter) were identified in 7 of 36 block-page clusters via copyright notices in HTML comments, HTTP header strings, or URL path patterns; the remaining 29 clusters contained no identifying markup. WireFilter was first detected in the wild in Saudi Arabia (AS 25019) in 2011, representing a newly deployed filtering product not previously observed in measurements.
-
Within a single country mandate, different ISPs implement censorship with different filtering tools and mechanisms: Thailand's AS 9737 and AS 17552 use structurally distinct block-page templates (vector 17 is ~1,000 bytes using div layout; vector 8 is ~6,000 bytes using table layout). Both ISPs actively obfuscate their filtering product by reporting generic 'Server: Apache/2.2.9 (Debian)' or 'Server: Apache' HTTP headers instead of the actual product identifier.
-
Facade encodes 78.04 bits per HTTP GET request using search-query terms, compared to Infranet's 3 bits per URL — a ~26× improvement — while maintaining comparable statistical deniability. StegoTorus encodes 12,000 bits per URL but offers no statistical deniability against traffic-pattern analysis.
-
Content inconsistency — transmitting non-native payloads (e.g., modem signals or general web traffic) over VBR-encoded VoIP/video channels — is sufficient for censors to detect camouflage systems via packet-length traffic analysis. Channel inconsistency — requiring reliable transport over a loss-tolerant UDP channel — allows selective disruption: dropping 5% of packets stalls SkypeMorph indefinitely, and dropping 90% for under one second desynchronizes the FreeWave modem.
-
DFA state-space explosion makes DFA-based FTE impractical for many realistic network-monitor regexes: the minimum DFA for `(a|b)*a(a|b){16}` has 131,073 states requiring 266 MB of precomputed tables, while the equivalent NFA has only 36 states requiring 73 KB — a reduction of roughly four orders of magnitude. Some formats in the Snort corpus required up to 383 MB under DFA-based ranking, rendering them prohibitive for deployment.
-
LibFTE exposes a regex-based API (Python, C++, JavaScript) that instantiates DPI-defeating FTE schemes from a regular-expression format specification alone, without expert cryptographic knowledge. The DCRS FTE scheme implemented in the library makes ciphertexts indistinguishable from real HTTP, SMTP, SMB, or other network-protocol messages under state-of-the-art DPI, and was already integrated into the Tor Browser Bundle at time of publication.
-
LibFTE's NFA-based 'relaxed ranking' sidesteps the PSPACE-hardness obstacle that previously made direct NFA ranking unworkable. Across 3,458 Snort IDS regular expressions in the network-monitor-circumvention setting, NFA-based ranking reduces client/server memory requirements by as much as 30% compared to DFA-based approaches.
-
The paper argues that the advantage in the censor-vs-circumvention arms race lies with the censor due to fundamental asymmetry: a nation state controls centralized communication infrastructure while dissidents depend on it. Standalone anti-censorship tools therefore face a structurally disadvantaged security posture that iterative patching cannot overcome.
-
Centralized communication architectures have a single global point of failure: governments can leverage centralization to surveil with or without operator cooperation, as demonstrated by the Snowden revelations about Skype, Facebook, and Google. A compromised broker in a centralized design enables monitoring and censorship that spans all users of the service.
-
The paper sketches a decentralized DHT-based communication protocol where all payloads are encrypted in TLS and explicit redirection enables a form of onion routing. Because the censor cannot distinguish censored from non-censored streams, it is forced into a binary choice: block all protocol traffic (overblocking) or allow all of it.
-
Known attacks on existing circumvention tools include steganographic detection, enumeration of decoy-router locations, and machine-learning traffic classifiers. The paper acknowledges these defeat current approaches (Infranet, Collage, Telex, SkypeMorph, Freewave) and argues that no iterative patch can neutralize the censor's long-term structural advantage.
-
GNS encrypts all DHT queries and responses using a zone-private-key-derived symmetric key (h = x·l mod n; query = H(hG)) such that a passive DHT observer can only mount a confirmation attack — requiring simultaneous knowledge of both the zone's public key and the specific label. Without both values, an adversary observing DHT traffic cannot determine the label, zone, or record data; even fully participating malicious DHT nodes see only opaque signed blobs unlinkable to their originating query.
-
A pre-shared key enables encrypting the entire GoHop packet—header, payload, and padding bytes—achieving true randomness in the full byte stream. Standard VPN protocols such as OpenVPN encrypt only the payload while leaving headers in plaintext, exposing protocol-identifying fields to DPI without payload inspection. This design choice is a prerequisite for defeating header-based fingerprinting.
-
Spreading UDP datagrams across a randomized port range breaks traditional 5-tuple-based session tracking, randomizes per-port inter-arrival times, and reduces per-port throughput to a small fraction of the aggregate—making per-flow statistical analysis significantly harder. Critically, the number of random ports does not reduce aggregate throughput: GoHop measured 76.8 Mbps (1 port) versus 78.5 Mbps (100 ports) at the virtual NIC.
-
Asymmetric IP routing is a fundamental constraint on prior E2M designs: tier-2 ISPs typically see around 25% of packets on asymmetric paths, while tier-1 ISPs can have up to 90% of packets on asymmetric flows. Because Telex requires observing both directions of a connection to derive the client-server TLS master secret, this asymmetry severely constrains where it can be deployed. TapDance resolves this by using chosen-ciphertext steganography to leak the master secret from client to station in a single upstream packet, making it functional under fully asymmetric routing.
-
TapDance introduces chosen-ciphertext steganography, which allows the client to embed an arbitrary-length hidden message inside a valid TLS ciphertext without invalidating the TLS MAC or session. By exploiting ciphertext malleability in both stream-cipher (counter) mode and CBC mode, the client can choose specific byte values to appear in the ciphertext while constraining plaintext to a safe ASCII range (0x40–0x7F), encoding 6 bits of tag data per ciphertext byte. This provides unbounded covert-channel bandwidth, compared to the fixed 224-bit TLS nonce used by Telex and Decoy Routing or the 24-bit TCP ISN used by Cirripede.
-
Scanning a 1% sample of the IPv4 address space and the Alexa top-1-million domains, the authors found that over half of all TLS hosts will leave an incomplete HTTP request connection open for at least 60 seconds before sending data or closing the connection; many had timeouts exceeding 5 minutes. The 16-core TapDance station prototype processes over 12,000 tag verifications per second per core, with approximately 90% of CPU time consumed by a single ECC point multiplication on Curve25519. The station adds a median latency of 270 milliseconds to page downloads versus direct connections, and a single station instance can be overwhelmed by approximately 1.2 Gbps of TLS application-layer traffic.
-
Iran's censors preferred throttling over outright shutdown because it is less conspicuous and draws less controversy. The paper notes that NDT-style bulk-transfer tests cannot detect targeted, DPI-based throttling of specific protocols (VPN, Tor, streaming), since those present different traffic signatures than generic TCP bulk transfers. Iran's filtering infrastructure (TCI/ITC, AS12880) runs deep packet inspection as an auxiliary layer on top of ISP-level controls.
-
Measurement of Alexa top-500 websites across 18 categories found that over 50% of the internet's most-visited sites were blocked in Iran, with adult content blocked at over 95% and the Art category the third-most censored. DNS hijacking was applied selectively to only three domains (facebook.com, youtube.com, plus.google.com), while HTTP Host filtering accounted for the vast majority of blocks.
-
Traceroutes from one major Iranian ISP to 3,160 destination IPs across 13 countries consistently showed a single private-address node (10.10._._) as the first observable external hop, preceded by one of only two TCI-owned transit nodes. TTL-based probing confirmed that both HTTP and DNS blocking originated at this same centralized node, suggesting that the processing capacity of this national chokepoint is a key bottleneck in Iran's censorship infrastructure.
-
Iran's HTTP censorship allows the TCP three-way handshake to complete normally before acting on the HTTP GET request: the censor responds with a '403 Forbidden' and simultaneously sends 5 spoofed RST packets to the destination server (3 with in-sequence numbers, 2 with seemingly random offsets). No modifications to TCP/IP or HTTP headers were observed at either endpoint, ruling out a transparent proxy and pointing to inline DPI.
-
URL filtering appliances are frequently misconfigured to be externally visible on the global Internet, enabling passive identification via Shodan keyword searches on product-specific HTTP headers and management console paths (e.g., 'cfru=' for Blue Coat, '8080/webadmin/' for Netsweeper). This technique discovered previously unknown installations in Finland, Sweden, Philippines, Thailand, Taiwan, Argentina, and Chile, as well as large U.S. ISPs including AT&T, Verizon, Bell South, Comcast, and Sprint.
-
tracebox identified a transparent HTTP proxy or IDS within a National Research Network (SUNET) that intercepted port-80 SYN probes but not port-21 SYN probes, producing shorter observed path lengths to port 80. It also found proxy misconfigurations causing forwarding loops for non-HTTP traffic, where ICMP replies alternated between two routers indefinitely.
-
Manually-generated FTE regexes achieve a 100% misclassification rate against all six tested DPI systems — appid, l7-filter, YAF, bro, nProbe, and the proprietary enterprise-grade DPI-X — for HTTP, SSH, and SMB target protocols. Each regex took less than 30 minutes to specify and debug against known classifiers.
-
FTE proxy overhead compared to socks-over-ssh: the intersection-ssh format incurred 0% average latency increase and only 16% bandwidth overhead (1,164 KB vs. 1,348 KB per Alexa Top 50 site). The worst-case auto-http format incurred 29% latency increase (5.5 s vs. 7.1 s) and 181% bandwidth overhead (3,279 KB), primarily due to ciphertext expansion and FTE/SOCKS negotiation on persistent empty TCP connections.
-
An FTE-tunneled Tor circuit using intersection, manual, and auto HTTP formats successfully traversed the Great Firewall of China from a VPS inside China to a server in the United States on port 80. A persistent tunnel polling a censored URL every five minutes remained active for one month until VPS account termination, with no blocking observed.
-
Regex-based DPI is fundamentally vulnerable to format-transforming encryption: because every tested system (including the proprietary enterprise-grade DPI-X, rated for 1.5 Gbps at $8,000) classifies protocols solely by membership in a regular language, any ciphertext can be guaranteed to match any chosen regex. The paper argues this forces DPI to adopt machine learning, active probing, or non-regular semantic checks — but notes that making such checks fast, scalable, and low-false-positive at line rate for arbitrary target protocols remains an open problem.
-
In the standard redirect design the cooperating proxy's IP address or domain name appears in plaintext HTTP redirect responses, because the censored client cannot present a valid TLS certificate to the OSS and must use plain HTTP. A censor inspecting OSS-bound traffic can extract the proxy address from the Location header or URL query parameters. The no-redirect variant (client and server each initiate single scans of each other) eliminates this leakage at the cost of higher latency and server-side OSS enumeration.
-
FreeWave's modem generates audio whose packet-length distribution has dramatically lower variance than human speech, even when transmitted through Skype's variable-bit-rate encoder; Figure 9 shows that English and Portuguese speech samples produce high-variance packet-length sequences while modem audio produces a narrow, nearly constant distribution, providing a reliable passive classifier for modem-over-VoIP traffic. This content mismatch persists even with perfect emulation of the VoIP protocol framing.
-
Protocol mimicry approaches (SkypeMorph, StegoTorus, CensorSpoofer) do not execute the target protocol in full and leave detectable discrepancies: SkypeMorph fails to replicate Skype's TCP handshake, and CensorSpoofer's IP-spoofing downstream channel enables active traffic analysis by censors who can inject manipulated packets and observe whether the purported VoIP endpoint reacts. The authors state that morphing approaches provide no provable indistinguishability, and protocol evolution further invalidates mimicry over time.
-
FreeWave-over-Skype produces traffic statistically indistinguishable from genuine Skype-Speak state: average packet rate 49.91 pps vs. 50.31 pps for Skype-Speak, and average packet size 148.64 bytes vs. 146.50 bytes. However, the Skype-Silent state generates distinctly lower rates (49.57 pps, 103.97 bytes avg), creating a detectable anomaly when both FreeWave endpoints appear to be 'speaking' simultaneously rather than alternating.
-
Hypothetical fixed parrot systems (SkypeMorph+ and StegoTorus+) that correct all passive detection failures remain unambiguously detectable via active and proactive attacks (Table II). Supernode cache flushing and TCP control channel manipulation — e.g., sending RST causes genuine Skype to drop the call immediately while parrots produce no reaction — distinguish them from genuine Skype because the parrot cannot actually execute Skype protocol logic.
-
CensorSpoofer's IP-spoofing architecture has an unfixable detection flaw: the spoofer cannot receive or respond to SIP probe messages (INVITE, invalid SIP, BYE for random call IDs) directed at the spoofed dummy host, making four SIP probing tests (Table IV) reliably distinguish CensorSpoofer from genuine Ekiga at local-censor cost. The nmap-based dummy-host selection algorithm identifies only 12.1% of 10,000 random IPs as candidate hosts; SIP probing of 10,000 random addresses found zero IETF-based VoIP clients.
-
The authors enumerate 12 requirements a parrot system must satisfy simultaneously (Correct, SideProtocols, IntraDepend, InterDepend, Err, Network, Content, Patterns, Users, Geo, Soft, OS) while a censor need detect only one failure. They conclude 'unobservability by imitation is a fundamentally flawed approach' and recommend embedding covert traffic in genuine encrypted payloads of a real running protocol (e.g., FreeWave in Skype voice, SWEET in email), which constrains detection to OM adversaries performing large-scale multi-flow analysis.
-
SkypeMorph and StegoTorus-Embed fail 5 of 9 standard Skype identification tests (Table I), including the TCP control channel (T9), SoM packet headers (T3), and periodic message exchanges (T6/T7). All failures are detectable by a local (LO) passive censor at line speed without requiring ISP-scale statistical analysis.
-
The StegoTorus-HTTP module returns '200 OK' for non-existent URIs, produces no response to HEAD, OPTIONS, DELETE, and TEST method requests, and omits xref tables from generated PDF files. Using httprecon with 9 request types, the StegoTorus server is distinguishable from any real HTTP server by an OB (resource-limited) censor that records port-80 destination IPs at line speed and fingerprints them offline.
-
GFW reassembles both IP fragments and TCP segments for HTTP connections, but its overlap-resolution policy diverges from receiver behavior in documented cases: it prefers the original IP fragment in all overlap configurations except when the challenger is simultaneously left-long and right-long (IP2), and prefers a later left-equal TCP segment over the original (TCP5). The paper tests all 18 possible fragment overlap cases and confirms that placing a banned keyword only in the fragment version GFW discards achieves evasion.
-
GFW exhibits three confirmed HTTP analysis gaps: it inspects only the first Request-URI and Host header in HTTP-pipelined requests (HTTP3), will not scan beyond 2,048 bytes into a Request-URI (HTTP2), and recognizes only standard percent-encoding while ignoring alternative URI encodings such as overlong UTF-8 (HTTP4). The authors classify all three as low-difficulty fixes for the censor, meaning they may be patched quickly once disclosed.
-
GFW maintains TCP connection state for up to ≈10 hours and tolerates up to ≈1 GB of client-to-server data, but drastically reduces these limits when a sequence hole exists: it abandons state after buffering only 1 KB above the hole (TCP9) and times out holed connections in 60–90 minutes rather than ≈10 hours (TCP10). These thresholds were confirmed over repeated measurements and represent the maxima tested, not precise censor-configured limits.
-
GFW instantiates a TCB upon observing a bare SYN before any SYN-ACK (TCP1), enabling a split-connection evasion: a client sends a low-TTL SYN visible to GFW but not the server, then opens the real connection on the same 5-tuple with a different initial sequence number. GFW tracks the phantom TCB and fails to detect banned keywords on the real, desynchronized connection. This same behavior also renders GFW vulnerable to SYN-flooding-style memory exhaustion.
-
A TTL-limited bare FIN packet (without ACK) is sufficient to induce GFW to tear down its connection state for a live TCP session (TCP6b), because GFW accepts FIN packets that violate RFC 793's requirement for the ACK flag. After induced state teardown, subsequent packets carrying banned keywords on the same connection produce no RST, confirming the monitor has lost track of the flow.
-
Among 1,175 Chinese circumvention users surveyed in late 2012, purpose-built anti-censorship platforms showed severe attrition: Freegate had 44.3% former users but only 15.3% current users, while GoAgent and paid VPNs (piggybacking on commercially indispensable infrastructure) were the top two most-used tools in the past month. The median respondent had used four different types of circumvention tools, indicating frequent switching driven by blocking events.
-
China's 2012 real-name registration law for consumer-facing online services (including VPNs) is designed to enable censors to segment circumvention-related consumer VPN traffic from business VPN traffic — permitting selective blocking of consumer VPNs while leaving corporate VPNs operational. The GFW had already demonstrated protocol-level VPN blocking capability; registration provides the identifying information needed to apply that capability selectively rather than as a blunt instrument.
-
Tor, which has minimal commercial footprint and a distinctive network signature, was blocked throughout China using tailor-made GFW countermeasures and lost approximately 85% of its Chinese users as a result. In contrast to GoAgent and VPNs, China's censors can block Tor without significant economic collateral damage, making it uniquely vulnerable despite its strong privacy properties.
-
Key distribution is the primary bootstrapping weakness of steganography-based censorship-resistance systems: a censor can simply block stego-key distribution. Identity-based steganographic tagging (IBST) eliminates this attack surface by requiring only a single master public key, which can be bundled with the client software — no key distribution inside the censored area is necessary.
-
The IBST construction is provably secure under the bilinear decisional Diffie-Hellman (BDDH) assumption in the random oracle model. Any adversary with advantage ε(λ) against IBST indistinguishability implies an adversary against BDDH with advantage at least ε(λ)/e(1+qE), where qE is the number of private-key extraction queries. Tags produced by the scheme are computationally indistinguishable from uniform random bitstrings for any party lacking the recipient's private key.
-
ScrambleSuit defeats active probing by requiring clients to prove knowledge of an out-of-band shared secret before the server responds; a probing censor receives only silence. Two mechanisms are provided: session tickets (preferred for non-Tor applications) and an authenticated UniformDH handshake (optimized for Tor's shared-secret bridge distribution model), with both producing payloads computationally indistinguishable from random.
-
Tor's traffic contains a characteristic prevalence of 586-byte packets (Tor's 512-byte cells plus TLS header overhead) that form a strong flow-level fingerprint detectable from a few dozen captured packets. ScrambleSuit's packet length morphing eliminates this signature and shifts the distribution toward MTU-sized packets, but the authors note that a censor using the VNG++ classifier — which relies on coarse features like connection duration, total bytes, and burstiness — would still require only a marginal increase in ScrambleSuit's overhead to defeat.
-
DPI boxes used for censorship do not rely solely on simple regular expressions but also employ context-sensitive languages for protocol identification. The paper notes that precise knowledge of these DPI patterns could be fed directly into format-transforming encryption to enable targeted protocol misidentification.
-
Iran deployed a new Tor-blocking strategy in February 2013 that caused direct Tor user counts to collapse from over 50,000 to near zero within weeks, as recorded by Tor Project metrics.
-
As of March 2013, Tor is documented as blocked in China, Iran, Syria, Ethiopia, the UAE, and Kazakhstan. Blocking techniques range from simple IP address blacklisting to a sophisticated hybrid consisting of deep packet inspection (DPI) and active probing.
-
Tor's TLS handshake exhibited multiple distinguishing fingerprints — including the client cipher list, server certificates, and randomly generated SNIs — that were used for TLS-based filtering in Ethiopia, China, and Iran. Inferring the exact byte-level pattern matched by DPI boxes required manual analysis and remains a difficult open problem as of 2013.
-
SWEET argues that mimicking complex protocols (SkypeMorph, CensorSpoofer, StegoTorus) is fundamentally breakable because comprehensive imitation of today's protocols is infeasible. The paper instead advocates tunneling inside genuine traffic from actual, widely-used protocol providers — in this case real email services — so the censor observes authentic protocol behavior rather than a simulation.
-
When using a foreign encrypted email provider (AlienMail), the censor observes only an encrypted connection to the foreign mail server (e.g., Gmail's servers in the U.S.); it cannot see the recipient address or the SWEET server's IP, making spam-filtering-style blocking of the SWEET endpoint entirely infeasible. This anonymity is provided by the mail provider's own TLS, requiring no additional obfuscation from the client.
-
When using a domestic email provider that collaborates with the censor (DomesticMail), SWEET clients must embed tunneled data via steganography (image or text) and coordinate a secondary secret email account with the SWEET server out-of-band. This prevents the censor from discovering the SWEET server association via recipient-field inspection, but adds operational complexity and requires an out-of-band bootstrapping channel.
-
BlueCoat's commercial DPI hardware/software, deployed in Syria, was confirmed capable of detecting and blocking Ultrasurf connections. BlueCoat logs recovered from Syria additionally exposed real Ultrasurf user behavior, including unproxied traffic leaking to non-Ultrasurf servers before and after bootstrapping completed.
-
Ultrasurf confirmed to the researcher that its protocol has no forward secrecy and uses RC4 without any integrity check (no MAC or HMAC). This means all recorded ciphertext can be retrospectively decrypted once a session key is recovered, and the stream is trivially malleable — both properties confirmed by the UltraReach team during disclosure.
-
Ultrasurf's DNS bootstrapping phase uses subdomain names that are always exactly 16 characters between delimiters and exclusively target .info TLDs, producing a constant byte-width network signature. The paper concludes that filtering this bootstrapping traffic is straightforward even without reverse engineering the client binary, as the client itself acts as a network discovery oracle for censors observing its connections.
-
Flash proxy tunnels carry inherent network-level fingerprints that survive application-layer obfuscation: WebSocket connections begin with a plaintext HTTP upgrade handshake followed by structured binary framing, and Flash socket connections open with a crossdomain XML policy request — both are distinguishable from ordinary TCP by a DPI middlebox.
-
OONI's experiment-control methodology explicitly favors false positives over false negatives: it is preferable to generate more censorship candidate events for further investigation than to miss genuine interference. Mismatch between experiment and control data is not always a definitive signal of manipulation but is treated as sufficient cause for flagging, and data collection and analysis are treated as distinct phases.
-
OONI observes that many interception devices deployed in the wild advertise their vendor and model information, making passive device identification feasible from probe-level observations alone. The framework is designed to locate interception devices and then apply probing techniques to fingerprint the specific vendor and product in use.
-
OONI's threat model assumes an adversary capable of country-wide traffic manipulation who may actively fingerprint and identify measurement probes. Prior measurement tools (e.g., ONI's rTurtle) used easily fingerprinted centralized DNS and HTTPS traffic, which the authors flag as a pattern to avoid. The authors acknowledge that anti-fingerprinting measures will likely reduce measurement accuracy — a trade-off unresolved at publication.
-
OONI's traffic manipulation test suite uses bidirectional traceroute comparison: asymmetry between inbound and outbound paths for specific source/destination port pairs is treated as an indicator that traffic is being diverted to an interception device. Additional per-flow indicators include timing differences in packets directed at specific ports and layer-7 header field manipulation detectable at the receiving endpoint.
-
SkypeMorph's packet size and inter-packet delay distributions are statistically indistinguishable from real Skype video calls: Kolmogorov-Smirnov tests on both the naïve traffic-shaping and enhanced Traffic Morphing outputs report p > 0.5, indicating no significant difference from the Skype target distribution. The original Tor traffic distribution, by contrast, is considerably different from Skype, validating the need for the morphing layer.
-
Encrypted channels expose only two statistical features to an external observer: packet sizes and inter-packet arrival times. Original Traffic Morphing (Wright et al. 2009) shaped only packet-size distributions, leaving inter-packet timing as an unobfuscated fingerprint identical to the source (Tor) distribution. SkypeMorph extends Traffic Morphing to jointly sample from nth-order conditional distributions of both packet sizes and inter-packet delays (tested up to n = 3), closing the timing gap.
-
BTP's wire protocol contains no handshakes, timeouts, or plaintext headers. Connections open with a pseudo-random b-byte tag that the recipient can compute in advance from its key state, making BTP frames indistinguishable from random data to a passive observer who does not know the shared secret.
-
China's censoring devices send four spoofed RST packets per filtered connection with varying sequence and ACK numbers and TTL values corresponding to roughly the hop count to the Chinese border; the IP ID field increments sequentially per TTL group, strongly implying a small cluster of out-of-band machines co-located at each border router. Because the device is out-of-band, the actual server response still arrives at the client but is preempted by the injected RSTs.
-
China's censoring device is stateful: it inspects only the first HTTP GET request after a TCP handshake and ignores subsequent requests or those without a preceding handshake. After blocking a request, it records the (src IP, dst IP, port, protocol) tuple and denies all further communication between that machine pair for approximately 12 hours, even for traffic that would not independently trigger censorship.
-
Across 11 countries, censorship execution falls into at least six distinct categories: DNS redirect to localhost (Malaysia, Russia, Turkey), DNS redirect with warning page (South Korea), connection timeout with no notification (Bangladesh, India), spoofed TCP RST injection (China), spoofed HTTP 403 with warning page (Bahrain, Iran), HTTP 302 redirect (South Korea, Thailand), and spoofed HTTP 200 iframe response (Saudi Arabia). Four countries censor at DNS and eight at routers, with South Korea employing both layers simultaneously.
-
South Korea operates DNS-based and router-based censorship simultaneously; sites blocked at the DNS resolver are a strict subset of those blocked at the router, verified by switching to an external DNS resolver and observing continued blocking at the router layer. Alternate DNS resolvers alone are therefore insufficient to circumvent South Korean censorship, in contrast to Malaysia, Russia, and Turkey where DNS-only bypass is adequate.
-
Tor's fixed 512-byte cells packed into TLS 1.0 records produce a characteristic TCP payload of 586 bytes (512 + 74 bytes of TLS overhead). A perimeter filter running a simple exponential moving average (τ ← ατ + (1−α)1ₗ₌₅₈₆, α=0.1, T=0.4) identifies Tor flows within a few dozen packets; this attack succeeds at backbone rates of ~540,000 packets/second on commodity hardware. Obfsproxy does not alter packet sizes or timings and therefore does not defeat this classifier.
-
The GFC identifies Tor connections via a unique TLS ClientHello cipher list sent by the Tor client. Once DPI boxes detect this fingerprint on outbound traffic, active scanning is initiated within minutes: scanners connect to the suspected bridge, attempt to build a Tor circuit, and if successful the IP:port tuple is blocked. This two-stage pipeline (fingerprint → confirm → block) allows dynamic bridge blocking without pre-enumeration.
-
Tor DPI fingerprinting by the GFC is applied exclusively to egress traffic (from inside China to the outside world). Simulated Tor connections between domestic Chinese nodes and between external nodes connecting inward to a Chinese VPS attracted zero active scans across multiple experimental runs, indicating the detection infrastructure is positioned on the border for outbound flows only.
-
Even with end-to-end encrypted messages, a censor observing subscription queries can detect anomalous interest in a short tag (e.g., a sudden domestic surge in followers of a foreign pop star's hashtag) and use timing/size traffic analysis to distinguish #h00t subscriptions from ordinary hashtag follows. The paper flags this as an open threat and proposes two mitigations: (1) push cover traffic for randomly selected short tags to all clients regardless of their actual subscriptions, or (2) silently redirect normal clients' hashtag follows to the corresponding #h00t short tags.
-
If a large site such as Google or Wikipedia scrambled all served content using a publicly known de-scrambling algorithm, the censor faces a strict all-or-nothing blocking decision: it cannot selectively filter banned scrambled content without blocking the entire site, since scrambled legitimate and banned content are computationally indistinguishable prior to running S⁻¹. This property scales the political cost of blocking proportionally to the size of the co-scrambling platform.
-
Scrambling without secret key management can frustrate DPI-based censors if the de-scrambling function satisfies 'high-inertia' — meaning an adversary computing S⁻¹ on n inputs cannot use less than Θ(n) times the resources of a single commodity-PC user, including electricity, memory, and computation time. This forces bulk censorship to become computationally infeasible without over-censoring all scrambled content.
-
Transmitting the de-scrambling algorithm S⁻¹ as in-page JavaScript alongside AJAX-fetched scrambled content eliminates the need for special client software installation or trusted public-key distribution, removing the primary bootstrapping vulnerability that cryptographic censorship-resistance schemes (including Tor) share — a vulnerability exploited when Iran blocked Tor by filtering its Diffie-Hellman parameter bit sequence.
-
The proposed multi-stage scrambling composes four orthogonal layers: (a) 128-bit AES with 20 bits stripped, requiring brute-force search; (b) an AES key derived from a CAPTCHA solution; (c) a memory-bound function key; and (d) blocks whose de-scrambling exploits JavaScript floating-point and string-processing quirks. Each layer independently forces a censor to build or emulate a distinct acceleration environment, multiplying total reverse-engineering cost.
-
Applying a BEAR all-or-nothing package transform (using a zero key) to message blocks forces any censor attempting to scan content to cache all blocks from all active concurrent transfers simultaneously, since no individual block reveals any information about the original message until all blocks are received. Artificially delaying block transmission amplifies censor state requirements proportionally.
-
Libya implemented escalating Internet disruptions before executing a sustained blackout: a 6.8-hour curfew on February 18 and an 8.3-hour curfew on February 19, followed by a 3.7-day near-total blackout beginning March 3. The authors detected what they believe were Libya's attempts to test firewall-based packet filtering before transitioning to more aggressive BGP-based disconnection, demonstrating a two-phase escalation pattern.
-
Using two CAIDA traces from March 2011, the byte volume of TCP SYN packets across all ports was only 4–7% that of port-443 traffic. Cirripede's registration design inspects only SYN packet headers rather than full HTTPS payloads, reducing the traffic an ISP must process by 14–25× compared to Telex/Decoy routing architectures that must reconstruct all port-443 TCP sessions.
-
A preplay attack defeats the TLS-sentinel covert channel: the adversary intercepts each ClientHello, immediately sends a copy to the decoy destination before the client's copy arrives, causing the sentinel to be consumed and poisoned. The client can never establish a decoy routing session while ordinary TLS to the decoy destination continues to work normally, giving the adversary both blocking capability and forensic confirmation that decoy routing was attempted. The paper notes this vulnerability is specific to the TLS sentinel and that alternatives such as port-knocking sentinels may not share it.
-
TCP flow hijacking by the decoy proxy is practical under an asymmetric routing assumption: expected sequence numbers are recoverable from ACK values in client-originated packets alone, so the decoy router need not observe return traffic. The proxy forges a TCP RST to the decoy destination and mimics its TCP options (timestamp, window scale, SACK) to reduce detectability; these options are conveyed encrypted inside the sentinel's 28-byte TLS random field.
-
Clients embed HMAC-derived, time-varying sentinels into the 28-byte random field of the TLS ClientHello message, which decoy routers can scan at line rate. Sentinels are keyed to the current hour and a per-hour sequence number, providing freshness. This covert channel requires no out-of-band signaling and is invisible to passive observers who see only a normal TLS handshake toward the decoy destination.
-
A politically active blogger in an anonymized censored country explicitly avoided BlackBerry encryption stating: 'they can't crack that encryption and they would just get suspicious. Cause they listen to me and listen to me and then suddenly I am encrypting and so that means I am really saying something they don't want me to.' This documents censor behavior where the mere use of strong encryption—independent of content—serves as a targeting signal.
-
A passive observer of BridgeSPA traffic sees only a TCP connection timeout on failed authorization or a successful TLS connection on success—exactly what they would observe with an unmodified Tor bridge. The ConnectionTag is indistinguishable from the normally-random ISN and timestamp fields in Linux 2.6, so no new observable artifact is introduced. However, BridgeSPA does not address the separate problem that Tor traffic itself remains fingerprint-distinguishable from HTTPS; this is an orthogonal concern.
-
Encrypted protocols such as SSL/TLS remain fully fingerprint-able through their unencrypted handshakes: DPI can apply static string matching, packet-length comparison, and timing profiling to the cleartext cipher-negotiation and key-exchange phase to identify and block the protocol even though the payload is encrypted.
-
Dust defeats DPI fingerprinting by constructing all packets from entirely encrypted or single-use random bytes (defeating static string matching), appending a random number of random padding bytes to every packet (defeating length matching), and permitting a complete client–server conversation to be encoded in a single UDP or TCP packet (defeating timing analysis for sufficiently small payloads).
-
Dust eliminates the in-band key-exchange fingerprint surface via an out-of-band half-handshake: the server's public key, IP, port, and a single-use secret are bundled into a PBKDF-encrypted invite packet transmitted out-of-band; only the decryption password (not the server IP) appears in plaintext, defeating the email/IM IP-address blocking attacks documented against prior systems.
-
BitTorrent's Message Stream Encryption (MSE), despite omitting static strings from the handshake, can be identified with 96% accuracy using packet-size analysis and direction-of-packet-flow; MSE also uses a cleartext Diffie-Hellman key exchange, leaving an additional fingerprint surface.
-
The obfuscated-openssh handshake encrypts SSH with a key derived from an iterated-hash PBKDF whose slowness was intended to prevent real-time censor analysis; Wiley argues this defense fails because modern censors use statistical packet sampling with offline processing, and the slow key generation itself introduces a timing side-channel detectable from the inter-packet delay between the first and second packets.
-
Censors responding to encryption-based circumvention have two escalation options: block all encrypted connections outright, or identify the underlying protocol via traffic signatures that persist even inside encrypted tunnels. The paper frames these as the two dominant censor responses to DPI being defeated by encryption.
-
National-level filtering is not homogeneous: the administrative burden of maintaining up-to-date filtering rules at national scale leads states to delegate implementation to regional authorities or individual ISPs, producing measurable filtering differences between geographic regions and providers within the same country.
-
Telex embeds steganographic tags in TLS ClientHello nonces using elliptic-curve Diffie-Hellman, placing proxy stations at ISP level on paths between the censor's network and popular uncensored destinations. Because the cover destinations are ordinary popular HTTPS websites, the censor cannot block Telex without simultaneously blocking a large class of legitimate TLS traffic — converting the censor's own reluctance to over-block into an unblockability guarantee.
-
The study located 495 router interfaces with attached IDS filtering devices across China, with CHINANET holding 79.4% and CNCGROUP 17.4%. The two ISPs use fundamentally different placement strategies: CHINANET distributes filtering across provincial networks (80% of its 21 served provinces operate their own filtering devices, Guangdong alone hosting 84 of 374 CHINANET interfaces), while 90% of CNCGROUP's 82 filtering interfaces concentrate in its backbone.
-
CNCGROUP's filtering interface count has grown to three times its 2007 level, now accounting for 17.4% of all 495 filtering interfaces found, while CHINANET's count has remained stable since 2007. This divergence indicates CNCGROUP is actively expanding its censorship infrastructure while CHINANET's filtering capacity has matured.
-
China's AS-level topology is shallow and concentrated: CHINANET and CNCGROUP together account for 63.9% of 133 unique foreign peerings, 87% of internal ASes are within one hop of a border AS, and just 24 border/backbone ASes serve as effective choke points for all international traffic. The TTL of GFW RST packets is now crafted to prevent IDS localization by TTL inspection, requiring TTL-incrementing probe packets to identify filtering device positions.
-
The GFW is fully stateful as of 2010: probing all 11,824 Chinese IP prefixes with single TCP packets containing the keyword 'falun' produced no RST responses, confirming that a complete TCP handshake must precede any filtering trigger. Earlier measurements (2006, 2007) reported contradictory results; this study finds statefulness is now universal across all probed prefixes.
-
14 of 495 filtering interfaces (2.9%) are located in non-border internal ASes, all but two belonging to CHINANET provincial subsidiaries. The paper notes that CHINANET's provincial filtering architecture creates infrastructure capable of inspecting inter-provincial domestic traffic, even though there is no current evidence it is being used for that purpose.
-
Collage's threat model identifies the censor's two most dangerous capabilities as: (1) aggregate traffic-flow analysis (e.g., NetFlow statistics) to detect anomalous access patterns to specific content hosts, and (2) joining the system as a sender or receiver to discover content locations and mount denial-of-service or deniability attacks. The censor is assumed to monitor all egress traffic but is modeled as computationally limited against joint statistical distributions across arbitrary user pairs.
-
The paper demonstrates that no single steganographic algorithm can provide both availability and deniability, since almost all production algorithms have been broken and steganography alone does not hide the identities of communicating parties. Collage addresses this by treating the embedding algorithm as a swappable component in a layered architecture—vector layer, message layer, application layer—so that compromise of the embedding scheme does not compromise the system, and stronger algorithms (e.g., digital watermarking) can be substituted as they mature.
-
Collage leverages platform-scale user-generated content—Flickr's 3.6 billion images with 6 million new per day and Twitter's ~500K tweets/day as of 2009—as a covert channel substrate. Because the censor cannot block all UGC platforms simultaneously without removing massive amounts of legitimate content, the system achieves availability and user deniability that fixed-infrastructure proxies (e.g., Tor relays) cannot: accessing Flickr or Twitter does not implicate the user as a circumvention tool operator.
-
Pseudonymity uses persistent identifiers other than real names, enabling accountability while providing partial unlinkability; however, use of the same pseudonym across different contexts enables linkability: the attacker can link all data related to a pseudonym. Unlinkability of two messages requires that the attacker cannot sufficiently distinguish whether they share a sender or recipient; for a scenario with n senders, this holds iff the probability of common authorship is sufficiently close to 1/n.
-
Because Skype relies on a central login server, it is technically possible for a censor to block Skype, but the paper observes that blocking widely-deployed services like Skype or Google inflicts real economic harm, making it a credible deterrent. Additionally, Skype's proprietary, closed-source protocol and P2P architecture make it harder to characterize and selectively filter than open protocols.
-
The hybrid two-stage design's architectural vulnerability is that circumventing either stage independently defeats the system: end-users can tunnel via Tor or JAP to bypass both stages entirely, while content providers can serve different content to IWF crawlers versus real users, exploiting the fact that only 33% of IWF hotline reports were substantiated as potentially illegal. The system's precision is entirely contingent on content-provider cooperation, which cannot be assumed.
-
In measurements conducted over 10 days in early February 2006, the GFW scanned approximately two-thirds of packets from a 256-address block per hourly probe, with address selection following a structured (non-random) pattern consistent with simple modular assignment to a limited pool of IDS devices. After several days, the inspected fraction rose to nearly all addresses, suggesting a configuration change to expand capacity.
-
The GFW's keyword-blocking mechanism relies entirely on endpoints honoring injected TCP RST packets; because the IDS operates out-of-band and cannot remove packets already queued in the router's transmission path, configuring both endpoints to silently discard incoming RSTs (e.g., via `iptables -A INPUT -p tcp --tcp-flags RST RST -j DROP`) allows blocked content to transfer unimpeded. In a controlled experiment, 28 injected RSTs were ignored and the complete blocked web page was successfully retrieved.
-
The GFW performs no stateful TCP stream reassembly, inspecting one packet at a time: splitting the blocked keyword '?falun' across two TCP segments is sufficient to evade detection entirely. Cross-device state is also absent — triggering a block on one border AS (e.g., AS9929) had no effect on traffic transiting a different Chinese border AS.
-
Tor's 2006 TLS handshake contained multiple identifying fingerprints exploitable by censors: the X.509 organizationName field was set to 'Tor', the relay nickname appeared in the commonName field, clients always presented certificates (unlike browsers), and Tor used two-certificate chains (identity cert + per-session TLS cert) while most consumer HTTPS services use a single certificate. The paper flags these as sufficient for a censor to identify Tor traffic without deep payload inspection.
-
The paper presents a systematic taxonomy of blocking criteria across ISO/OSI layers: circumstance-based (addresses including sender/receiver/kind/physical location; timing including send time, receive time, duration, frequency; data-transfer properties; services including protocols, names, addresses) and content-based (file type/MIME, statistical detection of encrypted or compressed data, pattern matching for keywords or phrases, and website fingerprinting via request-count/byte-volume signatures).
-
The protocol between blockee and volunteer forwarder is designed to be transport-layer independent from the outset, allowing substitution of plain TCP with SSL tunnels, SMTP, or steganographic channels as the censor escalates detection. The system is intentionally deployed in a weak initial form to observe how quickly and in what manner the censor adapts, then hardened iteratively based on measured censor behavior.
-
An attacker can conduct stealth port scans against a victim without revealing their own IP by exploiting a 'patsy' host whose OS uses a globally incrementing IP Identifier: the attacker observes ID increments of 2 (rather than 1) in the patsy's traffic when the victim sends a RST to the patsy in response to a spoofed SYN, revealing open ports. Choosing a different patsy for each port makes the scan very hard to detect.
-
The user-level norm normalizer processes a realistic 100,000-packet trace (88% TCP) at approximately 101,000 pkts/sec (397 Mb/s) with all normalizations enabled on a $1,000 AMD Athlon 1.1 GHz PC, compared to a memory-copy-only baseline of 727,270 pkts/sec; the authors conclude a kernel implementation could sustain a bidirectional 100 Mbps access link with sufficient headroom to weather high-speed small-packet flooding attacks.
-
Passive NIDS can be evaded via three fundamental classes of ambiguity: incomplete protocol analysis (none of the four commercial systems tested by Ptacek and Newsham in 1998 correctly reassembled IP fragments), divergent end-system behavior (different OS stacks resolve overlapping TCP retransmissions differently), and topology uncertainty (low-TTL packets may not reach the victim end-system, so the NIDS cannot determine which packets are delivered).
-
A traffic normalizer placed inline ('bump in the wire') can eliminate over 70 IP/TCP packet-level ambiguities before a NIDS inspects traffic — including fragment reassembly, TTL restoration, DF flag clearing, IP option removal, and cryptographic IP ID scrambling — leaving the classifier with an unambiguous byte stream and removing the degrees of freedom an attacker needs to evade detection.
-
Publius provides source anonymity once content is published but offers no connection-based anonymity at upload time. A network-layer eavesdropper between the publisher and the servers, or a server's connection log, can reveal the publisher's IP address. The paper explicitly states that Publius must be combined with a mix-network or crowd-anonymity tool (e.g., Crowds, Onion Routing) to protect publisher identity during the upload phase.
-
The paper proves that any network IDS operating without maintaining complete, OS-specific per-connection state cannot reliably reconstruct the byte stream seen by the end-system. TCP and IP reassembly ambiguities guarantee unavoidable blind spots unless the IDS performs full per-target OS emulation—a fundamental architectural limitation, not an implementation bug, that applies equally to any DPI-based censor.
-
IP-level fragment overlap attacks operate independently of TCP: crafting overlapping IP fragments whose reassembly by the IDS yields benign content while the end-system's reassembly yields the true payload. The paper demonstrates this is a separate attack surface from TCP-level evasion, exploitable below the transport layer before any TCP stream reconstruction begins.
-
Different operating systems apply different precedence rules when TCP segments overlap—some implementations use 'first data wins,' others 'last data wins.' An IDS applying a single universal reassembly policy will systematically diverge from the actual target end-system whenever overlapping segments appear, creating a predictable and repeatable evasion surface that is an inherent consequence of policy misalignment rather than a configuration flaw.
-
An 'evasion' attack exploits the mirror condition: the IDS drops a TCP segment that the end-system accepts, due to differences in overlap-resolution policy. The IDS reconstructs 'ATTCK' while the end-system sees 'ATTACK'; the missing segment carries the content that would trigger the signature, leaving the censor with an incomplete—and non-matching—view of the stream.
-
An 'insertion' attack sends TCP segments with forged TTL values low enough to expire at the IDS/censor but not at the true destination. The IDS incorporates the spurious segment into its reconstructed stream—seeing 'ATXTACK'—while the end-system assembles the intended byte stream 'ATTACK,' causing signature-based content matching to fail without disrupting delivery.