2022-harrity-get

GET /out: Automated Discovery of Application-Layer Censorship Evasion Strategies

Michael Harrity, Kevin Bock, Frederick Sell, Dave Levin · USENIX Security Symposium · 2022

canonical link →

Tags

censors: generic
techniques: dpi keyword-filtering
defenses: geneva

findings extracted from this paper

Extending Geneva's genetic algorithm to the application layer automatically discovered 77 unique HTTP evasion strategies and 9 DNS evasion strategies against censors in China, India, and Kazakhstan — all requiring only unprivileged usermode modifications with no TCP/IP header access. Against India's Airtel censor, 56 of the 77 strategies succeeded; 29 worked against Kazakhstan; 22 evaded China's keyword-based HTTP censorship and 27 evaded its Host-header censorship.

§5.1, §6 evaluation dpikeyword-filteringrst-injectiondns-poisoning cninkz
China's Great Firewall runs three independent DNS censorship injectors in parallel; elevating the DNS qdcount field to 2 (despite only one query being present, violating RFC 1035) evades all three injectors simultaneously with 100% success rate across 1,000 trials — but only Cloudflare (1.1.1.1) among eight tested open resolvers responds to such queries. DNS compression paired with an elevated qdcount also achieves 100% evasion of all three injectors but is supported only by Cloudflare and Google (8.8.8.8).

§6, Table 3 evaluation dns-poisoning cn
China's GFW keyword-based and Host-header HTTP censorship can be simultaneously defeated by a 'sandwich' strategy: a header with a name ≥64 bytes must appear before the Host header, the Host header value must start ≥1,281 bytes from the start of the headers, and the final header must be ≥129 bytes total — and the Host header must not be first or last. A 64+ byte header name alone is sufficient to defeat Host-header censorship because it prevents the GFW from reading further headers.

§5.2 defense dpikeyword-filtering cn
India's Airtel HTTP censor fails to reassemble TCP segments: padding any HTTP request to at least 1,449 bytes causes the IP+TCP overhead (52 bytes) to push the total past the Ethernet MTU of 1,500 bytes, forcing segmentation that the censor cannot handle and achieving 100% evasion. Kazakhstan requires the segmentation boundary to fall precisely between the Host header name and value (with two trailing spaces), rather than anywhere in the request.

§5.2 evaluation dpikeyword-filteringmiddlebox-interference inkz
A central finding of the paper is that RFC-compliance in the censor creates evasion opportunities: the more faithfully a censor parses HTTP/DNS per the RFC, the more RFC-permitted variants it will pass that servers also accept, yielding more viable evasion strategies. In contrast, India's Airtel censor was the most brittle (56/77 strategies bypassed it) precisely because it failed on many legitimate RFC variants; China's more sophisticated parser left fewer openings.

§5.1, §7 detection dpikeyword-filtering cninkz