2024-hoang-gfweb
findings extracted from this paper
-
GFWeb discovered that the GFW's bidirectional blocking is not symmetric: certain domains trigger blocking only when probed from inside China, not from outside. This overturns the prior assumption that the GFW blocks the same domains symmetrically in both directions. The paper also documents that the GFW has been upgraded to fix previously-reported evasion techniques, including overblocking mitigation and improved fragmented-packet reassembly, indicating active engineering iteration on the censor side.
-
GFWeb tested 1.02 billion domains against the GFW over 20 months and discovered 943,000 pay-level domains blocked by HTTP filters and 55,000 by HTTPS filters — the largest GFW blocklist dataset ever published. The HTTP-to-HTTPS ratio (17:1) confirms that the GFW's HTTPS keyword-based and SNI-based blocking covers far fewer domains than its HTTP host-header blocking, likely because HTTPS blocks carry higher collateral-damage risk.
-
Longitudinal GFWeb data spanning 20 months shows the GFW actively patched previously-published evasion findings during the measurement period: overblocking bugs reported in academic papers were fixed, and fragmented-packet reassembly failures that researchers used to bypass blocking were corrected. This demonstrates that the GFW operator monitors published research and iterates on the system in response to disclosed vulnerabilities.