2015-holowczak-cachebrowser
findings extracted from this paper
-
Of GFW-blocked websites in the Alexa top 1000, 82% are already hosted on CDN infrastructure; for news websites specifically, the figure rises to 85%. This was measured by scraping GreatFire.org blocked-site data and verifying CDN hosting for each domain.
-
Akamai's China-based edge servers self-censor, returning HTTP 403 for GFW-forbidden content, while Akamai's mapping system (located outside China) returns valid edge server IPs to Chinese users even for forbidden domains, and non-Chinese Akamai edge servers serve all content freely. This partial self-censorship structure is driven by the requirement to operate CDN infrastructure inside China.
-
CacheBrowser bypasses GFW DNS poisoning by directly fetching CDN content from known edge server IPs, using a low-bandwidth out-of-band bootstrapper to seed its edge-server database. The SWEET email-based bootstrapper achieves median 5.4-second resolution latency with 95% of queries answered within 10 seconds across 100 runs—acceptable because CDN provider migrations occur only every few months.
-
CacheBrowser achieves significantly lower download latency than Tor when fetching CDN-hosted content from China, because content is retrieved directly from CDN edge servers without traversing third-party proxies. Fetching from non-default alternative CDN edge servers increases latency relative to the CDN-mapped optimum, but the overhead is not prohibitive for real-world browsing; geographically proximate alternative servers minimize the penalty.
-
The GFW universally uses DNS poisoning rather than IP blocking to censor CDN-hosted content. Across all tested CDN providers (Akamai, CloudFlare, CloudFront, EdgeCast, Fastly, SoftLayer), no CDN edge server IPs were IP-filtered, because a single provider like Akamai hosts content on 170,000 shared edge servers—blocking any IP would collaterally block hundreds of thousands of unrelated publishers.