2015-knockel-every
findings extracted from this paper
-
9158 version 6.9, in addition to its explicit keyword filter, asterisks out all English alphabet letters in any chat message containing six or more consecutive English letters. Combined with explicit keywords for 'http', 'www', and 'com' on its filter list, this constitutes a blanket URL-suppression mechanism that also incidentally blocks arbitrary English-language communication.
-
Reverse engineering of four Chinese social video platforms (YY, 9158, Sina Show, GuaGua) yielded 42 keyword lists totaling 17,547 unique keywords. Jaccard similarity clustering shows very little overlap between lists from different companies, consistent with prior work that found only 3% overlap in unique keywords across TOM-Skype and Sina UC (4,256-keyword dataset). This provides the largest unbiased cross-platform evidence that Chinese platform censorship is decentralized rather than governed by a monolithic ruleset.
-
Between February and May 2015, YY High received 21 updates and 9158 Chat received 8 updates. Updates correlated directly with current events within days: Zhou Yongkang's name was added to 9158 Chat on May 6, days after his April 3 corruption indictment; YY Normal added and then removed Chinese Christian song titles between April 23 and April 30 during a church demolition controversy. GuaGua does not download keyword updates at all.
-
SVP keyword lists from all four platforms explicitly target both government criticism and collective action, contradicting King et al.'s claim that criticism is tolerated while collective action is suppressed. All four platforms censor Falun Gong and current CPC leaders (including phonetic homonyms like '习尽平'); over 90% of YY's event-related keywords (2,535 total) reference the June 4 1989 Tiananmen Square Massacre, and derogatory phrases such as '共匪' (Communist gangsters) appear alongside collective action event keywords.
-
YY version 7.1 silently exfiltrates the full text of any triggering message via HTTP GET to sere.hiido.com, including sending user ID, receiving user ID, and the triggering keyword. The surveillance endpoint authenticates using md5(⌊unix_epoch/1000⌋ + ";username=report;password=pswd@1234") with hardcoded credentials, making the surveillance traffic structurally distinguishable from normal YY traffic.