2013-verkamp-five
findings extracted from this paper
-
In four of five incidents (all except Syria), spam accounts were registered in temporally clustered blocks while legitimate accounts were not; in Russia and Mexico, multiple distinct registration bursts were observed. Across all five incidents, spam account usernames were automatically generated, with China'12 and Mexico accounts following a {name}{name}{number} pattern padded to exactly 15 characters (Twitter's maximum), making algorithmic reverse-engineering feasible.
-
In the Russia and Mexico incidents, spam tweets showed statistically significant spikes at fixed sub-hour intervals (5 and 15 minutes past the hour respectively), consistent with cron-job automation. Despite this automation, both campaigns deliberately mimicked human diurnal activity patterns — spam volume peaked at the same hours as legitimate traffic — to evade time-based anomaly detection.
-
Default-profile usage was significantly elevated among spam accounts in China'11 (89.4% spam vs 51.2% non-spam), Russia (57.8% vs 34.7%), and China'12 (95.1% vs 47.8%); however, Mexico inverted this trend with only 1.7% of spam accounts using default profiles vs 27.0% of non-spam accounts, indicating that newer campaigns actively customize profiles to evade appearance-based detection.
-
Across five political spam incidents, spam constituted 62–73% of all tweets in the Russia, China'12, and Mexico incidents, while Syria had only 6% spam. In the China'12 incident, 1,700 spam accounts (14% of all accounts) generated 600,000 spam tweets (73% of total), with 10 individual accounts each producing over 5,000 tweets before shutdown; in Mexico, 50 accounts sustained 1,000 spam tweets per day throughout the incident.
-
Twitter's existing automated spam-filtering mechanisms caught only approximately 50% of politically motivated spam in the Russian parliamentary election incident, as reported by Thomas et al. (2012) and noted as the baseline for this study. Spammer behavior varied sufficiently across incidents (targeting strategy, URL usage, mention patterns, default-profile adoption) that supervised machine-learning classifiers trained on one incident are unlikely to generalize to others.