2026-zohaib-extended

Extended Abstract: CensorAlert -- Leveraging LLM Agents for Automated Censorship Report Aggregation and Analysis

Ali Zohaib, Jade Sheffey, Mingshi Wu, Amir Houmansadr · Free and Open Communications on the Internet · 2026

canonical link →

Tags

censors: generic
techniques: dpi measurement-platform

findings extracted from this paper

CensorAlert aggregates censorship signals from heterogeneous open sources — including OONI, Cloudflare Radar, NetBlocks, GitHub Issues of circumvention tools (Hysteria, Xray), Net4People BBS, NTC Party forum, Mastodon, X, Telegram channels, and arXiv — normalizing each item into a common schema preserving timestamp, source, raw data, and provenance with a link to the original content.

§2 (Data Sources) deployment measurement-platform
Systematic measurement platforms (OONI, Censored Planet, Cloudflare Radar, NetBlocks) have inherent blind spots due to geographic coverage and protocol-specific test constraints; critical censorship discoveries — including GFW fully-encrypted protocol blocking and regional Chinese censorship — were first surfaced by user reports on forums and GitHub issue pages of circumvention tools, not by automated measurement infrastructure.

§1 evaluation measurement-platformfully-encrypted-detect cn
CensorAlert generates text embeddings (OpenAI text-embedding-3-small) from each item's summary, title, and tags to cluster semantically similar reports within configurable time windows; near-duplicates such as reposts, copied headlines, and translations are collapsed into a single canonical post that preserves all original source URLs and metadata.

§2 (AI-based Scoring) deployment measurement-platform
CensorAlert's LLM agent scores each ingested item 0–5 on five independent dimensions — credibility, novelty, impact, timeliness, and verifiability — then computes a normalized significance score (0–10) from these components; items are processed every two hours using OpenAI GPT-5 Thinking (hosted on Azure) constrained to return structured JSON output.

§2 (AI-based Scoring) deployment measurement-platform