2025-walsh-improved-open-world-fingerprinting
findings extracted from this paper
-
Combinations of Bayesian methods, data augmentation with mixup, and NOTA defensive padding cut the open-world false positive rate by up to 92% at 0.5 recall on HTTPS-only traffic and 75% on Tor traffic relative to the deterministic MSP baseline. Even with these improvements, sustaining a world size in the hundreds of millions (approaching YouTube-scale) requires accepting recall of 0.5–0.6 and precision of only 0.1–0.2; at precision 0.5 and recall 0.5, the maximum workable world size is only 37.5M for HTTPS-only (Table 3), far below YouTube's ~10 billion video catalog.
-
Extrapolating empirical FPRs using Wang's base-rate-adjusted precision formula (𝜋_r), the best HTTPS-only approach can sustain precision 0.5 at recall 0.5 only up to a world size of 37.5M videos; precision 0.1 at recall 0.5 extends to 337.5M — still short of YouTube's ~10 billion catalog (Table 3). For Tor, the corresponding limits are 4.8M and 42.9M, making dragnet surveillance of unselected users on large platforms effectively infeasible at any acceptable precision with current techniques.
-
When a fingerprinting model is trained on traffic collected from one geographic vantage point and tested on traffic from a different continent, the HTTPS-only open-world FPR at 0.5 recall increased by factors ranging from 2.8x (EU-West-2) to 50.3x (Africa) relative to the same-vantage baseline — despite 60-way closed-world accuracy remaining above 0.99 across all vantage-point pairs (Table 5). For Tor traffic the effect was weaker but still reached 25.2x (Asia-Pacific Southeast-1), showing path diversity also disrupts Tor-based fingerprinting.
-
The paper establishes, for the first time in a large open-world scenario (64,000 unmonitored test videos), that HTTPS-only video stream fingerprinting is significantly easier than Tor-based fingerprinting because DASH adaptive bitrate selection introduces a second-order network-condition effect: clients request entirely different video segments at different quality levels depending on path conditions, causing traffic traces from different geographic vantage points to diverge at the application layer even when network conditions are nominally similar. This makes NOTA and synthetic training sample techniques less effective on Tor data due to inherent trace noisiness.
-
Tor provides substantial and measurable protection against video stream fingerprinting: the best-case FPR at 0.5 recall is 0.0000063 for Tor versus 0.0000008 for HTTPS-only connections, roughly an 8x increase. Translating to world sizes, at 0.5 recall and 0.1 precision the maximum viable platform catalog is 42.9M videos over Tor versus 337.5M over HTTPS-only (Tables 3–4), confirming Tor degrades adversary capability even after an assumed prior website-fingerprinting step that identifies video platform visits.