2025-h-ller-evaluating
findings extracted from this paper
-
The Ahmia search engine provided the most onion addresses (18,069 in a single day, ranging 18,000–22,000 week-to-week), outperforming five other sources combined (36,028 total across six engines). However, Ahmia's intentional exclusion blacklist contains 46,000+ hashed addresses, and crawling onion services for 20 days yielded 48,745 unique v3 addresses, 11,809 of which were on Ahmia's blacklist — meaning any index-based collection systematically misses a significant share of the onion ecosystem by design.
-
Combining six onion search engines/repositories plus clearnet search engines, Tor2web-style DNS leakage, and 20 days of self-run crawling (2.9 million pages), the authors assembled 482,614 unique v3 onion addresses — the largest known collection. Verifying against HSDir blinded public keys showed the collected addresses accounted for 25% of observed blinded keys but were responsible for 66% of all successful service descriptor downloads, confirming a heavy-tailed usage distribution.