2012-vasserman-one-way
findings extracted from this paper
-
Content-oblivious replication delegates ongoing availability maintenance to 'manifest guarantors' — nodes holding content manifests — who periodically sample chunk replication factors and restore missing replicas without knowing the plaintext they protect, freeing the original publisher from any post-publication obligation. Two honest manifest holders (one content, one key) are sufficient to maintain replication with overwhelming probability even under adversarial conditions and high churn.
-
Simulation over erasure code parameters uniformly sampled from m∈[1,5] and n∈[5,500] shows that a 50-of-500 code is the best trade-off between overhead and robustness: it requires nearly 10× storage overhead to support 2^60 variable-size chunks and allows the network to tolerate more than 70% node failure before data is lost. Replication combined with erasure coding yields better durability than either strategy alone.
-
A hybrid garbage-collection scheme combining time-based expiry (last-access timestamp cutoff), popularity-based retention, and editor-signed manifest exemptions forces adversaries conducting pollution or exhaustion attacks to continuously re-access or re-upload junk to prevent its deletion. A single honest editor's signature is sufficient to exempt important but infrequently accessed content from deletion indefinitely, while malicious editors cannot explicitly remove content from the system.
-
One-way indexing separates a published file into encrypted content blocks (indexed by hash1(block)), a content manifest (indexed by hash2(keyword)), and a key manifest (indexed by hash3(keyword)), so a storer holding all content chunks cannot recover the plaintext or keywords without inverting a cryptographic one-way function. Using distinct hash functions for each manifest type also minimizes the probability that a single node stores both manifests, preventing correlation.
-
In a 250-node PlanetLab deployment with 10–15% silent node failures and high churn, the median user retrieved a 20MB file in 65–85 seconds end-to-end (search + manifest download + chunk fetch + reconstruction + decryption). 15.12% of DHT lookups and 11.24% of maintenance operations failed; 20% of nodes accounted for 80% of failures, yet nodes with working connections completed lookups and maintained sufficient guarantors for manifest replication.