2023-fifield-running
findings extracted from this paper
-
Localhost TCP connections between the pluggable transport, load balancer, and Tor processes exhaust the ephemeral port space because source and destination IP addresses are both 127.0.0.1, leaving only port numbers to distinguish sockets. The mitigation uses distinct addresses across the full 127.0.0.0/8 loopback range combined with a custom orport-srcaddr option that assigns random source addresses from 127.0.1.0/24, expanding available socket four-tuples by a factor of 256.
-
Operating system defaults create two additional scaling ceilings beyond CPU: (1) the default file descriptor limit is insufficient above ~64,000 simultaneous connections, requiring LimitNOFILE=1048576 (1 million) in the systemd service; and (2) Linux's conntrack default of 262,144 tracked connections was approached during peak hours for the Snowflake bridge, necessitating doubling the table to 524,288 via sysctl net.netfilter.nf_conntrack_max.
-
A single Tor process is limited to one CPU core, creating a performance ceiling that manifests at approximately 6,000 simultaneous users and 10 MB/s of Tor bandwidth. The solution is running multiple Tor processes (starting with 4, scaling to 12) sharing the same long-term identity keys, mediated by an HAProxy load balancer, which enabled a Snowflake bridge to scale from 2,000 to ~100,000 simultaneous users between December 2021 and February 2023.
-
Multiple Tor instances initialized with copied identity keys will independently rotate their medium-term onion keys on a 28-day schedule, causing clients with cached older keys to fail circuit construction. The fix is blocking Tor's onion key rotation by pre-creating directories at the filesystem rename targets (secret_onion_key.old, secret_onion_key_ntor.old), which now effectively makes onion keys long-term secrets requiring the same protection as identity keys.