2014-wang-gohop
findings extracted from this paper
-
GoHop without traffic shaping achieved 76.8–78.5 Mbps (virtual NIC) on a 1 Gbps LAN; traffic shaping reduced this to 58.1 Mbps (~26% overhead from fragmentation). In a Beijing-to-Seattle real-world download test, GoHop delivered 960–999 KB/s against a 1,544 KB/s direct baseline, with the 96.7 Mbps WAN link—not GoHop—as the bottleneck. This compares to Tor's 40–300 KB/s (30–80 KB/s with obfuscation plugins such as SkypeMorph).
-
Packet padding alone is insufficient to defeat statistical traffic analysis unless every packet is padded to MTU; small-size padding has minimal effect on classifier accuracy (citing Hjelmvik & John 2010). Traffic shaping that also fragments large packets—transforming the full packet-size CDF to match a target distribution rather than merely inflating small packets—is required to statistically impersonate a target traffic class.
-
A pre-shared key enables encrypting the entire GoHop packet—header, payload, and padding bytes—achieving true randomness in the full byte stream. Standard VPN protocols such as OpenVPN encrypt only the payload while leaving headers in plaintext, exposing protocol-identifying fields to DPI without payload inspection. This design choice is a prerequisite for defeating header-based fingerprinting.
-
Spreading UDP datagrams across a randomized port range breaks traditional 5-tuple-based session tracking, randomizes per-port inter-arrival times, and reduces per-port throughput to a small fraction of the aggregate—making per-flow statistical analysis significantly harder. Critically, the number of random ports does not reduce aggregate throughput: GoHop measured 76.8 Mbps (1 port) versus 78.5 Mbps (100 ports) at the virtual NIC.
-
GoHop's naïve traffic shaping targeting a uniform packet-size distribution (0–MTU) successfully morphed both HTTP and SSH flows: K-S test D values were 0.019 (HTTP) and 0.022 (SSH), both below the 0.025 rejection threshold, with p-values of 0.20 and 0.11 respectively. After shaping, packet-size CDFs and statistical metrics (mean ~782–783 bytes, variance ~163,600) for both protocols became nearly identical, eliminating the size signals that distinguish them.