2017-barradas-deltashaper
findings extracted from this paper
-
DeltaShaper embeds covert TCP/IP data into Skype's encrypted video stream using a virtual camera interface, treating Skype as a black box rather than mimicking its protocol. This approach provides active-attack resistance by design: any in-path perturbation affects covert and legitimate streams identically, because real Skype software processes both. The system achieves a goodput of 2.56 Kbps (with Reed-Solomon ECC) or 3.12 Kbps (without ECC) at optimal encoding parameters (320x240 area, 8x8 cell size, 6 bits/cell, 1 fps), with RTT of approximately 3 seconds.
-
Packet-length frequency distributions reliably distinguish regular Skype calls from irregular streams using Earth Mover's Distance (EMD): regular streams consistently produce EMD < 0.1 against a reference stream, while irregular streams range from 0.025 to 0.25. At the breakeven threshold ∆I = 0.066, an EMD classifier achieves 83% accuracy (equal sensitivity and specificity). An aggressive policy (∆A) blocks 95% of legitimate calls to catch all irregular streams; a conservative policy (∆C = 0.11) passes 80% of irregular streams to avoid false positives.
-
Encoding parameters must be jointly tuned to remain unobservable: only specific combinations stay below the intermediate blocking threshold ∆I. Valid configurations at 1 fps include (160x120, 4x4) and (320x240, 8x8) areas/cell sizes; increasing frame rate above 1 fps pushes EMD above ∆ for all multi-bit encodings. As bits per cell increase, video compression introduces more decoding errors — error rates become unacceptable above 6 bits/cell for the (320x240, 8x8) configuration, yielding the candidate encoding: 320x240 area, 8x8 cells, 6 bits/cell, 1 fps.
-
Across eight combinations of traffic features (packet length, bi-gram packet length, inter-packet time, bi-gram inter-packet time) and two similarity metrics (EMD, KS), adversarial classification accuracy against DeltaShaper streams ranges from 72–90% in unperturbed conditions. Bi-gram inter-packet times with EMD achieves 88% accuracy, matching packet-length/EMD, but requires roughly 10x the computation (~64s vs ~6s). Bandwidth throttling to 300 Kbps degrades classifier accuracy from 88% to 75%, but also drops Skype frame rate from 30 to 5 FPS, creating collateral damage that limits censor deployment of throttling as a detection aid.
-
FreeWave, the VoIP-based predecessor, was vulnerable to passive traffic analysis because its covert Skype streams exhibited packet-size distributions different from legitimate calls, enabling detection with high probability. DeltaShaper's video-based approach with EMD-constrained encoding addresses this specific failure mode, but at a severe throughput cost: FreeWave achieves 18.75 Kbps vs DeltaShaper's 2.56–3.12 Kbps goodput. Competing systems benchmark: CovertCast ~168 Kbps (no unobservability constraints), Castle 3.48 Kbps, SkypeLine 0.064 Kbps, Rook 0.024–0.04 Kbps.