Attacks on mixnets

thisbefruit · October 30, 2023, 4:41pm

Since Mel’s latest project Earendil is a mixnet, I was looking into some common attacks on mixnets. Here’s a sample of what I found:

Active traffic manipulation attacks

An active attacker can not only monitor the network, but also delay, delete, replay, and inject new messages into the network (Danezis 03)

[Trivial] Replay attack: attacker wishes to find out the destination of a packet P. So they duplicate P going into a mix node and determines P’s next hop by seeing which mix node receives two packets from the current node, repeating as needed
- Solution: implement replay protection
Message corruption: attacker can corrupt a message and use the effects to identify that message at a later stage of mixing
- Solution: implement integrity checking
Trickling attack: attacker blocks all packets from entering a mixnet except for P. As threshold mixes usually have a maximum duration they hold packets for, and the absence of other packets does not prevent continuous-time mixes from releasing P, eventually the attacker is able to observe the destination of P
- Solution: inject cover traffic
Flooding (n-1) attack: this applies to threshold mixes. For a threshold mixnet with threshold = n, attacker only lets P out of all “legitimate” traffic into a mix node, then injects n-1 packets into the node, causing the node to release all n packets. As the attacker set the destinations of all the released packets except for P, he observes the next hop of P. He repeats until P reaches its destination
- Solution: use a continuous-time mixing strategy
- Trickling & flooding attack paper: https://www.freehaven.net/anonbib/cache/trickle02.pdf
More on active traffic manipulation attacks: https://www.freehaven.net/anonbib/cache/ih05-Luke.pdf

Passive observation attacks

[Trivial] Message size: attacker can identify, or at least reduce the anonymity set, of messages sent through the mixnet by observing the message size
- Solution: pad all messages to the same size

Big: receiver/route disclosure attacks

Earliest version (2000): intersection attack
- By position in mix route:
  - Assume attacker controls all but 1 node in a free-routing mixnet. Assume this network has a maximum allowed route length. Assume everyone sets their route length to the max. Then the anonymity set for a given packet going into the honest node is the set of packets that have the honest node at the same position on their route as the packet of interest.
  - If we assume that Alice sends multiple messages to Bob, then we could observe all packets going out of Alice and take the intersection of all possible destinations for each packet, using the method given above. The more packets we assume Alice sends to Bob, the smaller the intersection, whether Alice picks a different route each time or not.
  - Clearly, cascade mixes do not have this vulnerability, because a given mix node is always at a fixed position on the routes of all packets, since there is only one possible route.
- By finding the next mix node
  - Assume attacker controls all but one or a few node in a free-routing mixnet. Assume Alice sends several messages, only to Bob, each time using the same route. The attacker wants to know who Alice is talking to. Then the attacker can observe all the traffic coming out of the honest node H whenever a message from Alice enters H, obtaining a set of potential next nodes. The attacker takes the intersection of these sets, eventually narrowing down the exact next node on Alice’s route to Bob, and repeats this attack on the next node until he reaches Bob.
- The paper: https://www.freehaven.net/anonbib/cache/disad-free-routes.pdf
Next generation (2002 Dogan et al.): disclosure attack for any mix
- You can generalize the intersection attack to all mixnets. Assume a global passive adversary who can observe Alice sending packets into the mixnet, as well as all connections in the mixnet. Then they can apply the following algorithm:
  - Observation: the attacker observes Alice sending messages into the mixnet for a period of time, each time obtaining a set of potential receivers of the message. Eventually they obtain a number of mutually disjoint sets (let’s call the set of these sets M); each of the sets in M then contains a unique correspondent of Alice
  - Exclusion: the attacker continues to watch the network to gather more sets of potential receivers for Alice’s messages. Whenever they obtain a set o that shares elements with only one member s of M, they update that member of M with s intersect o. They repeat until each member of M only contains one element – a correspondent of Alice.
- With enough observations, you can always find out with 100% certitude a set of Alice’s conversation partners
- However the naive observation-exclusion algo given is NP hard
- The paper: https://www.freehaven.net/anonbib/cache/agrawal03.pdf
Gen 3: statistical disclosure attack for threshold mixes (2003 Danezis)
- Danezis finds a statistical, non-NP-hard way to do disclosure attacks that is surprisingly efficient
More statistical disclosure attacks
- Statistical disclosure attack for pool mixes (2004 Danezis + Serjantov)
- Statistical disclosure taking advantage of reply blocks: https://www.freehaven.net/anonbib/cache/danezis-pet2007.pdf
- Improved statistical disclosure:
  https://www.freehaven.net/anonbib/cache/hitting-set04.pdf
- Statistical disclosure using Bayesian inference
  - https://www.freehaven.net/anonbib/cache/DBLP:conf/ccs/TroncosoD09.pdf
  - Vida
- Flow separation: https://www.freehaven.net/anonbib/cache/pet05-zhu.pdf
Most recently, deep-learning disclosure attacks apply machine learning to learn correlation functions tailored for a particular system (rather than applying generic statistical algorithms). These are apparently significantly more efficient that traditional statistical disclosure attacks.
- https://www.freehaven.net/anonbib/cache/deepcorr-ccs2018.pdf

Mitigations for disclosure attacks: cover traffic + highly variable delays

Attacks on low-latency mixes (e.g. Tor)

Time-pattern fingerprinting
- By latency: we observe that with each additional connection though a Tor node, latency through that node increases (this is because Tor is constantly overloaded). So we can measure traffic patterns (for example, search for the pattern of visiting a particular website) going through nodes with only the ability to send traffic through nodes! Specifically, we can observe how much traffic is going through a node at any given time by sending probe traffic through that node and measuring the latency. It turns out this is a viable avenue for attack
  - https://www.freehaven.net/anonbib/cache/torta05.pdf
- Mitigation: stop overloading Tor nodes to the point where one additional connection increases latency!
- Next attack: more connections still means more CPU usage → hotter computer → affects clock skew, which we can measure!
  - https://www.freehaven.net/anonbib/cache/HotOrNot.pdf
- There are many, many more versions of time-pattern attacks on low-latency mixes like Tor. Website fingerprinting is currently an unsolved vulnerability in Tor.
  - Latest: fingerprinting via deep learning works very well:
    https://www.freehaven.net/anonbib/cache/deepwf-ccs2018.pdf

Other attacks

DDoS attack: this is not mixnet-specific. This paper analyzes the ways and amount of resources it would take to DDoS Tor in 2018/19
Internet routing-based attacks
Compulsory revelation of secret keys (such as by governments): forward secrecy is important!

Measuring mixnet security

Information-theoretic based approaches:
- Several papers by Claudia Diaz et al.
Covert side-channel based approach:
https://www.freehaven.net/anonbib/cache/newman:pet2004.pdf

Research history

Lots of research on mixnets (new designs, attacks, defenses, anonymity analyses, etc.) happened in the 2000’s
Then in the 2010’s the research became focused on Tor, maybe because Tor started becoming big
Recently, the community has been applying deep learning to implement cheaper and more powerful attacks against mixnets