{% extends "_layout.html" %} {% block title %}I2P's Threat Model{% endblock %} {% block content %}
Your level of anonymity can be described as how hard it is for someone to find out information you don't want them to know - who you are, where you are located, who you communicate with, or even when you communicate. "Perfect" anonymity is not a useful concept here - software will not make you indistinguishable from people that don't use computers or who are not on the Internet. Instead, I2P is working to provide sufficient anonymity to meet the real needs of whomever we can - from Joe Sixpack browsing porn to Tommy Trader sharing files to Irene Insurgent organizing an upcoming action.
The question of whether I2P provides sufficient anonymity for your particular needs is a hard one, but this page will hopefully assist in answering that question by exploring how I2P operates under various attacks so that you may decide whether it meets your needs.
We welcome further research and analysis on I2P's resistance to the threats described below. Both review of existing literature (much of it focused on Tor) and original work focused on I2P is needed.
I2P builds off the ideas of many other systems, but a few key points should be kept in mind when reviewing related literature:
Taking from the attacks and analyzes put forth in the anonymity literature (largely Traffic Analysis: Protocols, Attacks, Design Issues and Open Problems), the following briefly describes a wide variety of attacks as well as many of I2Ps defenses. We will update this list to include new attacks as they are identified.
A brute force attack can be mounted by a global passive or active adversary, watching all the messages pass between all of the nodes and attempting to correlate which message follows which path. Mounting this attack against I2P should be nontrivial, as all peers in the network are frequently sending messages (both end to end and network maintenance messages), plus an end to end message changes size and data along its path. In addition, the external adversary does not have access to the messages either, as inter-router communication is both encrypted and streamed (making two 1024 byte messages indistinguishable from one 2048 byte message).
However, a powerful attacker can use brute force to detect trends - if they can send 5GB to an I2P destination and monitor everyone's network connection, they can eliminate all peers who did not receive 5GB of data. Techniques to defeat this attack exist, but may be prohibitively expensive (see: Tarzan's mimics or constant rate traffic). Most users are not concerned with this attack, as the cost of mounting it are extreme (and often require illegal activity). However, the attack is still possible, and those who want to defend against it would want to make appropriate countermeasures, such as not communicating with unknown destinations, not publishing one's current leaseSet in the network database, actively rerouting the associated tunnels 'mid stream', throttling the inbound tunnels themselves, and/or using restricted routes with trusted links to secure the local connection.
I2P's messages are unidirectional and do not necessarily imply that a reply will be sent. However, applications on top of I2P will most likely have recognizable patterns within the frequency of their messages - for instance, an HTTP request will be a small message with a large sequence of reply messages containing the HTTP response. Using this data as well as a broad view of the network topology, an attacker may be able to disqualify some links as being too slow to have passed the message along.
This sort of attack is powerful, but its applicability to I2P is non obvious, as the variation on message delays due to queuing, message processing, and throttling will often meet or exceed the time of passing a message along a single link - even when the attacker knows that a reply will be sent as soon as the message is received. There are some scenarios which will expose fairly automatic replies though - the streaming library does (with the SYN+ACK) as does the message mode of guaranteed delivery (with the DataMessage+DeliveryStatusMessage).
Without protocol scrubbing or higher latency, global active adversaries can gain substantial information. As such, people concerned with these attacks should increase the latency (per nontrivial delays or batching strategies), include protocol scrubbing, or other advanced tunnel routing techniques.
References: Low-Resource Routing Attacks Against Anonymous Systems
Intersection attacks against low latency systems are extremely powerful - periodically make contact with the target and keep track of what peers are on the network. Over time, as node churn occurs the attacker will gain significant information about the target by simply intersecting the sets of peers that are online when a message successfully goes through. The cost of this attack is significant as the network grows, but may be feasible in some scenarios.
I2P does not attempt to address this for low latency communication, but does for peers who can afford delays (per nontrivial delays and batching strategies). In addition, this is only relevant for destinations that other people know about - a private group whose destination is only known to trusted peers does not have to worry, as an adversary can't "ping" them to mount the attack.
There are a whole slew of denial of service attacks available against I2P, each with different costs and consequences:
Tagging attacks - modifying a message so that it can later be identified further along the path - are by themselves impossible in I2P, as messages passed through tunnels are signed. However, if an attacker is the inbound tunnel gateway as well as a participant further along in that tunnel, with collusion they can identify the fact that they are in the same tunnel (and prior to adding unique hop ids and other updates, colluding peers within the same tunnel can recognize that fact without any effort). An attacker in an outbound tunnel and any part of an inbound tunnel cannot collude however, as the tunnel encryption pads and modifies the data separately for the inbound and outbound tunnels. External attackers cannot do anything, as the links are encrypted and messages signed.
Partitioning attacks - finding ways to segregate (technically or analytically) the peers in a network - are important to keep in mind when dealing with a powerful adversary, since the size of the network plays a key role in determining your anonymity. Technical partitioning by cutting links between peers to create fragmented networks is addressed by I2P's built in network database, which maintains statistics about various peers so as to allow any existing connections to other fragmented sections to be exploited so as to heal the network. However, if the attacker does disconnect all links to uncontrolled peers, essentially isolating the target, no amount of network database healing will fix it. At that point, the only thing the router can hope to do is notice that a significant number of previously reliable peers have become unavailable and alert the client that it is temporarily disconnected (this detection code is not implemented at the moment).
Partitioning the network analytically by looking for differences in how routers and destinations behave and grouping them accordingly is also a very powerful attack. For instance, an attacker harvesting the network database will know when a particular destination has 5 inbound tunnels in their LeaseSet while others have only 2 or 3, allowing the adversary to potentially partition clients by the number of tunnels selected. Another partition is possible when dealing with the nontrivial delays and batching strategies, as the tunnel gateways and the particular hops with non-zero delays will likely stand out. However, this data is only exposed to those specific hops, so to partition effectively on that matter, the attacker would need to control a significant portion of the network (and still that would only be a probabilistic partition, as they wouldn't know which other tunnels or messages have those delays).
The predecessor attack is passively gathering statistics in an attempt to see what peers are 'close' to the destination by participating in their tunnels and keeping track of the previous or next hop (for outbound or inbound tunnels, respectively). Over time, using a perfectly random sample of peers and random ordering, an attacker would be able to see which peer shows up as 'closer' statistically more than the rest, and that peer would in turn be where the target is located.
I2P avoids this in four ways: first, the peers selected to participate in tunnels are not randomly sampled throughout the network - they are derived from the peer selection algorithm which breaks them into tiers. Second, with strict ordering of peers in a tunnel, the fact that a peer shows up more frequently does not mean they're the source. Third, with permuted tunnel length even 0 hop tunnels can provide plausible deniability as the occasional variation of the gateway will look like normal tunnels. Fourth, with restricted routes, only the peer with a restricted connection to the target will ever contact the target, while attackers will merely run into that gateway.
References: http://prisms.cs.umass.edu/brian/pubs/wright.tissec.2008.pdf which is an update to the 2004 predecessor attack paper http://prisms.cs.umass.edu/brian/pubs/wright-tissec.pdf.
The harvesting attack can be used for legal attacks and to help mounting other attacks by simply running a peer, seeing who it connects to, and harvesting whatever references to other peers it can find. I2P itself is built to optimize this attack, since there is the distributed network database containing just this information. However, this by itself isn't a problem, since there are ways to track down who is participating in the network with essentially all public systems - I2P just makes it easier to do (and, in turn, gains the ability to combat partitioning and provide well bounded discovery time). With basic and comprehensive restricted routes, this attack loses much of its power, as the "hidden" peers do not publish their contact addresses in the network database - only the tunnels through which they can be reached (as well as their public keys, etc).
Sybil describes a category of attacks where the adversary creates arbitrarily large numbers of colluding nodes and uses the increased numbers to help mounting other attacks. For instance, if an attacker is in a network where peers are selected randomly and they want an 80% chance to be one of those peers, they simply create five times the number of nodes that are in the network and roll the dice. When identity is free, Sybil can be a very potent technique for a powerful adversary. The primary technique to address this is simply to make identity 'non free' - Tarzan (among others) uses the fact that IP addresses are limited, while IIP used HashCash to 'charge' for creating a new identity. We currently have not implemented any particular technique to address Sybil, but do include placeholder certificates in the router's and destination's data structures which can contain a HashCash certificate of appropriate value when necessary (or some other certificate proving scarcity).
We are still assuming the security of the cryptographic primitives explained elsewhere. That includes the immediate detection of altered messages along the path, the inability to decrypt messages not addressed to you, and defense against man-in-the-middle attacks. The network protocol and data structures support securely padding messages to arbitrary sizes, so messages could be made constant size or garlic messages could be modified randomly so that some cloves appear to contain more subcloves than they actually do. At the moment, however, garlic, tunnel, and end to end messages include simple random padding.
In addition to the floodfill DOS attacks described above, floodfill routers are uniquely positioned to learn about network participants, due to their centralized role in the netDb, and the high frequency of communication with those participants. The specific potential threats and corresponding defenses are a topic for further research.
There are a few centralized or limited resources (some inside I2P, some not) that could be attacked or used as a vector for attacks. The absence of jrandom starting November 2007, followed by the loss of the i2p.net hosting service in January 2008, highlighted numerous centralized resources in the development and operation of the I2P network, most of which are now distributed. Attacks on externally-reachable resources mainly affect the ability of new users to find us, not the operation of the network itself.
These attacks aren't directly on the network, but instead go after its development team by either introducing legal hurdles on anyone contributing to the development of the software, or by using whatever means are available to get the developers to subvert the software. Traditional technical measures cannot defeat these attacks, and if someone threatened the lives of a developer's loved ones (or even just issuing a court order along with a gag order, under threat of prison), everyone should expect that the code would be modified.
However, two techniques help defend against these attacks:
Try as we might, most nontrivial applications include errors in the design or implementation, and I2P is no exception. There may be bugs that could be exploited to attack the anonymity or security of the communication running over I2P in unexpected ways. To help withstand attacks against the design or protocols in use, we publish all designs and documentation in the open and asking for any and all criticism with the hope that many eyes will improve the system. In addition, the code is being treated the same way, with little aversion towards reworking or throwing out something that isn't meeting the needs of the software system (including ease of modification). Documentation for the design and implementation of the network and the software components are an essential part of security, as without them it is unlikely that developers would be willing to spend the time to learn the software enough to identify shortcomings and bugs.
To some extent, I2P could be enhanced to avoid peers operating at IP addresses listed in a blocklist. Several blocklists are commonly available in standard formats, listing anti-P2P organizations, potential state-level adversaries, and others.
To the extent that active peers actually do show up in the actual blocklist, blocking by only a subset of peers would tend to segment the network, exacerbate reachability problems, and decrease overall reliability. Therefore we would want to agree on a particular blocklist and enable it by default.
Any blocking within i2p would have to be implemented at the following places in the code to be fully effective. Different places address the various possible malicious behaviors.
Blocklists are only a part (perhaps a small part) of an array of defenses against maliciousness. In large part the profiling system does a good job of measuring router behavior so that we don't need to trust anything in netDb. However there is more that can be done. For each of the areas in the list above there are improvements we can make in detecting badness.
If a blocklist is hosted at a central location with automatic updates the network is vulnerable to a central resource attack. Automatic subscription to a list gives the list provider the power to shut the i2p network down. Completely. So we probably won't be doing that. Inclusion in i2pupdate files perhaps? Signed or unsigned? In-I2P subscription?
Initial test results show that the common splist.txt blocklist is overly broad for our use - at the least we don't want to block Tor users that are in splist. Level1 + bogon + (maybe) level2 is more reasonable? But we don't want to block potential anonymity researchers at .edu locations. {% endblock %}