{% extends "_layout.html" %} {% block title %}How the Network Database (netDb) Works{% endblock %} {% block content %}

I2P's network database is a specialized kademlia derived DHT, containing just two types of data - router contact information and destination contact information. Each piece of data is signed by the appropriate party and verified by anyone who uses or stores it. In addition, the data has liveliness information within it, allowing irrelevent entries to be dropped, newer entries to replace older ones, and, for the paranoid, protection against certain classes of attack. (Note that this is also why I2P bundles in the necessary code to determine the correct time).

RouterInfo

When an I2P router wants to contact another router, they need to know some key pieces of data - all of which are bundled up and signed by the router into a structure called the "RouterInfo", which is distributed under the key derived from the SHA256 of the router's identity. The structure itself contains:

The arbitrary text options are currently used to help debug the network, publishing various stats about the router's health. These stats will be disabled by default once I2P 1.0 is out, and can be disabled by adding "router.publishPeerRankings=false" to the router configuration. The data published can be seen on the router's netDb page, but should not be trusted.

LeaseSet

The second piece of data distributed in the netDb is a "LeaseSet" - documenting a group of tunnel entry points (leases) for a particular client destination. Each of these leases specify the tunnel's gateway router (with the hash of its identity), the tunnel ID on that router to send messages (a 4 byte number), and when that tunnel will expire. The LeaseSet itself is stored in the netDb under the key derived from the SHA256 of the destination.

In addition to these leases, the LeaseSet also includes the destination itself (namely, the destination's 2048bit ElGamal encryption key, 1024bit DSA signing key, and certificate) as well as an additional pair of signing and encryption keys. These additional keys can be used for garlic routing messages to the router on which the destination is located (though these keys are not the router's keys - they are generated by the client and given to the router to use). End to end client messages are still, of course, encrypted with the destination's public keys.

Bootstrapping

The netDb, being a DHT, is completely decentralized, however you do need at least one reference to a peer so that you can let the DHT's integration process tie you in. This is accomplished by "reseeding" your router with the RouterInfo of an active peer - specifically, by retrieving their routerInfo-$hash.dat file and storing it in your netDb/ directory. Anyone can provide you with those files - you can even provide them to others by exposing your own netDb directory. To simplify the process of finding someone with a RouterInfo, an alias has been made to the netDb dir of one of the routers on dev.i2p.net.

Healing

While the kademlia algorithm is fairly efficient at maintaining the necessary links, we keep additional statistics regarding the netDb's activity so that we can detect potential segmentation and actively avoid it. This is done as part of the peer profiling - with data points such as how many new and verifiable RouterInfo references a peer gives us, we can determine what peers know about groups of peers that we have never seen references to. When this occurs, we can take advantage of kademlia's flexibility in exploration and send requests to that peer so as to integrate ourselves further with the part of the network seen by that well integrated router.

Migration

Unlike traditional DHTs, the very act of conducting a search distributes the data as well, since rather than passing IP+port # pairs, references are given to the routers on which to query (namely, the SHA256 of those router's identities). As such, iteratively searching for a particular destination's LeaseSet or router's RouterInfo will also provide you with the RouterInfo of the peers along the way.

In addition, due to the time sensitivity of the data, the information doesn't often need to be migrated - since a LeaseSet is only valid for the 10 minutes that the referenced tunnels are around, those entries can simply be dropped at expiration, since they will be replaced at the new location when the router publishes a new LeaseSet.

To address the concerns of Sybil attacks, the location used to store entries varies over time. Rather than storing the RouterInfo on the peers closest to SHA256(router identity), they are stored on the peers closest to SHA256(router identity + YYYYMMdd), requiring an adversary to remount the attack again daily so as to maintain closeness to the "current" keyspace. In addition, entries are probabalistically distributed to an additional peer outside of the target keyspace, so that a successful compromise of the K routers closest to the key will only degrade the search time.

Delivery

As with DNS lookups, the fact that someone is trying to retrieve the LeaseSet for a particular destination is sensitive (the fact that someone is publishing a LeaseSet even more so!). To address this, netDb searches and netDb store messages are simply sent through the router's exploratory tunnels.

Status

The netDb plays a very specific role in the I2P network, and the algorithms have been tuned towards our needs. This also means that it hasn't been tuned to address the needs we have yet to run into. I2P is currently (2005/02/18) fairly small (only 200 nodes), and we have not yet had to deal with the situations that kademlia really shines in - times when there are thousands or even millions of peers in the network. The netDb implementation more than adequately meets our needs at the moment, but there will likely be further tuning and bugfixing as the network grows.

{% endblock %}