2008-01-31 20:38:37 +00:00
|
|
|
{% extends "_layout.html" %}
|
2008-02-03 02:46:04 +00:00
|
|
|
{% block title %}How the Network Database (netDb) Works{% endblock %}
|
2008-02-03 16:15:11 +00:00
|
|
|
{% block content %}
|
2010-07-28 15:37:50 +00:00
|
|
|
|
|
|
|
<p>
|
|
|
|
Updated July 2010, current as of router version 0.8
|
|
|
|
|
|
|
|
<h2>Overview</h2>
|
|
|
|
|
2008-02-03 16:15:11 +00:00
|
|
|
<p>I2P's netDb is a specialized distributed database, containing
|
2009-01-06 15:17:51 +00:00
|
|
|
just two types of data - router contact information (<b>RouterInfos</b>) and destination contact
|
|
|
|
information (<b>LeaseSets</b>). Each piece of data is signed by the appropriate party and verified
|
2005-02-18 15:57:38 +00:00
|
|
|
by anyone who uses or stores it. In addition, the data has liveliness information
|
2008-03-22 12:31:14 +00:00
|
|
|
within it, allowing irrelevant entries to be dropped, newer entries to replace
|
2010-07-28 15:37:50 +00:00
|
|
|
older ones, and protection against certain classes of attack.
|
|
|
|
</p>
|
2005-02-18 15:57:38 +00:00
|
|
|
|
2008-02-03 16:15:11 +00:00
|
|
|
<p>
|
|
|
|
The netDb is distributed with a simple technique called "floodfill".
|
2009-01-06 15:17:51 +00:00
|
|
|
Previously, the netDb also used the Kademlia DHT as a fallback algorithm. However,
|
2009-01-09 09:45:01 +00:00
|
|
|
it did not work well in our application, and it was completely disabled
|
2008-02-03 16:15:11 +00:00
|
|
|
in release 0.6.1.20. More information is <a href="status">below</a>.
|
|
|
|
</p>
|
|
|
|
|
|
|
|
<p>
|
|
|
|
Note that this document has been updated to include floodfill details,
|
2010-03-21 19:41:10 +00:00
|
|
|
but there are still some incorrect statements about the use of Kademlia that
|
2008-02-03 16:15:11 +00:00
|
|
|
need to be fixed up.
|
|
|
|
</p>
|
|
|
|
|
2005-02-18 15:57:38 +00:00
|
|
|
<h2><a name="routerInfo">RouterInfo</a></h2>
|
|
|
|
|
|
|
|
<p>When an I2P router wants to contact another router, they need to know some
|
|
|
|
key pieces of data - all of which are bundled up and signed by the router into
|
|
|
|
a structure called the "RouterInfo", which is distributed under the key derived
|
|
|
|
from the SHA256 of the router's identity. The structure itself contains:</p><ul>
|
|
|
|
<li>The router's identity (a 2048bit ElGamal encryption key, a 1024bit DSA signing key, and a certificate)</li>
|
2009-01-09 09:45:01 +00:00
|
|
|
<li>The contact addresses at which it can be reached (e.g. TCP: example.org port 4108)</li>
|
2005-02-18 15:57:38 +00:00
|
|
|
<li>When this was published</li>
|
2010-07-28 15:37:50 +00:00
|
|
|
<li>A set of arbitrary text options</li>
|
2009-01-06 15:17:51 +00:00
|
|
|
<li>The signature of the above, generated by the identity's DSA signing key</li>
|
2004-07-31 16:21:49 +00:00
|
|
|
</ul>
|
|
|
|
|
2010-07-28 15:37:50 +00:00
|
|
|
<p>
|
|
|
|
The following text options, while not strictly required, are expected
|
|
|
|
to be present:
|
|
|
|
<ul>
|
|
|
|
<li><b>caps</b>
|
|
|
|
(Capabilities flags - used to indicate approximate bandwidth and reachability)
|
|
|
|
<li><b>coreVersion</b>
|
|
|
|
(The core library version, always the same as the router version)
|
|
|
|
<li><b>netId</b> = 2
|
|
|
|
(Basic network compatibility - A router will refuse to communicate with a peer having a different netId)
|
|
|
|
<li><b>router.version</b>
|
|
|
|
(Used to determine compatibility with newer features and messages)
|
|
|
|
<li><b>stat_uptime</b> = 90m
|
|
|
|
(Always sent as 90m, for compatibility with an older scheme where routers published their actual uptime,
|
|
|
|
and only sent tunnel requests to peers whose was more than 60m)
|
|
|
|
</ul>
|
|
|
|
|
|
|
|
These values are used by other routers for basic decisions.
|
|
|
|
Should we connect to this router? Should we attempt to route a tunnel through this router?
|
|
|
|
The bandwidth capability flag, in particular, is used only to determine whether
|
|
|
|
the router meets a minimum threshold for routing tunnels.
|
|
|
|
Above the minimum threshold, the advertised bandwidth is not used or trusted anywhere
|
|
|
|
in the router, except for display in the user interface and for debugging and network analysis.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<p>Additional text options include
|
|
|
|
a small number of statistics about the router's health, which are aggregated by
|
|
|
|
sites such as <a href="http://stats.i2p/">stats.i2p</a>
|
|
|
|
for network performance analysis and debugging.
|
|
|
|
These statistics were chosen to provide data crucial to the developers,
|
|
|
|
such as tunnel build success rates, while balancing the need for such data
|
|
|
|
with the side-effects that could result from revealing this data.
|
|
|
|
Current statistics are limited to:
|
|
|
|
<ul>
|
|
|
|
<li>1 hour average bandwidth (average of outbound and inbound bandwidth)
|
|
|
|
<li>Client and exporatory tunnel build success, reject, and timeout rates
|
|
|
|
<li>1 hour average number of participating tunnels
|
|
|
|
</ul>
|
|
|
|
|
|
|
|
|
|
|
|
The data
|
|
|
|
published can be seen in the router's user interface,
|
|
|
|
but is not used or trusted within the router.
|
|
|
|
As the network has matured, we have gradually removed most of the published
|
|
|
|
statistics to improve anonymity, and we plan to remove more in future releases.
|
|
|
|
|
|
|
|
</p>
|
|
|
|
|
|
|
|
<p>
|
|
|
|
<a href="common_structures_spec.html#struct_RouterInfo">RouterInfo specification</a>
|
|
|
|
<p>
|
|
|
|
<a href="http://docs.i2p2.de/core/net/i2p/data/RouterInfo.html">RouterInfo Javadoc</a>
|
|
|
|
|
2005-02-18 15:57:38 +00:00
|
|
|
|
|
|
|
<h2><a name="leaseSet">LeaseSet</a></h2>
|
|
|
|
|
|
|
|
<p>The second piece of data distributed in the netDb is a "LeaseSet" - documenting
|
|
|
|
a group of tunnel entry points (leases) for a particular client destination.
|
|
|
|
Each of these leases specify the tunnel's gateway router (with the hash of its
|
|
|
|
identity), the tunnel ID on that router to send messages (a 4 byte number), and
|
|
|
|
when that tunnel will expire. The LeaseSet itself is stored in the netDb under
|
|
|
|
the key derived from the SHA256 of the destination.</p>
|
|
|
|
|
|
|
|
<p>In addition to these leases, the LeaseSet also includes the destination
|
|
|
|
itself (namely, the destination's 2048bit ElGamal encryption key, 1024bit DSA
|
|
|
|
signing key, and certificate) as well as an additional pair of signing and
|
|
|
|
encryption keys. These additional keys can be used for garlic routing messages
|
|
|
|
to the router on which the destination is located (though these keys are <b>not</b>
|
|
|
|
the router's keys - they are generated by the client and given to the router to
|
2010-07-28 15:37:50 +00:00
|
|
|
use).
|
|
|
|
|
|
|
|
FIXME
|
|
|
|
|
|
|
|
End to end client messages are still, of course, encrypted with the
|
2008-06-29 15:49:54 +00:00
|
|
|
destination's public keys.
|
|
|
|
[UPDATE - This is no longer true, we don't do end-to-end client encryption
|
|
|
|
any more, as explained in
|
|
|
|
<a href="how_intro.html">the introduction</a>.
|
|
|
|
So is there any use for the first encryption key, signing key, and certificate?
|
2009-01-06 15:17:51 +00:00
|
|
|
Can they be removed?]
|
2008-06-29 15:49:54 +00:00
|
|
|
</p>
|
2005-02-18 15:57:38 +00:00
|
|
|
|
2010-07-28 15:37:50 +00:00
|
|
|
<p>
|
|
|
|
<a href="common_structures_spec.html#struct_Lease">Lease specification</a>
|
|
|
|
<br>
|
|
|
|
<a href="common_structures_spec.html#struct_LeaseSet">LeaseSet specification</a>
|
|
|
|
<p>
|
|
|
|
<a href="http://docs.i2p2.de/core/net/i2p/data/Lease.html">Lease Javadoc</a>
|
|
|
|
<br>
|
|
|
|
<a href="http://docs.i2p2.de/core/net/i2p/data/LeaseSet.html">LeaseSet Javadoc</a>
|
|
|
|
|
|
|
|
|
|
|
|
<h3><a name="revoked">Revoked LeaseSets</a></h3>
|
|
|
|
|
|
|
|
A LeaseSet may be <i>revoked</i> by publishing a new LeaseSet with zero leases.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<h3><a name="encrypted">Encrypted LeaseSets</a></h3>
|
|
|
|
|
|
|
|
In an <i>encrypted</i> LeaseSet, all Leases are encrypted with a separate DSA key.
|
|
|
|
The leases may only be decoded, and thus the destination may only be contacted,
|
|
|
|
by those with the key.
|
|
|
|
There is no flag or other direct indication that the LeaseSet is encrypted.
|
|
|
|
|
|
|
|
|
|
|
|
|
2005-02-18 15:57:38 +00:00
|
|
|
<h2><a name="bootstrap">Bootstrapping</a></h2>
|
|
|
|
|
2009-01-06 15:17:51 +00:00
|
|
|
<p>The netDb is decentralized, however you do need at
|
|
|
|
least one reference to a peer so that the integration process
|
|
|
|
ties you in. This is accomplished by "reseeding" your router with the RouterInfo
|
2005-02-18 15:57:38 +00:00
|
|
|
of an active peer - specifically, by retrieving their <code>routerInfo-$hash.dat</code>
|
|
|
|
file and storing it in your <code>netDb/</code> directory. Anyone can provide
|
|
|
|
you with those files - you can even provide them to others by exposing your own
|
2009-01-06 15:17:51 +00:00
|
|
|
netDb directory. To simplify the process,
|
|
|
|
volunteers publish their netDb directories (or a subset) on the regular (non-i2p) network,
|
|
|
|
and the URLs of these directories are hardcoded in I2P.
|
|
|
|
When the router starts up for the first time, it automatically fetches from
|
|
|
|
one of these URLs, selected at random.
|
|
|
|
</p>
|
2005-02-18 15:57:38 +00:00
|
|
|
|
2008-02-03 16:15:11 +00:00
|
|
|
<h2><a name="floodfill">Floodfill</a></h2>
|
|
|
|
<p>
|
|
|
|
(Adapted from a post by jrandom in the old Syndie, Nov. 26, 2005)
|
|
|
|
<br />
|
|
|
|
The floodfill netDb is really just a simple and perhaps temporary measure,
|
|
|
|
using the simplest possible algorithm - send the data to a peer in the
|
|
|
|
floodfill netDb, wait 10 seconds, pick a random peer in the netDb and ask them
|
|
|
|
for the entry to be sent, verifying its proper insertion / distribution. If the
|
|
|
|
verification peer doesn't reply, or they don't have the entry, the sender
|
|
|
|
repeats the process. When the peer in the floodfill netDb receives a netDb
|
|
|
|
store from a peer not in the floodfill netDb, they send it to all of the peers
|
|
|
|
in the floodfill netDb.
|
|
|
|
</p><p>
|
|
|
|
Peers still do netDb exploration and bootstrapping as before.
|
|
|
|
</p><p>
|
2010-03-21 19:41:10 +00:00
|
|
|
At one point, the Kademlia
|
2008-02-03 16:15:11 +00:00
|
|
|
search/store functionality was still in place. The peers
|
|
|
|
considered the floodfill peers as always being 'closer' to every key than any
|
2010-03-21 19:41:10 +00:00
|
|
|
peer not participating in the netDb. We fell back on the Kademlia
|
2008-02-03 16:15:11 +00:00
|
|
|
netDb if the floodfill peers fail for some reason or another.
|
2010-03-21 19:41:10 +00:00
|
|
|
However, Kademlia has since been disabled completely (see below).
|
2008-02-03 16:15:11 +00:00
|
|
|
</p><p>
|
2009-01-06 15:17:51 +00:00
|
|
|
Determining who is part of the floodfill netDb is trivial - it is exposed in each
|
|
|
|
router's published routerInfo.
|
2008-02-03 16:15:11 +00:00
|
|
|
</p><p>
|
|
|
|
As for efficiency, this algorithm is optimal when the netDb peers are known and
|
|
|
|
their quantity is appropriate for the appropriate uptime demands. Regarding
|
|
|
|
scaling, we've got two peers who participate in the netDb right now, and
|
2009-01-06 15:17:51 +00:00
|
|
|
they'll be able to handle the load by themselves until we've got 10k+ eepsites.
|
|
|
|
It's not as sexy as the old
|
2010-03-21 19:41:10 +00:00
|
|
|
Kademlia netDb, but there are subtle anonymity attacks against non-flooded
|
2008-02-03 16:15:11 +00:00
|
|
|
netDbs.
|
|
|
|
</p>
|
|
|
|
|
2009-01-06 15:17:51 +00:00
|
|
|
<h3><a name="opt-in">Floodfill Router Opt-in</a></h3>
|
|
|
|
|
|
|
|
<p>
|
|
|
|
Unlike Tor, where the directory servers are hardcoded and trusted,
|
|
|
|
and operated by known entities,
|
|
|
|
the members of the I2P floodfill peer set need not be trusted and
|
|
|
|
change over time.
|
|
|
|
While some peers are manually configured to be floodfill,
|
|
|
|
others are simply high-bandwidth routers who automatically volunteer
|
|
|
|
when the number of floodfill peers drops below a threshold.
|
|
|
|
This prevents any long-term network damage from losing most or all
|
|
|
|
floodfills to an attack.
|
|
|
|
In turn, these peers will un-floodfill themselves when there are
|
|
|
|
too many floodfills outstanding.
|
|
|
|
</p>
|
|
|
|
<p>
|
|
|
|
All netDb data are signed by their publisher, so a floodfill peer cannot
|
|
|
|
spoof netDb responses. All peers monitor the performance of the floodfill
|
|
|
|
routers they talk to, so that fake, malicious, or unresponsive floodfills
|
|
|
|
can be avoided.
|
|
|
|
While these defenses may be insufficient to prevent any network disruption,
|
|
|
|
we continue to refine the automated detection of and responses to bad floodfills.
|
|
|
|
The available statistics should make the router ID of a troublemaker readily apparent.
|
|
|
|
We also have methods for users to manually block peers by router hash
|
|
|
|
or IP, and several channels to get the word out to users.
|
|
|
|
</p>
|
|
|
|
|
2005-02-18 15:57:38 +00:00
|
|
|
<h2><a name="healing">Healing</a></h2>
|
2010-03-21 19:41:10 +00:00
|
|
|
<i>Needs update since Kademlia is disabled.</i>
|
2005-02-18 15:57:38 +00:00
|
|
|
|
2010-03-21 19:41:10 +00:00
|
|
|
<p>While the Kademlia algorithm is fairly efficient at maintaining the necessary
|
2005-02-18 15:57:38 +00:00
|
|
|
links, we keep additional statistics regarding the netDb's activity so that we
|
|
|
|
can detect potential segmentation and actively avoid it. This is done as part of
|
|
|
|
the peer profiling - with data points such as how many new and verifiable
|
|
|
|
RouterInfo references a peer gives us, we can determine what peers know about
|
|
|
|
groups of peers that we have never seen references to. When this occurs, we can
|
2010-03-21 19:41:10 +00:00
|
|
|
take advantage of Kademlia's flexibility in exploration and send requests to that
|
2005-02-18 15:57:38 +00:00
|
|
|
peer so as to integrate ourselves further with the part of the network seen by
|
|
|
|
that well integrated router.</p>
|
|
|
|
|
|
|
|
<h2><a name="migration">Migration</a></h2>
|
2010-03-21 19:41:10 +00:00
|
|
|
<i>Needs update since Kademlia is disabled.</i>
|
2005-02-18 15:57:38 +00:00
|
|
|
|
|
|
|
<p>Unlike traditional DHTs, the very act of conducting a search distributes the
|
|
|
|
data as well, since rather than passing IP+port # pairs, references are given to
|
|
|
|
the routers on which to query (namely, the SHA256 of those router's identities).
|
|
|
|
As such, iteratively searching for a particular destination's LeaseSet or
|
|
|
|
router's RouterInfo will also provide you with the RouterInfo of the peers along
|
|
|
|
the way.</p>
|
|
|
|
|
|
|
|
<p>In addition, due to the time sensitivity of the data, the information doesn't
|
|
|
|
often need to be migrated - since a LeaseSet is only valid for the 10 minutes
|
|
|
|
that the referenced tunnels are around, those entries can simply be dropped at
|
|
|
|
expiration, since they will be replaced at the new location when the router
|
|
|
|
publishes a new LeaseSet.</p>
|
|
|
|
|
|
|
|
<p>To address the concerns of <a href="http://citeseer.ist.psu.edu/douceur02sybil.html">Sybil attacks</a>,
|
|
|
|
the location used to store entries varies over time. Rather than storing the
|
|
|
|
RouterInfo on the peers closest to SHA256(router identity), they are stored on
|
|
|
|
the peers closest to SHA256(router identity + YYYYMMdd), requiring an adversary
|
|
|
|
to remount the attack again daily so as to maintain closeness to the "current"
|
2010-03-21 19:41:10 +00:00
|
|
|
keyspace. In addition, entries are probabilistically distributed to an additional
|
2005-02-18 15:57:38 +00:00
|
|
|
peer outside of the target keyspace, so that a successful compromise of the K
|
|
|
|
routers closest to the key will only degrade the search time.</p>
|
|
|
|
|
|
|
|
<h2><a name="delivery">Delivery</a></h2>
|
|
|
|
|
|
|
|
<p>As with DNS lookups, the fact that someone is trying to retrieve the LeaseSet
|
|
|
|
for a particular destination is sensitive (the fact that someone is <i>publishing</i>
|
|
|
|
a LeaseSet even more so!). To address this, netDb searches and netDb store
|
|
|
|
messages are simply sent through the router's exploratory tunnels.</p>
|
|
|
|
|
2008-03-30 21:52:15 +00:00
|
|
|
<h2><a name="multihome">MultiHoming</a></h2>
|
|
|
|
|
|
|
|
<p>Destinations may be hosted on multiple routers simultaneously, by using the same
|
|
|
|
private and public keys (traditionally named eepPriv.dat files).
|
|
|
|
As both instances will periodically publish their signed LeaseSets to the floodfill peers,
|
|
|
|
the most recently published LeaseSet will be returned to a peer requesting a database lookup.
|
2010-03-21 19:41:10 +00:00
|
|
|
As LeaseSets have (at most) a 10 minute lifetime, should a particular instance go down,
|
2008-03-30 21:52:15 +00:00
|
|
|
the outage will be 10 minutes at most, and generally much less than that.
|
|
|
|
The multihoming behavior has been verified with the test eepsite
|
|
|
|
<a href="http://multihome.i2p/">http://multihome.i2p/</a>.
|
|
|
|
</p>
|
|
|
|
|
2010-07-28 15:37:50 +00:00
|
|
|
<h2><a name="history">History</a></h2>
|
2008-02-03 16:15:11 +00:00
|
|
|
|
2010-07-28 15:37:50 +00:00
|
|
|
<a href="netdb_discussion.html">Moved to the netdb disussion page</a>.
|
2009-01-06 15:17:51 +00:00
|
|
|
|
2010-07-28 15:37:50 +00:00
|
|
|
<h2><a name="future">Future Work</a></h2>
|
2010-07-25 21:02:14 +00:00
|
|
|
|
|
|
|
|
2008-02-03 16:15:11 +00:00
|
|
|
{% endblock %}
|