717 lines
29 KiB
HTML
717 lines
29 KiB
HTML
{% extends "_layout.html" %}
|
|
{% block title %}The Network Database{% endblock %}
|
|
{% block content %}
|
|
|
|
<p>
|
|
Updated July 2010, current as of router version 0.8
|
|
</p>
|
|
|
|
<h2>Overview</h2>
|
|
|
|
<p>
|
|
I2P's netDb is a specialized distributed database, containing
|
|
just two types of data - router contact information (<b>RouterInfos</b>) and destination contact
|
|
information (<b>LeaseSets</b>). Each piece of data is signed by the appropriate party and verified
|
|
by anyone who uses or stores it. In addition, the data has liveliness information
|
|
within it, allowing irrelevant entries to be dropped, newer entries to replace
|
|
older ones, and protection against certain classes of attack.
|
|
</p>
|
|
|
|
<p>
|
|
The netDb is distributed with a simple technique called "floodfill",
|
|
where a subset of all routers, called "floodfill routers", maintains the distributed database.
|
|
</p>
|
|
|
|
|
|
<h2 id="routerInfo">RouterInfo</h2>
|
|
|
|
<p>
|
|
When an I2P router wants to contact another router, they need to know some
|
|
key pieces of data - all of which are bundled up and signed by the router into
|
|
a structure called the "RouterInfo", which is distributed under the key derived
|
|
from the SHA256 of the router's identity. The structure itself contains:
|
|
</p>
|
|
<ul>
|
|
<li>The router's identity (a 2048bit ElGamal encryption key, a 1024bit DSA signing key, and a certificate)</li>
|
|
<li>The contact addresses at which it can be reached (e.g. TCP: example.org port 4108)</li>
|
|
<li>When this was published</li>
|
|
<li>A set of arbitrary text options</li>
|
|
<li>The signature of the above, generated by the identity's DSA signing key</li>
|
|
</ul>
|
|
|
|
<p>
|
|
The following text options, while not strictly required, are expected
|
|
to be present:
|
|
</p>
|
|
<ul>
|
|
<li><b>caps</b>
|
|
(Capabilities flags - used to indicate floodfill participation, approximate bandwidth, and perceived reachability)
|
|
<li><b>coreVersion</b>
|
|
(The core library version, always the same as the router version)
|
|
<li><b>netId</b> = 2
|
|
(Basic network compatibility - A router will refuse to communicate with a peer having a different netId)
|
|
<li><b>router.version</b>
|
|
(Used to determine compatibility with newer features and messages)
|
|
<li><b>stat_uptime</b> = 90m
|
|
(Always sent as 90m, for compatibility with an older scheme where routers published their actual uptime,
|
|
and only sent tunnel requests to peers whose was more than 60m)
|
|
</ul>
|
|
|
|
<p>
|
|
These values are used by other routers for basic decisions.
|
|
Should we connect to this router? Should we attempt to route a tunnel through this router?
|
|
The bandwidth capability flag, in particular, is used only to determine whether
|
|
the router meets a minimum threshold for routing tunnels.
|
|
Above the minimum threshold, the advertised bandwidth is not used or trusted anywhere
|
|
in the router, except for display in the user interface and for debugging and network analysis.
|
|
</p>
|
|
|
|
|
|
<p>
|
|
Additional text options include
|
|
a small number of statistics about the router's health, which are aggregated by
|
|
sites such as <a href="http://stats.i2p/">stats.i2p</a>
|
|
for network performance analysis and debugging.
|
|
These statistics were chosen to provide data crucial to the developers,
|
|
such as tunnel build success rates, while balancing the need for such data
|
|
with the side-effects that could result from revealing this data.
|
|
Current statistics are limited to:
|
|
</p>
|
|
<ul>
|
|
<li>1 hour average bandwidth (average of outbound and inbound bandwidth)
|
|
<li>Client and exploratory tunnel build success, reject, and timeout rates
|
|
<li>1 hour average number of participating tunnels
|
|
</ul>
|
|
|
|
<p>
|
|
The data
|
|
published can be seen in the router's user interface,
|
|
but is not used or trusted within the router.
|
|
As the network has matured, we have gradually removed most of the published
|
|
statistics to improve anonymity, and we plan to remove more in future releases.
|
|
</p>
|
|
|
|
<p>
|
|
<a href="common_structures_spec.html#struct_RouterInfo">RouterInfo specification</a>
|
|
</p>
|
|
<p>
|
|
<a href="http://docs.i2p2.de/core/net/i2p/data/RouterInfo.html">RouterInfo Javadoc</a>
|
|
</p>
|
|
|
|
<h3>RouterInfo Expiration</h3>
|
|
<p>
|
|
RouterInfos have no set expiration time.
|
|
Each router is free to maintain its own local policy to trade off the frequency of RouterInfo lookups
|
|
with memory or disk usage.
|
|
In the current implementation, there are the following general policies:
|
|
</p>
|
|
<ul>
|
|
<li>There is no expiration during the first hour of uptime, as the persistent stored data may be old.
|
|
<li>As the number of local RouterInfos grows, the expiration time shrinks, in an attempt to maintain
|
|
a reasonable number RouterInfos.
|
|
<li>RouterInfos containing <a href="udp.html">SSU</a> introducers expire in about an hour, as
|
|
the introducer list expires in about that time
|
|
<li>Floodfills use a short expiration time for all local RouterInfos, as valid RouterInfos will
|
|
be frequently republished to them
|
|
</ul>
|
|
|
|
<h3>RouterInfo Persistent Storage</h3>
|
|
|
|
<p>
|
|
RouterInfos are periodically written to disk so that they are available after a restart.
|
|
</p>
|
|
|
|
|
|
<h2 id="leaseSet">LeaseSet</h2>
|
|
|
|
<p>
|
|
The second piece of data distributed in the netDb is a "LeaseSet" - documenting
|
|
a group of tunnel entry points (leases) for a particular client destination.
|
|
Each of these leases specify the tunnel's gateway router (with the hash of its
|
|
identity), the tunnel ID on that router to send messages (a 4 byte number), and
|
|
when that tunnel will expire. The LeaseSet itself is stored in the netDb under
|
|
the key derived from the SHA256 of the destination.
|
|
</p>
|
|
|
|
<p>
|
|
In addition to these leases, the LeaseSet includes the destination
|
|
itself (namely, the destination's 2048bit ElGamal encryption key, 1024bit DSA
|
|
signing key, and certificate) and an additional signing and
|
|
encryption public keys. The additional encryption public key is used for end-to-end encryption of
|
|
garlic messages.
|
|
The additional signing public key was intended for LeaseSet revocation but is currently unused.
|
|
</p>
|
|
|
|
<p>
|
|
<a href="common_structures_spec.html#struct_Lease">Lease specification</a>
|
|
<br />
|
|
<a href="common_structures_spec.html#struct_LeaseSet">LeaseSet specification</a>
|
|
</p>
|
|
<p>
|
|
<a href="http://docs.i2p2.de/core/net/i2p/data/Lease.html">Lease Javadoc</a>
|
|
<br />
|
|
<a href="http://docs.i2p2.de/core/net/i2p/data/LeaseSet.html">LeaseSet Javadoc</a>
|
|
</p>
|
|
|
|
|
|
<h3 id="unpublished">Unpublished LeaseSets</h3>
|
|
<p>
|
|
A LeaseSet for a destination used only for outgoing connections is <i>unpublished</i>.
|
|
It is never sent for publication to a floodfill router.
|
|
"Client" tunnels, such as those for web browsing and IRC clients, are unpublished.
|
|
</p>
|
|
|
|
|
|
|
|
<h3 id="revoked">Revoked LeaseSets</h3>
|
|
<p>
|
|
A LeaseSet may be <i>revoked</i> by publishing a new LeaseSet with zero leases.
|
|
Revocations must be signed by the additional signing key in the LeaseSet.
|
|
Revocations are not fully implemented, and it is unclear if they have any practical use.
|
|
This is the only planned use for the that signing key, so it is currently unused.
|
|
</p>
|
|
|
|
<h3 id="encrypted">Encrypted LeaseSets</h3>
|
|
<p>
|
|
In an <i>encrypted</i> LeaseSet, all Leases are encrypted with a separate DSA key.
|
|
The leases may only be decoded, and thus the destination may only be contacted,
|
|
by those with the key.
|
|
There is no flag or other direct indication that the LeaseSet is encrypted.
|
|
Encrypted LeaseSets are not widely used, and it is a topic for future work to
|
|
research whether the user interface and implementation of encrypted LeaseSets could be improved.
|
|
</p>
|
|
|
|
<h3>LeaseSet Expiration</h3>
|
|
<p>
|
|
All Leases (tunnels) are valid for 10 minutes; therefore, a LeaseSet expires
|
|
10 minutes after the earliest creation time of all its Leases.
|
|
</p>
|
|
|
|
<h3>LeaseSet Persistent Storage</h3>
|
|
<p>
|
|
There is no persistent storage of LeaseSet data since they expire so quickly.
|
|
</p>
|
|
|
|
|
|
<h2 id="bootstrap">Bootstrapping</h2>
|
|
<p>
|
|
The netDb is decentralized, however you do need at
|
|
least one reference to a peer so that the integration process
|
|
ties you in. This is accomplished by "reseeding" your router with the RouterInfo
|
|
of an active peer - specifically, by retrieving their <code>routerInfo-$hash.dat</code>
|
|
file and storing it in your <code>netDb/</code> directory. Anyone can provide
|
|
you with those files - you can even provide them to others by exposing your own
|
|
netDb directory. To simplify the process,
|
|
volunteers publish their netDb directories (or a subset) on the regular (non-i2p) network,
|
|
and the URLs of these directories are hardcoded in I2P.
|
|
When the router starts up for the first time, it automatically fetches from
|
|
one of these URLs, selected at random.
|
|
</p>
|
|
|
|
<h2 id="floodfill">Floodfill</h2>
|
|
<p>
|
|
The floodfill netDb is simple distributed storage mechanism.
|
|
The storage algorithm is simple- send the data to the closest peer that has advertised itself
|
|
as a floodfill router. Then wait 10 seconds, pick another floodfill router and ask them
|
|
for the entry to be sent, verifying its proper insertion / distribution. If the
|
|
verification peer doesn't reply, or they don't have the entry, the sender
|
|
repeats the process. When the peer in the floodfill netDb receives a netDb
|
|
store from a peer not in the floodfill netDb, they send it to all of the peers
|
|
in the floodfill netDb.
|
|
</p>
|
|
<p>
|
|
Determining who is part of the floodfill netDb is trivial - it is exposed in each
|
|
router's published routerInfo as a capability.
|
|
</p>
|
|
<p>
|
|
Floodfills have no central authority and do not form a "consensus" -
|
|
they only implement a simple DHT overlay.
|
|
</p>
|
|
|
|
|
|
|
|
<h3 id="opt-in">Floodfill Router Opt-in</h3>
|
|
|
|
<p>
|
|
Unlike Tor, where the directory servers are hardcoded and trusted,
|
|
and operated by known entities,
|
|
the members of the I2P floodfill peer set need not be trusted, and
|
|
change over time.
|
|
</p>
|
|
<p>
|
|
To increase reliability of the netDb, and minimize the impact
|
|
of netDb traffic on a router, floodfill is automatically enabled
|
|
only on routers that are configured with high bandwidth limits.
|
|
Routers with high bandwidth limits (which must be manually configured,
|
|
as the default is much lower) are presumed to be on lower-latency
|
|
connections, and are more likely to be available 24/7.
|
|
The current minimum share bandwidth for a floodfill router is 128 KBytes/sec.
|
|
</p>
|
|
<p>
|
|
In addition, a router must pass several additional tests for health
|
|
(outbound message queue time, job lag, etc.) before floodfill operation is
|
|
automatically enabled.
|
|
</p>
|
|
<p>
|
|
With the current rules for automatic opt-in, approximately 6% of
|
|
the routers in the network are floodfill routers.
|
|
</p>
|
|
|
|
<p>
|
|
While some peers are manually configured to be floodfill,
|
|
others are simply high-bandwidth routers who automatically volunteer
|
|
when the number of floodfill peers drops below a threshold.
|
|
This prevents any long-term network damage from losing most or all
|
|
floodfills to an attack.
|
|
In turn, these peers will un-floodfill themselves when there are
|
|
too many floodfills outstanding.
|
|
</p>
|
|
|
|
<h3>Floodfill Router Roles</h3>
|
|
<p>
|
|
A floodfill router's only services that are in addition to those of non-floodfill routers
|
|
are in accepting netDb stores and responding to netDb queries.
|
|
Since they are generally high-bandwidth, they are more likely to participate in a high number of tunnels
|
|
(i.e. be a "relay" for others), but this not directly related to their distributed database services.
|
|
</p>
|
|
|
|
|
|
<h2 id="kad">Kademlia Closeness Metric</h2>
|
|
<p>
|
|
The netDb uses a simple Kademlia-style XOR metric to determine closeness.
|
|
The SHA256 Hash of the key being looked up or stored is XOR-ed with
|
|
the hash of the router in question to determine closeness
|
|
(there is an additional daily keyspace rotation to increase the costs of Sybil attacks,
|
|
as <a href="#sybil-partial">explained below</a>).
|
|
</p>
|
|
|
|
|
|
|
|
<h2 id="delivery">Storage, Verification, and Lookup Mechanics</h2>
|
|
|
|
<h3>RouterInfo Storage to Peers</h3>
|
|
<p>
|
|
<a href="i2np.html">I2NP</a> DatabaseStoreMessages containing the local RouterInfo are exchanged with peers
|
|
as a part of the initialization of a
|
|
<a href="ntcp.html">NTCP</a>
|
|
or
|
|
<a href="udp.html">SSU</a>
|
|
transport connection.
|
|
</p>
|
|
|
|
|
|
<h3>LeaseSet Storage to Peers</h3>
|
|
<p>
|
|
<a href="i2np.html">I2NP</a> DatabaseStoreMessages containing the local LeaseSet are periodically exchanged with peers
|
|
by bundling them in a garlic message along with normal traffic from the related Destination.
|
|
This allows an initial response, and later responses, to be sent to an appropriate Lease,
|
|
without requiring any LeaseSet lookups, or requiring the communicating Destinations to have published LeaseSets at all.
|
|
</p>
|
|
|
|
|
|
<h3>RouterInfo Storage to Floodfills</h3>
|
|
<p>
|
|
A router publishes its own RouterInfo by directly connecting to a floodfill router
|
|
and sending it a <a href="i2np.html">I2NP</a> DatabaseStoreMessage
|
|
with a nonzero Reply Token. The message is not end-to-end garlic encrypted,
|
|
as this is a direct connection, so there are no intervening routers
|
|
(and no need to hide this data anyway).
|
|
The floodfill router replies with a
|
|
<a href="i2np.html">I2NP</a> DeliveryStatusMessage,
|
|
with the Message ID set to the value of the Reply Token.
|
|
</p>
|
|
|
|
|
|
|
|
|
|
<h3>LeaseSet Storage to Floodfills</h3>
|
|
<p>
|
|
Storage of LeaseSets is much more sensitive than for RouterInfos, as a router
|
|
must take care that the LeaseSet cannot be associated with the router.
|
|
</p>
|
|
|
|
<p>
|
|
A router publishes a local LeaseSet by
|
|
sending a <a href="i2np.html">I2NP</a> DatabaseStoreMessage
|
|
with a nonzero Reply Token over an outbound client tunnel for that Destination.
|
|
The message is end-to-end garlic encrypted using the Destination's Session Key Manager,
|
|
to hide the message from the tunnel's outbound endpoint.
|
|
The floodfill router replies with a
|
|
<a href="i2np.html">I2NP</a> DeliveryStatusMessage,
|
|
with the Message ID set to the value of the Reply Token.
|
|
This message is sent back to one of the client's inbound tunnels.
|
|
</p>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<h3>Flooding</h3>
|
|
<p>
|
|
After a floodfill router receives a DatabaseStoreMessage containing a
|
|
valid RouterInfo or LeaseSet which is newer than that previously stored in its
|
|
local NetDb, it "floods" it.
|
|
To flood a NetDb entry, it looks up the 7 floodfill routers closest to the key
|
|
of the NetDb entry. (The key is the SHA256 Hash of the RouterIdentity or Destination)
|
|
</p>
|
|
<p>
|
|
It then directly connects to each of the 7 peers
|
|
and sends it a <a href="i2np.html">I2NP</a> DatabaseStoreMessage
|
|
with a zero Reply Token. The message is not end-to-end garlic encrypted,
|
|
as this is a direct connection, so there are no intervening routers
|
|
(and no need to hide this data anyway).
|
|
The other routers do not reply or re-flood, as the Reply Token is zero.
|
|
</p>
|
|
|
|
|
|
<h3 id="lookup">RouterInfo and LeaseSet Lookup</h3>
|
|
<p>
|
|
The <a href="i2np.html">I2NP</a> DatabaseLookupMessage is used to request a netdb entry from a floodfill router.
|
|
Lookups are sent out one of the router's outbound exploratory tunnels.
|
|
The replies are specified to return via one of the router's inbound exploratory tunnels.
|
|
</p>
|
|
<p>
|
|
Lookups are generally sent to the two "good" floodfill routers closest to the requested key, in parallel.
|
|
</p>
|
|
<p>
|
|
If the key is found locally by the floodfill router, it responds with a
|
|
<a href="i2np.html">I2NP</a> DatabaseStoreMessage.
|
|
If the key is not found locally by the floodfill router, it responds with a
|
|
<a href="i2np.html">I2NP</a> DatabaseSearchReplyMessage
|
|
containing a list of other floodfill routers close to the key.
|
|
</p>
|
|
|
|
<p>
|
|
Lookups are not encrypted and thus are vulnerable to snooping by the outbound endpoint
|
|
(OBEP) of the client tunnel.
|
|
</p>
|
|
<p>
|
|
As the requesting router does not reveal itself, there is no recipient public key for the floodfill router to
|
|
encrypt the reply with. Therefore, the reply is exposed to the inbound gateway (IBGW)
|
|
of the inbound exploratory tunnel.
|
|
An appropriate method of encrypting the reply is a topic for future work.
|
|
</p>
|
|
|
|
<p>
|
|
(Reference:
|
|
<a href="http://www-users.cs.umn.edu/~hopper/hashing_it_out.pdf">Hashing it out in Public</a> Section 2.3 for terms below in italics)
|
|
</p>
|
|
<p>
|
|
Due to the relatively small size of the network, the flooding redundancy of 8x, and a lookup redundancy of 2x,
|
|
lookups are currently O(1) rather than O(log n) --
|
|
a router is highly likely to know a floodfill router close enough to the key to get the answer on the first try.
|
|
Neither <i>recursive</i> nor <i>iterative</i> routing for lookups is implemented.
|
|
</p>
|
|
<p>
|
|
<i>Node IDs</i> are <i>verifiable</i> in that we use the router hash directly as both the node ID and the Kademlia key.
|
|
Given the current size of the network, a router has
|
|
<i>detailed knowledge of the neighborhood of the destination ID space</i>.
|
|
</p>
|
|
|
|
<p>
|
|
Queries are sent through <i>multiple routes simultaneously</i>
|
|
to <i>reduce the chance of query failure</i>.
|
|
</p>
|
|
|
|
<p>
|
|
After network growth of 5x - 10x, there will be a significant chance of lookup failure due
|
|
to the O(1) lookup strategy, and implementation of an <i>iterative</i> lookup strategy will be required.
|
|
See <a href="#future">below</a> for more information.
|
|
</p>
|
|
|
|
|
|
|
|
<h3>RouterInfo Storage Verification</h3>
|
|
<p>
|
|
To verify a storage was successful, a router simply waits about 10 seconds,
|
|
then sends a lookup to another floodfill router close to the key
|
|
(but not the one the store was sent to).
|
|
Lookups sent out one of the router's outbound exploratory tunnels.
|
|
Lookups are end-to-end garlic encrypted to prevent snooping by the outbound endpoint(OBEP).
|
|
</p>
|
|
|
|
<h3>LeaseSet Storage Verification</h3>
|
|
<p>
|
|
To verify a storage was successful, a router simply waits about 10 seconds,
|
|
then sends a lookup to another floodfill router close to the key
|
|
(but not the one the store was sent to).
|
|
Lookups sent out one of the outbound client tunnels for the destination of the LeaseSet being verified.
|
|
To prevent snooping by the OBEP of the outbound tunnel,
|
|
lookups are end-to-end garlic encrypted.
|
|
The replies are specified to return via one of the client's inbound tunnels.
|
|
</p>
|
|
<p>
|
|
As for regular lookups, the reply is unencrypted,
|
|
thus exposing the reply to the inbound gateway (IBGW) of the reply tunnel, and
|
|
an appropriate method of encrypting the reply is a topic for future work.
|
|
As the IBGW for the reply is one of the gateways published in the LeaseSet,
|
|
the exposure is minimal.
|
|
</p>
|
|
|
|
|
|
|
|
<h3>Exploration</h3>
|
|
<p>
|
|
<i>Exploration</i> is a special form of netdb lookup, where a router attempts to learn about
|
|
new routers.
|
|
It does this by sending a floodfill router a <a href="i2np.html">I2NP</a> DatabaseLookupMessage, looking for a random key.
|
|
As this lookup will fail, the floodfill would normally respond with a
|
|
<a href="i2np.html">I2NP</a> DatabaseSearchReplyMessage containing hashes of floodfill routers close to the key.
|
|
This would not be helpful, as the requesting router probably already knows those floodfills,
|
|
and it would be impractical to add ALL floodfill routers to the "don't include" field of the lookup.
|
|
For an exploration query, the requesting router adds a router hash of all zeros to the
|
|
"don't include" field of the DatabaseLookupMessage.
|
|
The floodfill will then respond only with non-floodfill routers close to the requested key.
|
|
</p>
|
|
|
|
|
|
<h3>Notes on Lookup Responses</h3>
|
|
<p>
|
|
The response to a lookup request is either a Database Store Message (on success) or a
|
|
Database Search Reply Message (on failure). The DSRM contains a 'from' router hash field
|
|
to indicate the source of the reply; the DSM does not.
|
|
The DSRM 'from' field is unauthenticated and may be spoofed or invalid.
|
|
There are no other response tags. Therefore, when making multiple requests in parallel, it is
|
|
difficult to monitor the performance of the various floodfill routers.
|
|
</p>
|
|
|
|
|
|
<h2 id="multihome">MultiHoming</h2>
|
|
|
|
<p>
|
|
Destinations may be hosted on multiple routers simultaneously, by using the same
|
|
private and public keys (traditionally stored in eepPriv.dat files).
|
|
As both instances will periodically publish their signed LeaseSets to the floodfill peers,
|
|
the most recently published LeaseSet will be returned to a peer requesting a database lookup.
|
|
As LeaseSets have (at most) a 10 minute lifetime, should a particular instance go down,
|
|
the outage will be 10 minutes at most, and generally much less than that.
|
|
The multihoming function has been verified and is in use by several services on the network.
|
|
</p>
|
|
|
|
<h2 id="threat">Threat Analysis</h2>
|
|
<p>
|
|
Also discussed on <a href="how_threatmodel.html#floodfill">the threat model page</a>.
|
|
</p>
|
|
<p>
|
|
A hostile user may attempt to harm the network by
|
|
creating one or more floodfill routers and crafting them to offer
|
|
bad, slow, or no responses.
|
|
Some scenarios are discussed below.
|
|
</p>
|
|
|
|
<h3>General Mitigation Through Growth</h3>
|
|
<p>
|
|
There are currently almost 100 floodfill routers in the network.
|
|
Most of the following attacks will become more difficult, or have less impact,
|
|
as the network size and number of floodfill routers increase.
|
|
</p>
|
|
|
|
|
|
<h3>General Mitigation Through Redundancy</h3>
|
|
<p>
|
|
Via flooding, all netdb entries are stored on the 8 floodfill routers closest to the key.
|
|
</p>
|
|
|
|
|
|
<h3>Forgeries</h3>
|
|
<p>
|
|
All netdb entries are signed by their creators, so no router may forge a
|
|
RouterInfo or LeaseSet.
|
|
</p>
|
|
|
|
<h3>Slow or Unresponsive</h3>
|
|
<p>
|
|
Each router maintains an expanded set of statistics in the
|
|
<a href="how_peerselection.html">peer profile</a> for each floodfill router,
|
|
covering various quality metrics for that peer.
|
|
The set includes:
|
|
</p>
|
|
<ul>
|
|
<li>Average response time
|
|
<li>Percentage of queries answered with the data requested
|
|
<li>Percentage of stores that were successfully verified
|
|
<li>Last successful store
|
|
<li>Last successful lookup
|
|
<li>Last response
|
|
</ul>
|
|
|
|
<p>
|
|
Each time a router needs to make a determination on which floodfill router is closest to a key,
|
|
it uses these metrics to determine which floodfill routers are "good".
|
|
The methods, and thresholds, used to determine "goodness" are relatively new, and
|
|
are subject to further analysis and improvement.
|
|
While a completely unresponsive router will quickly be identified and avoided,
|
|
routers that are only sometimes malicious may be much harder to deal with.
|
|
</p>
|
|
|
|
|
|
<h3>Sybil Attack (Full Keyspace)</h3>
|
|
<p>
|
|
An attacker may mount a
|
|
<a href="http://citeseer.ist.psu.edu/douceur02sybil.html">Sybil attack</a>
|
|
by creating a large number of floodfill routers spread throughout the keyspace.
|
|
</p>
|
|
|
|
<p>
|
|
(In a related example, a researcher recently created a
|
|
<a href="http://blog.torproject.org/blog/june-2010-progress-report">large number of Tor relays</a>.)
|
|
|
|
If successful, this could be an effective DOS attack on the entire network.
|
|
</p>
|
|
|
|
<p>
|
|
If the floodfills are not sufficiently misbehaving to be marked as "bad" using the peer profile
|
|
metrics described above, this is a difficult scenario to handle.
|
|
Tor's response can be much more nimble in the relay case, as the suspicious relays
|
|
can be manually removed from the consensus.
|
|
Some possible responses for the I2P network are listed below, however none of them is completely satisfactory:
|
|
</p>
|
|
<ul>
|
|
<li>Compile a list of bad router hashes or IPs, and announce the list through various means
|
|
(console news, website, forum, etc.); users would have to manually download the list and
|
|
add it to their local "blacklist"
|
|
<li>Ask everyone in the network to enable floodfill manually (fight Sybil with more Sybil)
|
|
<li>Release a new software version that includes the hardcoded "bad" list
|
|
<li>Release a new software version that improves the peer profile metrics and thresholds,
|
|
in an attempt to automatically identify the "bad" peers
|
|
<li>Add software that disqualifies floodfills if too many of them are in a single IP block
|
|
<li>Implement an automatic subscription-based blacklist controlled by a single individual or group.
|
|
This would essentially implement a portion of the Tor "consensus" model.
|
|
Unfortunately it would also give a single individual or group the power to
|
|
block participation of any particular router or IP in the network,
|
|
or even to completely shutdown or destroy the entire network.
|
|
</ul>
|
|
|
|
<p>
|
|
This attack becomes more difficult as the network size grows.
|
|
</p>
|
|
|
|
|
|
|
|
<h3 id="sybil-partial">Sybil Attack (Partial Keyspace)</h3>
|
|
<p>
|
|
An attacker may mount a
|
|
<a href="http://citeseer.ist.psu.edu/douceur02sybil.html">Sybil attack</a>
|
|
by creating a small number (8-15) of floodfill routers clustered closely in the keyspace,
|
|
and distribute the RouterInfos for these routers widely.
|
|
Then, all lookups and stores for a key in that keyspace would be directed
|
|
to one of the attacker's routers.
|
|
If successful, this could be an effective DOS attack on a particular eepsite, for example.
|
|
</p>
|
|
<p>
|
|
As the keyspace is indexed by the cryptographic (SHA256) Hash of the key,
|
|
an attacker must use a brute-force method to repeatedly generate router hashes
|
|
until he has enough that are sufficiently close to the key.
|
|
The amount of computational power required for this, which is dependent on network
|
|
size, is unknown.
|
|
</p>
|
|
|
|
<p>
|
|
As a partial defense against this attack,
|
|
the algorithm used to determine Kademlia "closeness" varies over time.
|
|
Rather than using the Hash of the key (i.e. H(k)) to determine closeness,
|
|
we use the Hash of the key appended with the current date string, i.e. H(k + YYYYMMDD).
|
|
A function called the "routing key generator" does this, which transforms the original key into a "routing key".
|
|
In other words, the entire netdb keyspace "rotates" every day at UTC midnight.
|
|
Any partial-keyspace attack would have to be regenerated every day, for
|
|
after the rotation, the attacking routers would no longer be close
|
|
to the target key, or to each other.
|
|
</p>
|
|
<p>
|
|
This attack becomes more difficult as the network size grows.
|
|
</p>
|
|
|
|
<p>
|
|
One consequence of daily keyspace rotation is that the distributed network database
|
|
may become unreliable for a few minutes after the rotation --
|
|
lookups will fail because the new "closest" router has not received a store yet.
|
|
The extent of the issue, and methods for mitigation
|
|
(for example netdb "handoffs" at midnight)
|
|
are a topic for further study.
|
|
</p>
|
|
|
|
|
|
<h3>Bootstrap Attacks</h3>
|
|
<p>
|
|
An attacker could attempt to boot new routers into an isolated
|
|
or majority-controlled network by taking over a reseed website,
|
|
or tricking the developers into adding his reseed website
|
|
to the hardcoded list in the router.
|
|
</p>
|
|
<p>
|
|
Several defenses are possible, and most of these are planned:
|
|
</p>
|
|
<ul>
|
|
<li>Switching from HTTP to HTTPS for reseeding, with SSL certificate verification
|
|
<li>Changing the reseed task to fetch a subset of RouterInfos from
|
|
each of several reseed sites rather than using only a single site
|
|
<li>Creating an out-of-network reseed monitoring service that
|
|
periodically polls reseed websites and verifies that the
|
|
data are not stale or inconsistent with other views of the network
|
|
<li>Bundling reseed data in the installer
|
|
</ul>
|
|
|
|
<h3>Query Capture</h3>
|
|
<p>
|
|
See also <a href="#lookup">lookup</a>
|
|
(Reference:
|
|
<a href="http://www-users.cs.umn.edu/~hopper/hashing_it_out.pdf">Hashing it out in Public</a> Section 2.3 for terms below in italics)
|
|
</p>
|
|
<p>
|
|
Similar to a bootstrap attack, an attacker using a floodfill router could attempt to "steer"
|
|
peers to a subset of routers controlled by him by returning their references.
|
|
</p>
|
|
<p>
|
|
This is unlikely to work via exploration, because exploration is a low-frequency task.
|
|
Routers acquire a majority of their peer references through normal tunnel building activity.
|
|
Exploration results are generally limited to a few router hashes,
|
|
and each exploration query is directed to a random floodfill router.
|
|
</p>
|
|
<p>
|
|
For floodfill router references returned in a
|
|
<a href="i2np.html">I2NP</a> DatabaseSearchReplyMessage
|
|
response to a lookup,
|
|
these references are not immediately followed.
|
|
The requesting router does not trust that the references are
|
|
closer to the key (i.e. they are <i>verifiable correct</i>, and the references are not immediately queried.
|
|
In other words, the Kademlia lookup is not iterative.
|
|
This means the query capture attack described in
|
|
<a href="http://www-users.cs.umn.edu/~hopper/hashing_it_out.pdf">Hashing it out in Public</a>
|
|
much less likely, until <a href="#future">iterative lookups are implemented</a>.
|
|
</p>
|
|
|
|
<h3>DHT-Based Relay Selection</h3>
|
|
<p>
|
|
(Reference:
|
|
<a href="http://www-users.cs.umn.edu/~hopper/hashing_it_out.pdf">Hashing it out in Public</a> Section 3)
|
|
</p>
|
|
<p>
|
|
This doesn't have much to do with floodfill, but see
|
|
the <a href="how_peerselection">peer selection page</a>
|
|
for a discussion of the vulnerabilities of peer selection for tunnels.
|
|
</p>
|
|
|
|
|
|
<h2 id="history">History</h2>
|
|
<p>
|
|
<a href="netdb_discussion.html">Moved to the netdb discussion page</a>.
|
|
</p>
|
|
|
|
<h2 id="future">Future Work</h2>
|
|
<p>
|
|
After network growth of 5x - 10x, there will be a significant chance of lookup failure due
|
|
to the O(1) lookup strategy, and implementation of an <i>iterative</i> lookup strategy will be required.
|
|
</p>
|
|
<p>
|
|
End-to-end encryption of additional netDb lookups and responses.
|
|
</p>
|
|
<p>
|
|
Better methods for tracking lookup responses.
|
|
</p>
|
|
|
|
|
|
{% endblock %}
|
|
|