595 lines
29 KiB
HTML
595 lines
29 KiB
HTML
{% extends "_layout.html" %}
|
|
{% block title %}How the Network Database (netDb) Works{% endblock %}
|
|
{% block content %}
|
|
<p>I2P's netDb is a specialized distributed database, containing
|
|
just two types of data - router contact information (<b>RouterInfos</b>) and destination contact
|
|
information (<b>LeaseSets</b>). Each piece of data is signed by the appropriate party and verified
|
|
by anyone who uses or stores it. In addition, the data has liveliness information
|
|
within it, allowing irrelevant entries to be dropped, newer entries to replace
|
|
older ones, and, for the paranoid, protection against certain classes of attack.
|
|
(Note that this is also why I2P bundles in the necessary code to determine the
|
|
correct time).</p>
|
|
|
|
<p>
|
|
The netDb is distributed with a simple technique called "floodfill".
|
|
Previously, the netDb also used the Kademlia DHT as a fallback algorithm. However,
|
|
it did not work well in our application, and it was completely disabled
|
|
in release 0.6.1.20. More information is <a href="status">below</a>.
|
|
</p>
|
|
|
|
<p>
|
|
Note that this document has been updated to include floodfill details,
|
|
but there are still some incorrect statements about the use of Kademlia that
|
|
need to be fixed up.
|
|
</p>
|
|
|
|
<h2><a name="routerInfo">RouterInfo</a></h2>
|
|
|
|
<p>When an I2P router wants to contact another router, they need to know some
|
|
key pieces of data - all of which are bundled up and signed by the router into
|
|
a structure called the "RouterInfo", which is distributed under the key derived
|
|
from the SHA256 of the router's identity. The structure itself contains:</p><ul>
|
|
<li>The router's identity (a 2048bit ElGamal encryption key, a 1024bit DSA signing key, and a certificate)</li>
|
|
<li>The contact addresses at which it can be reached (e.g. TCP: example.org port 4108)</li>
|
|
<li>When this was published</li>
|
|
<li>A set of arbitrary (uninterpreted) text options</li>
|
|
<li>The signature of the above, generated by the identity's DSA signing key</li>
|
|
</ul>
|
|
|
|
<p>The arbitrary text options are currently used to help debug the network,
|
|
publishing various stats about the router's health. These stats will be disabled
|
|
by default once I2P 1.0 is out, and can be disabled by adding
|
|
"router.publishPeerRankings=false" to the router
|
|
<a href="http://localhost:7657/configadvanced.jsp">configuration</a>. The data
|
|
published can be seen on the router's <a href="http://localhost:7657/netdb.jsp">netDb</a>
|
|
page, but should not be trusted.</p>
|
|
|
|
<h2><a name="leaseSet">LeaseSet</a></h2>
|
|
|
|
<p>The second piece of data distributed in the netDb is a "LeaseSet" - documenting
|
|
a group of tunnel entry points (leases) for a particular client destination.
|
|
Each of these leases specify the tunnel's gateway router (with the hash of its
|
|
identity), the tunnel ID on that router to send messages (a 4 byte number), and
|
|
when that tunnel will expire. The LeaseSet itself is stored in the netDb under
|
|
the key derived from the SHA256 of the destination.</p>
|
|
|
|
<p>In addition to these leases, the LeaseSet also includes the destination
|
|
itself (namely, the destination's 2048bit ElGamal encryption key, 1024bit DSA
|
|
signing key, and certificate) as well as an additional pair of signing and
|
|
encryption keys. These additional keys can be used for garlic routing messages
|
|
to the router on which the destination is located (though these keys are <b>not</b>
|
|
the router's keys - they are generated by the client and given to the router to
|
|
use). End to end client messages are still, of course, encrypted with the
|
|
destination's public keys.
|
|
[UPDATE - This is no longer true, we don't do end-to-end client encryption
|
|
any more, as explained in
|
|
<a href="how_intro.html">the introduction</a>.
|
|
So is there any use for the first encryption key, signing key, and certificate?
|
|
Can they be removed?]
|
|
</p>
|
|
|
|
<h2><a name="bootstrap">Bootstrapping</a></h2>
|
|
|
|
<p>The netDb is decentralized, however you do need at
|
|
least one reference to a peer so that the integration process
|
|
ties you in. This is accomplished by "reseeding" your router with the RouterInfo
|
|
of an active peer - specifically, by retrieving their <code>routerInfo-$hash.dat</code>
|
|
file and storing it in your <code>netDb/</code> directory. Anyone can provide
|
|
you with those files - you can even provide them to others by exposing your own
|
|
netDb directory. To simplify the process,
|
|
volunteers publish their netDb directories (or a subset) on the regular (non-i2p) network,
|
|
and the URLs of these directories are hardcoded in I2P.
|
|
When the router starts up for the first time, it automatically fetches from
|
|
one of these URLs, selected at random.
|
|
</p>
|
|
|
|
<h2><a name="floodfill">Floodfill</a></h2>
|
|
<p>
|
|
(Adapted from a post by jrandom in the old Syndie, Nov. 26, 2005)
|
|
<br />
|
|
The floodfill netDb is really just a simple and perhaps temporary measure,
|
|
using the simplest possible algorithm - send the data to a peer in the
|
|
floodfill netDb, wait 10 seconds, pick a random peer in the netDb and ask them
|
|
for the entry to be sent, verifying its proper insertion / distribution. If the
|
|
verification peer doesn't reply, or they don't have the entry, the sender
|
|
repeats the process. When the peer in the floodfill netDb receives a netDb
|
|
store from a peer not in the floodfill netDb, they send it to all of the peers
|
|
in the floodfill netDb.
|
|
</p><p>
|
|
Peers still do netDb exploration and bootstrapping as before.
|
|
</p><p>
|
|
At one point, the Kademlia
|
|
search/store functionality was still in place. The peers
|
|
considered the floodfill peers as always being 'closer' to every key than any
|
|
peer not participating in the netDb. We fell back on the Kademlia
|
|
netDb if the floodfill peers fail for some reason or another.
|
|
However, Kademlia has since been disabled completely (see below).
|
|
</p><p>
|
|
Determining who is part of the floodfill netDb is trivial - it is exposed in each
|
|
router's published routerInfo.
|
|
</p><p>
|
|
As for efficiency, this algorithm is optimal when the netDb peers are known and
|
|
their quantity is appropriate for the appropriate uptime demands. Regarding
|
|
scaling, we've got two peers who participate in the netDb right now, and
|
|
they'll be able to handle the load by themselves until we've got 10k+ eepsites.
|
|
It's not as sexy as the old
|
|
Kademlia netDb, but there are subtle anonymity attacks against non-flooded
|
|
netDbs.
|
|
</p>
|
|
|
|
<h3><a name="opt-in">Floodfill Router Opt-in</a></h3>
|
|
|
|
<p>
|
|
Unlike Tor, where the directory servers are hardcoded and trusted,
|
|
and operated by known entities,
|
|
the members of the I2P floodfill peer set need not be trusted and
|
|
change over time.
|
|
While some peers are manually configured to be floodfill,
|
|
others are simply high-bandwidth routers who automatically volunteer
|
|
when the number of floodfill peers drops below a threshold.
|
|
This prevents any long-term network damage from losing most or all
|
|
floodfills to an attack.
|
|
In turn, these peers will un-floodfill themselves when there are
|
|
too many floodfills outstanding.
|
|
</p>
|
|
<p>
|
|
All netDb data are signed by their publisher, so a floodfill peer cannot
|
|
spoof netDb responses. All peers monitor the performance of the floodfill
|
|
routers they talk to, so that fake, malicious, or unresponsive floodfills
|
|
can be avoided.
|
|
While these defenses may be insufficient to prevent any network disruption,
|
|
we continue to refine the automated detection of and responses to bad floodfills.
|
|
The available statistics should make the router ID of a troublemaker readily apparent.
|
|
We also have methods for users to manually block peers by router hash
|
|
or IP, and several channels to get the word out to users.
|
|
</p>
|
|
|
|
<h2><a name="healing">Healing</a></h2>
|
|
<i>Needs update since Kademlia is disabled.</i>
|
|
|
|
<p>While the Kademlia algorithm is fairly efficient at maintaining the necessary
|
|
links, we keep additional statistics regarding the netDb's activity so that we
|
|
can detect potential segmentation and actively avoid it. This is done as part of
|
|
the peer profiling - with data points such as how many new and verifiable
|
|
RouterInfo references a peer gives us, we can determine what peers know about
|
|
groups of peers that we have never seen references to. When this occurs, we can
|
|
take advantage of Kademlia's flexibility in exploration and send requests to that
|
|
peer so as to integrate ourselves further with the part of the network seen by
|
|
that well integrated router.</p>
|
|
|
|
<h2><a name="migration">Migration</a></h2>
|
|
<i>Needs update since Kademlia is disabled.</i>
|
|
|
|
<p>Unlike traditional DHTs, the very act of conducting a search distributes the
|
|
data as well, since rather than passing IP+port # pairs, references are given to
|
|
the routers on which to query (namely, the SHA256 of those router's identities).
|
|
As such, iteratively searching for a particular destination's LeaseSet or
|
|
router's RouterInfo will also provide you with the RouterInfo of the peers along
|
|
the way.</p>
|
|
|
|
<p>In addition, due to the time sensitivity of the data, the information doesn't
|
|
often need to be migrated - since a LeaseSet is only valid for the 10 minutes
|
|
that the referenced tunnels are around, those entries can simply be dropped at
|
|
expiration, since they will be replaced at the new location when the router
|
|
publishes a new LeaseSet.</p>
|
|
|
|
<p>To address the concerns of <a href="http://citeseer.ist.psu.edu/douceur02sybil.html">Sybil attacks</a>,
|
|
the location used to store entries varies over time. Rather than storing the
|
|
RouterInfo on the peers closest to SHA256(router identity), they are stored on
|
|
the peers closest to SHA256(router identity + YYYYMMdd), requiring an adversary
|
|
to remount the attack again daily so as to maintain closeness to the "current"
|
|
keyspace. In addition, entries are probabilistically distributed to an additional
|
|
peer outside of the target keyspace, so that a successful compromise of the K
|
|
routers closest to the key will only degrade the search time.</p>
|
|
|
|
<h2><a name="delivery">Delivery</a></h2>
|
|
|
|
<p>As with DNS lookups, the fact that someone is trying to retrieve the LeaseSet
|
|
for a particular destination is sensitive (the fact that someone is <i>publishing</i>
|
|
a LeaseSet even more so!). To address this, netDb searches and netDb store
|
|
messages are simply sent through the router's exploratory tunnels.</p>
|
|
|
|
<h2><a name="multihome">MultiHoming</a></h2>
|
|
|
|
<p>Destinations may be hosted on multiple routers simultaneously, by using the same
|
|
private and public keys (traditionally named eepPriv.dat files).
|
|
As both instances will periodically publish their signed LeaseSets to the floodfill peers,
|
|
the most recently published LeaseSet will be returned to a peer requesting a database lookup.
|
|
As LeaseSets have (at most) a 10 minute lifetime, should a particular instance go down,
|
|
the outage will be 10 minutes at most, and generally much less than that.
|
|
The multihoming behavior has been verified with the test eepsite
|
|
<a href="http://multihome.i2p/">http://multihome.i2p/</a>.
|
|
</p>
|
|
|
|
<h2><a name="status">History and Status</a></h2>
|
|
|
|
<h3>The Introduction of the Floodfill Algorithm</h3>
|
|
<p>
|
|
Floodfill was introduced in release 0.6.0.4, keeping Kademlia as a backup algorithm.
|
|
</p>
|
|
|
|
<p>
|
|
(Adapted from posts by jrandom in the old Syndie, Nov. 26, 2005)
|
|
<br />
|
|
As I've often said, I'm not particularly bound to any specific technology -
|
|
what matters to me is what will get results. While I've been working through
|
|
various netDb ideas over the last few years, the issues we've faced in the last
|
|
few weeks have brought some of them to a head. On the live net,
|
|
with the netDb redundancy factor set to 4 peers (meaning we keep sending an
|
|
entry to new peers until 4 of them confirm that they've got it) and the
|
|
per-peer timeout set to 4 times that peer's average reply time, we're
|
|
<b>still</b> getting an average of 40-60 peers sent to before 4 ACK the store.
|
|
That means sending 36-56 times as many messages as should go out, each using
|
|
tunnels and thereby crossing 2-4 links. Even further, that value is heavily
|
|
skewed, as the average number of peers sent to in a 'failed' store (meaning
|
|
less than 4 people ACKed the message after 60 seconds of sending messages out)
|
|
was in the 130-160 peers range.
|
|
</p><p>
|
|
This is insane, especially for a network with only perhaps 250 peers on it.
|
|
</p><p>
|
|
The simplest answer is to say "well, duh jrandom, it's broken. fix it", but
|
|
that doesn't quite get to the core of the issue. In line with another current
|
|
effort, it's likely that we have a substantial number of network issues due to
|
|
restricted routes - peers who cannot talk with some other peers, often due to
|
|
NAT or firewall issues. If, say, the K peers closest to a particular netDb
|
|
entry are behind a 'restricted route' such that the netDb store message could
|
|
reach them but some other peer's netDb lookup message could not, that entry
|
|
would be essentially unreachable. Following down those lines a bit further and
|
|
taking into consideration the fact that some restricted routes will be created
|
|
with hostile intent, its clear that we're going to have to look closer into a
|
|
long term netDb solution.
|
|
</p><p>
|
|
There are a few alternatives, but two worth mentioning in particular. The
|
|
first is to simply run the netDb as a Kademlia DHT using a subset of the full
|
|
network, where all of those peers are externally reachable. Peers who are not
|
|
participating in the netDb still query those peers but they don't receive
|
|
unsolicited netDb store or lookup messages. Participation in the netDb would
|
|
be both self-selecting and user-eliminating - routers would choose whether to
|
|
publish a flag in their routerInfo stating whether they want to participate
|
|
while each router chooses which peers it wants to treat as part of the netDb
|
|
(peers who publish that flag but who never give any useful data would be
|
|
ignored, essentially eliminating them from the netDb).
|
|
</p><p>
|
|
Another alternative is a blast from the past, going back to the DTSTTCPW
|
|
(Do The Simplest Thing That Could Possibly Work)
|
|
mentality - a floodfill netDb, but like the alternative above, using only a
|
|
subset of the full network. When a user wants to publish an entry into the
|
|
floodfill netDb, they simply send it to one of the participating routers, wait
|
|
for an ACK, and then 30 seconds later, query another random participant in the
|
|
floodfill netDb to verify that it was properly distributed. If it was, great,
|
|
and if it wasn't, just repeat the process. When a floodfill router receives a
|
|
netDb store, they ACK immediately and queue off the netDb store to all of its
|
|
known netDb peers. When a floodfill router receives a netDb lookup, if they
|
|
have the data, they reply with it, but if they don't, they reply with the
|
|
hashes for, say, 20 other peers in the floodfill netDb.
|
|
</p><p>
|
|
Looking at it from a network economics perspective, the floodfill netDb is
|
|
quite similar to the original broadcast netDb, except the cost for publishing
|
|
an entry is borne mostly by peers in the netDb, rather than by the publisher.
|
|
Fleshing this out a bit further and treating the netDb like a blackbox, we can
|
|
see the total bandwidth required by the netDb to be:<pre>
|
|
recvKBps = N * (L + 1) * (1 + F) * (1 + R) * S / T
|
|
</pre>where<pre>
|
|
N = number of routers in the entire network
|
|
L = average number of client destinations on each router
|
|
(+1 for the routerInfo)
|
|
F = tunnel failure percentage
|
|
R = tunnel rebuild period, as a fraction of the tunnel lifetime
|
|
S = average netDb entry size
|
|
T = tunnel lifetime
|
|
</pre>Plugging in a few values:<pre>
|
|
recvKBps = 1000 * (5 + 1) * (1 + 0.05) * (1 + 0.2) * 2KB / 10m
|
|
= 25.2KBps
|
|
</pre>That, in turn, scales linearly with N (at 100,000 peers, the netDb must
|
|
be able to handle netDb store messages totaling 2.5MBps, or, at 300 peers,
|
|
7.6KBps).
|
|
</p><p>
|
|
While the floodfill netDb would have each netDb participant receiving only a
|
|
small fraction of the client generated netDb stores directly, they would all
|
|
receive all entries eventually, so all of their links should be capable of
|
|
handling the full recvKBps. In turn, they'll all need to send
|
|
<tt>(recvKBps/sizeof(netDb)) * (sizeof(netDb)-1)</tt> to keep the other
|
|
peers in sync.
|
|
</p><p>
|
|
A floodfill netDb would not require either tunnel routing for netDb operation
|
|
or any special selection as to which entries it can answer 'safely', as the
|
|
basic assumption is that they are all storing everything. Oh, and with regards
|
|
to the netDb disk usage required, its still fairly trivial for any modern
|
|
machine, requiring around 11MB for every 1000 peers <tt>(N * (L + 1) *
|
|
S)</tt>.
|
|
</p><p>
|
|
The Kademlia netDb would cut down on these numbers, ideally bringing them to K
|
|
over M times their value, with K = the redundancy factor and M being the number
|
|
of routers in the netDb (e.g. 5/100, giving a recvKBps of 126KBps and 536MB at
|
|
100,000 routers). The downside of the Kademlia netDb though is the increased
|
|
complexity of safe operation in a hostile environment.
|
|
</p><p>
|
|
What I'm thinking about now is to simply implement and deploy a floodfill netDb
|
|
in our existing live network, letting peers who want to use it pick out other
|
|
peers who are flagged as members and query them instead of querying the
|
|
traditional Kademlia netDb peers. The bandwidth and disk requirements at this
|
|
stage are trivial enough (7.6KBps and 3MB disk space) and it will remove the
|
|
netDb entirely from the debugging plan - issues that remain to be addressed
|
|
will be caused by something unrelated to the netDb.
|
|
</p><p>
|
|
How would peers be chosen to publish that flag saying they are a part of the
|
|
floodfill netDb? At the beginning, it could be done manually as an advanced
|
|
config option (ignored if the router is not able to verify its external
|
|
reachability). If too many peers set that flag, how do the netDb participants
|
|
pick which ones to eject? Again, at the beginning it could be done manually as
|
|
an advanced config option (after dropping peers which are unreachable). How do
|
|
we avoid netDb partitioning? By having the routers verify that the netDb is
|
|
doing the flood fill properly by querying K random netDb peers. How do routers
|
|
not participating in the netDb discover new routers to tunnel through? Perhaps
|
|
this could be done by sending a particular netDb lookup so that the netDb
|
|
router would respond not with peers in the netDb, but with random peers outside
|
|
the netDb.
|
|
</p><p>
|
|
I2P's netDb is very different from traditional load bearing DHTs - it only
|
|
carries network metadata, not any actual payload, which is why even a netDb
|
|
using a floodfill algorithm will be able to sustain an arbitrary amount of
|
|
eepsite/IRC/bt/mail/syndie/etc data. We can even do some optimizations as I2P
|
|
grows to distribute that load a bit further (perhaps passing bloom filters
|
|
between the netDb participants to see what they need to share), but it seems we
|
|
can get by with a much simpler solution for now.
|
|
</p><p>
|
|
One fact may be worth digging
|
|
into - not all leaseSets need to be published in the netDb! In fact, most
|
|
don't need to be - only those for destinations which will be receiving
|
|
unsolicited messages (aka servers). This is because the garlic wrapped
|
|
messages sent from one destination to another already bundles the sender's
|
|
leaseSet so that any subsequent send/recv between those two destinations
|
|
(within a short period of time) work without any netDb activity.
|
|
</p><p>
|
|
So, back at those equations, we can change L from 5 to something like 0.1
|
|
(assuming only 1 out of every 50 destinations is a server). The previous
|
|
equations also brushed over the network load required to answer queries from
|
|
clients, but while that is highly variable (based on the user activity), it's
|
|
also very likely to be quite insignificant as compared to the publishing
|
|
frequency.
|
|
</p><p>
|
|
Anyway, still no magic, but a nice reduction of nearly 1/5th the bandwidth/disk
|
|
space required (perhaps more later, depending upon whether the routerInfo
|
|
distribution goes directly as part of the peer establishment or only through
|
|
the netDb).
|
|
</p>
|
|
|
|
<h3>The Disabling of the Kademlia Algorithm</h3>
|
|
<p>
|
|
Kademlia was completely disabled in release 0.6.1.20.
|
|
</p><p>
|
|
(this is adapted from an IRC conversation with jrandom 11/07)
|
|
<br />
|
|
Kademlia requires a minimum level of service that the baseline could not offer (bandwidth, cpu),
|
|
even after adding in tiers (pure kad is absurd on that point).
|
|
Kademlia just wouldn't work. It was a nice idea, but not for a hostile and fluid environment.
|
|
</p>
|
|
|
|
<h3>Current Status</h3>
|
|
<p>The netDb plays a very specific role in the I2P network, and the algorithms
|
|
have been tuned towards our needs. This also means that it hasn't been tuned
|
|
to address the needs we have yet to run into. I2P is currently
|
|
fairly small (a few hundred routers).
|
|
There were some calculations that 3-5 floodfill routers should be able to handle
|
|
10,000 nodes in the network.
|
|
The netDb implementation more than adequately meets our
|
|
needs at the moment, but there will likely be further tuning and bugfixing as
|
|
the network grows.</p>
|
|
|
|
<h3>Update of Calculations 03-2008</h3>
|
|
<p>Current numbers:
|
|
<pre>
|
|
recvKBps = N * (L + 1) * (1 + F) * (1 + R) * S / T
|
|
</pre>where<pre>
|
|
N = number of routers in the entire network
|
|
L = average number of client destinations on each router
|
|
(+1 for the routerInfo)
|
|
F = tunnel failure percentage
|
|
R = tunnel rebuild period, as a fraction of the tunnel lifetime
|
|
S = average netDb entry size
|
|
T = tunnel lifetime
|
|
</pre>
|
|
Changes in assumptions:
|
|
<ul>
|
|
<li>L is now about .5, compared to .1 above, due to the popularity of i2psnark
|
|
and other apps.
|
|
<li>F is about .33, but bugs in tunnel testing are fixed in 0.6.1.33, so it will get much better.
|
|
<li>Since netDb is about 2/3 5K routerInfos and 1/3 2K leaseSets, S = 4K.
|
|
RouterInfo size is shrinking in 0.6.1.32 and 0.6.1.33 as we remove unnecessary stats.
|
|
<li>R = tunnel build period: 0.2 was a very low - it was maybe 0.7 -
|
|
but build algorithm improvements in 0.6.1.32 should bring it down to about 0.2
|
|
as the network upgrades. Call it 0.5 now with half the network at .30 or earlier.
|
|
</ul>
|
|
<pre> recvKBps = 700 * (0.5 + 1) * (1 + 0.33) * (1 + 0.5) * 4KB / 10m
|
|
~= 28KBps
|
|
</pre>
|
|
This just accounts for the stores - what about the queries?
|
|
|
|
|
|
<h3>The Return of the Kademlia Algorithm?</h3>
|
|
<p>
|
|
(this is adapted from <a href="meeting195.html">the I2P meeting Jan. 2, 2007</a>)
|
|
<br />
|
|
The Kademlia netDb just wasn't working properly.
|
|
Is it dead forever or will it be coming back?
|
|
If it comes back, the peers in the Kademlia netDb would be a very limited subset
|
|
of the routers in the network (basically an expanded number of floodfill peers, if/when the floodfill peers
|
|
cannot handle the load).
|
|
But until the floodfill peers cannot handle the load (and other peers cannot be added that can), it's unnecessary.
|
|
</p>
|
|
|
|
<h3>The Future of Floodfill</h3>
|
|
<p>
|
|
(this is adapted from an IRC conversation with jrandom 11/07)
|
|
<br />
|
|
Here's a proposal: Capacity class O is automatically floodfill.
|
|
Hmm.
|
|
Unless we're sure, we might end up with a fancy way of DDoS'ing all O class routers.
|
|
This is quite the case: we want to make sure the number of floodfill is as small as possible while providing sufficient reachability.
|
|
If/when netDb requests fail, then we need to increase the number of floodfill peers, but atm, I'm not aware of a netDb fetch problem.
|
|
There are 33 "O" class peers according to my records.
|
|
33 is a /lot/ to floodfill to.
|
|
</p><p>
|
|
So floodfill works best when the number of peers in that pool is firmly limited?
|
|
And the size of the floodfill pool shouldn't grow much, even if the network itself gradually would?
|
|
3-5 floodfill peers can handle 10K routers iirc (I posted a bunch of numbers on that explaining the details in the old syndie).
|
|
Sounds like a difficult requirement to fill with automatic opt-in,
|
|
especially if nodes opting in cannot trust data from others.
|
|
e.g. "let's see if I'm among the top 5",
|
|
and can only trust data about themselves (e.g. "I am definitely O class, and moving 150 KB/s, and up for 123 days").
|
|
And top 5 is hostile as well. Basically, it's the same as the tor directory servers - chosen by trusted people (aka devs).
|
|
Yeah, right now it could be exploited by opt-in, but that'd be trivial to detect and deal with.
|
|
Seems like in the end, we might need something more useful than Kademlia, and have only reasonably capable peers join that scheme.
|
|
N class and above should be a big enough quantity to suppress risk of an adversary causing denial of service, I'd hope.
|
|
But it would have to be different from floodfill then, in the sense that it wouldn't cause humongous traffic.
|
|
Large quantity? For a DHT based netDb?
|
|
Not necessarily DHT-based.
|
|
</p>
|
|
|
|
<h3 id="todo">Floodfill TODO List</h3>
|
|
<p>
|
|
The network was down to only one floodfill for a couple of hours on March 13, 2008
|
|
(approx. 18:00 - 20:00 UTC),
|
|
and it caused a lot of trouble.
|
|
<p>
|
|
Two changes implemented in 0.6.1.33 should reduce the disruption caused
|
|
by floodfill peer removal or churn:
|
|
<ol>
|
|
<li>Randomize the floodfill peers used for search each time.
|
|
This will get you past the failing ones eventually.
|
|
This change also fixed a nasty bug that would sometimes drive the ff search code insane.
|
|
<li>Prefer the floodfill peers that are up.
|
|
The code now avoids peers that are shitlisted, failing, or not heard from in
|
|
half an hour, if possible.
|
|
</ol>
|
|
<p>
|
|
One benefit is faster first contact to an eepsite (i.e. when you had to fetch
|
|
the leaseset first). The lookup timeout is 10s, so if you don't start out by
|
|
asking a peer that is down, you can save 10s.
|
|
|
|
<p>
|
|
There <i>may</i> be anonymity implications in these changes.
|
|
For example, in the floodfill <b>store</b> code, there are comments that
|
|
shitlisted peers are not avoided, since a peer could be "shitty" and then
|
|
see what happens.
|
|
Searches are much less vulnerable than stores -
|
|
they're much less frequent, and give less away.
|
|
So maybe we don't think we need to worry about it?
|
|
But if we want to tweak the changes, it would be easy to
|
|
send to a peer listed as "down" or shitlisted anyway, just not
|
|
count it as part of the 2 we are sending to
|
|
(since we don't really expect a reply).
|
|
|
|
<p>
|
|
There are several places where a floodfill peer is selected - this fix addresses only one -
|
|
who a regular peer searches from [2 at a time].
|
|
Other places where better floodfill selection should be implemented:
|
|
<ol>
|
|
<li>Who a regular peer stores to [1 at a time]
|
|
(random - need to add qualification, because timeouts are long)
|
|
<li>Who a regular peer searches to verify a store [1 at a time]
|
|
(random - need to add qualification, because timeouts are long)
|
|
<li>Who a ff peer sends in reply to a failed search (3 closest to the search)
|
|
<li>Who a ff peer floods to (all other ff peers)
|
|
<li>The list of ff peers sent in the NTCP every-6-hour "whisper"
|
|
(although this may not longer be necessary due to other ff improvements)
|
|
</ol>
|
|
<p>
|
|
Lots more that could and should be done -
|
|
<ul>
|
|
<li>
|
|
Use the "dbHistory" stats to better rate a floodfill peer's integration
|
|
<li>
|
|
Use the "dbHistory" stats to immediately react to floodfill peers that don't respond
|
|
<li>
|
|
Be smarter on retries - retries are handled by an upper layer, not in
|
|
FloodOnlySearchJob, so it does another random sort and tries again,
|
|
rather than purposefully skipping the ff peers we just tried.
|
|
<li>
|
|
Improve integration stats more
|
|
<li>
|
|
Actually use integration stats rather than just floodfill indication in netDb
|
|
<li>
|
|
Use latency stats too?
|
|
<li>
|
|
More improvement on recognizing failing floodfill peers
|
|
</ul>
|
|
|
|
<p>
|
|
Recently completed -
|
|
<ul>
|
|
<li>
|
|
[In Release 0.6.3]
|
|
Implement automatic opt-in
|
|
to floodfill for some percentage of class O peers, based on analysis of the network.
|
|
<li>
|
|
[In Release 0.6.3]
|
|
Continue to reduce netDb entry size to reduce floodfill traffic -
|
|
we are now at the minimum number of stats required to monitor the network.
|
|
<li>
|
|
[In Release 0.6.3]
|
|
Manual list of floodfill peers to exclude?
|
|
(<a href="how_threatmodel.html#blocklist">blocklists</a> by router ident)
|
|
<li>
|
|
[In Release 0.6.3]
|
|
Better floodfill peer selection for stores:
|
|
Avoid peers whose netDb is old, or have a recent failed store,
|
|
or are forever-shitlisted.
|
|
<li>
|
|
[In Release 0.6.4]
|
|
Prefer already-connected floodfill peers for RouterInfo stores, to
|
|
reduce number of direct connections to floodfill peers.
|
|
<li>
|
|
[In Release 0.6.5]
|
|
Peers who are no longer floodfill send their routerInfo in response
|
|
to a query, so that the router doing the query will know he
|
|
is no longer floodfill.
|
|
<li>
|
|
[In Release 0.6.5]
|
|
Further tuning of the requirements to automatically become floodfill
|
|
<li>
|
|
[In Release 0.6.5]
|
|
Fix response time profiling in preparation for favoring fast floodfills
|
|
<li>
|
|
[In Release 0.6.5]
|
|
Improve blocklisting
|
|
<li>
|
|
[In Release 0.7]
|
|
Fix netDb exploration
|
|
<li>
|
|
[In Release 0.7]
|
|
Turn blocklisting on by default, block the known troublemakers
|
|
<li>
|
|
[Several improvements in recent releases, a continuing effort]
|
|
Reduce the resource demands on high-bandwidth and floodfill routers
|
|
</ul>
|
|
|
|
<p>
|
|
That's a long list but it will take that much work to
|
|
have a network that's resistant to DOS from lots of peers turning the floodfill switch on and off.
|
|
Or pretending to be a floodfill router.
|
|
None of this was a problem when we had only two ff routers, and they were both up
|
|
24/7. Again, jrandom's absence has pointed us to places that need improvement.
|
|
|
|
</p><p>
|
|
To assist in this effort, additional profile data for floodfill peers are
|
|
now (as of release 0.6.1.33) displayed on the "Profiles" page in
|
|
the router console.
|
|
We will use this to analyze which data are appropriate for
|
|
rating floodfill peers.
|
|
</p>
|
|
|
|
<p>
|
|
The network is currently quite resilient, however
|
|
we will continue to enhance our algorithms for measuring and reacting to the performance and reliability
|
|
of floodfill peers. While we are not, at the moment, fully hardened to the potential threats of
|
|
malicious floodfills or a floodfill DDOS, most of the infrastructure is in place,
|
|
and we are well-positioned to react quickly
|
|
should the need arise.
|
|
</p>
|
|
|
|
<h3>Really current status</h3>
|
|
|
|
|
|
{% endblock %}
|