2008-01-31 20:38:37 +00:00
|
|
|
{% extends "_layout.html" %}
|
2008-05-21 16:49:27 +00:00
|
|
|
{% block title %}How Peer Selection and Profiling Work{% endblock %}
|
2008-01-31 20:38:37 +00:00
|
|
|
{% block content %}<p>Peer selection within I2P is simply the process of choosing which routers
|
2004-07-19 20:08:11 +00:00
|
|
|
on the network we want our messages to go through (which peers will we
|
|
|
|
ask to join our tunnels). To accomplish this, we keep track of how each
|
|
|
|
peer performs (the peer's "profile") and use that data to estimate how
|
|
|
|
fast they are, how often they will be able to accept our requests, and
|
|
|
|
whether they seem to be overloaded or otherwise unable to perform what
|
|
|
|
they agree to reliably.</p>
|
2008-05-21 16:49:27 +00:00
|
|
|
<p>
|
|
|
|
Claimed bandwidth is untrusted and is <b>only</b> used to avoid those peers
|
|
|
|
advertising very low bandwidth insufficient for routing tunnels.
|
|
|
|
All peer selection is done through profiling.
|
|
|
|
This makes
|
|
|
|
<a href="how_threatmodel.html#timing">timing attacks</a>
|
|
|
|
more difficult.
|
|
|
|
</p>
|
2009-03-26 20:05:55 +00:00
|
|
|
<p>
|
|
|
|
For more information see the paper
|
|
|
|
<a href="/_static/pdf/I2P-PET-CON-2009.1.pdf">Peer Profiling and Selection in the I2P Anonymous Network</a>
|
|
|
|
presented at
|
|
|
|
<a href="http://www.pet-con.org/index.php/PET_Convention_2009.1">PET-CON 2009.1</a>.
|
|
|
|
</p>
|
2004-07-19 20:08:11 +00:00
|
|
|
|
|
|
|
<h2>Peer profiles</h2>
|
|
|
|
|
|
|
|
<p>Each peer has a set of data points collected about them, including statistics
|
|
|
|
about how long it takes for them to reply to a network database query, how
|
|
|
|
often their tunnels fail, and how many new peers they are able to introduce
|
|
|
|
us to, as well as simple data points such as when we last heard from them or
|
|
|
|
when the last communication error occurred. The specific data points gathered
|
2010-07-28 15:37:50 +00:00
|
|
|
can be found in the <a href="http://docs.i2p2.de/router/net/i2p/router/peermanager/PeerProfile.html">code</a>
|
2004-07-19 20:08:11 +00:00
|
|
|
</p>
|
|
|
|
|
|
|
|
<p>Currently, there is no 'ejection' strategy to get rid of the profiles for
|
|
|
|
peers that are no longer active (or when the network consists of thousands
|
|
|
|
of peers, to get rid of peers that are performing poorly). However, the size
|
|
|
|
of each profile is fairly small, and is unrelated to how much data is
|
|
|
|
collected about the peer, so that a router can keep a few thousand active
|
2008-04-25 15:55:22 +00:00
|
|
|
peer profiles before the overhead becomes a serious concern. Once it becomes necessary,
|
2004-07-19 20:08:11 +00:00
|
|
|
we can simply compact the poorly performing profiles (keeping only the most
|
|
|
|
basic data) and maintain hundreds of thousands of profiles in memory. Beyond
|
|
|
|
that size, we can simply eject the peers (e.g. keeping the best 100,000).</p>
|
|
|
|
|
2008-04-25 15:55:22 +00:00
|
|
|
<p>
|
|
|
|
The actual in-memory size of a peer profile should be analyzed,
|
|
|
|
and unused portions should be removed.
|
|
|
|
This is a good topic for further research and optimization.
|
|
|
|
</p>
|
|
|
|
|
|
|
|
|
|
|
|
|
2004-07-19 20:08:11 +00:00
|
|
|
<h2>Peer summaries</h2>
|
|
|
|
|
|
|
|
<p>While the profiles themselves can be considered a summary of a peer's
|
|
|
|
performance, to allow for effective peer selection we break each summary down
|
|
|
|
into four simple values, representing the peer's speed, its capacity, how well
|
|
|
|
integrated into the network it is, and whether it is failing.</p>
|
|
|
|
|
|
|
|
<h3>Speed</h3>
|
|
|
|
|
|
|
|
<p>The speed calculation (as implemented
|
2010-07-28 15:37:50 +00:00
|
|
|
<a href="http://docs.i2p2.de/router/net/i2p/router/peermanager/SpeedCalculator.html">here</a>)
|
2008-05-21 16:49:27 +00:00
|
|
|
simply goes through the profile and estimates how much data we can
|
|
|
|
send or receive on a single tunnel through the peer in a minute. For this estimate it just looks at past
|
|
|
|
performance.
|
|
|
|
</p><p>
|
|
|
|
Previous algorithms used a longer time,
|
|
|
|
weighing more recent data more heavily, and extrapolating it for the future.
|
|
|
|
Another previous calculation used total (rather than per-tunnel) bandwidth,
|
|
|
|
but it was decided that this method overrated slow, high-capacity peers
|
|
|
|
(that is, slow in bandwidth but high-capacity in number of tunnels).
|
|
|
|
Even earlier methods were much more complex.
|
|
|
|
</p><p>
|
2010-03-21 19:41:10 +00:00
|
|
|
The current speed calculation has remained unchanged since early 2006,
|
2008-05-21 16:49:27 +00:00
|
|
|
and it is a topic for further study to verify its effectiveness
|
|
|
|
on today's network.
|
|
|
|
</p>
|
2004-07-19 20:08:11 +00:00
|
|
|
|
|
|
|
<h3>Capacity</h3>
|
|
|
|
|
|
|
|
<p>The capacity calculation (as implemented
|
2010-07-28 15:37:50 +00:00
|
|
|
<a href="http://docs.i2p2.de/router/net/i2p/router/peermanager/CapacityCalculator.html">here</a>)
|
2004-07-19 20:08:11 +00:00
|
|
|
simply goes through the profile and estimates how many tunnels we think the peer
|
|
|
|
would agree to participate in over the next hour. For this estimate it looks at
|
|
|
|
how many the peer has agreed to lately, how many the peer rejected, and how many
|
|
|
|
of the agreed to tunnels failed, and extrapolates the data. In addition, it
|
|
|
|
includes a 'growth factor' (when the peer isn't failing or rejecting requests) so
|
2008-05-21 16:49:27 +00:00
|
|
|
that we will keep trying to push their limits.
|
|
|
|
</p><p>
|
|
|
|
The tunnel-failure part of the calculation was disabled for
|
|
|
|
a long period, but it was reinstated for release 0.6.2,
|
|
|
|
as it became apparent that recognizing and avoiding unreliable and unreachable
|
|
|
|
peers is critically important.
|
|
|
|
Unfortunately, as the tunnel tests require the participation of several peers,
|
|
|
|
it is difficult to positively identify the cause of a failure.
|
|
|
|
The recent changes assign a probability of failure to each of the
|
|
|
|
peers, and uses that probability in the capacity calculation.
|
|
|
|
These recent changes may require additional adjustment,
|
|
|
|
and are a topic for further study to verify their effectiveness
|
|
|
|
on today's network.
|
|
|
|
</p>
|
2004-07-19 20:08:11 +00:00
|
|
|
|
|
|
|
<h3>Integration</h3>
|
|
|
|
|
|
|
|
<p>The integration calculation (as implemented
|
2010-07-28 15:37:50 +00:00
|
|
|
<a href="http://docs.i2p2.de/router/net/i2p/router/peermanager/IntegrationCalculator.html">here</a>)
|
2004-07-19 20:08:11 +00:00
|
|
|
is important only for the network database (and in turn, only when trying to focus
|
|
|
|
the 'exploration' of the network to detect and heal splits). At the moment it is
|
|
|
|
not used for anything though, as the detection code is not necessary for generally
|
|
|
|
well connected networks. In any case, the calculation itself simply tracks how many
|
|
|
|
times the peer is able to tell us about a peer we didn't know (or updated data for
|
|
|
|
a peer we did know).</p>
|
|
|
|
|
2008-04-25 15:55:22 +00:00
|
|
|
<p>
|
|
|
|
In upcoming releases, the integration calculation will be used, combined with
|
|
|
|
other measurements of a peer's connectivity and reliability, to determine
|
2010-03-21 19:41:10 +00:00
|
|
|
suitability for use as a floodfill router in the <a href="how_networkdatabase.html">network database</a>.
|
2008-04-25 15:55:22 +00:00
|
|
|
By determining whether a peer is actually a useful floodfill router
|
|
|
|
(rather than just claiming to be), intelligent decisions can be made about
|
|
|
|
whether to send database queries and stores to that peer.
|
|
|
|
Also, high-capacity non-floodfill routers can analyze the number of
|
|
|
|
floodfill peers in the network and join the floodfill router pool if
|
|
|
|
insufficient floodfill peers are present,
|
|
|
|
thus eliminating a major weakness in today's floodfill scheme.
|
|
|
|
</p>
|
|
|
|
|
2004-07-19 20:08:11 +00:00
|
|
|
<h3>Failing</h3>
|
|
|
|
|
|
|
|
<p>The failing calculation (as implemented
|
2010-07-28 15:37:50 +00:00
|
|
|
<a href="http://docs.i2p2.de/router/net/i2p/router/peermanager/IsFailingCalculator.html">here</a>)
|
2004-07-19 20:08:11 +00:00
|
|
|
keeps track of a few data points and determines whether the peer is overloaded
|
|
|
|
or is unable to honor its agreements. When a peer is marked as failing, it will
|
|
|
|
be avoided whenever possible.</p>
|
|
|
|
|
|
|
|
<h2>Peer organization</h2>
|
|
|
|
|
|
|
|
<p>As mentioned above, we drill through each peer's profile to come up with a
|
|
|
|
few key calculations, and based upon those, we organize each peer into five
|
|
|
|
groups - fast, capable, well integrated, not failing, and failing. When the
|
|
|
|
router wants to build a tunnel, it looks for fast peers, while when it wants
|
2010-03-21 19:41:10 +00:00
|
|
|
to test peers it simply chooses capable ones. Within each of these groups
|
2004-07-19 20:08:11 +00:00
|
|
|
however, the peer selected is random so that we balance the load (and risk)
|
|
|
|
across peers as well as to prevent some simple attacks.</p>
|
|
|
|
|
|
|
|
<p>The groupings are not mutually exclusive, nor are they unrelated: <ul>
|
|
|
|
<li>A peer is active if we have sent or received a message from the peer in the
|
|
|
|
last few minutes</li>
|
|
|
|
<li>A peer is considered "high capacity" if its capacity calculation meets or
|
|
|
|
exceeds the median of all active peers. </li>
|
|
|
|
<li>A peer is considered "fast" if they are already "high capacity" and their
|
|
|
|
speed calculation meets or exceeds the median of all "high capacity" peers.</li>
|
|
|
|
<li>A peer is considered "well integrated" if its integration calculation meets
|
|
|
|
or exceeds the mean value of active peers.</li>
|
|
|
|
<li>A peer is considered "failing" if the failing calculation returns true.</li>
|
|
|
|
<li>A peer is considered "not failing" if it is not "high capacity" or "failing"</li>
|
|
|
|
</ul>
|
|
|
|
|
|
|
|
These groupings are implemented in the <a
|
2010-07-28 15:37:50 +00:00
|
|
|
href="http://docs.i2p2.de/router/net/i2p/router/peermanager/ProfileOrganizer.html">ProfileOrganizer</a>'s
|
2008-05-21 21:22:45 +00:00
|
|
|
reorganize() method (using the calculateThresholds() and locked_placeProfile() methods in turn).</p>
|
2008-05-21 16:49:27 +00:00
|
|
|
|
|
|
|
<h2>To Do</h2>
|
|
|
|
|
|
|
|
<p>
|
|
|
|
As described above, each of the calculations is critical to peer selection and
|
|
|
|
requires review to determine the effectiveness in today's network.
|
|
|
|
Also, the size of in-memory profiles must be studied,
|
|
|
|
unused portions removed, and an ejection strategy implemented.
|
2008-05-21 21:22:45 +00:00
|
|
|
|
|
|
|
{% endblock %}
|