
improvements moved to history page. TODO: remove duplication for SessionTags (longer lifetime vs synchronized PRNG).
119 lines
7.3 KiB
HTML
119 lines
7.3 KiB
HTML
{% extends "_layout.html" %}
|
|
{% block title %}Performance History{% endblock %}
|
|
{% block content %}
|
|
<p>Notable performance improvements have been made using the techniques below.
|
|
There is more to do, see the <a href="performance.html">Performance</a> page
|
|
for current issues and thoughts.</p>
|
|
|
|
<h2>Native math <b>[implemented]</b></h2>
|
|
<p>When I last profiled the I2P code, the vast majority of time was spent within
|
|
one function: java.math.BigInteger's
|
|
<a href="http://java.sun.com/j2se/1.4.2/docs/api/java/math/BigInteger.html#modPow(java.math.BigInteger,%20java.math.BigInteger)">modPow</a>.
|
|
Rather than try to tune this method, we'll call out to
|
|
<a href="http://www.swox.com/gmp/">GNU MP</a> - an insanely fast math library
|
|
(with tuned assembler for many architectures). (<i>Editor: see
|
|
<a href="jbigi">NativeBigInteger for faster public key cryptography</a></i>)</p>
|
|
<p>ugha and duck are working on the C/JNI glue code, and the existing java code
|
|
is already deployed with hooks for that whenever its ready. Preliminary results
|
|
look fantastic - running the router with the native GMP modPow is providing over
|
|
a 800% speedup in encryption performance, and the load was cut in half. This
|
|
was just on one user's machine, and things are nowhere near ready for packaging
|
|
and deployment, yet.</p>
|
|
|
|
<h2>Garlic wrapping a "reply" LeaseSet <b>[implemented but needs tuning]</b></h2>
|
|
<p>This algorithm tweak will only be relevant for applications that want their
|
|
peers to reply to them (though that includes everything that uses I2PTunnel or
|
|
mihi's ministreaming lib):</p>
|
|
<p>Previously, when Alice sent Bob a message, when Bob replied he had to do a
|
|
lookup in the network database - sending out a few requests to get Alice's
|
|
current LeaseSet. If he already has Alice's current LeaseSet, he can instead
|
|
just send his reply immediately - this is (part of) why it typically takes a
|
|
little longer talking to someone the first time you connect, but subsequent
|
|
communication is faster. Currently - for all clients - we wrap
|
|
the sender's current LeaseSet in the garlic that is delivered to the recipient,
|
|
so that when they go to reply, they'll <i>always</i> have the LeaseSet locally
|
|
stored - completely removing any need for a network database lookup on replies.
|
|
This trades off a large portion of the sender's bandwidth for that faster reply.
|
|
If we didn't do this very often,
|
|
overall network bandwidth usage would decrease, since the recipient doesn't
|
|
have to do the network database lookup.</p>
|
|
<p>
|
|
For unpublished LeaseSets such as "shared clients", this is the only way to
|
|
get the LeaseSet to Bob. Unfortunately this bundling every time adds
|
|
almost 100% overhead to a high-bandwidth connection, and much more to
|
|
a connection with smaller messages.
|
|
</p><p>
|
|
Changes scheduled for release 0.6.2 will bundle the LeaseSet only when
|
|
necessary, at the beginning of a connection or when the LeaseSet changes.
|
|
This will substantially reduce the total overhead of I2P messaging.
|
|
</p>
|
|
|
|
<h2>More efficient TCP rejection <b>[implemented]</b></h2>
|
|
<p>At the moment, all TCP connections do all of their peer validation after
|
|
going through the full (expensive) Diffie-Hellman handshaking to negotiate a
|
|
private session key. This means that if someone's clock is really wrong, or
|
|
their NAT/firewall/etc is improperly configured (or they're just running an
|
|
incompatible version of the router), they're going to consistently (though not
|
|
constantly, thanks to the shitlist) cause a futile expensive cryptographic
|
|
operation on all the peers they know about. While we will want to keep some
|
|
verification/validation within the encryption boundary, we'll want to update the
|
|
protocol to do some of it first, so that we can reject them cleanly
|
|
without wasting much CPU or other resources.</p>
|
|
|
|
<h2>Adjust the tunnel testing <b>[implemented]</b></h2>
|
|
<p>Rather than going with the fairly random scheme we have now, we should use a
|
|
more context aware algorithm for testing tunnels. e.g. if we already know its
|
|
passing valid data correctly, there's no need to test it, while if we haven't
|
|
seen any data through it recently, perhaps its worthwhile to throw some data its
|
|
way. This will reduce the tunnel contention due to excess messages, as well as
|
|
improve the speed at which we detect - and address - failing tunnels.</p>
|
|
|
|
<h2>Persistent Tunnel / Lease Selection</h2>
|
|
<p>Outbound tunnel selection implemented in 0.6.1.30, inbound lease selection
|
|
implemented in release 0.6.2.</p>
|
|
<p>Selecting tunnels and leases at random for every message creates a large
|
|
incidence of out-of-order delivery, which prevents the streaming lib from
|
|
increasing its window size as much as it could. By persisting with the
|
|
same selections for a given connection, the transfer rate is much faster.
|
|
</p>
|
|
|
|
<h2>Compress some data structures <b>[implemented]</b></h2>
|
|
<p>The I2NP messages and the data they contain is already defined in a fairly
|
|
compact structure, though one attribute of the RouterInfo structure is not -
|
|
"options" is a plain ASCII name = value mapping. Right now, we're filling it
|
|
with those published statistics - around 3300 bytes per peer. Trivial to
|
|
implement GZip compression would nearly cut that to 1/3 its size, and when you
|
|
consider how often RouterInfo structures are passed across the network, that's
|
|
significant savings - every time a router asks another router for a networkDb
|
|
entry that the peer doesn't have, it sends back 3-10 RouterInfo of them.</p>
|
|
|
|
<h2>Update the ministreaming protocol</h2> [replaced by full streaming protocol]
|
|
<p>Currently mihi's ministreaming library has a fairly simple stream negotiation
|
|
protocol - Alice sends Bob a SYN message, Bob replies with an ACK message, then
|
|
Alice and Bob send each other some data, until one of them sends the other a
|
|
CLOSE message. For long lasting connections (to an IRC server, for instance),
|
|
that overhead is negligible, but for simple one-off request/response situations
|
|
(an HTTP request/reply, for instance), that's more than twice as many messages as
|
|
necessary. If, however, Alice piggybacked her first payload in with the SYN
|
|
message, and Bob piggybacked his first reply with the ACK - and perhaps also
|
|
included the CLOSE flag - transient streams such as HTTP requests could be
|
|
reduced to a pair of messages, instead of the SYN+ACK+request+response+CLOSE.</p>
|
|
|
|
<h2>Implement full streaming protocol</h2> [<a href="streaming.html">implemented</a>]
|
|
<p>The ministreaming protocol takes advantage of a poor design decision in the
|
|
I2P client protocol (I2CP) - the exposure of "mode=GUARANTEED", allowing what
|
|
would otherwise be an unreliable, best-effort, message based protocol to be used
|
|
for reliable, blocking operation (under the covers, its still all unreliable and
|
|
message based, with the router providing delivery guarantees by garlic wrapping
|
|
an "ACK" message in with the payload, so once the data gets to the target, the
|
|
ACK message is forwarded back to us [through tunnels, of course]).</p>
|
|
<p>As I've
|
|
<a href="http://dev.i2p.net/pipermail/i2p/2004-March/000167.html">said</a>, having
|
|
I2PTunnel (and the ministreaming lib) go this route was the best thing that
|
|
could be done, but more efficient mechanisms are available. When we rip out the
|
|
"mode=GUARANTEED" functionality, we're essentially leaving ourselves with an
|
|
I2CP that looks like an anonymous IP layer, and as such, we'll be able to
|
|
implement the streaming library to take advantage of the design experiences of
|
|
the TCP layer - selective ACKs, congestion detection, nagle, etc.</p>
|
|
{% endblock %}
|