- tunnel-alt-creation rework
- More how_crypto and i2np_spec fixups - Quick NTCP fixup, move discussion to new page
This commit is contained in:
@ -35,8 +35,8 @@ block is formatted (in network byte order):
|
||||
<p>
|
||||
The H(data) is the SHA256 of the data that is encrypted in the ElGamal block,
|
||||
and is preceded by a random nonzero byte. The data encrypted in the block
|
||||
can be up to 222 bytes long. Specifically, see
|
||||
<a href="http://docs.i2p2.de/core/net/i2p/crypto/ElGamalEngine.html">[the code]</a>.
|
||||
can be up to 223 bytes long. See
|
||||
<a href="http://docs.i2p2.de/core/net/i2p/crypto/ElGamalEngine.html">the ElGamal Javadoc</a>.
|
||||
<p>
|
||||
ElGamal is never used on its own in I2P, but instead always as part of
|
||||
<a href="how_elgamalaes">ElGamal/AES+SessionTag</a>.
|
||||
|
@ -174,7 +174,7 @@ iv_key :: SessionKey
|
||||
reply_key :: SessionKey
|
||||
length -> 32 bytes
|
||||
|
||||
reply_iv :: Integer
|
||||
reply_iv :: data
|
||||
length -> 16 bytes
|
||||
|
||||
flag :: Integer
|
||||
@ -182,6 +182,7 @@ flag :: Integer
|
||||
|
||||
request_time :: Integer
|
||||
length -> 4 bytes
|
||||
Hours since the epoch, i.e. current time / 3600
|
||||
|
||||
send_message_id :: Integer
|
||||
length -> 4 bytes
|
||||
@ -191,17 +192,27 @@ padding :: Data
|
||||
|
||||
source -> random
|
||||
|
||||
total length: 223
|
||||
|
||||
encrypted:
|
||||
|
||||
toPeer :: Hash
|
||||
length -> 16 bytes
|
||||
|
||||
encrypted_data :: ElGamal-2048 encrypted data
|
||||
length -> 514
|
||||
length -> 512
|
||||
|
||||
total length: 528
|
||||
|
||||
{% endfilter %}
|
||||
</pre>
|
||||
|
||||
<h4>Notes</h4>
|
||||
<p>
|
||||
See also the <a href="tunnel-alt-creation.html">tunnel creation specification</a>.
|
||||
</p>
|
||||
|
||||
|
||||
<h3 id="struct_BuildResponseRecord">BuildResponseRecord</h3>
|
||||
<pre>
|
||||
{% filter escape %}
|
||||
@ -224,9 +235,17 @@ byte 527 : reply
|
||||
|
||||
encrypted:
|
||||
bytes 0-527: AES-encrypted record(note: same size as BuildRequestRecord!)
|
||||
|
||||
total length: 528
|
||||
|
||||
{% endfilter %}
|
||||
</pre>
|
||||
|
||||
<h4>Notes</h4>
|
||||
<p>
|
||||
See also the <a href="tunnel-alt-creation.html">tunnel creation specification</a>.
|
||||
</p>
|
||||
|
||||
|
||||
<h2 id="messages">Messages</h2>
|
||||
<table border=1>
|
||||
@ -667,6 +686,11 @@ Total size: 8*528 = 4224 bytes
|
||||
{% endfilter %}
|
||||
</pre>
|
||||
|
||||
<h4>Notes</h4>
|
||||
<p>
|
||||
See also the <a href="tunnel-alt-creation.html">tunnel creation specification</a>.
|
||||
</p>
|
||||
|
||||
|
||||
<h3 id="msg_TunnelBuildReply">TunnelBuildReply</h3>
|
||||
<pre>
|
||||
@ -675,6 +699,11 @@ same format as TunnelBuild message
|
||||
{% endfilter %}
|
||||
</pre>
|
||||
|
||||
<h4>Notes</h4>
|
||||
<p>
|
||||
See also the <a href="tunnel-alt-creation.html">tunnel creation specification</a>.
|
||||
</p>
|
||||
|
||||
<h3 id="msg_VariableTunnelBuild">VariableTunnelBuild</h3>
|
||||
<pre>
|
||||
{% filter escape %}
|
||||
@ -697,9 +726,19 @@ Total size: 1 + $num*528
|
||||
{% endfilter %}
|
||||
</pre>
|
||||
|
||||
<h4>Notes</h4>
|
||||
<p>
|
||||
See also the <a href="tunnel-alt-creation.html">tunnel creation specification</a>.
|
||||
</p>
|
||||
|
||||
<h3 id="msg_VariableTunnelBuildReply">VariableTunnelBuildReply</h3>
|
||||
<pre>
|
||||
{% filter escape %}
|
||||
same format as VariableTunnelBuild message
|
||||
{% endfilter %}
|
||||
</pre>
|
||||
|
||||
<h4>Notes</h4>
|
||||
<p>
|
||||
See also the <a href="tunnel-alt-creation.html">tunnel creation specification</a>.
|
||||
</p>
|
||||
|
@ -2,20 +2,25 @@
|
||||
{% block title %}NTCP{% endblock %}
|
||||
{% block content %}
|
||||
|
||||
<h1>NTCP (NIO-based TCP)</h1>
|
||||
Updated August 2010 for release 0.8
|
||||
|
||||
<h2>NTCP (NIO-based TCP)</h2>
|
||||
|
||||
<p>
|
||||
NTCP was introduced in I2P 0.6.1.22.
|
||||
It is a Java NIO-based transport, enabled by default for outbound
|
||||
connections only. Those who configure their NAT/firewall to allow
|
||||
inbound connections and specify the external host and port
|
||||
(dyndns/etc is okay) on /config.jsp can receive inbound connections.
|
||||
NTCP is NIO based, so it doesn't suffer from the 1 thread per connection issues of the old TCP transport.
|
||||
NTCP
|
||||
is one of two <a href="transport.html">transports</a> currently implemented in I2P.
|
||||
The other is <a href="udp.html">SSU</a>.
|
||||
NTCP
|
||||
is a Java NIO-based transport
|
||||
introduced in I2P release 0.6.1.22.
|
||||
Java NIO (new I/O) does not suffer from the 1 thread per connection issues of the old TCP transport.
|
||||
</p><p>
|
||||
|
||||
As of 0.6.1.29, NTCP uses the IP/Port
|
||||
By default,
|
||||
NTCP uses the IP/Port
|
||||
auto-detected by SSU. When enabled on config.jsp,
|
||||
SSU will notify/restart NTCP when the external address changes.
|
||||
SSU will notify/restart NTCP when the external address changes
|
||||
or when the firewall status changes.
|
||||
Now you can enable inbound TCP without a static IP or dyndns service.
|
||||
</p><p>
|
||||
|
||||
@ -23,71 +28,47 @@ The NTCP code within I2P is relatively lightweight (1/4 the size of the SSU code
|
||||
because it uses the underlying Java TCP transport.
|
||||
</p>
|
||||
|
||||
<h2>Transport Bids and Transport Comparison</h2>
|
||||
|
||||
<h2>NTCP Protocol Specification</h2>
|
||||
|
||||
|
||||
<h3>Standard Message Format</h3>
|
||||
<p>
|
||||
I2P supports multiple transports simultaneously.
|
||||
A particular transport for an outbound connection is selected with "bids".
|
||||
Each transport bids for the connection and the relative value of these bids
|
||||
assigns the priority.
|
||||
Transports may reply with different bids, depending on whether there is
|
||||
already an established connection to the peer.
|
||||
</p><p>
|
||||
|
||||
To compare the performance of UDP and NTCP,
|
||||
you can adjust the value of i2np.udp.preferred in configadvanced.jsp
|
||||
(introduced in I2P 0.6.1.29).
|
||||
Possible settings are
|
||||
"false" (default), "true", and "always".
|
||||
Default setting results in same behavior as before
|
||||
(NTCP is preferred unless it isn't established and UDP is established).
|
||||
</p><p>
|
||||
|
||||
The table below shows the new bid values. A lower bid is a higher priority.
|
||||
<p>
|
||||
<table border=1>
|
||||
<tr>
|
||||
<td><td colspan=3>i2np.udp.preferred setting
|
||||
<tr>
|
||||
<td>Transport<td>false<td>true<td>always
|
||||
<tr>
|
||||
<td>NTCP Established<td>25<td>25<td>25
|
||||
<tr>
|
||||
<td>UDP Established<td>50<td>15<td>15
|
||||
<tr>
|
||||
<td>NTCP Not established<td>70<td>70<td>70
|
||||
<tr>
|
||||
<td>UDP Not established<td>1000<td>65<td>20
|
||||
</table>
|
||||
|
||||
|
||||
|
||||
<h2>NTCP Transport Protocol</h2>
|
||||
|
||||
|
||||
The NTCP transport sends individual I2NP messages AES/256/CBC encrypted with
|
||||
a simple checksum. The unencrypted message is encoded as follows:
|
||||
<pre>
|
||||
* Coordinate the connection to a single peer.
|
||||
*
|
||||
* The NTCP transport sends individual I2NP messages AES/256/CBC encrypted with
|
||||
* a simple checksum. The unencrypted message is encoded as follows:
|
||||
* +-------+-------+--//--+---//----+-------+-------+-------+-------+
|
||||
* | sizeof(data) | data | padding | adler checksum of sz+data+pad |
|
||||
* | sizeof(data) | data | padding | Adler checksum of sz+data+pad |
|
||||
* +-------+-------+--//--+---//----+-------+-------+-------+-------+
|
||||
* That message is then encrypted with the DH/2048 negotiated session key
|
||||
* (station to station authenticated per the EstablishState class) using the
|
||||
* last 16 bytes of the previous encrypted message as the IV.
|
||||
*
|
||||
* One special case is a metadata message where the sizeof(data) is 0. In
|
||||
* that case, the unencrypted message is encoded as:
|
||||
</pre>
|
||||
That message is then encrypted with the DH/2048 negotiated session key
|
||||
(station to station authenticated per the EstablishState class) using the
|
||||
last 16 bytes of the previous encrypted message as the IV.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
0-15 bytes of padding are required to bring the total message length
|
||||
(including the six size and checksum bytes) to a multiple of 16.
|
||||
The maximum message size is currently 16 KB.
|
||||
Therefore the maximum data size is currently 16 KB - 6, or 16378 bytes.
|
||||
The minimum data size is 1.
|
||||
</p>
|
||||
|
||||
<h3>Time Sync Message Format</h3>
|
||||
<p>
|
||||
One special case is a metadata message where the sizeof(data) is 0. In
|
||||
that case, the unencrypted message is encoded as:
|
||||
<pre>
|
||||
* +-------+-------+-------+-------+-------+-------+-------+-------+
|
||||
* | 0 | timestamp in seconds | uninterpreted
|
||||
* +-------+-------+-------+-------+-------+-------+-------+-------+
|
||||
* uninterpreted | adler checksum of sz+data+pad |
|
||||
* uninterpreted | Adler checksum of bytes 0-11 |
|
||||
* +-------+-------+-------+-------+-------+-------+-------+-------+
|
||||
*
|
||||
*
|
||||
</pre>
|
||||
Total length: 16 bytes. The time sync message is sent at approximately 15 minute intervals.
|
||||
|
||||
|
||||
<h3>Establishment Sequence</h3>
|
||||
In the establish state, the following communication happens.
|
||||
There is a 2048-bit Diffie Hellman exchange.
|
||||
For more information see the <a href="how_cryptography.html#tcp">cryptography page</a>.
|
||||
@ -99,571 +80,33 @@ For more information see the <a href="how_cryptography.html#tcp">cryptography pa
|
||||
* E(#+Alice.identity+tsA+padding+S(X+Y+Bob.identHash+tsA+tsB+padding), sk, hX_xor_Bob.identHash[16:31])--->
|
||||
* <----------------------E(S(X+Y+Alice.identHash+tsA+tsB)+padding, sk, prev)
|
||||
</pre>
|
||||
|
||||
Todo: Explain this in words.
|
||||
|
||||
|
||||
<h3>Check Connection Message</h3>
|
||||
Alternately, when Bob receives a connection, it could be a
|
||||
check connection (perhaps prompted by Bob asking for someone
|
||||
to verify his listener).
|
||||
It does not appear that 'check connection' is used.
|
||||
However, for the record, check connections are formatted as follows:
|
||||
<pre>
|
||||
* a check info connection will receive 256 bytes containing:
|
||||
* - 32 bytes of uninterpreted, ignored data
|
||||
* - 1 byte size
|
||||
* - that many bytes making up the local router's IP address (as reached by the remote side)
|
||||
* - 2 byte port number that the local router was reached on
|
||||
* - 4 byte i2p network time as known by the remote side (seconds since the epoch)
|
||||
* - uninterpreted padding data, up to byte 223
|
||||
* - xor of the local router's identity hash and the SHA256 of bytes 32 through bytes 223
|
||||
Check Connection is not currently used.
|
||||
However, for the record, check connections are formatted as follows.
|
||||
A check info connection will receive 256 bytes containing:
|
||||
<ul>
|
||||
<li> 32 bytes of uninterpreted, ignored data
|
||||
<li> 1 byte size
|
||||
<li> that many bytes making up the local router's IP address (as reached by the remote side)
|
||||
<li> 2 byte port number that the local router was reached on
|
||||
<li> 4 byte i2p network time as known by the remote side (seconds since the epoch)
|
||||
<li> uninterpreted padding data, up to byte 223
|
||||
<li> xor of the local router's identity hash and the SHA256 of bytes 32 through bytes 223
|
||||
</ul>
|
||||
</pre>
|
||||
|
||||
<h2>Discussion</h2>
|
||||
Now on the <a href="ntcp_discussion.html">NTCP Discussion Page</a>.
|
||||
|
||||
<h2>NTCP vs. SSU Discussion, March 2007</h2>
|
||||
<h3>NTCP questions</h3>
|
||||
(adapted from an IRC discussion between zzz and cervantes)
|
||||
<br />
|
||||
Why is NTCP preferred over SSU, doesn't NTCP have higher overhead and latency?
|
||||
It has better reliability.
|
||||
<br />
|
||||
Doesn't streaming lib over NTCP suffer from classic TCP-over-TCP issues?
|
||||
What if we had a really simple UDP transport for streaming-lib-originated traffic?
|
||||
I think SSU was meant to be the so-called really simple UDP transport - but it just proved too unreliable.
|
||||
|
||||
<h3>"NTCP Considered Harmful" Analysis by zzz</h3>
|
||||
Posted to new Syndie, 2007-03-25.
|
||||
This was posted to stimulate discussion, don't take it too seriously.
|
||||
<p>
|
||||
Summary: NTCP has higher latency and overhead than SSU, and is more likely to
|
||||
collapse when used with the streaming lib. However, traffic is routed with a
|
||||
preference for NTCP over SSU and this is currently hardcoded.
|
||||
<h2><a name="future">Future Work</a></h2>
|
||||
<p>The maximum message size should be increased to approximately 32 KB.
|
||||
</p>
|
||||
|
||||
<h4>Discussion</h4>
|
||||
<p>
|
||||
We currently have two transports, NTCP and SSU. As currently implemented, NTCP
|
||||
has lower "bids" than SSU so it is preferred, except for the case where there
|
||||
is an established SSU connection but no established NTCP connection for a peer.
|
||||
</p><p>
|
||||
|
||||
SSU is similar to NTCP in that it implements acknowledgments, timeouts, and
|
||||
retransmissions. However SSU is I2P code with tight constraints on the
|
||||
timeouts and available statistics on round trip times, retransmissions, etc.
|
||||
NTCP is based on Java NIO TCP, which is a black box and presumably implements
|
||||
RFC standards, including very long maximum timeouts.
|
||||
</p><p>
|
||||
|
||||
The majority of traffic within I2P is streaming-lib originated (HTTP, IRC,
|
||||
Bittorrent) which is our implementation of TCP. As the lower-level transport is
|
||||
generally NTCP due to the lower bids, the system is subject to the well-known
|
||||
and dreaded problem of TCP-over-TCP
|
||||
http://sites.inka.de/~W1011/devel/tcp-tcp.html , where both the higher and
|
||||
lower layers of TCP are doing retransmissions at once, leading to collapse.
|
||||
</p><p>
|
||||
|
||||
Unlike in the PPP over SSH scenario described in the link above, we have
|
||||
several hops for the lower layer, each covered by a NTCP link. So each NTCP
|
||||
latency is generally much less than the higher-layer streaming lib latency.
|
||||
This lessens the chance of collapse.
|
||||
</p><p>
|
||||
|
||||
Also, the probabilities of collapse are lessened when the lower-layer TCP is
|
||||
tightly constrained with low timeouts and number of retransmissions compared to
|
||||
the higher layer.
|
||||
</p><p>
|
||||
|
||||
The .28 release increased the maximum streaming lib timeout from 10 sec to 45
|
||||
sec which greatly improved things. The SSU max timeout is 3 sec. The NTCP max
|
||||
timeout is presumably at least 60 sec, which is the RFC recommendation. There
|
||||
is no way to change NTCP parameters or monitor performance. Collapse of the
|
||||
NTCP layer is [editor: text lost]. Perhaps an external tool like tcpdump would help.
|
||||
</p><p>
|
||||
|
||||
However, running .28, the i2psnark reported upstream does not generally stay at
|
||||
a high level. It often goes down to 3-4 KBps before climbing back up. This is a
|
||||
signal that there are still collapses.
|
||||
</p><p>
|
||||
|
||||
SSU is also more efficient. NTCP has higher overhead and probably higher round
|
||||
trip times. when using NTCP the ratio of (tunnel output) / (i2psnark data
|
||||
output) is at least 3.5 : 1. Running an experiment where the code was modified
|
||||
to prefer SSU (the config option i2np.udp.alwaysPreferred has no effect in the
|
||||
current code), the ratio reduced to about 3 : 1, indicating better efficiency.
|
||||
</p><p>
|
||||
|
||||
As reported by streaming lib stats, things were much improved - lifetime window
|
||||
size up from 6.3 to 7.5, RTT down from 11.5s to 10s, sends per ack down from
|
||||
1.11 to 1.07.
|
||||
</p><p>
|
||||
|
||||
That this was quite effective was surprising, given that we were only changing
|
||||
the transport for the first of 3 to 5 total hops the outbound messages would
|
||||
take.
|
||||
</p><p>
|
||||
|
||||
The effect on outbound i2psnark speeds wasn't clear due to normal variations.
|
||||
Also for the experiment, inbound NTCP was disabled. The effect on inbound
|
||||
speeds on i2psnark was not clear.
|
||||
</p>
|
||||
<h4>Proposals</h4>
|
||||
|
||||
<ul>
|
||||
<li>
|
||||
1A)
|
||||
This is easy -
|
||||
We should flip the bid priorities so that SSU is preferred for all traffic, if
|
||||
we can do this without causing all sorts of other trouble. This will fix the
|
||||
i2np.udp.alwaysPreferred configuration option so that it works (either as true
|
||||
or false).
|
||||
|
||||
<li>
|
||||
1B)
|
||||
Alternative to 1A), not so easy -
|
||||
If we can mark traffic without adversely affecting our anonymity goals, we
|
||||
should identify streaming-lib generated traffic and have SSU generate a low bid
|
||||
for that traffic. This tag will have to go with the message through each hop
|
||||
so that the forwarding routers also honor the SSU preference.
|
||||
|
||||
|
||||
<li>
|
||||
2)
|
||||
Bounding SSU even further (reducing maximum retransmissions from the current
|
||||
10) is probably wise to reduce the chance of collapse.
|
||||
|
||||
<li>
|
||||
3)
|
||||
We need further study on the benefits vs. harm of a semi-reliable protocol
|
||||
underneath the streaming lib. Are retransmissions over a single hop beneficial
|
||||
and a big win or are they worse than useless?
|
||||
We could do a new SUU (secure unreliable UDP) but probably not worth it. We
|
||||
could perhaps add a no-ack-required message type in SSU if we don't want any
|
||||
retransmissions at all of streaming-lib traffic. Are tightly bounded
|
||||
retransmissions desirable?
|
||||
|
||||
<li>
|
||||
4)
|
||||
The priority sending code in .28 is only for NTCP. So far my testing hasn't
|
||||
shown much use for SSU priority as the messages don't queue up long enough for
|
||||
priorities to do any good. But more testing needed.
|
||||
|
||||
<li>
|
||||
5)
|
||||
The new streaming lib max timeout of 45s is probably still too low.
|
||||
The TCP RFC says 60s. It probably shouldn't be shorter than the underlying NTCP max timeout (presumably 60s).
|
||||
</ul>
|
||||
|
||||
<h3>Response by jrandom</h3>
|
||||
Posted to new Syndie, 2007-03-27
|
||||
<p>
|
||||
On the whole, I'm open to experimenting with this, though remember why NTCP is
|
||||
there in the first place - SSU failed in a congestion collapse. NTCP "just
|
||||
works", and while 2-10% retransmission rates can be handled in normal
|
||||
single-hop networks, that gives us a 40% retransmission rate with 2 hop
|
||||
tunnels. If you loop in some of the measured SSU retransmission rates we saw
|
||||
back before NTCP was implemented (10-30+%), that gives us an 83% retransmission
|
||||
rate. Perhaps those rates were caused by the low 10 second timeout, but
|
||||
increasing that much would bite us (remember, multiply by 5 and you've got half
|
||||
the journey).
|
||||
</p><p>
|
||||
|
||||
Unlike TCP, we have no feedback from the tunnel to know whether the message
|
||||
made it - there are no tunnel level acks. We do have end to end ACKs, but only
|
||||
on a small number of messages (whenever we distribute new session tags) - out
|
||||
of the 1,553,591 client messages my router sent, we only attempted to ACK
|
||||
145,207 of them. The others may have failed silently or succeeded perfectly.
|
||||
</p><p>
|
||||
|
||||
I'm not convinced by the TCP-over-TCP argument for us, especially split across
|
||||
the various paths we transfer down. Measurements on I2P can convince me
|
||||
otherwise, of course.
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
The NTCP max timeout is presumably at least 60 sec, which is the RFC
|
||||
recommendation. There is no way to change NTCP parameters or monitor
|
||||
performance.
|
||||
</i>
|
||||
</p><p>
|
||||
|
||||
|
||||
True, but net connections only get up to that level when something really bad
|
||||
is going on - the retransmission timeout on TCP is often on the order of tens
|
||||
or hundreds of milliseconds. As foofighter points out, they've got 20+ years
|
||||
experience and bugfixing in their TCP stacks, plus a billion dollar industry
|
||||
optimizing hardware and software to perform well according to whatever it is
|
||||
they do.
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
NTCP has higher overhead and probably higher round trip times. when using NTCP
|
||||
the ratio of (tunnel output) / (i2psnark data output) is at least 3.5 : 1.
|
||||
Running an experiment where the code was modified to prefer SSU (the config
|
||||
option i2np.udp.alwaysPreferred has no effect in the current code), the ratio
|
||||
reduced to about 3 : 1, indicating better efficiency.
|
||||
</i>
|
||||
</p><p>
|
||||
|
||||
|
||||
This is very interesting data, though more as a matter of router congestion
|
||||
than bandwidth efficiency - you'd have to compare 3.5*$n*$NTCPRetransmissionPct
|
||||
./. 3.0*$n*$SSURetransmissionPct. This data point suggests there's something in
|
||||
the router that leads to excess local queuing of messages already being
|
||||
transferred.
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
lifetime window size up from 6.3 to 7.5, RTT down from 11.5s to 10s, sends per
|
||||
ACK down from 1.11 to 1.07.
|
||||
</i>
|
||||
|
||||
</p><p>
|
||||
|
||||
Remember that the sends-per-ACK is only a sample not a full count (as we don't
|
||||
try to ACK every send). Its not a random sample either, but instead samples
|
||||
more heavily periods of inactivity or the initiation of a burst of activity -
|
||||
sustained load won't require many ACKs.
|
||||
</p><p>
|
||||
|
||||
Window sizes in that range are still woefully low to get the real benefit of
|
||||
AIMD, and still too low to transmit a single 32KB BT chunk (increasing the
|
||||
floor to 10 or 12 would cover that).
|
||||
</p><p>
|
||||
|
||||
Still, the wsize stat looks promising - over how long was that maintained?
|
||||
</p><p>
|
||||
|
||||
Actually, for testing purposes, you may want to look at
|
||||
StreamSinkClient/StreamSinkServer or even TestSwarm in
|
||||
apps/ministreaming/java/src/net/i2p/client/streaming/ - StreamSinkClient is a
|
||||
CLI app that sends a selected file to a selected destination and
|
||||
StreamSinkServer creates a destination and writes out any data sent to it
|
||||
(displaying size and transfer time). TestSwarm combines the two - flooding
|
||||
random data to whomever it connects to. That should give you the tools to
|
||||
measure sustained throughput capacity over the streaming lib, as opposed to BT
|
||||
choke/send.
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
1A)
|
||||
|
||||
This is easy -
|
||||
We should flip the bid priorities so that SSU is preferred for all traffic, if
|
||||
we can do this without causing all sorts of other trouble. This will fix the
|
||||
i2np.udp.alwaysPreferred configuration option so that it works (either as true
|
||||
or false).
|
||||
</i>
|
||||
</p><p>
|
||||
|
||||
|
||||
Honoring i2np.udp.alwaysPreferred is a good idea in any case - please feel free
|
||||
to commit that change. Lets gather a bit more data though before switching the
|
||||
preferences, as NTCP was added to deal with an SSU-created congestion collapse.
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
1B)
|
||||
Alternative to 1A), not so easy -
|
||||
If we can mark traffic without adversely affecting our anonymity goals, we
|
||||
should identify streaming-lib generated traffic
|
||||
and have SSU generate a low bid for that traffic. This tag will have to go with
|
||||
the message through each hop
|
||||
so that the forwarding routers also honor the SSU preference.
|
||||
</i>
|
||||
</p><p>
|
||||
|
||||
|
||||
In practice, there are three types of traffic - tunnel building/testing, netDb
|
||||
query/response, and streaming lib traffic. The network has been designed to
|
||||
make differentiating those three very hard.
|
||||
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
2)
|
||||
Bounding SSU even further (reducing maximum retransmissions from the current
|
||||
10) is probably wise to reduce the chance of collapse.
|
||||
</i>
|
||||
</p><p>
|
||||
|
||||
|
||||
At 10 retransmissions, we're up shit creek already, I agree. One, maybe two
|
||||
retransmissions is reasonable, from a transport layer, but if the other side is
|
||||
too congested to ACK in time (even with the implemented SACK/NACK capability),
|
||||
there's not much we can do.
|
||||
</p><p>
|
||||
|
||||
In my view, to really address the core issue we need to address why the router
|
||||
gets so congested to ACK in time (which, from what I've found, is due to CPU
|
||||
contention). Maybe we can juggle some things in the router's processing to make
|
||||
the transmission of an already existing tunnel higher CPU priority than
|
||||
decrypting a new tunnel request? Though we've got to be careful to avoid
|
||||
starvation.
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
3)
|
||||
We need further study on the benefits vs. harm of a semi-reliable protocol
|
||||
underneath the streaming lib. Are retransmissions over a single hop beneficial
|
||||
and a big win or are they worse than useless?
|
||||
We could do a new SUU (secure unreliable UDP) but probably not worth it. We
|
||||
could perhaps add a no-ACK-required message type in SSU if we don't want any
|
||||
retransmissions at all of streaming-lib traffic. Are tightly bounded
|
||||
retransmissions desirable?
|
||||
</i>
|
||||
|
||||
</p><p>
|
||||
|
||||
Worth looking into - what if we just disabled SSU's retransmissions? It'd
|
||||
probably lead to much higher streaming lib resend rates, but maybe not.
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
4)
|
||||
The priority sending code in .28 is only for NTCP. So far my testing hasn't
|
||||
shown much use for SSU priority as the messages don't queue up long enough for
|
||||
priorities to do any good. But more testing needed.
|
||||
</i>
|
||||
|
||||
</p><p>
|
||||
|
||||
There's UDPTransport.PRIORITY_LIMITS and UDPTransport.PRIORITY_WEIGHT (honored
|
||||
by TimedWeightedPriorityMessageQueue), but currently the weights are almost all
|
||||
equal, so there's no effect. That could be adjusted, of course (but as you
|
||||
mention, if there's no queuing, it doesn't matter).
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
5)
|
||||
The new streaming lib max timeout of 45s is probably still too low. The TCP RFC
|
||||
says 60s. It probably shouldn't be shorter than the underlying NTCP max timeout
|
||||
(presumably 60s).
|
||||
</i>
|
||||
</p><p>
|
||||
|
||||
|
||||
That 45s is the max retransmission timeout of the streaming lib though, not the
|
||||
stream timeout. TCP in practice has retransmission timeouts orders of magnitude
|
||||
less, though yes, can get to 60s on links running through exposed wires or
|
||||
satellite transmissions ;) If we increase the streaming lib retransmission
|
||||
timeout to e.g. 75 seconds, we could go get a beer before a web page loads
|
||||
(especially assuming less than a 98% reliable transport). That's one reason we
|
||||
prefer NTCP.
|
||||
</p>
|
||||
|
||||
|
||||
<h3>Response by zzz</h3>
|
||||
Posted to new Syndie, 2007-03-31
|
||||
<p>
|
||||
|
||||
<i>
|
||||
At 10 retransmissions, we're up shit creek already, I agree. One, maybe two
|
||||
retransmissions is reasonable, from a transport layer, but if the other side is
|
||||
too congested to ACK in time (even with the implemented SACK/NACK capability),
|
||||
there's not much we can do.
|
||||
<br>
|
||||
In my view, to really address the core issue we need to address why the
|
||||
router gets so congested to ACK in time (which, from what I've found, is due to
|
||||
CPU contention). Maybe we can juggle some things in the router's processing to
|
||||
make the transmission of an already existing tunnel higher CPU priority than
|
||||
decrypting a new tunnel request? Though we've got to be careful to avoid
|
||||
starvation.
|
||||
</i>
|
||||
</p><p>
|
||||
|
||||
One of my main stats-gathering techniques is turning on
|
||||
net.i2p.client.streaming.ConnectionPacketHandler=DEBUG and watching the RTT
|
||||
times and window sizes as they go by. To overgeneralize for a moment, it's
|
||||
common to see 3 types of connections: ~4s RTT, ~10s RTT, and ~30s RTT. Trying
|
||||
to knock down the 30s RTT connections is the goal. If CPU contention is the
|
||||
cause then maybe some juggling will do it.
|
||||
</p><p>
|
||||
|
||||
Reducing the SSU max retrans from 10 is really just a stab in the dark as we
|
||||
don't have good data on whether we are collapsing, having TCP-over-TCP issues,
|
||||
or what, so more data is needed.
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
Worth looking into - what if we just disabled SSU's retransmissions? It'd
|
||||
probably lead to much higher streaming lib resend rates, but maybe not.
|
||||
</i>
|
||||
</p><p>
|
||||
|
||||
What I don't understand, if you could elaborate, are the benefits of SSU
|
||||
retransmissions for non-streaming-lib traffic. Do we need tunnel messages (for
|
||||
example) to use a semi-reliable transport or can they use an unreliable or
|
||||
kinda-sorta-reliable transport (1 or 2 retransmissions max, for example)? In
|
||||
other words, why semi-reliability?
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
(but as you mention, if there's no queuing, it doesn't matter).
|
||||
</i>
|
||||
</p><p>
|
||||
|
||||
I implemented priority sending for UDP but it kicked in about 100,000 times
|
||||
less often than the code on the NTCP side. Maybe that's a clue for further
|
||||
investigation or a hint - I don't understand why it would back up that much
|
||||
more often on NTCP, but maybe that's a hint on why NTCP performs worse.
|
||||
|
||||
</p>
|
||||
|
||||
<h3>Question answered by jrandom</h3>
|
||||
Posted to new Syndie, 2007-03-31
|
||||
<p>
|
||||
measured SSU retransmission rates we saw back before NTCP was implemented
|
||||
(10-30+%)
|
||||
</p><p>
|
||||
|
||||
Can the router itself measure this? If so, could a transport be selected based
|
||||
on measured performance? (i.e. if an SSU connection to a peer is dropping an
|
||||
unreasonable number of messages, prefer NTCP when sending to that peer)
|
||||
</p><p>
|
||||
|
||||
|
||||
|
||||
Yeah, it currently uses that stat right now as a poor-man's MTU detection (if
|
||||
the retransmission rate is high, it uses the small packet size, but if its low,
|
||||
it uses the large packet size). We tried a few things when first introducing
|
||||
NTCP (and when first moving away from the original TCP transport) that would
|
||||
prefer SSU but fail that transport for a peer easily, causing it to fall back
|
||||
on NTCP. However, there's certainly more that could be done in that regard,
|
||||
though it gets complicated quickly (how/when to adjust/reset the bids, whether
|
||||
to share these preferences across multiple peers or not, whether to share it
|
||||
across multiple sessions with the same peer (and for how long), etc).
|
||||
|
||||
|
||||
<h3>Response by foofighter</h3>
|
||||
Posted to new Syndie, 2007-03-26
|
||||
<p>
|
||||
|
||||
If I've understood things right, the primary reason in favor of TCP (in
|
||||
general, both the old and new variety) was that you needn't worry about coding
|
||||
a good TCP stack. Which ain't impossibly hard to get right... just that
|
||||
existing TCP stacks have a 20 year lead.
|
||||
</p><p>
|
||||
|
||||
AFAIK, there hasn't been much deep theory behind the preference of TCP versus
|
||||
UDP, except the following considerations:
|
||||
|
||||
<ul>
|
||||
<li>
|
||||
A TCP-only network is very dependent on reachable peers (those who can forward
|
||||
incoming connections through their NAT)
|
||||
<li>
|
||||
Still even if reachable peers are rare, having them be high capacity somewhat
|
||||
alleviates the topological scarcity issues
|
||||
<li>
|
||||
UDP allows for "NAT hole punching" which lets people be "kind of
|
||||
pseudo-reachable" (with the help of introducers) who could otherwise only
|
||||
connect out
|
||||
<li>
|
||||
The "old" TCP transport implementation required lots of threads, which was a
|
||||
performance killer, while the "new" TCP transport does well with few threads
|
||||
<li>
|
||||
Routers of set A crap out when saturated with UDP. Routers of set B crap out
|
||||
when saturated with TCP.
|
||||
<li>
|
||||
It "feels" (as in, there are some indications but no scientific data or
|
||||
quality statistics) that A is more widely deployed than B
|
||||
<li>
|
||||
Some networks carry non-DNS UDP datagrams with an outright shitty quality,
|
||||
while still somewhat bothering to carry TCP streams.
|
||||
</ul>
|
||||
</p><p>
|
||||
|
||||
|
||||
On that background, a small diversity of transports (as many as needed, but not
|
||||
more) appears sensible in either case. Which should be the main transport,
|
||||
depends on their performance-wise. I've seen nasty stuff on my line when I
|
||||
tried to use its full capacity with UDP. Packet losses on the level of 35%.
|
||||
</p><p>
|
||||
|
||||
We could definitely try playing with UDP versus TCP priorities, but I'd urge
|
||||
caution in that. I would urge that they not be changed too radically all at
|
||||
once, or it might break things.
|
||||
|
||||
</p>
|
||||
|
||||
<h3>Response by zzz</h3>
|
||||
Posted to new Syndie, 2007-03-27
|
||||
<p>
|
||||
<i>
|
||||
AFAIK, there hasn't been much deep theory behind the preference of TCP versus
|
||||
UDP, except the following considerations:
|
||||
</i>
|
||||
|
||||
</p><p>
|
||||
|
||||
These are all valid issues. However you are considering the two protocols in
|
||||
isolation, whether than thinking about what transport protocol is best for a
|
||||
particular higher-level protocol (i.e. streaming lib or not).
|
||||
</p><p>
|
||||
|
||||
What I'm saying is you have to take the streaming lib into consideration.
|
||||
|
||||
So either shift the preferences for everybody or treat streaming lib traffic
|
||||
differently.
|
||||
|
||||
That's what my proposal 1B) is talking about - have a different preference for
|
||||
streaming-lib traffic than for non streaming-lib traffic (for example tunnel
|
||||
build messages).
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
|
||||
On that background, a small diversity of transports (as many as needed, but
|
||||
not more) appears sensible in either case. Which should be the main transport,
|
||||
depends on their performance-wise. I've seen nasty stuff on my line when I
|
||||
tried to use its full capacity with UDP. Packet losses on the level of 35%.
|
||||
|
||||
</i>
|
||||
</p><p>
|
||||
|
||||
Agreed. The new .28 may have made things better for packet loss over UDP, or
|
||||
maybe not.
|
||||
|
||||
One important point - the transport code does remember failures of a transport.
|
||||
So if UDP is the preferred transport, it will try it first, but if it fails for
|
||||
a particular destination, the next attempt for that destination it will try
|
||||
NTCP rather than trying UDP again.
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
We could definitely try playing with UDP versus TCP priorities, but I'd urge
|
||||
caution in that. I would urge that they not be changed too radically all at
|
||||
once, or it might break things.
|
||||
</i>
|
||||
</p><p>
|
||||
|
||||
We have four tuning knobs - the four bid values (SSU and NTCP, for
|
||||
already-connected and not-already-connected).
|
||||
We could make SSU be preferred over NTCP only if both are connected, for
|
||||
example, but try NTCP first if neither transport is connected.
|
||||
</p><p>
|
||||
|
||||
The other way to do it gradually is only shifting the streaming lib traffic
|
||||
(the 1B proposal) however that could be hard and may have anonymity
|
||||
implications, I don't know. Or maybe shift the traffic only for the first
|
||||
outbound hop (i.e. don't propagate the flag to the next router), which gives
|
||||
you only partial benefit but might be more anonymous and easier.
|
||||
</p>
|
||||
|
||||
<h3>Results of the Discussion</h3>
|
||||
... and other related changes in the same timeframe (2007):
|
||||
<ul>
|
||||
<li>
|
||||
Significant tuning of the streaming lib parameters,
|
||||
greatly increasing outbound performance, was implemented in 0.6.1.28
|
||||
<li>
|
||||
Priority sending for NTCP was implemented in 0.6.1.28
|
||||
<li>
|
||||
Priority sending for SSU was implemented by zzz but was never checked in
|
||||
<li>
|
||||
The advanced transport bid control
|
||||
i2np.udp.preferred was implemented in 0.6.1.29.
|
||||
<li>
|
||||
Pushback for NTCP was implemented in 0.6.1.30, disabled in 0.6.1.31 due to anonymity concerns,
|
||||
and re-enabled with improvements to address those concerns in 0.6.1.32.
|
||||
<li>
|
||||
None of zzz's proposals 1-5 have been implemented.
|
||||
</ul>
|
||||
|
||||
{% endblock %}
|
||||
|
559
www.i2p2/pages/ntcp_discussion.html
Normal file
559
www.i2p2/pages/ntcp_discussion.html
Normal file
@ -0,0 +1,559 @@
|
||||
{% extends "_layout.html" %}
|
||||
{% block title %}NTCP Discussion{% endblock %}
|
||||
{% block content %}
|
||||
|
||||
Following is a discussion about NTCP that took place in March 2007.
|
||||
It has not been updated to reflect current implementation.
|
||||
For the current NTCP specification see <a href="ntcp.html">the main NTCP page</a>.
|
||||
|
||||
<h2>NTCP vs. SSU Discussion, March 2007</h2>
|
||||
<h3>NTCP questions</h3>
|
||||
(adapted from an IRC discussion between zzz and cervantes)
|
||||
<br />
|
||||
Why is NTCP preferred over SSU, doesn't NTCP have higher overhead and latency?
|
||||
It has better reliability.
|
||||
<br />
|
||||
Doesn't streaming lib over NTCP suffer from classic TCP-over-TCP issues?
|
||||
What if we had a really simple UDP transport for streaming-lib-originated traffic?
|
||||
I think SSU was meant to be the so-called really simple UDP transport - but it just proved too unreliable.
|
||||
|
||||
<h3>"NTCP Considered Harmful" Analysis by zzz</h3>
|
||||
Posted to new Syndie, 2007-03-25.
|
||||
This was posted to stimulate discussion, don't take it too seriously.
|
||||
<p>
|
||||
Summary: NTCP has higher latency and overhead than SSU, and is more likely to
|
||||
collapse when used with the streaming lib. However, traffic is routed with a
|
||||
preference for NTCP over SSU and this is currently hardcoded.
|
||||
</p>
|
||||
|
||||
<h4>Discussion</h4>
|
||||
<p>
|
||||
We currently have two transports, NTCP and SSU. As currently implemented, NTCP
|
||||
has lower "bids" than SSU so it is preferred, except for the case where there
|
||||
is an established SSU connection but no established NTCP connection for a peer.
|
||||
</p><p>
|
||||
|
||||
SSU is similar to NTCP in that it implements acknowledgments, timeouts, and
|
||||
retransmissions. However SSU is I2P code with tight constraints on the
|
||||
timeouts and available statistics on round trip times, retransmissions, etc.
|
||||
NTCP is based on Java NIO TCP, which is a black box and presumably implements
|
||||
RFC standards, including very long maximum timeouts.
|
||||
</p><p>
|
||||
|
||||
The majority of traffic within I2P is streaming-lib originated (HTTP, IRC,
|
||||
Bittorrent) which is our implementation of TCP. As the lower-level transport is
|
||||
generally NTCP due to the lower bids, the system is subject to the well-known
|
||||
and dreaded problem of TCP-over-TCP
|
||||
http://sites.inka.de/~W1011/devel/tcp-tcp.html , where both the higher and
|
||||
lower layers of TCP are doing retransmissions at once, leading to collapse.
|
||||
</p><p>
|
||||
|
||||
Unlike in the PPP over SSH scenario described in the link above, we have
|
||||
several hops for the lower layer, each covered by a NTCP link. So each NTCP
|
||||
latency is generally much less than the higher-layer streaming lib latency.
|
||||
This lessens the chance of collapse.
|
||||
</p><p>
|
||||
|
||||
Also, the probabilities of collapse are lessened when the lower-layer TCP is
|
||||
tightly constrained with low timeouts and number of retransmissions compared to
|
||||
the higher layer.
|
||||
</p><p>
|
||||
|
||||
The .28 release increased the maximum streaming lib timeout from 10 sec to 45
|
||||
sec which greatly improved things. The SSU max timeout is 3 sec. The NTCP max
|
||||
timeout is presumably at least 60 sec, which is the RFC recommendation. There
|
||||
is no way to change NTCP parameters or monitor performance. Collapse of the
|
||||
NTCP layer is [editor: text lost]. Perhaps an external tool like tcpdump would help.
|
||||
</p><p>
|
||||
|
||||
However, running .28, the i2psnark reported upstream does not generally stay at
|
||||
a high level. It often goes down to 3-4 KBps before climbing back up. This is a
|
||||
signal that there are still collapses.
|
||||
</p><p>
|
||||
|
||||
SSU is also more efficient. NTCP has higher overhead and probably higher round
|
||||
trip times. when using NTCP the ratio of (tunnel output) / (i2psnark data
|
||||
output) is at least 3.5 : 1. Running an experiment where the code was modified
|
||||
to prefer SSU (the config option i2np.udp.alwaysPreferred has no effect in the
|
||||
current code), the ratio reduced to about 3 : 1, indicating better efficiency.
|
||||
</p><p>
|
||||
|
||||
As reported by streaming lib stats, things were much improved - lifetime window
|
||||
size up from 6.3 to 7.5, RTT down from 11.5s to 10s, sends per ack down from
|
||||
1.11 to 1.07.
|
||||
</p><p>
|
||||
|
||||
That this was quite effective was surprising, given that we were only changing
|
||||
the transport for the first of 3 to 5 total hops the outbound messages would
|
||||
take.
|
||||
</p><p>
|
||||
|
||||
The effect on outbound i2psnark speeds wasn't clear due to normal variations.
|
||||
Also for the experiment, inbound NTCP was disabled. The effect on inbound
|
||||
speeds on i2psnark was not clear.
|
||||
</p>
|
||||
<h4>Proposals</h4>
|
||||
|
||||
<ul>
|
||||
<li>
|
||||
1A)
|
||||
This is easy -
|
||||
We should flip the bid priorities so that SSU is preferred for all traffic, if
|
||||
we can do this without causing all sorts of other trouble. This will fix the
|
||||
i2np.udp.alwaysPreferred configuration option so that it works (either as true
|
||||
or false).
|
||||
|
||||
<li>
|
||||
1B)
|
||||
Alternative to 1A), not so easy -
|
||||
If we can mark traffic without adversely affecting our anonymity goals, we
|
||||
should identify streaming-lib generated traffic and have SSU generate a low bid
|
||||
for that traffic. This tag will have to go with the message through each hop
|
||||
so that the forwarding routers also honor the SSU preference.
|
||||
|
||||
|
||||
<li>
|
||||
2)
|
||||
Bounding SSU even further (reducing maximum retransmissions from the current
|
||||
10) is probably wise to reduce the chance of collapse.
|
||||
|
||||
<li>
|
||||
3)
|
||||
We need further study on the benefits vs. harm of a semi-reliable protocol
|
||||
underneath the streaming lib. Are retransmissions over a single hop beneficial
|
||||
and a big win or are they worse than useless?
|
||||
We could do a new SUU (secure unreliable UDP) but probably not worth it. We
|
||||
could perhaps add a no-ack-required message type in SSU if we don't want any
|
||||
retransmissions at all of streaming-lib traffic. Are tightly bounded
|
||||
retransmissions desirable?
|
||||
|
||||
<li>
|
||||
4)
|
||||
The priority sending code in .28 is only for NTCP. So far my testing hasn't
|
||||
shown much use for SSU priority as the messages don't queue up long enough for
|
||||
priorities to do any good. But more testing needed.
|
||||
|
||||
<li>
|
||||
5)
|
||||
The new streaming lib max timeout of 45s is probably still too low.
|
||||
The TCP RFC says 60s. It probably shouldn't be shorter than the underlying NTCP max timeout (presumably 60s).
|
||||
</ul>
|
||||
|
||||
<h3>Response by jrandom</h3>
|
||||
Posted to new Syndie, 2007-03-27
|
||||
<p>
|
||||
On the whole, I'm open to experimenting with this, though remember why NTCP is
|
||||
there in the first place - SSU failed in a congestion collapse. NTCP "just
|
||||
works", and while 2-10% retransmission rates can be handled in normal
|
||||
single-hop networks, that gives us a 40% retransmission rate with 2 hop
|
||||
tunnels. If you loop in some of the measured SSU retransmission rates we saw
|
||||
back before NTCP was implemented (10-30+%), that gives us an 83% retransmission
|
||||
rate. Perhaps those rates were caused by the low 10 second timeout, but
|
||||
increasing that much would bite us (remember, multiply by 5 and you've got half
|
||||
the journey).
|
||||
</p><p>
|
||||
|
||||
Unlike TCP, we have no feedback from the tunnel to know whether the message
|
||||
made it - there are no tunnel level acks. We do have end to end ACKs, but only
|
||||
on a small number of messages (whenever we distribute new session tags) - out
|
||||
of the 1,553,591 client messages my router sent, we only attempted to ACK
|
||||
145,207 of them. The others may have failed silently or succeeded perfectly.
|
||||
</p><p>
|
||||
|
||||
I'm not convinced by the TCP-over-TCP argument for us, especially split across
|
||||
the various paths we transfer down. Measurements on I2P can convince me
|
||||
otherwise, of course.
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
The NTCP max timeout is presumably at least 60 sec, which is the RFC
|
||||
recommendation. There is no way to change NTCP parameters or monitor
|
||||
performance.
|
||||
</i>
|
||||
</p><p>
|
||||
|
||||
|
||||
True, but net connections only get up to that level when something really bad
|
||||
is going on - the retransmission timeout on TCP is often on the order of tens
|
||||
or hundreds of milliseconds. As foofighter points out, they've got 20+ years
|
||||
experience and bugfixing in their TCP stacks, plus a billion dollar industry
|
||||
optimizing hardware and software to perform well according to whatever it is
|
||||
they do.
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
NTCP has higher overhead and probably higher round trip times. when using NTCP
|
||||
the ratio of (tunnel output) / (i2psnark data output) is at least 3.5 : 1.
|
||||
Running an experiment where the code was modified to prefer SSU (the config
|
||||
option i2np.udp.alwaysPreferred has no effect in the current code), the ratio
|
||||
reduced to about 3 : 1, indicating better efficiency.
|
||||
</i>
|
||||
</p><p>
|
||||
|
||||
|
||||
This is very interesting data, though more as a matter of router congestion
|
||||
than bandwidth efficiency - you'd have to compare 3.5*$n*$NTCPRetransmissionPct
|
||||
./. 3.0*$n*$SSURetransmissionPct. This data point suggests there's something in
|
||||
the router that leads to excess local queuing of messages already being
|
||||
transferred.
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
lifetime window size up from 6.3 to 7.5, RTT down from 11.5s to 10s, sends per
|
||||
ACK down from 1.11 to 1.07.
|
||||
</i>
|
||||
|
||||
</p><p>
|
||||
|
||||
Remember that the sends-per-ACK is only a sample not a full count (as we don't
|
||||
try to ACK every send). Its not a random sample either, but instead samples
|
||||
more heavily periods of inactivity or the initiation of a burst of activity -
|
||||
sustained load won't require many ACKs.
|
||||
</p><p>
|
||||
|
||||
Window sizes in that range are still woefully low to get the real benefit of
|
||||
AIMD, and still too low to transmit a single 32KB BT chunk (increasing the
|
||||
floor to 10 or 12 would cover that).
|
||||
</p><p>
|
||||
|
||||
Still, the wsize stat looks promising - over how long was that maintained?
|
||||
</p><p>
|
||||
|
||||
Actually, for testing purposes, you may want to look at
|
||||
StreamSinkClient/StreamSinkServer or even TestSwarm in
|
||||
apps/ministreaming/java/src/net/i2p/client/streaming/ - StreamSinkClient is a
|
||||
CLI app that sends a selected file to a selected destination and
|
||||
StreamSinkServer creates a destination and writes out any data sent to it
|
||||
(displaying size and transfer time). TestSwarm combines the two - flooding
|
||||
random data to whomever it connects to. That should give you the tools to
|
||||
measure sustained throughput capacity over the streaming lib, as opposed to BT
|
||||
choke/send.
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
1A)
|
||||
|
||||
This is easy -
|
||||
We should flip the bid priorities so that SSU is preferred for all traffic, if
|
||||
we can do this without causing all sorts of other trouble. This will fix the
|
||||
i2np.udp.alwaysPreferred configuration option so that it works (either as true
|
||||
or false).
|
||||
</i>
|
||||
</p><p>
|
||||
|
||||
|
||||
Honoring i2np.udp.alwaysPreferred is a good idea in any case - please feel free
|
||||
to commit that change. Lets gather a bit more data though before switching the
|
||||
preferences, as NTCP was added to deal with an SSU-created congestion collapse.
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
1B)
|
||||
Alternative to 1A), not so easy -
|
||||
If we can mark traffic without adversely affecting our anonymity goals, we
|
||||
should identify streaming-lib generated traffic
|
||||
and have SSU generate a low bid for that traffic. This tag will have to go with
|
||||
the message through each hop
|
||||
so that the forwarding routers also honor the SSU preference.
|
||||
</i>
|
||||
</p><p>
|
||||
|
||||
|
||||
In practice, there are three types of traffic - tunnel building/testing, netDb
|
||||
query/response, and streaming lib traffic. The network has been designed to
|
||||
make differentiating those three very hard.
|
||||
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
2)
|
||||
Bounding SSU even further (reducing maximum retransmissions from the current
|
||||
10) is probably wise to reduce the chance of collapse.
|
||||
</i>
|
||||
</p><p>
|
||||
|
||||
|
||||
At 10 retransmissions, we're up shit creek already, I agree. One, maybe two
|
||||
retransmissions is reasonable, from a transport layer, but if the other side is
|
||||
too congested to ACK in time (even with the implemented SACK/NACK capability),
|
||||
there's not much we can do.
|
||||
</p><p>
|
||||
|
||||
In my view, to really address the core issue we need to address why the router
|
||||
gets so congested to ACK in time (which, from what I've found, is due to CPU
|
||||
contention). Maybe we can juggle some things in the router's processing to make
|
||||
the transmission of an already existing tunnel higher CPU priority than
|
||||
decrypting a new tunnel request? Though we've got to be careful to avoid
|
||||
starvation.
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
3)
|
||||
We need further study on the benefits vs. harm of a semi-reliable protocol
|
||||
underneath the streaming lib. Are retransmissions over a single hop beneficial
|
||||
and a big win or are they worse than useless?
|
||||
We could do a new SUU (secure unreliable UDP) but probably not worth it. We
|
||||
could perhaps add a no-ACK-required message type in SSU if we don't want any
|
||||
retransmissions at all of streaming-lib traffic. Are tightly bounded
|
||||
retransmissions desirable?
|
||||
</i>
|
||||
|
||||
</p><p>
|
||||
|
||||
Worth looking into - what if we just disabled SSU's retransmissions? It'd
|
||||
probably lead to much higher streaming lib resend rates, but maybe not.
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
4)
|
||||
The priority sending code in .28 is only for NTCP. So far my testing hasn't
|
||||
shown much use for SSU priority as the messages don't queue up long enough for
|
||||
priorities to do any good. But more testing needed.
|
||||
</i>
|
||||
|
||||
</p><p>
|
||||
|
||||
There's UDPTransport.PRIORITY_LIMITS and UDPTransport.PRIORITY_WEIGHT (honored
|
||||
by TimedWeightedPriorityMessageQueue), but currently the weights are almost all
|
||||
equal, so there's no effect. That could be adjusted, of course (but as you
|
||||
mention, if there's no queuing, it doesn't matter).
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
5)
|
||||
The new streaming lib max timeout of 45s is probably still too low. The TCP RFC
|
||||
says 60s. It probably shouldn't be shorter than the underlying NTCP max timeout
|
||||
(presumably 60s).
|
||||
</i>
|
||||
</p><p>
|
||||
|
||||
|
||||
That 45s is the max retransmission timeout of the streaming lib though, not the
|
||||
stream timeout. TCP in practice has retransmission timeouts orders of magnitude
|
||||
less, though yes, can get to 60s on links running through exposed wires or
|
||||
satellite transmissions ;) If we increase the streaming lib retransmission
|
||||
timeout to e.g. 75 seconds, we could go get a beer before a web page loads
|
||||
(especially assuming less than a 98% reliable transport). That's one reason we
|
||||
prefer NTCP.
|
||||
</p>
|
||||
|
||||
|
||||
<h3>Response by zzz</h3>
|
||||
Posted to new Syndie, 2007-03-31
|
||||
<p>
|
||||
|
||||
<i>
|
||||
At 10 retransmissions, we're up shit creek already, I agree. One, maybe two
|
||||
retransmissions is reasonable, from a transport layer, but if the other side is
|
||||
too congested to ACK in time (even with the implemented SACK/NACK capability),
|
||||
there's not much we can do.
|
||||
<br>
|
||||
In my view, to really address the core issue we need to address why the
|
||||
router gets so congested to ACK in time (which, from what I've found, is due to
|
||||
CPU contention). Maybe we can juggle some things in the router's processing to
|
||||
make the transmission of an already existing tunnel higher CPU priority than
|
||||
decrypting a new tunnel request? Though we've got to be careful to avoid
|
||||
starvation.
|
||||
</i>
|
||||
</p><p>
|
||||
|
||||
One of my main stats-gathering techniques is turning on
|
||||
net.i2p.client.streaming.ConnectionPacketHandler=DEBUG and watching the RTT
|
||||
times and window sizes as they go by. To overgeneralize for a moment, it's
|
||||
common to see 3 types of connections: ~4s RTT, ~10s RTT, and ~30s RTT. Trying
|
||||
to knock down the 30s RTT connections is the goal. If CPU contention is the
|
||||
cause then maybe some juggling will do it.
|
||||
</p><p>
|
||||
|
||||
Reducing the SSU max retrans from 10 is really just a stab in the dark as we
|
||||
don't have good data on whether we are collapsing, having TCP-over-TCP issues,
|
||||
or what, so more data is needed.
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
Worth looking into - what if we just disabled SSU's retransmissions? It'd
|
||||
probably lead to much higher streaming lib resend rates, but maybe not.
|
||||
</i>
|
||||
</p><p>
|
||||
|
||||
What I don't understand, if you could elaborate, are the benefits of SSU
|
||||
retransmissions for non-streaming-lib traffic. Do we need tunnel messages (for
|
||||
example) to use a semi-reliable transport or can they use an unreliable or
|
||||
kinda-sorta-reliable transport (1 or 2 retransmissions max, for example)? In
|
||||
other words, why semi-reliability?
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
(but as you mention, if there's no queuing, it doesn't matter).
|
||||
</i>
|
||||
</p><p>
|
||||
|
||||
I implemented priority sending for UDP but it kicked in about 100,000 times
|
||||
less often than the code on the NTCP side. Maybe that's a clue for further
|
||||
investigation or a hint - I don't understand why it would back up that much
|
||||
more often on NTCP, but maybe that's a hint on why NTCP performs worse.
|
||||
|
||||
</p>
|
||||
|
||||
<h3>Question answered by jrandom</h3>
|
||||
Posted to new Syndie, 2007-03-31
|
||||
<p>
|
||||
measured SSU retransmission rates we saw back before NTCP was implemented
|
||||
(10-30+%)
|
||||
</p><p>
|
||||
|
||||
Can the router itself measure this? If so, could a transport be selected based
|
||||
on measured performance? (i.e. if an SSU connection to a peer is dropping an
|
||||
unreasonable number of messages, prefer NTCP when sending to that peer)
|
||||
</p><p>
|
||||
|
||||
|
||||
|
||||
Yeah, it currently uses that stat right now as a poor-man's MTU detection (if
|
||||
the retransmission rate is high, it uses the small packet size, but if its low,
|
||||
it uses the large packet size). We tried a few things when first introducing
|
||||
NTCP (and when first moving away from the original TCP transport) that would
|
||||
prefer SSU but fail that transport for a peer easily, causing it to fall back
|
||||
on NTCP. However, there's certainly more that could be done in that regard,
|
||||
though it gets complicated quickly (how/when to adjust/reset the bids, whether
|
||||
to share these preferences across multiple peers or not, whether to share it
|
||||
across multiple sessions with the same peer (and for how long), etc).
|
||||
|
||||
|
||||
<h3>Response by foofighter</h3>
|
||||
Posted to new Syndie, 2007-03-26
|
||||
<p>
|
||||
|
||||
If I've understood things right, the primary reason in favor of TCP (in
|
||||
general, both the old and new variety) was that you needn't worry about coding
|
||||
a good TCP stack. Which ain't impossibly hard to get right... just that
|
||||
existing TCP stacks have a 20 year lead.
|
||||
</p><p>
|
||||
|
||||
AFAIK, there hasn't been much deep theory behind the preference of TCP versus
|
||||
UDP, except the following considerations:
|
||||
|
||||
<ul>
|
||||
<li>
|
||||
A TCP-only network is very dependent on reachable peers (those who can forward
|
||||
incoming connections through their NAT)
|
||||
<li>
|
||||
Still even if reachable peers are rare, having them be high capacity somewhat
|
||||
alleviates the topological scarcity issues
|
||||
<li>
|
||||
UDP allows for "NAT hole punching" which lets people be "kind of
|
||||
pseudo-reachable" (with the help of introducers) who could otherwise only
|
||||
connect out
|
||||
<li>
|
||||
The "old" TCP transport implementation required lots of threads, which was a
|
||||
performance killer, while the "new" TCP transport does well with few threads
|
||||
<li>
|
||||
Routers of set A crap out when saturated with UDP. Routers of set B crap out
|
||||
when saturated with TCP.
|
||||
<li>
|
||||
It "feels" (as in, there are some indications but no scientific data or
|
||||
quality statistics) that A is more widely deployed than B
|
||||
<li>
|
||||
Some networks carry non-DNS UDP datagrams with an outright shitty quality,
|
||||
while still somewhat bothering to carry TCP streams.
|
||||
</ul>
|
||||
</p><p>
|
||||
|
||||
|
||||
On that background, a small diversity of transports (as many as needed, but not
|
||||
more) appears sensible in either case. Which should be the main transport,
|
||||
depends on their performance-wise. I've seen nasty stuff on my line when I
|
||||
tried to use its full capacity with UDP. Packet losses on the level of 35%.
|
||||
</p><p>
|
||||
|
||||
We could definitely try playing with UDP versus TCP priorities, but I'd urge
|
||||
caution in that. I would urge that they not be changed too radically all at
|
||||
once, or it might break things.
|
||||
|
||||
</p>
|
||||
|
||||
<h3>Response by zzz</h3>
|
||||
Posted to new Syndie, 2007-03-27
|
||||
<p>
|
||||
<i>
|
||||
AFAIK, there hasn't been much deep theory behind the preference of TCP versus
|
||||
UDP, except the following considerations:
|
||||
</i>
|
||||
|
||||
</p><p>
|
||||
|
||||
These are all valid issues. However you are considering the two protocols in
|
||||
isolation, whether than thinking about what transport protocol is best for a
|
||||
particular higher-level protocol (i.e. streaming lib or not).
|
||||
</p><p>
|
||||
|
||||
What I'm saying is you have to take the streaming lib into consideration.
|
||||
|
||||
So either shift the preferences for everybody or treat streaming lib traffic
|
||||
differently.
|
||||
|
||||
That's what my proposal 1B) is talking about - have a different preference for
|
||||
streaming-lib traffic than for non streaming-lib traffic (for example tunnel
|
||||
build messages).
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
|
||||
On that background, a small diversity of transports (as many as needed, but
|
||||
not more) appears sensible in either case. Which should be the main transport,
|
||||
depends on their performance-wise. I've seen nasty stuff on my line when I
|
||||
tried to use its full capacity with UDP. Packet losses on the level of 35%.
|
||||
|
||||
</i>
|
||||
</p><p>
|
||||
|
||||
Agreed. The new .28 may have made things better for packet loss over UDP, or
|
||||
maybe not.
|
||||
|
||||
One important point - the transport code does remember failures of a transport.
|
||||
So if UDP is the preferred transport, it will try it first, but if it fails for
|
||||
a particular destination, the next attempt for that destination it will try
|
||||
NTCP rather than trying UDP again.
|
||||
</p><p>
|
||||
|
||||
<i>
|
||||
We could definitely try playing with UDP versus TCP priorities, but I'd urge
|
||||
caution in that. I would urge that they not be changed too radically all at
|
||||
once, or it might break things.
|
||||
</i>
|
||||
</p><p>
|
||||
|
||||
We have four tuning knobs - the four bid values (SSU and NTCP, for
|
||||
already-connected and not-already-connected).
|
||||
We could make SSU be preferred over NTCP only if both are connected, for
|
||||
example, but try NTCP first if neither transport is connected.
|
||||
</p><p>
|
||||
|
||||
The other way to do it gradually is only shifting the streaming lib traffic
|
||||
(the 1B proposal) however that could be hard and may have anonymity
|
||||
implications, I don't know. Or maybe shift the traffic only for the first
|
||||
outbound hop (i.e. don't propagate the flag to the next router), which gives
|
||||
you only partial benefit but might be more anonymous and easier.
|
||||
</p>
|
||||
|
||||
<h3>Results of the Discussion</h3>
|
||||
... and other related changes in the same timeframe (2007):
|
||||
<ul>
|
||||
<li>
|
||||
Significant tuning of the streaming lib parameters,
|
||||
greatly increasing outbound performance, was implemented in 0.6.1.28
|
||||
<li>
|
||||
Priority sending for NTCP was implemented in 0.6.1.28
|
||||
<li>
|
||||
Priority sending for SSU was implemented by zzz but was never checked in
|
||||
<li>
|
||||
The advanced transport bid control
|
||||
i2np.udp.preferred was implemented in 0.6.1.29.
|
||||
<li>
|
||||
Pushback for NTCP was implemented in 0.6.1.30, disabled in 0.6.1.31 due to anonymity concerns,
|
||||
and re-enabled with improvements to address those concerns in 0.6.1.32.
|
||||
<li>
|
||||
None of zzz's proposals 1-5 have been implemented.
|
||||
</ul>
|
||||
|
||||
{% endblock %}
|
@ -2,32 +2,51 @@
|
||||
{% block title %}Tunnel Creation{% endblock %}
|
||||
{% block content %}
|
||||
|
||||
<b>Note: This documents the current tunnel build implementation as of release 0.6.1.10.</b>
|
||||
<br>
|
||||
<pre>
|
||||
1) <a href="#tunnelCreate.overview">Tunnel creation</a>
|
||||
1.1) <a href="#tunnelCreate.requestRecord">Tunnel creation request record</a>
|
||||
1.2) <a href="#tunnelCreate.hopProcessing">Hop processing</a>
|
||||
1.3) <a href="#tunnelCreate.replyRecord">Tunnel creation reply record</a>
|
||||
1.4) <a href="#tunnelCreate.requestPreparation">Request preparation</a>
|
||||
1.5) <a href="#tunnelCreate.requestDelivery">Request delivery</a>
|
||||
1.6) <a href="#tunnelCreate.endpointHandling">Endpoint handling</a>
|
||||
1.7) <a href="#tunnelCreate.replyProcessing">Reply processing</a>
|
||||
2) <a href="#tunnelCreate.notes">Notes</a>
|
||||
</pre>
|
||||
This page documents the current tunnel build implementation.
|
||||
Updated August 2010 for release 0.8
|
||||
|
||||
<h2 id="tunnelCreate.overview">1) Tunnel creation encryption:</h2>
|
||||
<h2 id="tunnelCreate.overview">Tunnel Creation Specification</h2>
|
||||
|
||||
<p>
|
||||
This document specifies the details of the encrypted tunnel build messages
|
||||
used to create tunnels using a "non-interactive telescoping" method.
|
||||
See <a href="tunnel-alt.html">the tunnel build document</a>
|
||||
for an overview of the process, including peer selection and ordering methods.
|
||||
|
||||
<p>The tunnel creation is accomplished by a single message passed along
|
||||
the path of peers in the tunnel, rewritten in place, and transmitted
|
||||
back to the tunnel creator. This single tunnel message is made up
|
||||
of a fixed number of records (8) - one for each potential peer in
|
||||
of a variable number of records (up to 8) - one for each potential peer in
|
||||
the tunnel. Individual records are asymmetrically encrypted to be
|
||||
read only by a specific peer along the path, while an additional
|
||||
symmetric layer of encryption is added at each hop so as to expose
|
||||
the asymmetrically encrypted record only at the appropriate time.</p>
|
||||
|
||||
<h3 id="tunnelCreate.requestRecord">1.1) Tunnel creation request record</h3>
|
||||
<h3 id="number">Number of Records</h3>
|
||||
Not all records must contain valid data.
|
||||
The build message for a 3-hop tunnel, for example, may contain more records
|
||||
to hide the actual length of the tunnel from the participants.
|
||||
There are two build message types. The original
|
||||
<a href="i2np_spec.html#msg_TunnelBuild">Tunnel Build Message</a> (TBM)
|
||||
contains 8 records, which is more than enough for any practical tunnel length.
|
||||
The recently-implemented
|
||||
<a href="i2np_spec.html#msg_VariableTunnelBuild">Variable Tunnel Build Message</a> (VTBM)
|
||||
contains 1 to 8 records. The originator may trade off the size of the message
|
||||
with the desired amount of tunnel length obfuscation.
|
||||
<p>
|
||||
In the current network, most tunnels are 2 or 3 hops long.
|
||||
The current implementation uses a 5-record VTBM to build tunnels of 4 hops or less,
|
||||
and the 8-record TBM for longer tunnels.
|
||||
The 5-record VTBM (which fits in 3 1KB tunnel messaages) reduces network traffic
|
||||
and increases build sucess rate, because larger messages are less likely to be dropped.
|
||||
<p>
|
||||
The reply message must be the same type and length as the build message.
|
||||
|
||||
|
||||
<h3 id="tunnelCreate.requestRecord">Request Record Specification</h3>
|
||||
|
||||
Also specified in the
|
||||
<a href="i2np_spec.html#struct_BuildRequestRecord">I2NP Specification</a>
|
||||
|
||||
<p>Cleartext of the record, visible only to the hop being asked:</p><pre>
|
||||
bytes 0-3: tunnel ID to receive messages as
|
||||
@ -49,49 +68,79 @@ endpoint, they specify where the rewritten tunnel creation reply
|
||||
message should be sent. In addition, the next message ID specifies the
|
||||
message ID that the message (or reply) should use.</p>
|
||||
|
||||
<p>The flags field currently has two bits defined:</p><pre>
|
||||
bit 0: if set, allow messages from anyone
|
||||
bit 1: if set, allow messages to anyone, and send the reply to the
|
||||
specified next hop in a tunnel message</pre>
|
||||
<p>The flags field contains the following:
|
||||
<pre>
|
||||
Bit order: 76543210 (bit 7 is MSB)
|
||||
bit 7: if set, allow messages from anyone
|
||||
bit 6: if set, allow messages to anyone, and send the reply to the
|
||||
specified next hop in a tunnel message
|
||||
bits 5-0: Undefined
|
||||
</pre>
|
||||
|
||||
<p>That cleartext record is ElGamal 2048 encrypted with the hop's
|
||||
Bit 7 indicates that the hop will be an inbound gateway (IBGW).
|
||||
Bit 6 indicates that the hop will be an outbound endpoint (OBEP).
|
||||
|
||||
<h4 id="encryption">Request Encryption</h4>
|
||||
|
||||
<p>That cleartext record is <a href="how_cryptography.html#elgamal">ElGamal 2048 encrypted</a> with the hop's
|
||||
public encryption key and formatted into a 528 byte record:</p><pre>
|
||||
bytes 0-15: SHA-256-128 of the current hop's router identity
|
||||
bytes 0-15: First 16 bytes of the SHA-256 of the current hop's router identity
|
||||
bytes 16-527: ElGamal-2048 encrypted request record</pre>
|
||||
|
||||
<p>Since the cleartext uses the full field, there is no need for
|
||||
additional padding beyond <code>SHA256(cleartext) + cleartext</code>.</p>
|
||||
|
||||
<h3 id="tunnelCreate.hopProcessing">1.2) Hop processing</h3>
|
||||
<h3 id="tunnelCreate.hopProcessing">Hop Processing and Encryption</h3>
|
||||
|
||||
<p>When a hop receives a TunnelBuildMessage, it looks through the 8
|
||||
<p>When a hop receives a TunnelBuildMessage, it looks through the
|
||||
records contained within it for one starting with their own identity
|
||||
hash (trimmed to 8 bytes). It then decrypts the ElGamal block from
|
||||
that record and retrieves the protected cleartext. At that point,
|
||||
they make sure the tunnel request is not a duplicate by feeding the
|
||||
AES-256 reply key into a bloom filter and making sure the request
|
||||
time is within an hour of current. Duplicates or invalid requests
|
||||
AES-256 reply key into a bloom filter.
|
||||
Duplicates or invalid requests
|
||||
are dropped.</p>
|
||||
|
||||
<p>After deciding whether they will agree to participate in the tunnel
|
||||
or not, they replace the record that had contained the request with
|
||||
an encrypted reply block. All other records are AES-256/CBC
|
||||
encrypted with the included reply key and IV (though each is
|
||||
an encrypted reply block. All other records are <a href="how_cryptography.html#AES">AES-256/CBC
|
||||
encrypted</a> with the included reply key and IV (though each is
|
||||
encrypted separately, rather than chained across records).</p>
|
||||
|
||||
<h3 id="tunnelCreate.replyRecord">1.3) Tunnel creation reply record</h3>
|
||||
<h4 id="tunnelCreate.replyRecord">Reply Record Specification</h4>
|
||||
|
||||
<p>After the current hop reads their record, they replace it with a
|
||||
reply record stating whether or not they agree to participate in the
|
||||
tunnel, and if they do not, they classify their reason for
|
||||
rejection. This is simply a 1 byte value, with 0x0 meaning they
|
||||
agree to participate in the tunnel, and higher values meaning higher
|
||||
levels of rejection. The reply is encrypted with the AES session
|
||||
key delivered to it in the encrypted block, padded with random data
|
||||
until it reaches the full record size:</p><pre>
|
||||
AES-256-CBC(SHA-256(padding+status) + padding + status, key, IV)</pre>
|
||||
levels of rejection.
|
||||
<p>
|
||||
The following rejection codes are defined:
|
||||
<ul>
|
||||
<li>
|
||||
TUNNEL_REJECT_PROBABALISTIC_REJECT = 10
|
||||
<li>
|
||||
TUNNEL_REJECT_TRANSIENT_OVERLOAD = 20
|
||||
<li>
|
||||
TUNNEL_REJECT_BANDWIDTH = 30
|
||||
<li>
|
||||
TUNNEL_REJECT_CRIT = 50
|
||||
</ul>
|
||||
To hide other causes, such as router shutdown, from peers, the current implementation
|
||||
uses TUNNEL_REJECT_BANDWIDTH for almost all rejections.
|
||||
|
||||
<h3 id="tunnelCreate.requestPreparation">1.4) Request preparation</h3>
|
||||
<p>
|
||||
The reply is encrypted with the AES session
|
||||
key delivered to it in the encrypted block, padded with 527 bytes of random data
|
||||
to reach the full record size.
|
||||
The padding is placed before the status byte:
|
||||
</p><pre>
|
||||
AES-256-CBC(SHA-256(padding+status) + padding + status, key, IV)</pre>
|
||||
This is also described in the
|
||||
<a href="i2np_spec.html#msg_TunnelBuildReply">I2NP spec</a>.
|
||||
|
||||
<h3 id="tunnelCreate.requestPreparation">Request Preparation</h3>
|
||||
|
||||
<p>When building a new request, all of the records must first be
|
||||
built and asymmetrically encrypted. Each record should then be
|
||||
@ -103,31 +152,49 @@ right hop after their predecessor encrypts it.</p>
|
||||
<p>The excess records not needed for individual requests are simply
|
||||
filled with random data by the creator.</p>
|
||||
|
||||
<h3 id="tunnelCreate.requestDelivery">1.5) Request delivery</h3>
|
||||
<h3 id="tunnelCreate.requestDelivery">Request Delivery</h3>
|
||||
|
||||
<p>For outbound tunnels, the delivery is done directly from the tunnel
|
||||
creator to the first hop, packaging up the TunnelBuildMessage as if
|
||||
the creator was just another hop in the tunnel. For inbound
|
||||
tunnels, the delivery is done through an existing outbound tunnel
|
||||
(and during startup, when no outbound tunnel exists yet, a fake 0
|
||||
hop outbound tunnel is used).</p>
|
||||
tunnels, the delivery is done through an existing outbound tunnel.
|
||||
The outbound tunnel is generally from the same pool as the new tunnel being built.
|
||||
If no outbound tunnel is available in that pool, an outbound exploratory tunnel is used.
|
||||
At startup, when no outbound exploratory tunnel exists yet, a fake 0-hop
|
||||
outbound tunnel is used.</p>
|
||||
|
||||
<h3 id="tunnelCreate.endpointHandling">1.6) Endpoint handling</h3>
|
||||
<h3 id="tunnelCreate.endpointHandling">Endpoint Handling</h3>
|
||||
|
||||
<p>When the request reaches an outbound endpoint (as determined by the
|
||||
<p>
|
||||
For creation of an outbound tunnel,
|
||||
when the request reaches an outbound endpoint (as determined by the
|
||||
'allow messages to anyone' flag), the hop is processed as usual,
|
||||
encrypting a reply in place of the record and encrypting all of the
|
||||
other records, but since there is no 'next hop' to forward the
|
||||
TunnelBuildMessage on to, it instead places the encrypted reply
|
||||
records into a TunnelBuildReplyMessage and delivers it to the
|
||||
records into a
|
||||
<a href="i2np_spec.html#msg_TunnelBuildReply">TunnelBuildReplyMessage</a>
|
||||
or
|
||||
<a href="i2np_spec.html#msg_VariableTunnelBuildReply">VariableTunnelBuildReplyMessage</a>
|
||||
(the type of message and number of records must match that of the request)
|
||||
and delivers it to the
|
||||
reply tunnel specified within the request record. That reply tunnel
|
||||
forwards the reply records down to the tunnel creator for
|
||||
processing, as below.</p>
|
||||
|
||||
<p>When the request reaches the inbound endpoint (also known as the
|
||||
tunnel creator), the router processes each of the replies, as below.</p>
|
||||
<p>The reply tunnel was specified by the creator as follows:
|
||||
Generally it is an inbound tunnel from the same pool as the new outbound tunnel being built.
|
||||
If no inbound tunnel is available in that pool, an inbound exploratory tunnel is used.
|
||||
At startup, when no inbound exploratory tunnel exists yet, a fake 0-hop
|
||||
inbound tunnel is used.</p>
|
||||
|
||||
<h3 id="tunnelCreate.replyProcessing">1.7) Reply processing</h3>
|
||||
<p>
|
||||
For creation of an inbound tunnel,
|
||||
when the request reaches the inbound endpoint (also known as the
|
||||
tunnel creator), there is no need to generate an explicit Reply Message, and
|
||||
the router processes each of the replies, as below.</p>
|
||||
|
||||
<h3 id="tunnelCreate.replyProcessing">Reply Processing by the Request Creator</h3>
|
||||
|
||||
<p>To process the reply records, the creator simply has to AES decrypt
|
||||
each record individually, using the reply key and IV of each hop in
|
||||
@ -137,18 +204,37 @@ why they refuse. If they all agree, the tunnel is considered
|
||||
created and may be used immediately, but if anyone refuses, the
|
||||
tunnel is discarded.</p>
|
||||
|
||||
<h2 id="tunnelCreate.notes">2) Notes</h2>
|
||||
<p>
|
||||
The agreements and rejections are noted in each peer's
|
||||
<a href="how_peerselection.html">profile</a>, to be used in future assessments
|
||||
of peer tunnel capacity.
|
||||
|
||||
|
||||
<h2 id="tunnelCreate.notes">History and Notes</h2>
|
||||
<p>
|
||||
This strategy came about during a discussion on the I2P mailing list
|
||||
between Michael Rogers, Matthew Toseland (toad), and jrandom regarding
|
||||
the predecessor attack. See: <ul>
|
||||
<li><a href="http://osdir.com/ml/network.i2p/2005-10/msg00138.html">Summary</a></li>
|
||||
<li><a href="http://osdir.com/ml/network.i2p/2005-10/msg00129.html">Reasoning</a></li>
|
||||
</ul></li>
|
||||
It was introduced in release 0.6.1.10 on 2006-02-16, which was the last time
|
||||
a non-backward-compatible change was made in I2P.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
Notes:
|
||||
<ul>
|
||||
<li>This does not prevent two hostile peers within a tunnel from
|
||||
<li>This design does not prevent two hostile peers within a tunnel from
|
||||
tagging one or more request or reply records to detect that they are
|
||||
within the same tunnel, but doing so can be detected by the tunnel
|
||||
creator when reading the reply, causing the tunnel to be marked as
|
||||
invalid.</li>
|
||||
<li>This does not include a proof of work on the asymmetrically
|
||||
<li>This design does not include a proof of work on the asymmetrically
|
||||
encrypted section, though the 16 byte identity hash could be cut in
|
||||
half with the later replaced by a hashcash function of up to 2^64
|
||||
cost. This will not immediately be pursued, however.</li>
|
||||
<li>This alone does not prevent two hostile peers within a tunnel from
|
||||
half with the latter replaced by a hashcash function of up to 2^64
|
||||
cost.</li>
|
||||
<li>This design alone does not prevent two hostile peers within a tunnel from
|
||||
using timing information to determine whether they are in the same
|
||||
tunnel. The use of batched and synchronized request delivery
|
||||
could help (batching up requests and sending them off on the
|
||||
@ -159,12 +245,34 @@ window would work (though doing that would require a high degree of
|
||||
clock synchronization). Alternately, perhaps individual hops could
|
||||
inject a random delay before forwarding on the request?</li>
|
||||
<li>Are there any nonfatal methods of tagging the request?</li>
|
||||
<li>This strategy came about during a discussion on the I2P mailing list
|
||||
between Michael Rogers, Matthew Toseland (toad), and jrandom regarding
|
||||
the predecessor attack. See: <ul>
|
||||
<li><a href="http://osdir.com/ml/network.i2p/2005-10/msg00138.html">Summary</a></li>
|
||||
<li><a href="http://osdir.com/ml/network.i2p/2005-10/msg00129.html">Reasoning</a></li>
|
||||
</ul></li>
|
||||
</ul>
|
||||
|
||||
<h2 id="ref">References</h2>
|
||||
<ul>
|
||||
<li>
|
||||
<a href="http://prisms.cs.umass.edu/brian/pubs/wright-tissec.pdf">Predecessor
|
||||
attack</a>
|
||||
<li>
|
||||
<a href="http://prisms.cs.umass.edu/brian/pubs/wright.tissec.2008.pdf">2008
|
||||
update</a>
|
||||
</ul>
|
||||
|
||||
<h2 id="future">Future Work</h2>
|
||||
<ul>
|
||||
<li>
|
||||
It appears that, in the current implementation, the originator leaves one record empty
|
||||
for itself, which is not necessary. Thus a message of n records can only build a
|
||||
tunnel of n-1 hops. This is to be researched and verified.
|
||||
If it is possible to use the remaining record without compromising anonymity,
|
||||
we should do so.
|
||||
<li>
|
||||
The usefulness of a timestamp with an hour resolution is questionable,
|
||||
and the constraint is not currently enforced.
|
||||
Therefore the request time field is unused.
|
||||
This should be researched and possibly changed.
|
||||
<li>
|
||||
Further analysis of possible tagging and timing attacks described in the above notes.
|
||||
</ul>
|
||||
|
||||
|
||||
{% endblock %}
|
||||
|
Reference in New Issue
Block a user