From 8fa8d7739f5341c58d67d7e92589dbf5085da8cf Mon Sep 17 00:00:00 2001 From: jrandom Date: Sun, 9 Jan 2005 23:01:34 +0000 Subject: [PATCH] work in progress, but i want it in cvs so i dont lose it again --- router/doc/tunnel.html | 373 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 373 insertions(+) create mode 100644 router/doc/tunnel.html diff --git a/router/doc/tunnel.html b/router/doc/tunnel.html new file mode 100644 index 000000000..cddeaf2c0 --- /dev/null +++ b/router/doc/tunnel.html @@ -0,0 +1,373 @@ +
+1) Tunnel overview
+2) Tunnel operation
+2.1) Message preprocessing
+2.2) Gateway processing
+2.3) Participant processing
+2.4) Endpoint processing
+2.5) Padding
+2.6) Tunnel fragmentation
+2.7) Alternatives
+2.7.1) Don't use a checksum block
+2.7.2) Adjust tunnel processing midstream
+2.7.3) Use bidirectional tunnels
+2.7.4) Use smaller hashes
+3) Tunnel building
+3.1) Peer selection
+3.2) Request delivery
+3.3) Pooling
+4) Tunnel throttling
+5) Mixing/batching
+
+ +

1) Tunnel overview

+ +

Within I2P, messages are passed in one direction through a virtual +tunnel of peers, using whatever means are available to pass the +message on to the next hop. Messages arrive at the tunnel's +gateway, get bundled up for the path, and are forwarded on to the +next hop in the tunnel, which processes and verifies the validity +of the message and sends it on to the next hop, and so on, until +it reaches the tunnel endpoint. That endpoint takes the messages +bundled up by the gateway and forwards them as instructed - either +to another router, to another tunnel on another router, or locally.

+ +

Tunnels all work the same, but can be segmented into two different +groups - inbound tunnels and outbound tunnels. The inbound tunnels +have an untrusted gateway which passes messages down towards the +tunnel creator, which serves as the tunnel endpoint. For outbound +tunnels, the tunnel creator serves as the gateway, passing messages +out to the remote endpoint.

+ +

The tunnel's creator selects exactly which peers will participate +in the tunnel, and provides each with the necessary confiruration +data. They may vary in length from 0 hops (where the gateway +is also the endpoint) to 9 hops (where there are 7 peers after +the gateway and before the endpoint). It is the intent to make +it hard for either participants or third parties to determine +the length of a tunnel, or even for colluding participants to +determine whether they are a part of the same tunnel at all +(barring the situation where colluding peers are next to each other +in the tunnel). Messages that have been corrupted are also dropped +as soon as possible, reducing network load.

+ +

Beyond their length, there are additional configurable parameters +for each tunnel that can be used, such as a throttle on the size or +frequency of messages delivered, how padding should be used, how +long a tunnel should be in operation, whether to inject chaff +messages, whether to use fragmentation, and what, if any, batching +strategies should be employed.

+ +

In practice, a series of tunnel pools are used for different +purposes - each local client destination has its own set of inbound +tunnels and outbound tunnels, configured to meet its anonymity and +performance needs. In addition, the router itself maintains a series +of pools for participating in the network database and for managing +the tunnels themselves.

+ +

I2P is an inherently packet switched network, even with these +tunnels, allowing it to take advantage of multiple tunnels running +in parallel, increasing resiliance and balancing load. Outside of +the core I2P layer, there is an optional end to end streaming library +available for client applications, exposing TCP-esque operation, +including message reordering, retransmission, congestion control, etc.

+ +

2) Tunnel operation

+ +

Tunnel operation has four distinct processes, taken on by various +peers in the tunnel. First, the tunnel gateway accumulates a number +of tunnel messages and preprocesses them into something for tunnel +delivery. Next, that gateway encrypts that preprocessed data, then +forwards it to the first hop. That peer, and subsequent tunnel +participants, unwrap a layer of the encryption, verifying the +integrity of the message, then forward it on to the next peer. +Eventually, the message arrives at the endpoint where the messages +bundled by the gateway are split out again and forwarded on as +requested.

+ +

2.1) Message preprocessing

+ +

When the gateway wants to deliver data through the tunnel, it first +gathers zero or more I2NP messages (no more than 32KB worth), +selects how much padding will be used, and decides how each I2NP +message should be handled by the tunnel endpoint, encoding that +data into the raw tunnel payload:

+ + +

The instructions are encoded as follows:

+ + +

The I2NP message is encoded in its standard form, and the +preprocessed payload must be padded to a multiple of 16 bytes.

+ +

2.2) Gateway processing

+ +

After the preprocessing of messages into a padded payload, the gateway +encrypts the payload with the eight keys, building a checksum block so +that each peer can verify the integrity of the payload at any time, as +well as an end to end verification block for the tunnel endpoint to +verify the integrity of the checksum block. The specific details follow.

+ +

The encryption used is such that decryption +merely requires running over the data with AES in CTR mode, calculating the +SHA256 of a certain fixed portion of the message (bytes 16 through $size-288), +and searching for that hash in the checksum block. There is a fixed number +of hops defined (8 peers after the gateway) so that we can verify the message +without either leaking the position in the tunnel or having the message +continually "shrink" as layers are peeled off. For tunnels shorter than 9 +hops, the tunnel creator will take the place of the excess hops, decrypting +with their keys (for outbound tunnels, this is done at the beginning, and for +inbound tunnels, the end).

+ +

The hard part in the encryption is building that entangled checksum block, +which requires essentially finding out what the hash of the payload will look +like at each step, randomly ordering those hashes, then building a matrix of +what each of those randomly ordered hashes will look like at each step. +To visualize this a bit:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
IVPayloadeH[0]eH[1]eH[2]eH[3]eH[4]eH[5]eH[6]eH[7]V
peer0
key=K[0]
recvIV[0]P[0]V[0]
sendIV[1]P[1]H(P[1])V[1]
peer1
key=K[1]
recv
sendIV[2]P[2]H(P[2])V[2]
peer2
key=K[2]
recv
sendIV[3]P[3]H(P[3])V[3]
peer3
key=K[3]
recv
sendIV[4]P[4]H(P[4])V[4]
peer4
key=K[4]
recv
sendIV[5]P[5]H(P[5])V[5]
peer5
key=K[5]
recv
sendIV[6]P[6]H(P[6])V[6]
peer6
key=K[6]
recv
sendIV[7]P[7]H(P[7])V[7]
peer7
key=K[7]
recv
sendIV[8]P[8]H(P[8])V[8]
+ +

In the above, P[8] is the same as the original data being passed through the +tunnel (the preprocessed messages), and V[8] is the SHA256 of eH[0-7] as seen on +peer7 after decryption. For +cells in the matrix "higher up" than the hash, their value is derived by encrypting +the cell below it with the key for the peer below it, using the end of the column +to the left of it as the IV. For cells in the matrix "lower down" than the hash, +they're equal to the cell above them, decrypted by the current peer's key, using +the end of the previous encrypted block on that row.

+ +

With this randomized matrix of checksum blocks, each peer will be able to find +the hash of the payload, or if it is not there, know that the message is corrupt. +The entanglement by using CTR mode increases the difficulty in tagging the +checksum blocks themselves, but it is still possible for that tagging to go +briefly undetected if the columns after the tagged data have already been used +to check the payload at a peer. In any case, the tunnel endpoint (peer 7) knows +for certain whether any of the checksum blocks have been tagged, as that would +corrupt the verification block (V[8]).

+ +

The IV[0] is a random 16 byte value, and IV[i] is the first 16 bytes of +H(D(IV[i-1], K[i-1])). We don't use the same IV along the path, as that would +allow trivial collusion, and we use the hash of the decrypted value to propogate +the IV so as to hamper key leakage.

+ +

2.3) Participant processing

+ +

When a participant in a tunnel receives a message, they decrypt a layer with their +tunnel key using AES256 in CTR mode with the first 16 bytes as the IV. They then +calculate the hash of what they see as the payload (bytes 16 through $size-288) and +search for that hash within the decrypted checksum block. If no match is found, the +message is discarded. Otherwise, the IV is updated by decrypting it and replacing it +with the first 16 bytes of its hash. The resulting message is then forwarded on to +the next peer for processing.

+ +

2.4) Endpoint processing

+ +

When a message reaches the tunnel endpoint, they decrypts and verifies it like +a normal participant. If the checksum block has a valid match, the endpoint then +computes the hash of the checksum block itself (as seen after decryption) and compares +that to the decrypted verification hash (the last 32 bytes). If that verification +hash does not match, the endpoint takes note of the tagging attempt by one of the +tunnel participants and perhaps discards the message.

+ +

At this point, the tunnel endpoint has the preprocessed data sent by the gateway, +which it may then parse out into the included I2NP messages and forwards them as +requested in their delivery instructions.

+ +

2.5) Padding

+ +

Several tunnel padding strategies are possible, each with their own merits:

+ + + +

Which to use? no padding is most efficient, random padding is what +we have now, fixed size would either be an extreme waste or force us to +implement fragmentation. Padding to the closest exponential size (ala freenet) +seems promising. Perhaps we should gather some stats on the net as to what size +messages are, then see what costs and benefits would arise from different +strategies?

+ +

2.6) Tunnel fragmentation

+ +

For various padding and mixing schemes, it may be useful from an anonymity +perspective to fragment a single I2NP message into multiple parts, each delivered +seperately through different tunnel messages. The endpoint may or may not +support that fragmentation (discarding or hanging on to fragments as needed), +and handling fragmentation will not immediately be implemented.

+ +

2.7) Alternatives

+ +

2.7.1) Don't use a checksum block

+ +

One alternative to the above process is to remove the checksum block +completely and replace the verification hash with a plain hash of the payload. +This would simplify processing at the tunnel gateway and save 256 bytes of +bandwidth at each hop. On the other hand, attackers within the tunnel could +trivially adjust the message size to one which is easily traceable by +colluding external observers in addition to later tunnel participants. The +corruption would also incur the waste of the entire bandwidth necessary to +pass on the message. Without the per-hop validation, it would also be possible +to consume excess network resources by building extremely long tunnels, or by +building loops into the tunnel.

+ +

2.7.2) Adjust tunnel processing midstream

+ +

While the simple tunnel routing algorithm should be sufficient for most cases, +there are three alternatives that can be explored:

+ + +

2.7.3) Use bidirectional tunnels

+ +

The current strategy of using two seperate tunnels for inbound and outbound +communication is not the only technique available, and it does have anonymity +implications. On the positive side, by using separate tunnels it lessens the +traffic data exposed for analysis to participants in a tunnel - for instance, +peers in an outbound tunnel from a web browser would only see the traffic of +an HTTP GET, while the peers in an inbound tunnel would see the payload +delivered along the tunnel. With bidirectional tunnels, all participants would +have access to the fact that e.g. 1KB was sent in one direction, then 100KB +in the other. On the negative side, using unidirectional tunnels means that +there are two sets of peers which need to be profiled and accounted for, and +additional care must be taken to address the increased speed of predecessor +attacks. The tunnel pooling and building process outlined below should +minimize the worries of the predecessor attack, though if it were desired, +it wouldn't be much trouble to build both the inbound and outbound tunnels +along the same peers.

+ +

2.7.4) Use smaller hashes

+ +

At the moment, the plan is to reuse the existing SHA256 code and build +all of the checksum and verification hashes as 32 byte SHA256 values. 20 +byte SHA1 would likely be more than sufficient, and perhaps smaller.

+ +

3) Tunnel building

+ +

3.1) Peer selection

+

3.2) Request delivery

+

3.3) Pooling

+ +

4) Tunnel throttling

+ +

5) Mixing/batching

+