{% extends "_layout.html" %}
{% block title %}To Do List{% endblock %}
{% block content %}
  <h1>I2P Project Targets</h1>
  <p>Below is a more detailed (yet still incomplete) discussion of the major areas 
    of future development on the core I2P network, spanning the plausibly planned 
    releases. This does not include stego transports, porting to wireless devices, 
    or tools to secure the local machine, nor does it include client applications 
    that will be essential in I2P's success. There are probably other things that 
    will come up, especially as I2P gets more peer review, but these are the main 
    'big things'. See also <a href="roadmap.html">the roadmap</a>. Want to help? 
    <a href="getinvolved.html">Get involved</a>! </p>
  <br />
  <h2>Core functionality <font size="-1"><a href="#core">[link]</a></font></h2>
  <ul class="targetlist">
    <li><a href="#nat">NAT/Firewall bridging via 1-hop restricted routes</a></li>
    <li><a href="#transport">High degree transport layer with UDP, NBIO, or NIO</a></li>
    <li><a href="#netdb">NetworkDB and profile tuning and ejection policy for 
      large nets</a></li>
  </ul>
  <h2>Security / anonymity <font size="-1"><a href="#security">[link]</a></font></h2>
  <ul class="targetlist">
    <li><a href="#tunnelId">Per-hop tunnel id &amp; new permuted TunnelVerificationStructure 
      encryption</a></li>
    <li><a href="#ordering">Strict ordering of participants within tunnels</a></li>
    <li><a href="#tunnelLength">Randomly permuted tunnel lengths</a></li>
    <li><a href="#fullRestrictedRoutes">Full blown n-hop restricted routes with 
      optional trusted links</a></li>
    <li><a href="#hashcash">Hashcash for routerIdentity, destination, and tunnel 
      request</a></li>
    <li><a href="#batching">Advanced tunnel operation (batching/mixing/throttling/padding)</a></li>
    <li><a href="#stop">Stop &amp; go mix w/ garlics &amp; tunnels</a></li>
  </ul>
  <h2>Performance <font size="-1"><a href="#performance">[link]</a></font></h2>
  <ul class="targetlist">
    <li><a href="#sessionTag">Migrate sessionTag to synchronized PRNG</a></li>
    <li><a href="#streaming">Full streaming protocol improvements</a></li>
  </ul>
  <h2 id="core">Core functionality</h2>
  <ul class="targetlist">
    <li> 
      <h3 id="nat">NAT/Firewall bridging via 1-hop restricted routes</h3>
    </li>
    <b><i>Implemented in I2P 0.6.0.6</i></b> 
    <p>The functionality of allowing routers to fully participate within the network 
      while behind firewalls and NATs that they do not control requires some basic 
      restricted route operation (since those peers will not be able to receive 
      inbound connections). To do this successfully, you consider peers one of 
      two ways:</p>
  </ul>
  <ul>
    <li><b>Peers who have reachable interfaces</b> - these peers do not need to 
      do anything special</li>
    <li><b>Peers who do not have reachable interfaces</b> - these peers must build 
      a tunnel pointing at them where the gateway is one of the peers they have 
      established a connection with who has both a publicly reachable interface 
      and who has agreed to serve as their 'introducer'.</li>
  </ul>
  <ul class="targetlist">
    <p>To do this, peers who have no IP address simply connect to a few peers, 
      build a tunnel through them, and publish a reference to those tunnels within 
      their RouterInfo structure in the network database.</p>
    <p>When someone wants to contact any particular router, they first must get 
      its RouterInfo from the network database, which will tell them whether they 
      can connect directly (e.g. the peer has a publicly reachable interface) 
      or whether they need to contact them indirectly. Direct connections occur 
      as normal, while indirect connections are done through one of the the published 
      tunnels.</p>
    <p>When a router just wants to get a message or two to a specific hidden peer, 
      they can just use the indirect tunnel for sending the payload. However, 
      if the router wants to talk to the hidden peer often (for instance, as part 
      of a tunnel), they will send a garlic routed message through the indirect 
      tunnel to that hidden peer which unwraps to contain a message which should 
      be sent to the originating router. That hidden peer then establishes an 
      outbound connection to the originating router and from then on, those two 
      routers can talk to each other directly over that newly established direct 
      connection.</p>
    <p>Of course, that only works if the originating peer can receive connections 
      (they aren't also hidden). However, if the originating peer is hidden, they 
      can simply direct the garlic routed message to come back to the originating 
      peer's inbound tunnel.</p>
    <p>This is not meant to provide a way for a peer's IP address to be concealed, 
      merely as a way to let people behind firewalls and NATs fully operate within 
      the network. Concealing the peer's IP address adds a little more work, as 
      described <a href="#fullRestrictedRoutes">below.</a></p>
    <p>With this technique, any router can participate as any part of a tunnel. 
      For efficiency purposes, a hidden peer would be a bad choice for an inbound 
      gateway, and within any given tunnel, two neighboring peers wouldn't want 
      to be hidden. But that is not technically necessary.</p>
  </ul>
  <ul class="targetlist">
    <li> 
      <h3 id="transport">High degree transport layer with UDP, NBIO, or NIO</h3>
      <b><i>Both UDP and NIO have been Implemented in I2P</i></b> 
      <p>Standard TCP communication in Java generally requires blocking socket 
        calls, and to keep a blocked socket from hanging the entire system, those 
        blocking calls are done on their own threads. Our current TCP transport 
        is implemented in a naive fashion - for each peer we are talking to, we 
        have one thread reading and one thread writing. The reader thread simply 
        loops a bunch of read() calls, building I2NP messages and adding them 
        to our internal inbound message queue, and the writer thread pulls messages 
        off a per-connection outbound message queue and shoves the data through 
        write() calls.</p>
      <p>We do this fairly efficiently, from a CPU perspective - at any time, 
        almost all of these threads are sitting idle, blocked waiting for something 
        to do. However, each thread consumes real resources (on older Linux kernels, 
        for instance, each thread would often be implemented as a fork()'ed process). 
        As the network grows, the number of peers each router will want to talk 
        with will increase (remember, I2P is fully connected, meaning that any 
        given peer should know how to get a message to any other peer, and restricted 
        route support will probably not significantly reduce the number of connections 
        necessary). This means that with a 100,000 router network, each router 
        will have up to 199,998 threads just to deal with the TCP connections!</p>
      <p>Obviously, that just won't work. We need to use a transport layer that 
        can scale. In Java, we have two main camps:</p>
      <h4>UDP</h4>
      <b><i>Implemented in I2P 0.6 ("SSU") as documented <a href="udp.html">elsewhere</a></i></b> 
      <p>Sending and receiving UDP datagrams is a connectionless operation - if 
        we are communicating with 100,000 peers, we simply stick the UDP packets 
        in a queue and have a single thread pulling them off the queue and shoving 
        them out the pipe (and to receive, have a single thread pulling in any 
        UDP packets received and adding them to an inbound queue).</p>
      <p>However, moving to UDP means losing the benefits of TCP's ordering, congestion 
        control, MTU discovery, etc. Implementing that code will take significant 
        work, however I2P doesn't need it to be as strong as TCP. Specifically, 
        a while ago I was taking some measurements in the simulator and on the 
        live net, and the vast majority of messages transferred would fit easily 
        within a single unfragmented UDP packet, and the largest of the messages 
        would fit within 20-30 packets. As mule pointed out, TCP adds a significant 
        overhead when dealing with so many small packets, as the ACKs are within 
        an order of magnitude in size. With UDP, we can optimize the transport 
        for both efficiency and resilience by taking into account I2P's particular 
        needs.</p>
      <p>It will be a lot of work though.</p>
      <h4>NIO or NBIO</h4>
      <b><i>NIO Implemented in I2P 0.6.1.22 ("NTCP")</i></b> 
      <p>In Java 1.4, a set of "New I/O" packages was introduced, allowing Java 
        developers to take advantage of the operating system's nonblocking IO 
        capabilities - allowing you to maintain a large number of concurrent IO 
        operations without requiring a separate thread for each. There is much 
        promise with this approach, as we can scalable handle a large number of 
        concurrent connections and we don't have to write a mini-TCP stack with 
        UDP. However, the NIO packages have not proven themselves to be battle-ready, 
        as the Freenet developer's found. In addition, requiring NIO support would 
        mean we can't run on any of the open source JVMs like <a href="http://www.kaffe.org/">Kaffe</a>, 
        as <a href="http://www.classpath.org/">GNU/Classpath</a> has only limited 
        support for NIO. <i>(note: this may not be the case anymore, as there 
        has been some progress on Classpath's NIO, but it is an unknown quantity)</i></p>
      <p>Another alternative along the same lines is the <a href="http://www.eecs.harvard.edu/~mdw/proj/java-nbio/">Non 
        Blocking I/O</a> package - essentially a cleanroom NIO implementation 
        (written before NIO was around). It works by using some native OS code 
        to do the nonblocking IO, passing off events through Java. It seems to 
        be working with Kaffe, though there doesn't seem to be much development 
        activity on it lately (likely due to 1.4's NIO deployment).</p>
    </li>
  </ul>
  <ul class="targetlist">
    <li> 
      <h3 id="netdb">NetworkDB and profile tuning and ejection policy for large 
        nets</h3>
      <p>Within the current network database and profile management implementation, 
        we have taken the liberty of some practical shortcuts. For instance, we 
        don't have the code to drop peer references from the K-buckets, as we 
        don't have enough peers to even plausibly fill any of them, so instead, 
        we just keep the peers in whatever bucket is appropriate. Another example 
        deals with the peer profiles - the memory required to maintain each peer's 
        profile is small enough that we can keep thousands of full blown profiles 
        in memory without problems. While we have the capacity to use trimmed 
        down profiles (which we can maintain 100s of thousands in memory), we 
        don't have any code to deal with moving a profile from a "minimal profile" 
        to a "full profile", a "full profile" to a "minimal profile", or to simply 
        eject a profile altogether. It just wouldn't be practical to write that 
        code yet, since we aren't going to need it for a while.</p>
      <p>That said, as the network grows we are going to want to keep these considerations 
        in mind. We will have some work to do, but we can put it off for later.</p>
    </li>
  </ul>
  <h2 id="security">Security / anonymity</h2>
  <ul class="targetlist">
    <li> 
      <h3 id="tunnelId">Per-hop tunnel id &amp; new permuted TunnelVerificationStructure 
        encryption</h3>
      <b><i>Addressed in I2P 0.5 as documented <a href="tunnel-alt.html">elsewhere</a></i></b> 
      <p>Right now, if Alice builds a four hop inbound tunnel starting at Elvis, 
        going to Dave, then to Charlie, then Bob, and finally Alice (A&lt;--B&lt;--C&lt;--D&lt;--E), 
        all five of them will know they are participating in tunnel "123", as 
        the messages are tagged as such. What we want to do is give each hop their 
        own unique tunnel hop ID - Charlie will receive messages on tunnel 234 
        and forward them to tunnel 876 on Bob. The intent is to prevent Bob or 
        Charlie from knowing that they are in Alice's tunnel, as if each hop in 
        the tunnel had the same tunnel ID, collusion attacks aren't much work. 
      </p>
      <p>Adding a unique tunnel ID per hop isn't hard, but by itself, insufficient. 
        If Dave and Bob are under the control of the same attacker, they wouldn't 
        be able to tell they are in the same tunnel due to the tunnel ID, but 
        would be able to tell by the message bodies and verification structures 
        by simply comparing them. To prevent that, the tunnel must use layered 
        encryption along the path, both on the payload of the tunneled message 
        and on the verification structure (used to prevent simple tagging attacks). 
        This requires some simple modifications to the TunnelMessage, as well 
        as the inclusion of per-hop secret keys delivered during tunnel creation 
        and given to the tunnel's gateway. We must fix a maximum tunnel length 
        (e.g. 16 hops) and instruct the gateway to encrypt the message to each 
        of the 16 delivered secret keys, in reverse order, and to encrypt the 
        signature of the hash of the (encrypted) payload at each step. The gateway 
        then sends that 16-step encrypted message, along with a 16-step and 16-wide 
        encrypted mapping to the first hop, which then decrypts the mapping and 
        the payload with their secret key, looking in the 16-wide mapping for 
        the entry associated with their own hop (keyed by the per-hop tunnel ID) 
        and verifying the payload by checking it against the associated signed 
        hash.</p>
      <p>The tunnel gateway does still have more information than the other peers 
        in the tunnel, and compromising both the gateway and a tunnel participant 
        would allow those peers to collude, exposing the fact that they are both 
        in the same tunnel. In addition, neighboring peers know that they are 
        in the same tunnel anyway, as they know who they send the message to (and 
        with IP-based transports without restricted routes, they know who they 
        got it from). However, the above two techniques significantly increase 
        the cost of gaining meaningful samples when dealing with longer tunnels.</p>
    </li>
    <ul class="targetlist">
      <li> 
        <h3 id="ordering">Strict ordering of participants within tunnels</h3>
        <b><i>Implemented in release 0.6.2</i></b></li>
    </ul>
    <ul class="targetlist">
      <p>As Connelly <a href="http://dev.i2p/pipermail/i2p/2004-July/000335.html">proposed</a> 
        to deal with the <a href="http://prisms.cs.umass.edu/brian/pubs/wright-tissec.pdf">predecessor 
        attack</a> <a href="http://prisms.cs.umass.edu/brian/pubs/wright.tissec.2008.pdf">(2008 
        update)</a>, keeping the order of peers within our tunnels consistent 
        (aka whenever Alice creates a tunnel with both Bob and Charlie in it, 
        Bob's next hop is always Charlie), we address the issue as Bob doesn't 
        get to substantially sample Alice's peer selection group. We may even 
        want to explicitly allow Bob to participate in Alice's tunnels in only 
        one way - receiving a message from Dave and sending it to Charlie - and 
        if any of those peers are not available to participate in the tunnel (due 
        to overload, network disconnection, etc), avoid asking Bob to participate 
        in any tunnels until they are back online.</p>
      <p>More analysis is necessary for revising the tunnel creation - at the 
        moment, we simply select and order randomly within the peer's top tier 
        of peers (ones with fast + high capacity).</p>
      <p>Adding a strict ordering to peers in a tunnel also improves the anonymity 
        of peers with 0-hop tunnels, as otherwise the fact that a peer's gateway 
        is always the same would be particularly damning. However, peers with 
        0-hop tunnels may want to periodically use a 1-hop tunnel to simulate 
        the failure of a normally reliable gateway peer (so every MTBF*(tunnel 
        duration) minutes, use a 1-hop tunnel).</p>
    </ul>
    <li> 
      <h3 id="tunnelLength">Randomly permuted tunnel lengths</h3>
      <b><i>Addressed in I2P 0.5 as documented <a href="tunnel-alt.html">elsewhere</a></i></b></li>
  </ul>
  <ul class="targetlist">
    <p>Without tunnel length permutation, if someone were to somehow detect that 
      a destination had a particular number of hops, it might be able to use that 
      information to identify the router the destination is located on, per the 
      predecessor attack. For instance, if everyone has 2-hop tunnels, if Bob 
      receives a tunnel message from Charlie and forwards it to Alice, Bob knows 
      Alice is the final router in the tunnel. If Bob were to identify what destination 
      that tunnel served (by means of colluding with the gateway and harvesting 
      the network database for all of the LeaseSets), he would know the router 
      on which that destination is located (and without restricted routes, that 
      would mean what IP address the destination is on).</p>
    <p>It is to counter user behavior that tunnel lengths should be permuted, 
      using algorithms based on the length requested (for example, the 1/MTBF 
      length change for 0-hop tunnels outlined above).</p>
    <li> 
      <h3 id="fullRestrictedRoutes">Full blown n-hop restricted routes with optional 
        trusted links</h3>
      <p>The restricted route functionality described before was simply a functional 
        issue - how to let peers who would not otherwise be able to communicate 
        do so. However, the concept of allowing restricted routes includes additional 
        capabilities. For instance, if a router absolutely cannot risk communicating 
        directly with any untrusted peers, they can set up trusted links through 
        those peers, using them to both send and receive all of its messages. 
        Those hidden peers who want to be completely isolated would also refuse 
        to connect to peers who attempt to get them to (as demonstrated by the 
        garlic routing technique outlined before) - they can simply take the garlic 
        clove that has a request for delivery to a particular peer and tunnel 
        route that message out one of the hidden peer's trusted links with instructions 
        to forward it as requested.</p>
    </li>
    <li> 
      <h3 id="hashcash">Hashcash for routerIdentity, destination, and tunnel request</h3>
      <p>Within the network, we will want some way to deter people from consuming 
        too many resources or from creating so many peers to mount a <a href="http://citeseer.ist.psu.edu/douceur02sybil.html">Sybil</a> 
        attack. Traditional techniques such as having a peer see who is requesting 
        a resource or running a peer aren't appropriate for use within I2P, as 
        doing so would compromise the anonymity of the system. Instead, we want 
        to make certain requests "expensive".</p>
      <p><a href="http://www.hashcash.org/">Hashcash</a> is one technique that 
        we can use to anonymously increase the "cost" of doing certain activities, 
        such as creating a new router identity (done only once on installation), 
        creating a new destination (done only once when creating a service), or 
        requesting that a peer participate in a tunnel (done often, perhaps 2-300 
        times per hour). We don't know the "correct" cost of each type of certificate 
        yet, but with some research and experimentation, we could set a base level 
        that is sufficiently expensive while not an excessive burden for people 
        with few resources.</p>
      <p>There are a few other algorithms that we can explore for making those 
        requests for resources "nonfree", and further research on that front is 
        appropriate.</p>
    </li>
    <li> 
      <h3 id="batching">Advanced tunnel operation (batching/mixing/throttling/padding)</h3>
      <p>To powerful passive external observers as well as large colluding internal 
        observers, standard tunnel routing is vulnerable to traffic analysis attacks 
        - simply watching the size and frequency of messages being passed between 
        routers. To defend against these, we will want to essentially turn some 
        of the tunnels into its own mix cascade - delaying messages received at 
        the gateway and passing them in batches, reordering them as necessary, 
        and injecting dummy messages (indistinguishable from other "real" tunnel 
        messages by peers in the path). There has been a significant amount of 
        <a href="http://freehaven.net/doc/sync-batching/sync-batching.pdf">research</a> 
        on these algorithms that we can lean on prior to implementing the various 
        tunnel mixing strategies.</p>
      <p>In addition to the anonymity aspects of more varied tunnel operation, 
        there is a functional dimension as well. Each peer only has a certain 
        amount of data they can route for the network, and to keep any particular 
        tunnel from consuming an unreasonable portion of that bandwidth, they 
        will want to include some throttles on the tunnel. For instance, a tunnel 
        may be configured to throttle itself after passing 600 messages (1 per 
        second), 2.4MB (4KBps), or exceeding some moving average (8KBps for the 
        last minute). Excess messages may be delayed or summarily dropped. With 
        this sort of throttling, peers can provide ATM-like QoS support for their 
        tunnels, refusing to agree to allocate more bandwidth than the peer has 
        available.</p>
      <p>In addition, we may want to implement code to dynamically reroute tunnels 
        to avoid failed peers or to inject additional hops into the path. This 
        can be done by garlic routing a message to any particular peer in a tunnel 
        with instructions to redefine the next-hop in the tunnel.</p>
    </li>
    <li> 
      <h3 id="stop">Stop &amp; go mix w/ garlics &amp; tunnels</h3>
      <p>Beyond the per-tunnel batching and mixing strategy, there are further 
        capabilities for protecting against powerful attackers, such as allowing 
        each step in a garlic routed path to define a delay or window in which 
        it should be forwarded on. This would enable protections against the long 
        term intersection attack, as a peer could send a message that looks perfectly 
        standard to most peers that pass it along, except at any peers where the 
        clove exposed includes delay instructions.</p>
    </li>
  </ul>
  <h2 id="performance">Performance</h2>
  <ul class="targetlist">
    <li> 
      <h3 id="reply">Persistent Tunnel / Lease Selection</h3>
      <b><i>Outbound tunnel selection implemented in 0.6.1.30, inbound lease selection 
      implemented in release 0.6.2</i></b> 
      <p>Selecting tunnels and leases at random for every message creates a large 
        incidence of out-of-order delivery, which prevents the streaming lib from 
        increasing its window size as much as it could. By persisting with the 
        same selections for a given connection, the transfer rate is much faster. 
      </p>
    </li>
    <li> 
      <h3 id="reply">Reduction of Reply LeaseSet Bundling</h3>
      <b><i>Implemented in release 0.6.2</i></b> 
      <p>I2P bundled a reply leaseset (typically 1056 bytes) with every outbound 
        client message, which was a massive overhead. Fixed in 0.6.2. </p>
    </li>
    <li> 
      <h3 id="sessionTag">Migrate sessionTag to synchronized PRNG</h3>
      <p>Right now, our <a href="how_elgamalaes.html">ElGamal/AES+SessionTag</a> 
        algorithm works by tagging each encrypted message with a unique random 
        32 byte nonce (a "session tag"), identifying that message as being encrypted 
        with the associated AES session's key. This prevents peers from distinguishing 
        messages that are part of the same session, since each message has a completely 
        new random tag. To accomplish this, every few messages bundle a whole 
        new set of session tags within the encrypted message itself, transparently 
        delivering a way to identify future messages. We then have to keep track 
        of what messages are successfully delivered so that we know what tags 
        we may use.</p>
      <p>This works fine and is fairly robust, however it is inefficient in terms 
        of bandwidth usage, as it requires the delivery of these tags ahead of 
        time (and not all tags may be necessary, or some may be wasted, due to 
        their expiration). On average though, predelivering the session tag costs 
        32 bytes per message (the size of a tag). As Taral suggested though, that 
        size can be avoided by replacing the delivery of the tags with a synchronized 
        PRNG - when a new session is established (through an ElGamal encrypted 
        block), both sides seed a PRNG for use and generate the session tags on 
        demand (with the recipient precalculating the next few possible values 
        to handle out of order delivery).</p>
    </li>
    <li> 
      <h3 id="streaming">Full streaming protocol improvements</h3>
      <b><i>Several improvements implemented in I2P 0.6.1.28, and significant 
      additional fixes in 0.6.1.33, but still lots here to investigate</i></b> 
    </li>
  </ul>
  <ul class="targetlist">
    <p>Since I2P <a href="http://dev.i2p.net/pipermail/i2p/2004-November/000491.html">0.4.2</a>, 
      we have had a full sliding window streaming library, improving upon the 
      older fixed window size and resend delay implementation greatly. However, 
      there are still a few avenues for further optimization:</p>
  </ul>
  <ul>
    <li>some algorithms to share congestion and RTT information across streams 
      (per target destination? per source destination? for all of the local destinations?)</li>
    <li>further optimizations for interactive streams (most of the focus in the 
      current implementation is on bulk streams)</li>
    <li>more explicit use of the new streaming lib's features in I2PTunnel and 
      the SAM bridge, reducing the per-tunnel overhead.</li>
    <li>client level bandwidth limiting (in either or both directions on a stream, 
      or possibly shared across multiple streams). This would be in addition to 
      the router's overall bandwidth limiting, of course.</li>
    <li>various controls for destinations to throttle how many streams they accept 
      or create (we have some basic code, but largely disabled)</li>
    <li>access control lists (only allowing streams to or from certain other known 
      destinations)</li>
    <li>web controls and monitoring the health of the various streams, as well 
      as the ability to explicitly close or throttle them</li>
  </ul>
{% endblock %}