- Start work on how_netdb

- Split floodfill discussion to new page - Add javadoc links in common data structures - Fix javadoc links in how* again
2010-07-28 15:37:50 +00:00
parent c225ce313e
commit 46e1be1e8c
9 changed files with 559 additions and 432 deletions
--- a/www.i2p2/pages/common_structures_spec.html
+++ b/www.i2p2/pages/common_structures_spec.html
@ -58,6 +58,8 @@
  256 byte <a href="#type_Integer">Integer</a>
 </p>

+<h4><a href="http://docs.i2p2.de/core/net/i2p/data/PublicKey.html">Javadoc</a></h4>
+
 <h2 id="type_PrivateKey">PrivateKey</h2>
 <h4>Description</h4>
 <p>
@ -68,6 +70,8 @@
  256 byte <a href="#type_Integer">Integer</a>
 </p>

+<h4><a href="http://docs.i2p2.de/core/net/i2p/data/PrivateKey.html">Javadoc</a></h4>
+
 <h2 id="type_SessionKey">SessionKey</h2>
 <h4>Description</h4>
 <p>
@ -78,6 +82,8 @@
  32 byte <a href="#type_Integer">Integer</a>
 </p>

+<h4><a href="http://docs.i2p2.de/core/net/i2p/data/SessionKey.html">Javadoc</a></h4>
+
 <h2 id="type_SigningPublicKey">SigningPublicKey</h2>
 <h4>Description</h4>
 <p>
@ -88,6 +94,8 @@
  128 byte <a href="#type_Integer">Integer</a>
 </p>

+<h4><a href="http://docs.i2p2.de/core/net/i2p/data/SigningPublicKey.html">Javadoc</a></h4>
+
 <h2 id="type_SigningPrivateKey">SigningPrivateKey</h2>
 <h4>Description</h4>
 <p>
@ -98,6 +106,8 @@
  20 byte <a href="#type_Integer">Integer</a>
 </p>

+<h4><a href="http://docs.i2p2.de/core/net/i2p/data/SigningPrivateKey.html">Javadoc</a></h4>
+
 <h2 id="type_Signature">Signature</h2>
 <h4>Description</h4>
 <p>
@ -108,6 +118,8 @@
  40 byte <a href="#type_Integer">Integer</a>
 </p>

+<h4><a href="http://docs.i2p2.de/core/net/i2p/data/Signature.html">Javadoc</a></h4>
+
 <h2 id="type_Hash">Hash</h2>
 <h4>Description</h4>
 <p>
@ -118,6 +130,8 @@
  32 bytes
 </p>

+<h4><a href="http://docs.i2p2.de/core/net/i2p/data/Hash.html">Javadoc</a></h4>
+
 <h2 id="type_TunnelId">TunnelId</h2>
 <h4>Description</h4>
 <p>
@ -128,6 +142,8 @@
  4 byte <a href="type_Integer">Integer</a>
 </p>

+<h4><a href="http://docs.i2p2.de/core/net/i2p/data/TunnelID.html">Javadoc</a></h4>
+
 <h2 id="type_Certificate">Certificate</h2>
 <h4>Description</h4>
 <p>
@ -160,6 +176,8 @@ payload :: data
 {% endfilter %}
 </pre>

+<h4><a href="http://docs.i2p2.de/core/net/i2p/data/Certificate.html">Javadoc</a></h4>
+


 <h1>Common structure specification</h1>
@ -207,6 +225,8 @@ certificate :: Certificate
 {% endfilter %}
 </pre>

+<h4><a href="http://docs.i2p2.de/core/net/i2p/data/RouterIdentity.html">Javadoc</a></h4>
+
 <h2 id="struct_Destination">Destination</h2>
 <h4>Description</h4>
 <p>
@ -249,6 +269,8 @@ certificate :: Certificate
 {% endfilter %}
 </pre>

+<h4><a href="http://docs.i2p2.de/core/net/i2p/data/Destination.html">Javadoc</a></h4>
+
 <h2 id="struct_Lease">Lease</h2>
 <h4>Description</h4>
 <p>
@ -290,6 +312,8 @@ end_date :: Date
 {% endfilter %}
 </pre>

+<h4><a href="http://docs.i2p2.de/core/net/i2p/data/Lease.html">Javadoc</a></h4>
+
 <h2 id="struct_LeaseSet">LeaseSet</h2>
 <h4>Description</h4>
 <p>
@ -389,6 +413,8 @@ signature :: Signature
 {% endfilter %}
 </pre>

+<h4><a href="http://docs.i2p2.de/core/net/i2p/data/LeaseSet.html">Javadoc</a></h4>
+


 <h2 id="struct_RouterAddress">RouterAddress</h2>
@ -432,6 +458,8 @@ options :: Mapping
 {% endfilter %}
 </pre>

+<h4><a href="http://docs.i2p2.de/core/net/i2p/data/RouterAddress.html">Javadoc</a></h4>
+
 <h2 id="struct_RouterInfo">RouterInfo</h2>
 <h4>Description</h4>
 <p>
@ -505,4 +533,6 @@ options :: Mapping
 {% endfilter %}
 </pre>

+<h4><a href="http://docs.i2p2.de/core/net/i2p/data/RouterInfo.html">Javadoc</a></h4>
+
 {% endblock %}
--- a/www.i2p2/pages/how_cryptography.html
+++ b/www.i2p2/pages/how_cryptography.html
@ -36,7 +36,7 @@ block is formatted (in network byte order):
 The H(data) is the SHA256 of the data that is encrypted in the ElGamal block,
 and is preceded by a random nonzero byte.  The data encrypted in the block 
 can be up to 222 bytes long.  Specifically, see 
-<a href="http://docs.i2p2.de/core/net/i2p/crypto/ElGamalEngine.java">[the code]</a>.
+<a href="http://docs.i2p2.de/core/net/i2p/crypto/ElGamalEngine.html">[the code]</a>.
 <p>
 ElGamal is never used on its own in I2P, but instead always as part of 
 <a href="how_elgamalaes">ElGamal/AES+SessionTag</a>.
@ -55,14 +55,14 @@ Using 2 as the generator.
 <p>
 We use 256bit AES in CBC mode with PKCS#5 padding for 16 byte blocks (aka each block is end 
 padded with the number of pad bytes).  Specifically, see 
-<a href="http://docs.i2p2.de/core/net/i2p/crypto/CryptixAESEngine.java">[the CBC code]</a> 
+<a href="http://docs.i2p2.de/core/net/i2p/crypto/CryptixAESEngine.html">[the CBC code]</a> 
 and the Cryptix AES 

-<a href="http://docs.i2p2.de/core/net/i2p/crypto/CryptixRijndael_Algorithm.java">[implementation]</a>
+<a href="http://docs.i2p2.de/core/net/i2p/crypto/CryptixRijndael_Algorithm.html">[implementation]</a>
 <p>
 For situations where we stream AES data, we still use the same algorithm, as implemented in
-<a href="http://docs.i2p2.de/core/net/i2p/crypto/AESOutputStream.java">[AESOutputStream]</a>
-<a href="http://docs.i2p2.de/core/net/i2p/crypto/AESInputStream.java">[AESInputStream]</a>
+<a href="http://docs.i2p2.de/core/net/i2p/crypto/AESOutputStream.html">[AESOutputStream]</a>
+<a href="http://docs.i2p2.de/core/net/i2p/crypto/AESInputStream.html">[AESInputStream]</a>
 <p>
 For situations where we know the size of the data to be sent, we AES encrypt the following:
 <p>
@ -77,13 +77,13 @@ this entire segment (from H(data) through the end of the random bytes) is AES en

 <p>
 This code is implemented in the safeEncrypt and safeDecrypt methods of 
-<a href="http://docs.i2p2.de/core/net/i2p/crypto/AESEngine.java">[AESEngine]</a>
+<a href="http://docs.i2p2.de/core/net/i2p/crypto/AESEngine.html">[AESEngine]</a>
 <p>
 <H2><a name="DSA">DSA</a></H2>

 <p>
 Signatures are generated and verified with 1024bit DSA, as implemented in
-<a href="http://docs.i2p2.de/core/net/i2p/crypto/DSAEngine.java">[DSAEngine]</a> 
+<a href="http://docs.i2p2.de/core/net/i2p/crypto/DSAEngine.html">[DSAEngine]</a> 
 <p>
 <H3>The DSA constants</H3>

@ -135,7 +135,7 @@ Signatures are generated and verified with 1024bit DSA, as implemented in

 <p>
 Hashes within I2P are plain old SHA256, as implemented in
-<a href="http://docs.i2p2.de/core/net/i2p/crypto/SHA256Generator.java">[SHA256Generator]</a>
+<a href="http://docs.i2p2.de/core/net/i2p/crypto/SHA256Generator.html">[SHA256Generator]</a>


 <h2>Transports</h2>
--- a/www.i2p2/pages/how_cryptography_de.html
+++ b/www.i2p2/pages/how_cryptography_de.html
@ -38,7 +38,7 @@ Die H(Daten) ist der SHA256 Wert der Daten, die in dem ElGamal Block
 verschl&uuml;sselt sind. Vor den H(Daten) steht ein zuf&auml;lliges Byte, das 
 nicht Null ist. Die Daten im Block k&ouml;nnen bis zu 222 Bytes lang sein.
 Die Spezifikation ist 
-<a href="http://docs.i2p2.de/core/net/i2p/crypto/ElGamalEngine.java">[im Quelltext]</a>.
+<a href="http://docs.i2p2.de/core/net/i2p/crypto/ElGamalEngine.html">[im Quelltext]</a>.
 <p>
 ElGamal wird in I2P nie alleine genutzt, es ist immer ein Teil von 
 <a href="how_elgamalaes">ElGamal/AES+SessionTag</a>.
@ -59,14 +59,14 @@ Wir benutzen 256bit AES im CBC Modus mit dem PKCS#5 Padding f&uuml;r 16 Byte Blo
 (welches bedeuted, das jeder Block mit der Anzahl der aufgef&uuml;llten Bytes
 als Daten aufgef&uuml;llt wird). 
 Zur Spezifikation schaue in den 
- <a href="http://docs.i2p2.de/core/net/i2p/crypto/CryptixAESEngine.java">[CBC Quelltext]</a> 
+ <a href="http://docs.i2p2.de/core/net/i2p/crypto/CryptixAESEngine.html">[CBC Quelltext]</a> 
 und f&uuml;r die Crytix AES 

-<a href="http://docs.i2p2.de/core/net/i2p/crypto/CryptixRijndael_Algorithm.java">[Implementation hier]</a>
+<a href="http://docs.i2p2.de/core/net/i2p/crypto/CryptixRijndael_Algorithm.html">[Implementation hier]</a>
 <p>
 In Situationen, in denen wir AES Daten streamen, nutzen wir die selben Algorhytmen, wie sie in 
-<a href="http://docs.i2p2.de/core/net/i2p/crypto/AESOutputStream.java">[AESOutputStream]</a> und
-<a href="http://docs.i2p2.de/core/net/i2p/crypto/AESInputStream.java">[AESInputStream]</a>
+<a href="http://docs.i2p2.de/core/net/i2p/crypto/AESOutputStream.html">[AESOutputStream]</a> und
+<a href="http://docs.i2p2.de/core/net/i2p/crypto/AESInputStream.html">[AESInputStream]</a>
 implementiert sind.
 <p>
 F&uuml;r Situationen, in denen die Gr&ouml;se der zu sendenden Daten bekannt ist, 
@ -83,14 +83,14 @@ Ende der zuf&auml;lligen Daten ist AES verschl&uuml;sselt (256bit CBC mit PKCS#5

 <p>
 Dieser Code ist in den safeEncrypt und safeDecrypt Methoden der  
-<a href="http://docs.i2p2.de/core/net/i2p/crypto/AESEngine.java">[AESEngine]</a>
+<a href="http://docs.i2p2.de/core/net/i2p/crypto/AESEngine.html">[AESEngine]</a>
 implementiert.
 <p>
 <H2>DSA</H2>

 <p>
 Signaturen werden mit 1024bit DSA erzeugt und verifiziert, wie es in der
-<a href="http://docs.i2p2.de/core/net/i2p/crypto/DSAEngine.java">[DSAEngine]</a> 
+<a href="http://docs.i2p2.de/core/net/i2p/crypto/DSAEngine.html">[DSAEngine]</a> 
 implementiert ist.
 <p>
 <H3>Die DSA Konstanten</H3>
@ -143,7 +143,7 @@ implementiert ist.

 <p>
 Hashes in I2P sind bekannte, alte SHA256 wie es im
-<a href="http://docs.i2p2.de/core/net/i2p/crypto/SHA256Generator.java">[SHA256Generator]</a>
+<a href="http://docs.i2p2.de/core/net/i2p/crypto/SHA256Generator.html">[SHA256Generator]</a>
 implementiert ist.
 <p>
 <H2>TCP Verbindungen</H2>
@ -170,11 +170,11 @@ RouterInfo Struktur nutzen (die die ElGamal und DSA Schl&uuml;ssel enth&auml;lt)
 <p>
 Die grundlegenden TCP Verbindungsalgorhythmen sind in der establishConnection() Funktion
 implementiert (die wiederrum exchangeKey() und identifyStationToStation()) in 
-<a href="http://docs.i2p2.de/router/net/i2p/router/transport/tcp/TCPConnection.java">[TCPConnection]</a>
+<a href="http://docs.i2p2.de/router/net/i2p/router/transport/tcp/TCPConnection.html">[TCPConnection]</a>
 aufruft).
 <p>
 Dieses wird erweitert durch die 
-<a href="http://docs.i2p2.de/router/net/i2p/router/transport/tcp/RestrictiveTCPConnection.java">[RestrictiveTCPConnection]</a>
+<a href="http://docs.i2p2.de/router/net/i2p/router/transport/tcp/RestrictiveTCPConnection.html">[RestrictiveTCPConnection]</a>
 Funktion, die die establishConnection() Methode aktualisiert um die
 Protokoll Version, die Uhrzeit und die &ouml;ffentlich erreichbare IP
 Adresse des Knotens verifizieren zu k&ouml;nnen. (Da wir noch keine 
--- a/www.i2p2/pages/how_elgamalaes.html
+++ b/www.i2p2/pages/how_elgamalaes.html
@ -18,7 +18,7 @@ the three possible conditions:</p>
 <p>
 If its ElGamal encrypted to us, the message is considered a new session, and
 is encrypted per encryptNewSession(...) in 
-<a href="http://docs.i2p2.de/core/net/i2p/crypto/ElGamalAESEngine.java">[ElGamalAESEngine]</a> 
+<a href="http://docs.i2p2.de/core/net/i2p/crypto/ElGamalAESEngine.html">[ElGamalAESEngine]</a> 
 as follows -</p>

 <p>An initial ElGamal block, encrypted <a href="how_cryptography">as before</a>:</p>
@ -63,7 +63,7 @@ the old one.</p>
 brief period (30 minutes currently) until they are used (and discarded).
 They are used by packaging in a message that is not preceded by an
 ElGamal block.  Instead, it is encrypted per encryptExistingSession(...) in 
-<a href="http://docs.i2p2.de/core/net/i2p/crypto/ElGamalAESEngine.java">[ElGamalAESEngine]</a> 
+<a href="http://docs.i2p2.de/core/net/i2p/crypto/ElGamalAESEngine.html">[ElGamalAESEngine]</a> 
 as follows -</p>

 <PRE>
--- a/www.i2p2/pages/how_elgamalaes_de.html
+++ b/www.i2p2/pages/how_elgamalaes_de.html
@ -22,7 +22,7 @@ testen, ob diese in eine der drei M&ouml;glichkeiten passt:
 Falls es f&uuml;r uns ElGamal verschl&uuml;sselt ist, wird die Nachricht als
 neue Session behandelt und wird verschl&uuml;sselt mittels encryptNewSession(....) 
 in der  
-<a href="http://docs.i2p2.de/core/net/i2p/crypto/ElGamalAESEngine.java">[ElGamalAESEngine]</a> 
+<a href="http://docs.i2p2.de/core/net/i2p/crypto/ElGamalAESEngine.html">[ElGamalAESEngine]</a> 
 wie folgend -</p>

 <p>Ein initialer ElGamal Block, verschl&uuml;sselt <a href="how_cryptography">wie zuvor</a>:</p>
@ -67,7 +67,7 @@ ersetzt.</p>
 Zeit (zur Zeit 30 Minuten) gespeichert bis sie gebraucht (und weggeworfen) werden.
 Sie werden f&uuml;r das Einpacken einer Nachricht ohne vorhergehnden ElGamal Block
 genutzt. Dabei wird es mit encryptExistingSession(...) in der
-<a href="http://docs.i2p2.de/core/net/i2p/crypto/ElGamalAESEngine.java">[ElGamalAESEngine]</a> 
+<a href="http://docs.i2p2.de/core/net/i2p/crypto/ElGamalAESEngine.html">[ElGamalAESEngine]</a> 
 wie folgend verschl&uuml;sselt -</p>

 <PRE>
--- a/www.i2p2/pages/how_networkdatabase.html
+++ b/www.i2p2/pages/how_networkdatabase.html
@ -1,14 +1,19 @@
 {% extends "_layout.html" %}
 {% block title %}How the Network Database (netDb) Works{% endblock %}
 {% block content %}
+
+<p>
+Updated July 2010, current as of router version 0.8
+
+<h2>Overview</h2>
+
 <p>I2P's netDb is a specialized distributed database, containing 
 just two types of data - router contact information (<b>RouterInfos</b>) and destination contact
 information (<b>LeaseSets</b>).  Each piece of data is signed by the appropriate party and verified
 by anyone who uses or stores it.  In addition, the data has liveliness information
 within it, allowing irrelevant entries to be dropped, newer entries to replace
-older ones, and, for the paranoid, protection against certain classes of attack.
-(Note that this is also why I2P bundles in the necessary code to determine the
-correct time).</p>
+older ones, and protection against certain classes of attack.
+</p>

 <p>
 The netDb is distributed with a simple technique called "floodfill".
@ -32,17 +37,64 @@ from the SHA256 of the router's identity.  The structure itself contains:</p><ul
 <li>The router's identity (a 2048bit ElGamal encryption key, a 1024bit DSA signing key, and a certificate)</li>
 <li>The contact addresses at which it can be reached (e.g. TCP: example.org port 4108)</li>
 <li>When this was published</li>
-<li>A set of arbitrary (uninterpreted) text options</li>
+<li>A set of arbitrary text options</li>
 <li>The signature of the above, generated by the identity's DSA signing key</li>
 </ul>

-<p>The arbitrary text options are currently used to help debug the network,
-publishing various stats about the router's health.  These stats will be disabled
-by default once I2P 1.0 is out, and can be disabled by adding 
-"router.publishPeerRankings=false" to the router 
-<a href="http://localhost:7657/configadvanced.jsp">configuration</a>.  The data 
-published can be seen on the router's <a href="http://localhost:7657/netdb.jsp">netDb</a>
-page, but should not be trusted.</p>
+<p>
+The following text options, while not strictly required, are expected
+to be present:
+<ul>
+<li><b>caps</b>
+    (Capabilities flags - used to indicate approximate bandwidth and reachability)
+<li><b>coreVersion</b>
+    (The core library version, always the same as the router version)
+<li><b>netId</b> = 2
+    (Basic network compatibility - A router will refuse to communicate with a peer having a different netId)
+<li><b>router.version</b>
+    (Used to determine compatibility with newer features and messages)
+<li><b>stat_uptime</b> = 90m
+    (Always sent as 90m, for compatibility with an older scheme where routers published their actual uptime,
+     and only sent tunnel requests to peers whose was more than 60m)
+</ul>
+
+These values are used by other routers for basic decisions.
+Should we connect to this router? Should we attempt to route a tunnel through this router?
+The bandwidth capability flag, in particular, is used only to determine whether
+the router meets a minimum threshold for routing tunnels.
+Above the minimum threshold, the advertised bandwidth is not used or trusted anywhere
+in the router, except for display in the user interface and for debugging and network analysis.
+
+
+
+<p>Additional text options include
+a small number of statistics about the router's health, which are aggregated by
+sites such as <a href="http://stats.i2p/">stats.i2p</a>
+for network performance analysis and debugging.
+These statistics were chosen to provide data crucial to the developers,
+such as tunnel build success rates, while balancing the need for such data
+with the side-effects that could result from revealing this data.
+Current statistics are limited to:
+<ul>
+<li>1 hour average bandwidth (average of outbound  and inbound bandwidth)
+<li>Client and exporatory tunnel build success, reject, and timeout rates
+<li>1 hour average number of participating tunnels
+</ul>
+
+
+The data 
+published can be seen in the router's user interface,
+but is not used or trusted within the router.
+As the network has matured, we have gradually removed most of the published
+statistics to improve anonymity, and we plan to remove more in future releases.
+
+</p>
+
+<p>
+<a href="common_structures_spec.html#struct_RouterInfo">RouterInfo specification</a>
+<p>
+<a href="http://docs.i2p2.de/core/net/i2p/data/RouterInfo.html">RouterInfo Javadoc</a>
+

 <h2><a name="leaseSet">LeaseSet</a></h2>

@ -59,7 +111,11 @@ signing key, and certificate) as well as an additional pair of signing and
 encryption keys.  These additional keys can be used for garlic routing messages
 to the router on which the destination is located (though these keys are <b>not</b>
 the router's keys - they are generated by the client and given to the router to
-use).  End to end client messages are still, of course, encrypted with the 
+use).
+
+FIXME
+
+ End to end client messages are still, of course, encrypted with the 
 destination's public keys.
 [UPDATE - This is no longer true, we don't do end-to-end client encryption
 any more, as explained in
@ -68,6 +124,31 @@ So is there any use for the first encryption key, signing key, and certificate?
 Can they be removed?]
 </p>

+<p>
+<a href="common_structures_spec.html#struct_Lease">Lease specification</a>
+<br>
+<a href="common_structures_spec.html#struct_LeaseSet">LeaseSet specification</a>
+<p>
+<a href="http://docs.i2p2.de/core/net/i2p/data/Lease.html">Lease Javadoc</a>
+<br>
+<a href="http://docs.i2p2.de/core/net/i2p/data/LeaseSet.html">LeaseSet Javadoc</a>
+
+
+<h3><a name="revoked">Revoked LeaseSets</a></h3>
+
+A LeaseSet may be <i>revoked</i> by publishing a new LeaseSet with zero leases.
+
+
+
+<h3><a name="encrypted">Encrypted LeaseSets</a></h3>
+
+In an <i>encrypted</i> LeaseSet, all Leases are encrypted with a separate DSA key.
+The leases may only be decoded, and thus the destination may only be contacted,
+by those with the key.
+There is no flag or other direct indication that the LeaseSet is encrypted.
+
+
+
 <h2><a name="bootstrap">Bootstrapping</a></h2>

 <p>The netDb is decentralized, however you do need at
@ -201,394 +282,11 @@ The multihoming behavior has been verified with the test eepsite
 <a href="http://multihome.i2p/">http://multihome.i2p/</a>.
 </p>

-<h2><a name="status">History and Status</a></h2>
+<h2><a name="history">History</a></h2>

-<h3>The Introduction of the Floodfill Algorithm</h3>
-<p>
-Floodfill was introduced in release 0.6.0.4, keeping Kademlia as a backup algorithm.
-</p>
+<a href="netdb_discussion.html">Moved to the netdb disussion page</a>.

-<p>
-(Adapted from posts by jrandom in the old Syndie, Nov. 26, 2005)
-<br />
-As I've often said, I'm not particularly bound to any specific technology - 
-what matters to me is what will get results.  While I've been working through 
-various netDb ideas over the last few years, the issues we've faced in the last 
-few weeks have brought some of them to a head.  On the live net, 
-with the netDb redundancy factor set to 4 peers (meaning we keep sending an 
-entry to new peers until 4 of them confirm that they've got it) and the 
-per-peer timeout set to 4 times that peer's average reply time, we're 
-<b>still</b> getting an average of 40-60 peers sent to before 4 ACK the store.  
-That means sending 36-56 times as many messages as should go out, each using 
-tunnels and thereby crossing 2-4 links.  Even further, that value is heavily 
-skewed, as the average number of peers sent to in a 'failed' store (meaning 
-less than 4 people ACKed the message after 60 seconds of sending messages out) 
-was in the 130-160 peers range.
-</p><p>
-This is insane, especially for a network with only perhaps 250 peers on it.  
-</p><p>
-The simplest answer is to say "well, duh jrandom, it's broken.  fix it", but 
-that doesn't quite get to the core of the issue.  In line with another current 
-effort, it's likely that we have a substantial number of network issues due to 
-restricted routes - peers who cannot talk with some other peers, often due to 
-NAT or firewall issues.  If, say, the K peers closest to a particular netDb 
-entry are behind a 'restricted route' such that the netDb store message could 
-reach them but some other peer's netDb lookup message could not, that entry 
-would be essentially unreachable.  Following down those lines a bit further and 
-taking into consideration the fact that some restricted routes will be created 
-with hostile intent, its clear that we're going to have to look closer into a 
-long term netDb solution.
-</p><p>
-There are a few alternatives, but two worth mentioning in particular.  The 
-first is to simply run the netDb as a Kademlia DHT using a subset of the full 
-network, where all of those peers are externally reachable.  Peers who are not 
-participating in the netDb still query those peers but they don't receive 
-unsolicited netDb store or lookup messages.  Participation in the netDb would 
-be both self-selecting and user-eliminating - routers would choose whether to 
-publish a flag in their routerInfo stating whether they want to participate 
-while each router chooses which peers it wants to treat as part of the netDb 
-(peers who publish that flag but who never give any useful data would be 
-ignored, essentially eliminating them from the netDb).
-</p><p>
-Another alternative is a blast from the past, going back to the DTSTTCPW 
-(Do The Simplest Thing That Could Possibly Work)
-mentality - a floodfill netDb, but like the alternative above, using only a 
-subset of the full network.  When a user wants to publish an entry into the 
-floodfill netDb, they simply send it to one of the participating routers, wait 
-for an ACK, and then 30 seconds later, query another random participant in the 
-floodfill netDb to verify that it was properly distributed.  If it was, great, 
-and if it wasn't, just repeat the process.  When a floodfill router receives a 
-netDb store, they ACK immediately and queue off the netDb store to all of its 
-known netDb peers.  When a floodfill router receives a netDb lookup, if they 
-have the data, they reply with it, but if they don't, they reply with the 
-hashes for, say, 20 other peers in the floodfill netDb.
-</p><p>
-Looking at it from a network economics perspective, the floodfill netDb is 
-quite similar to the original broadcast netDb, except the cost for publishing 
-an entry is borne mostly by peers in the netDb, rather than by the publisher.  
-Fleshing this out a bit further and treating the netDb like a blackbox, we can 
-see the total bandwidth required by the netDb to be:<pre>
-  recvKBps = N * (L + 1) * (1 + F) * (1 + R) * S / T
-</pre>where<pre>
-  N = number of routers in the entire network
-  L = average number of client destinations on each router 
-      (+1 for the routerInfo)
-  F = tunnel failure percentage 
-  R = tunnel rebuild period, as a fraction of the tunnel lifetime
-  S = average netDb entry size
-  T = tunnel lifetime
-</pre>Plugging in a few values:<pre>
-  recvKBps = 1000 * (5 + 1) * (1 + 0.05) * (1 + 0.2) * 2KB / 10m
-           = 25.2KBps
-</pre>That, in turn, scales linearly with N (at 100,000 peers, the netDb must 
-be able to handle netDb store messages totaling 2.5MBps, or, at 300 peers, 
-7.6KBps).  
-</p><p>
-While the floodfill netDb would have each netDb participant receiving only a 
-small fraction of the client generated netDb stores directly, they would all 
-receive all entries eventually, so all of their links should be capable of 
-handling the full recvKBps.  In turn, they'll all need to send 
-<tt>(recvKBps/sizeof(netDb)) * (sizeof(netDb)-1)</tt> to keep the other 
-peers in sync.
-</p><p>
-A floodfill netDb would not require either tunnel routing for netDb operation 
-or any special selection as to which entries it can answer 'safely', as the 
-basic assumption is that they are all storing everything.  Oh, and with regards 
-to the netDb disk usage required, its still fairly trivial for any modern 
-machine, requiring around 11MB for every 1000 peers <tt>(N * (L + 1) * 
-S)</tt>.
-</p><p>
-The Kademlia netDb would cut down on these numbers, ideally bringing them to K 
-over M times their value, with K = the redundancy factor and M being the number 
-of routers in the netDb (e.g. 5/100, giving a recvKBps of 126KBps and 536MB at 
-100,000 routers).  The downside of the Kademlia netDb though is the increased 
-complexity of safe operation in a hostile environment.  
-</p><p>
-What I'm thinking about now is to simply implement and deploy a floodfill netDb 
-in our existing live network, letting peers who want to use it pick out other 
-peers who are flagged as members and query them instead of querying the 
-traditional Kademlia netDb peers.  The bandwidth and disk requirements at this 
-stage are trivial enough  (7.6KBps and 3MB disk space) and it will remove the 
-netDb entirely from the debugging plan - issues that remain to be addressed 
-will be caused by something unrelated to the netDb.
-</p><p>
-How would peers be chosen to publish that flag saying they are a part of the 
-floodfill netDb?  At the beginning, it could be done manually as an advanced 
-config option (ignored if the router is not able to verify its external 
-reachability).  If too many peers set that flag, how do the netDb participants 
-pick which ones to eject?  Again, at the beginning it could be done manually as 
-an advanced config option (after dropping peers which are unreachable).  How do 
-we avoid netDb partitioning?  By having the routers verify that the netDb is 
-doing the flood fill properly by querying K random netDb peers.  How do routers 
-not participating in the netDb discover new routers to tunnel through?  Perhaps 
-this could be done by sending a particular netDb lookup so that the netDb 
-router would respond not with peers in the netDb, but with random peers outside 
-the netDb.  
-</p><p>
-I2P's netDb is very different from traditional load bearing DHTs - it only 
-carries network metadata, not any actual payload, which is why even a netDb 
-using a floodfill algorithm will be able to sustain an arbitrary amount of 
-eepsite/IRC/bt/mail/syndie/etc data.  We can even do some optimizations as I2P 
-grows to distribute that load a bit further (perhaps passing bloom filters 
-between the netDb participants to see what they need to share), but it seems we 
-can get by with a much simpler solution for now.
-</p><p>
-One fact may be worth digging 
-into - not all leaseSets need to be published in the netDb!  In fact, most 
-don't need to be - only those for destinations which will be receiving 
-unsolicited messages (aka servers).  This is because the garlic wrapped 
-messages sent from one destination to another already bundles the sender's 
-leaseSet so that any subsequent send/recv between those two destinations 
-(within a short period of time) work without any netDb activity.
-</p><p>
-So, back at those equations, we can change L from 5 to something like 0.1 
-(assuming only 1 out of every 50 destinations is a server).  The previous 
-equations also brushed over the network load required to answer queries from 
-clients, but while that is highly variable (based on the user activity), it's 
-also very likely to be quite insignificant as compared to the publishing 
-frequency.
-</p><p>
-Anyway, still no magic, but a nice reduction of nearly 1/5th the bandwidth/disk 
-space required (perhaps more later, depending upon whether the routerInfo 
-distribution goes directly as part of the peer establishment or only through 
-the netDb).
-</p>
-
-<h3>The Disabling of the Kademlia Algorithm</h3>
-<p>
-Kademlia was completely disabled in release 0.6.1.20.
-</p><p>
-(this is adapted from an IRC conversation with jrandom 11/07)
-<br />
-Kademlia requires a minimum level of service that the baseline could not offer (bandwidth, cpu),
-even after adding in tiers (pure kad is absurd on that point).
-Kademlia just wouldn't work.  It was a nice idea, but not for a hostile and fluid environment.
-</p>
-
-<h3>Current Status</h3>
-<p>The netDb plays a very specific role in the I2P network, and the algorithms
-have been tuned towards our needs.  This also means that it hasn't been tuned 
-to address the needs we have yet to run into.  I2P is currently
-fairly small (a few hundred routers).
-There were some calculations that 3-5 floodfill routers should be able to handle
-10,000 nodes in the network.
-The netDb implementation more than adequately meets our
-needs at the moment, but there will likely be further tuning and bugfixing as 
-the network grows.</p>
-
-<h3>Update of Calculations 03-2008</h3>
-<p>Current numbers:
-<pre>
-  recvKBps = N * (L + 1) * (1 + F) * (1 + R) * S / T
-</pre>where<pre>
-  N = number of routers in the entire network
-  L = average number of client destinations on each router 
-      (+1 for the routerInfo)
-  F = tunnel failure percentage 
-  R = tunnel rebuild period, as a fraction of the tunnel lifetime
-  S = average netDb entry size
-  T = tunnel lifetime
-</pre>
-Changes in assumptions:
-<ul>
-<li>L is now about .5, compared to .1 above, due to the popularity of i2psnark
-and other apps.
-<li>F is about .33, but bugs in tunnel testing are fixed in 0.6.1.33, so it will get much better.
-<li>Since netDb is about 2/3 5K routerInfos and 1/3 2K leaseSets, S = 4K.
-RouterInfo size is shrinking in 0.6.1.32 and 0.6.1.33 as we remove unnecessary stats.
-<li>R = tunnel build period: 0.2 was a very low - it was maybe 0.7 -
-but build algorithm improvements in 0.6.1.32 should bring it down to about 0.2
-as the network upgrades. Call it 0.5 now with half the network at .30 or earlier.
-</ul>
-<pre>  recvKBps = 700 * (0.5 + 1) * (1 + 0.33) * (1 + 0.5) * 4KB / 10m
-           ~= 28KBps
-</pre>
-This just accounts for the stores - what about the queries?
-
-
-<h3>The Return of the Kademlia Algorithm?</h3>
-<p>
-(this is adapted from <a href="meeting195.html">the I2P meeting Jan. 2, 2007</a>)
-<br />
-The Kademlia netDb just wasn't working properly.
-Is it dead forever or will it be coming back?
-If it comes back, the peers in the Kademlia netDb would be a very limited subset
-of the routers in the network (basically an expanded number of floodfill peers, if/when the floodfill peers
-cannot handle the load).
-But until the floodfill peers cannot handle the load (and other peers cannot be added that can), it's unnecessary.
-</p>
-
-<h3>The Future of Floodfill</h3>
-<p>
-(this is adapted from an IRC conversation with jrandom 11/07)
-<br />
-Here's a proposal: Capacity class O is automatically floodfill.
-Hmm.
-Unless we're sure, we might end up with a fancy way of DDoS'ing all O class routers.
-This is quite the case: we want to make sure the number of floodfill is as small as possible while providing sufficient reachability.
-If/when netDb requests fail, then we need to increase the number of floodfill peers, but atm, I'm not aware of a netDb fetch problem.
-There are 33 "O" class peers according to my records.
-33 is a /lot/ to floodfill to.
-</p><p>
-So floodfill works best when the number of peers in that pool is firmly limited?
-And the size of the floodfill pool shouldn't grow much, even if the network itself gradually would?
-3-5 floodfill peers can handle 10K routers iirc (I posted a bunch of numbers on that explaining the details in the old syndie).
-Sounds like a difficult requirement to fill with automatic opt-in,
-especially if nodes opting in cannot trust data from others.
-e.g. "let's see if I'm among the top 5",
-and can only trust data about themselves (e.g. "I am definitely O class, and moving 150 KB/s, and up for 123 days").
-And top 5 is hostile as well.  Basically, it's the same as the tor directory servers - chosen by trusted people (aka devs).
-Yeah, right now it could be exploited by opt-in, but that'd be trivial to detect and deal with.
-Seems like in the end, we might need something more useful than Kademlia, and have only reasonably capable peers join that scheme.
-N class and above should be a big enough quantity to suppress risk of an adversary causing denial of service, I'd hope.
-But it would have to be different from floodfill then, in the sense that it wouldn't cause humongous traffic.
-Large quantity?  For a DHT based netDb?
-Not necessarily DHT-based.
-</p>
-
-<h3 id="todo">Floodfill TODO List</h3>
-<p>
-The network was down to only one floodfill for a couple of hours on March 13, 2008
-(approx. 18:00 - 20:00 UTC),
-and it caused a lot of trouble.
-<p>
-Two changes implemented in 0.6.1.33 should reduce the disruption caused
-by floodfill peer removal or churn:
-<ol>
-<li>Randomize the floodfill peers used for search each time.
-This will get you past the failing ones eventually.
-This change also fixed a nasty bug that would sometimes drive the ff search code insane.
-<li>Prefer the floodfill peers that are up.
-The code now avoids peers that are shitlisted, failing, or not heard from in
-half an hour, if possible.
-</ol>
-<p>
-One benefit is faster first contact to an eepsite (i.e. when you had to fetch
-the leaseset first). The lookup timeout is 10s, so if you don't start out by
-asking a peer that is down, you can save 10s.
-
-<p>
-There <i>may</i> be anonymity implications in these changes.
-For example, in the floodfill <b>store</b> code, there are comments that
-shitlisted peers are not avoided, since a peer could be "shitty" and then
-see what happens.
-Searches are much less vulnerable than stores -
-they're much less frequent, and give less away.
-So maybe we don't think we need to worry about it?
-But if we want to tweak the changes, it would be easy to
-send to a peer listed as "down" or shitlisted anyway, just not
-count it as part of the 2 we are sending to
-(since we don't really expect a reply).
-
-<p>
-There are several places where a floodfill peer is selected - this fix addresses only one -
-who a regular peer searches from [2 at a time].
-Other places where better floodfill selection should be implemented:
-<ol>
-<li>Who a regular peer stores to [1 at a time]
-(random - need to add qualification, because timeouts are long)
-<li>Who a regular peer searches to verify a store [1 at a time]
-(random - need to add qualification, because timeouts are long)
-<li>Who a ff peer sends in reply to a failed search (3 closest to the search)
-<li>Who a ff peer floods to (all other ff peers)
-<li>The list of ff peers sent in the NTCP every-6-hour "whisper"
-(although this may not longer be necessary due to other ff improvements)
-</ol>
-<p>
-Lots more that could and should be done -
-<ul>
-<li>
-Use the "dbHistory" stats to better rate a floodfill peer's integration
-<li>
-Use the "dbHistory" stats to immediately react to floodfill peers that don't respond
-<li>
-Be smarter on retries - retries are handled by an upper layer, not in
-FloodOnlySearchJob, so it does another random sort and tries again,
-rather than purposefully skipping the ff peers we just tried.
-<li>
-Improve integration stats more
-<li>
-Actually use integration stats rather than just floodfill indication in netDb
-<li>
-Use latency stats too?
-<li>
-More improvement on recognizing failing floodfill peers
-</ul>
-
-<p>
-Recently completed -
-<ul>
-<li>
-[In Release 0.6.3]
-Implement automatic opt-in
-to floodfill for some percentage of class O peers, based on analysis of the network.
-<li>
-[In Release 0.6.3]
-Continue to reduce netDb entry size to reduce floodfill traffic -
-we are now at the minimum number of stats required to monitor the network.
-<li>
-[In Release 0.6.3]
-Manual list of floodfill peers to exclude?
-(<a href="how_threatmodel.html#blocklist">blocklists</a> by router ident)
-<li>
-[In Release 0.6.3]
-Better floodfill peer selection for stores:
-Avoid peers whose netDb is old, or have a recent failed store,
-or are forever-shitlisted.
-<li>
-[In Release 0.6.4]
-Prefer already-connected floodfill peers for RouterInfo stores, to
-reduce number of direct connections to floodfill peers.
-<li>
-[In Release 0.6.5]
-Peers who are no longer floodfill send their routerInfo in response
-to a query, so that the router doing the query will know he
-is no longer floodfill.
-<li>
-[In Release 0.6.5]
-Further tuning of the requirements to automatically become floodfill
-<li>
-[In Release 0.6.5]
-Fix response time profiling in preparation for favoring fast floodfills
-<li>
-[In Release 0.6.5]
-Improve blocklisting
-<li>
-[In Release 0.7]
-Fix netDb exploration
-<li>
-[In Release 0.7]
-Turn blocklisting on by default, block the known troublemakers
-<li>
-[Several improvements in recent releases, a continuing effort]
-Reduce the resource demands on high-bandwidth and floodfill routers
-</ul>
-
-<p>
-That's a long list but it will take that much work to
-have a network that's resistant to DOS from lots of peers turning the floodfill switch on and off.
-Or pretending to be a floodfill router.
-None of this was a problem when we had only two ff routers, and they were both up
-24/7. Again, jrandom's absence has pointed us to places that need improvement.
-
-</p><p>
-To assist in this effort, additional profile data for floodfill peers are
-now (as of release 0.6.1.33) displayed on the "Profiles" page in
-the router console.
-We will use this to analyze which data are appropriate for
-rating floodfill peers.
-</p>
-
-<p>
-The network is currently quite resilient, however
-we will continue to enhance our algorithms for measuring and reacting to the performance and reliability
-of floodfill peers. While we are not, at the moment, fully hardened to the potential threats of
-malicious floodfills or a floodfill DDOS, most of the infrastructure is in place,
-and we are well-positioned to react quickly
-should the need arise.
-</p>
-
-<h3>Really current status</h3>
+<h2><a name="future">Future Work</a></h2>


 {% endblock %}
--- a/www.i2p2/pages/how_peerselection.html
+++ b/www.i2p2/pages/how_peerselection.html
@ -29,7 +29,7 @@ about how long it takes for them to reply to a network database query, how
 often their tunnels fail, and how many new peers they are able to introduce 
 us to, as well as simple data points such as when we last heard from them or
 when the last communication error occurred.  The specific data points gathered
-can be found in the <a href="http://docs.i2p2.de/router/net/i2p/router/peermanager/PeerProfile.java">code</a>
+can be found in the <a href="http://docs.i2p2.de/router/net/i2p/router/peermanager/PeerProfile.html">code</a>
 </p>

 <p>Currently, there is no 'ejection' strategy to get rid of the profiles for 
@ -60,7 +60,7 @@ integrated into the network it is, and whether it is failing.</p>
 <h3>Speed</h3>

 <p>The speed calculation (as implemented 
-<a href="http://docs.i2p2.de/router/net/i2p/router/peermanager/SpeedCalculator.java">here</a>)
+<a href="http://docs.i2p2.de/router/net/i2p/router/peermanager/SpeedCalculator.html">here</a>)
 simply goes through the profile and estimates how much data we can
 send or receive on a single tunnel through the peer in a minute.  For this estimate it just looks at past 
 performance.
@ -80,7 +80,7 @@ on today's network.
 <h3>Capacity</h3>

 <p>The capacity calculation (as implemented 
-<a href="http://docs.i2p2.de/router/net/i2p/router/peermanager/CapacityCalculator.java">here</a>)
+<a href="http://docs.i2p2.de/router/net/i2p/router/peermanager/CapacityCalculator.html">here</a>)
 simply goes through the profile and estimates how many tunnels we think the peer
 would agree to participate in over the next hour.  For this estimate it looks at 
 how many the peer has agreed to lately, how many the peer rejected, and how many
@ -104,7 +104,7 @@ on today's network.
 <h3>Integration</h3>

 <p>The integration calculation (as implemented 
-<a href="http://docs.i2p2.de/router/net/i2p/router/peermanager/IntegrationCalculator.java">here</a>)
+<a href="http://docs.i2p2.de/router/net/i2p/router/peermanager/IntegrationCalculator.html">here</a>)
 is important only for the network database (and in turn, only when trying to focus
 the 'exploration' of the network to detect and heal splits).  At the moment it is 
 not used for anything though, as the detection code is not necessary for generally
@ -128,7 +128,7 @@ thus eliminating a major weakness in today's floodfill scheme.
 <h3>Failing</h3>

 <p>The failing calculation (as implemented 
-<a href="http://docs.i2p2.de/router/net/i2p/router/peermanager/IsFailingCalculator.java">here</a>)
+<a href="http://docs.i2p2.de/router/net/i2p/router/peermanager/IsFailingCalculator.html">here</a>)
 keeps track of a few data points and determines whether the peer is overloaded 
 or is unable to honor its agreements.  When a peer is marked as failing, it will
 be avoided whenever possible.</p>
@ -157,7 +157,7 @@ across peers as well as to prevent some simple attacks.</p>
 </ul>

 These groupings are implemented in the <a 
-href="http://docs.i2p2.de/router/net/i2p/router/peermanager/ProfileOrganizer.java">ProfileOrganizer</a>'s
+href="http://docs.i2p2.de/router/net/i2p/router/peermanager/ProfileOrganizer.html">ProfileOrganizer</a>'s
 reorganize() method (using the calculateThresholds() and locked_placeProfile() methods in turn).</p>

 <h2>To Do</h2>
--- a/www.i2p2/pages/how_peerselection_de.html
+++ b/www.i2p2/pages/how_peerselection_de.html
@ -23,7 +23,7 @@ mitteilen k&ouml;nnen. Auch einfachere Daten sind enthalten, wie etwa der
 Zeitpunkt des letzten Kontaktes oder wann der letzte Fehler in der 
 Kommunikation war. Die einzelnen gesammelten Datenpunkte k&ouml;nnen
 im Monotone abgelesen werden (Link folgt).</p>
-<!-- <a href="http://docs.i2p2.de/router/net/i2p/router/peermanager/PeerProfile.java">code</a>
+<!-- <a href="http://docs.i2p2.de/router/net/i2p/router/peermanager/PeerProfile.html">code</a>
 </p> --!>

 <p>Momentan gibt es keine "L&ouml;schstrategie, um sich der Profile der Knoten
@ -55,7 +55,7 @@ Netzwerk integriert ist und ob er nicht funktioniert.</p>

 <p>Die Berechnung der Geschwindigkeit (wie im Quellkode implementiert)
 <!--
-<a href="http://docs.i2p2.de/router/net/i2p/router/peermanager/SpeedCalculator.java">here</a>) --!>
+<a href="http://docs.i2p2.de/router/net/i2p/router/peermanager/SpeedCalculator.html">here</a>) --!>
 wird einfach mit den Profilen durchgef&uuml;hrt und sch&auml;tzt, wie viele
 Daten wir durch einen einzigen Tunnel durch diesen Knoten in einer Minute 
 senden oder empfangen k&ouml;nnen. Daf&uuml;r werden einfach die Werte der
@ -77,7 +77,7 @@ dem derzeitigem Netzwerk zu best&auml;tigen.
 <h3>Kapazit&auml;t/h3>

 <p>die Berechnung der Kapazit&auml;t <!--(as implemented 
-<a href="http://docs.i2p2.de/router/net/i2p/router/peermanager/CapacityCalculator.java">here</a>) --!>
+<a href="http://docs.i2p2.de/router/net/i2p/router/peermanager/CapacityCalculator.html">here</a>) --!>
 bearbeitet die Profile und sch&auml;tzt wie viele von uns angefragte Tunnel
 der Knoten in der n&auml;chsten Stunde akzeptiert. Daf&uuml;r betrachtet 
 die Methode, wieviele Tunnel der Knoten in letzter Zeit akzeptiert hat, wie
@ -104,7 +104,7 @@ zur Effektivit&auml;t im aktuellem Netzwerk.

 <p>Die Berechnung der Integration
 <!-- (as implemented 
-<a href="http://docs.i2p2.de/router/net/i2p/router/peermanager/IntegrationCalculator.java">here</a>) --!>
+<a href="http://docs.i2p2.de/router/net/i2p/router/peermanager/IntegrationCalculator.html">here</a>) --!>
 ist nur f&uuml;r die Netzwerkdatenbank notwendig (und, falls eingebaut, zum
 Erkennen und reparieren von Netzwerk`splits`). Zur Zeit wird es f&uuml;r nichts
 benutzt, da der Kode zum Erkennen der Splits in gut verbundenen Netzwerken
@ -131,7 +131,7 @@ Schwachstelle im derzeitigem Floodfillsystem behoben.

 <p>Die Berechung zum Versagen
 <!-- (as implemented 
-<a href="http://docs.i2p2.de/router/net/i2p/router/peermanager/IsFailingCalculator.java">here</a>) --!)
+<a href="http://docs.i2p2.de/router/net/i2p/router/peermanager/IsFailingCalculator.html">here</a>) --!)
 eines Knoten kontrolliert ein paar Daten aus den Profilen und erkennt, ob
 der Knoten &uuml;berladen ist oder nicht seine Versprechen zum Tunnelaufbau
 einhalten kann. Sobald ein Knoten als "Versager" gekennzeichnet ist, wird
@ -165,7 +165,7 @@ Beziehungen zueinander: <ul>

 <!--
 These groupings are implemented in the <a 
-href="http://docs.i2p2.de/router/net/i2p/router/peermanager/ProfileOrganizer.java">ProfileOrganizer</a>'s
+href="http://docs.i2p2.de/router/net/i2p/router/peermanager/ProfileOrganizer.html">ProfileOrganizer</a>'s
 reorganize() method (using the calculateThresholds() and locked_placeProfile() methods in turn).</p>
 --!>

--- a/www.i2p2/pages/netdb_discussion.html
+++ b/www.i2p2/pages/netdb_discussion.html
@ -0,0 +1,399 @@
+{% extends "_layout.html" %}
+{% block title %}Network Database Discussion{% endblock %}
+{% block content %}
+<p>
+NOTE: The following is a discussion of the history of netdb implmenetation and is not current information.
+See <a href="how_networkdatabase.html">the main netdb page</a> for current documentation</a>.
+
+<h2><a name="status">History</a></h2>
+
+<h3>The Introduction of the Floodfill Algorithm</h3>
+<p>
+Floodfill was introduced in release 0.6.0.4, keeping Kademlia as a backup algorithm.
+</p>
+
+<p>
+(Adapted from posts by jrandom in the old Syndie, Nov. 26, 2005)
+<br />
+As I've often said, I'm not particularly bound to any specific technology - 
+what matters to me is what will get results.  While I've been working through 
+various netDb ideas over the last few years, the issues we've faced in the last 
+few weeks have brought some of them to a head.  On the live net, 
+with the netDb redundancy factor set to 4 peers (meaning we keep sending an 
+entry to new peers until 4 of them confirm that they've got it) and the 
+per-peer timeout set to 4 times that peer's average reply time, we're 
+<b>still</b> getting an average of 40-60 peers sent to before 4 ACK the store.  
+That means sending 36-56 times as many messages as should go out, each using 
+tunnels and thereby crossing 2-4 links.  Even further, that value is heavily 
+skewed, as the average number of peers sent to in a 'failed' store (meaning 
+less than 4 people ACKed the message after 60 seconds of sending messages out) 
+was in the 130-160 peers range.
+</p><p>
+This is insane, especially for a network with only perhaps 250 peers on it.  
+</p><p>
+The simplest answer is to say "well, duh jrandom, it's broken.  fix it", but 
+that doesn't quite get to the core of the issue.  In line with another current 
+effort, it's likely that we have a substantial number of network issues due to 
+restricted routes - peers who cannot talk with some other peers, often due to 
+NAT or firewall issues.  If, say, the K peers closest to a particular netDb 
+entry are behind a 'restricted route' such that the netDb store message could 
+reach them but some other peer's netDb lookup message could not, that entry 
+would be essentially unreachable.  Following down those lines a bit further and 
+taking into consideration the fact that some restricted routes will be created 
+with hostile intent, its clear that we're going to have to look closer into a 
+long term netDb solution.
+</p><p>
+There are a few alternatives, but two worth mentioning in particular.  The 
+first is to simply run the netDb as a Kademlia DHT using a subset of the full 
+network, where all of those peers are externally reachable.  Peers who are not 
+participating in the netDb still query those peers but they don't receive 
+unsolicited netDb store or lookup messages.  Participation in the netDb would 
+be both self-selecting and user-eliminating - routers would choose whether to 
+publish a flag in their routerInfo stating whether they want to participate 
+while each router chooses which peers it wants to treat as part of the netDb 
+(peers who publish that flag but who never give any useful data would be 
+ignored, essentially eliminating them from the netDb).
+</p><p>
+Another alternative is a blast from the past, going back to the DTSTTCPW 
+(Do The Simplest Thing That Could Possibly Work)
+mentality - a floodfill netDb, but like the alternative above, using only a 
+subset of the full network.  When a user wants to publish an entry into the 
+floodfill netDb, they simply send it to one of the participating routers, wait 
+for an ACK, and then 30 seconds later, query another random participant in the 
+floodfill netDb to verify that it was properly distributed.  If it was, great, 
+and if it wasn't, just repeat the process.  When a floodfill router receives a 
+netDb store, they ACK immediately and queue off the netDb store to all of its 
+known netDb peers.  When a floodfill router receives a netDb lookup, if they 
+have the data, they reply with it, but if they don't, they reply with the 
+hashes for, say, 20 other peers in the floodfill netDb.
+</p><p>
+Looking at it from a network economics perspective, the floodfill netDb is 
+quite similar to the original broadcast netDb, except the cost for publishing 
+an entry is borne mostly by peers in the netDb, rather than by the publisher.  
+Fleshing this out a bit further and treating the netDb like a blackbox, we can 
+see the total bandwidth required by the netDb to be:<pre>
+  recvKBps = N * (L + 1) * (1 + F) * (1 + R) * S / T
+</pre>where<pre>
+  N = number of routers in the entire network
+  L = average number of client destinations on each router 
+      (+1 for the routerInfo)
+  F = tunnel failure percentage 
+  R = tunnel rebuild period, as a fraction of the tunnel lifetime
+  S = average netDb entry size
+  T = tunnel lifetime
+</pre>Plugging in a few values:<pre>
+  recvKBps = 1000 * (5 + 1) * (1 + 0.05) * (1 + 0.2) * 2KB / 10m
+           = 25.2KBps
+</pre>That, in turn, scales linearly with N (at 100,000 peers, the netDb must 
+be able to handle netDb store messages totaling 2.5MBps, or, at 300 peers, 
+7.6KBps).  
+</p><p>
+While the floodfill netDb would have each netDb participant receiving only a 
+small fraction of the client generated netDb stores directly, they would all 
+receive all entries eventually, so all of their links should be capable of 
+handling the full recvKBps.  In turn, they'll all need to send 
+<tt>(recvKBps/sizeof(netDb)) * (sizeof(netDb)-1)</tt> to keep the other 
+peers in sync.
+</p><p>
+A floodfill netDb would not require either tunnel routing for netDb operation 
+or any special selection as to which entries it can answer 'safely', as the 
+basic assumption is that they are all storing everything.  Oh, and with regards 
+to the netDb disk usage required, its still fairly trivial for any modern 
+machine, requiring around 11MB for every 1000 peers <tt>(N * (L + 1) * 
+S)</tt>.
+</p><p>
+The Kademlia netDb would cut down on these numbers, ideally bringing them to K 
+over M times their value, with K = the redundancy factor and M being the number 
+of routers in the netDb (e.g. 5/100, giving a recvKBps of 126KBps and 536MB at 
+100,000 routers).  The downside of the Kademlia netDb though is the increased 
+complexity of safe operation in a hostile environment.  
+</p><p>
+What I'm thinking about now is to simply implement and deploy a floodfill netDb 
+in our existing live network, letting peers who want to use it pick out other 
+peers who are flagged as members and query them instead of querying the 
+traditional Kademlia netDb peers.  The bandwidth and disk requirements at this 
+stage are trivial enough  (7.6KBps and 3MB disk space) and it will remove the 
+netDb entirely from the debugging plan - issues that remain to be addressed 
+will be caused by something unrelated to the netDb.
+</p><p>
+How would peers be chosen to publish that flag saying they are a part of the 
+floodfill netDb?  At the beginning, it could be done manually as an advanced 
+config option (ignored if the router is not able to verify its external 
+reachability).  If too many peers set that flag, how do the netDb participants 
+pick which ones to eject?  Again, at the beginning it could be done manually as 
+an advanced config option (after dropping peers which are unreachable).  How do 
+we avoid netDb partitioning?  By having the routers verify that the netDb is 
+doing the flood fill properly by querying K random netDb peers.  How do routers 
+not participating in the netDb discover new routers to tunnel through?  Perhaps 
+this could be done by sending a particular netDb lookup so that the netDb 
+router would respond not with peers in the netDb, but with random peers outside 
+the netDb.  
+</p><p>
+I2P's netDb is very different from traditional load bearing DHTs - it only 
+carries network metadata, not any actual payload, which is why even a netDb 
+using a floodfill algorithm will be able to sustain an arbitrary amount of 
+eepsite/IRC/bt/mail/syndie/etc data.  We can even do some optimizations as I2P 
+grows to distribute that load a bit further (perhaps passing bloom filters 
+between the netDb participants to see what they need to share), but it seems we 
+can get by with a much simpler solution for now.
+</p><p>
+One fact may be worth digging 
+into - not all leaseSets need to be published in the netDb!  In fact, most 
+don't need to be - only those for destinations which will be receiving 
+unsolicited messages (aka servers).  This is because the garlic wrapped 
+messages sent from one destination to another already bundles the sender's 
+leaseSet so that any subsequent send/recv between those two destinations 
+(within a short period of time) work without any netDb activity.
+</p><p>
+So, back at those equations, we can change L from 5 to something like 0.1 
+(assuming only 1 out of every 50 destinations is a server).  The previous 
+equations also brushed over the network load required to answer queries from 
+clients, but while that is highly variable (based on the user activity), it's 
+also very likely to be quite insignificant as compared to the publishing 
+frequency.
+</p><p>
+Anyway, still no magic, but a nice reduction of nearly 1/5th the bandwidth/disk 
+space required (perhaps more later, depending upon whether the routerInfo 
+distribution goes directly as part of the peer establishment or only through 
+the netDb).
+</p>
+
+<h3>The Disabling of the Kademlia Algorithm</h3>
+<p>
+Kademlia was completely disabled in release 0.6.1.20.
+</p><p>
+(this is adapted from an IRC conversation with jrandom 11/07)
+<br />
+Kademlia requires a minimum level of service that the baseline could not offer (bandwidth, cpu),
+even after adding in tiers (pure kad is absurd on that point).
+Kademlia just wouldn't work.  It was a nice idea, but not for a hostile and fluid environment.
+</p>
+
+<h3>Current Status</h3>
+<p>The netDb plays a very specific role in the I2P network, and the algorithms
+have been tuned towards our needs.  This also means that it hasn't been tuned 
+to address the needs we have yet to run into.  I2P is currently
+fairly small (a few hundred routers).
+There were some calculations that 3-5 floodfill routers should be able to handle
+10,000 nodes in the network.
+The netDb implementation more than adequately meets our
+needs at the moment, but there will likely be further tuning and bugfixing as 
+the network grows.</p>
+
+<h3>Update of Calculations 03-2008</h3>
+<p>Current numbers:
+<pre>
+  recvKBps = N * (L + 1) * (1 + F) * (1 + R) * S / T
+</pre>where<pre>
+  N = number of routers in the entire network
+  L = average number of client destinations on each router 
+      (+1 for the routerInfo)
+  F = tunnel failure percentage 
+  R = tunnel rebuild period, as a fraction of the tunnel lifetime
+  S = average netDb entry size
+  T = tunnel lifetime
+</pre>
+Changes in assumptions:
+<ul>
+<li>L is now about .5, compared to .1 above, due to the popularity of i2psnark
+and other apps.
+<li>F is about .33, but bugs in tunnel testing are fixed in 0.6.1.33, so it will get much better.
+<li>Since netDb is about 2/3 5K routerInfos and 1/3 2K leaseSets, S = 4K.
+RouterInfo size is shrinking in 0.6.1.32 and 0.6.1.33 as we remove unnecessary stats.
+<li>R = tunnel build period: 0.2 was a very low - it was maybe 0.7 -
+but build algorithm improvements in 0.6.1.32 should bring it down to about 0.2
+as the network upgrades. Call it 0.5 now with half the network at .30 or earlier.
+</ul>
+<pre>  recvKBps = 700 * (0.5 + 1) * (1 + 0.33) * (1 + 0.5) * 4KB / 10m
+           ~= 28KBps
+</pre>
+This just accounts for the stores - what about the queries?
+
+
+<h3>The Return of the Kademlia Algorithm?</h3>
+<p>
+(this is adapted from <a href="meeting195.html">the I2P meeting Jan. 2, 2007</a>)
+<br />
+The Kademlia netDb just wasn't working properly.
+Is it dead forever or will it be coming back?
+If it comes back, the peers in the Kademlia netDb would be a very limited subset
+of the routers in the network (basically an expanded number of floodfill peers, if/when the floodfill peers
+cannot handle the load).
+But until the floodfill peers cannot handle the load (and other peers cannot be added that can), it's unnecessary.
+</p>
+
+<h3>The Future of Floodfill</h3>
+<p>
+(this is adapted from an IRC conversation with jrandom 11/07)
+<br />
+Here's a proposal: Capacity class O is automatically floodfill.
+Hmm.
+Unless we're sure, we might end up with a fancy way of DDoS'ing all O class routers.
+This is quite the case: we want to make sure the number of floodfill is as small as possible while providing sufficient reachability.
+If/when netDb requests fail, then we need to increase the number of floodfill peers, but atm, I'm not aware of a netDb fetch problem.
+There are 33 "O" class peers according to my records.
+33 is a /lot/ to floodfill to.
+</p><p>
+So floodfill works best when the number of peers in that pool is firmly limited?
+And the size of the floodfill pool shouldn't grow much, even if the network itself gradually would?
+3-5 floodfill peers can handle 10K routers iirc (I posted a bunch of numbers on that explaining the details in the old syndie).
+Sounds like a difficult requirement to fill with automatic opt-in,
+especially if nodes opting in cannot trust data from others.
+e.g. "let's see if I'm among the top 5",
+and can only trust data about themselves (e.g. "I am definitely O class, and moving 150 KB/s, and up for 123 days").
+And top 5 is hostile as well.  Basically, it's the same as the tor directory servers - chosen by trusted people (aka devs).
+Yeah, right now it could be exploited by opt-in, but that'd be trivial to detect and deal with.
+Seems like in the end, we might need something more useful than Kademlia, and have only reasonably capable peers join that scheme.
+N class and above should be a big enough quantity to suppress risk of an adversary causing denial of service, I'd hope.
+But it would have to be different from floodfill then, in the sense that it wouldn't cause humongous traffic.
+Large quantity?  For a DHT based netDb?
+Not necessarily DHT-based.
+</p>
+
+<h3 id="todo">Floodfill TODO List</h3>
+<p>
+NOTE: The following is not current information.
+See <a href="how_networkdatabase.html">the main netdb page</a> for the current status and a list of future work</a>.
+<p>
+The network was down to only one floodfill for a couple of hours on March 13, 2008
+(approx. 18:00 - 20:00 UTC),
+and it caused a lot of trouble.
+<p>
+Two changes implemented in 0.6.1.33 should reduce the disruption caused
+by floodfill peer removal or churn:
+<ol>
+<li>Randomize the floodfill peers used for search each time.
+This will get you past the failing ones eventually.
+This change also fixed a nasty bug that would sometimes drive the ff search code insane.
+<li>Prefer the floodfill peers that are up.
+The code now avoids peers that are shitlisted, failing, or not heard from in
+half an hour, if possible.
+</ol>
+<p>
+One benefit is faster first contact to an eepsite (i.e. when you had to fetch
+the leaseset first). The lookup timeout is 10s, so if you don't start out by
+asking a peer that is down, you can save 10s.
+
+<p>
+There <i>may</i> be anonymity implications in these changes.
+For example, in the floodfill <b>store</b> code, there are comments that
+shitlisted peers are not avoided, since a peer could be "shitty" and then
+see what happens.
+Searches are much less vulnerable than stores -
+they're much less frequent, and give less away.
+So maybe we don't think we need to worry about it?
+But if we want to tweak the changes, it would be easy to
+send to a peer listed as "down" or shitlisted anyway, just not
+count it as part of the 2 we are sending to
+(since we don't really expect a reply).
+
+<p>
+There are several places where a floodfill peer is selected - this fix addresses only one -
+who a regular peer searches from [2 at a time].
+Other places where better floodfill selection should be implemented:
+<ol>
+<li>Who a regular peer stores to [1 at a time]
+(random - need to add qualification, because timeouts are long)
+<li>Who a regular peer searches to verify a store [1 at a time]
+(random - need to add qualification, because timeouts are long)
+<li>Who a ff peer sends in reply to a failed search (3 closest to the search)
+<li>Who a ff peer floods to (all other ff peers)
+<li>The list of ff peers sent in the NTCP every-6-hour "whisper"
+(although this may not longer be necessary due to other ff improvements)
+</ol>
+<p>
+Lots more that could and should be done -
+<ul>
+<li>
+Use the "dbHistory" stats to better rate a floodfill peer's integration
+<li>
+Use the "dbHistory" stats to immediately react to floodfill peers that don't respond
+<li>
+Be smarter on retries - retries are handled by an upper layer, not in
+FloodOnlySearchJob, so it does another random sort and tries again,
+rather than purposefully skipping the ff peers we just tried.
+<li>
+Improve integration stats more
+<li>
+Actually use integration stats rather than just floodfill indication in netDb
+<li>
+Use latency stats too?
+<li>
+More improvement on recognizing failing floodfill peers
+</ul>
+
+<p>
+Recently completed -
+<ul>
+<li>
+[In Release 0.6.3]
+Implement automatic opt-in
+to floodfill for some percentage of class O peers, based on analysis of the network.
+<li>
+[In Release 0.6.3]
+Continue to reduce netDb entry size to reduce floodfill traffic -
+we are now at the minimum number of stats required to monitor the network.
+<li>
+[In Release 0.6.3]
+Manual list of floodfill peers to exclude?
+(<a href="how_threatmodel.html#blocklist">blocklists</a> by router ident)
+<li>
+[In Release 0.6.3]
+Better floodfill peer selection for stores:
+Avoid peers whose netDb is old, or have a recent failed store,
+or are forever-shitlisted.
+<li>
+[In Release 0.6.4]
+Prefer already-connected floodfill peers for RouterInfo stores, to
+reduce number of direct connections to floodfill peers.
+<li>
+[In Release 0.6.5]
+Peers who are no longer floodfill send their routerInfo in response
+to a query, so that the router doing the query will know he
+is no longer floodfill.
+<li>
+[In Release 0.6.5]
+Further tuning of the requirements to automatically become floodfill
+<li>
+[In Release 0.6.5]
+Fix response time profiling in preparation for favoring fast floodfills
+<li>
+[In Release 0.6.5]
+Improve blocklisting
+<li>
+[In Release 0.7]
+Fix netDb exploration
+<li>
+[In Release 0.7]
+Turn blocklisting on by default, block the known troublemakers
+<li>
+[Several improvements in recent releases, a continuing effort]
+Reduce the resource demands on high-bandwidth and floodfill routers
+</ul>
+
+<p>
+That's a long list but it will take that much work to
+have a network that's resistant to DOS from lots of peers turning the floodfill switch on and off.
+Or pretending to be a floodfill router.
+None of this was a problem when we had only two ff routers, and they were both up
+24/7. Again, jrandom's absence has pointed us to places that need improvement.
+
+</p><p>
+To assist in this effort, additional profile data for floodfill peers are
+now (as of release 0.6.1.33) displayed on the "Profiles" page in
+the router console.
+We will use this to analyze which data are appropriate for
+rating floodfill peers.
+</p>
+
+<p>
+The network is currently quite resilient, however
+we will continue to enhance our algorithms for measuring and reacting to the performance and reliability
+of floodfill peers. While we are not, at the moment, fully hardened to the potential threats of
+malicious floodfills or a floodfill DDOS, most of the infrastructure is in place,
+and we are well-positioned to react quickly
+should the need arise.
+</p>
+
+
+{% endblock %}