diff --git a/www.i2p2/pages/common_structures_spec.html b/www.i2p2/pages/common_structures_spec.html index c4aac2e4..37fd9568 100644 --- a/www.i2p2/pages/common_structures_spec.html +++ b/www.i2p2/pages/common_structures_spec.html @@ -58,6 +58,8 @@ 256 byte Integer
+@@ -68,6 +70,8 @@ 256 byte Integer
+@@ -78,6 +82,8 @@ 32 byte Integer
+@@ -88,6 +94,8 @@ 128 byte Integer
+@@ -98,6 +106,8 @@ 20 byte Integer
+@@ -108,6 +118,8 @@ 40 byte Integer
+@@ -118,6 +130,8 @@ 32 bytes
+@@ -128,6 +142,8 @@ 4 byte Integer
+@@ -160,6 +176,8 @@ payload :: data {% endfilter %} +
@@ -249,6 +269,8 @@ certificate :: Certificate {% endfilter %} +
@@ -290,6 +312,8 @@ end_date :: Date {% endfilter %} +
@@ -389,6 +413,8 @@ signature :: Signature {% endfilter %} +
@@ -505,4 +533,6 @@ options :: Mapping {% endfilter %} +
ElGamal is never used on its own in I2P, but instead always as part of ElGamal/AES+SessionTag. @@ -55,14 +55,14 @@ Using 2 as the generator.
We use 256bit AES in CBC mode with PKCS#5 padding for 16 byte blocks (aka each block is end padded with the number of pad bytes). Specifically, see -[the CBC code] +[the CBC code] and the Cryptix AES -[implementation] +[implementation]
For situations where we stream AES data, we still use the same algorithm, as implemented in -[AESOutputStream] -[AESInputStream] +[AESOutputStream] +[AESInputStream]
For situations where we know the size of the data to be sent, we AES encrypt the following:
@@ -77,13 +77,13 @@ this entire segment (from H(data) through the end of the random bytes) is AES en
This code is implemented in the safeEncrypt and safeDecrypt methods of -[AESEngine] +[AESEngine]
Signatures are generated and verified with 1024bit DSA, as implemented in -[DSAEngine] +[DSAEngine]
Hashes within I2P are plain old SHA256, as implemented in -[SHA256Generator] +[SHA256Generator]
ElGamal wird in I2P nie alleine genutzt, es ist immer ein Teil von ElGamal/AES+SessionTag. @@ -59,14 +59,14 @@ Wir benutzen 256bit AES im CBC Modus mit dem PKCS#5 Padding für 16 Byte Blo (welches bedeuted, das jeder Block mit der Anzahl der aufgefüllten Bytes als Daten aufgefüllt wird). Zur Spezifikation schaue in den - [CBC Quelltext] + [CBC Quelltext] und für die Crytix AES -[Implementation hier] +[Implementation hier]
In Situationen, in denen wir AES Daten streamen, nutzen wir die selben Algorhytmen, wie sie in -[AESOutputStream] und -[AESInputStream] +[AESOutputStream] und +[AESInputStream] implementiert sind.
Für Situationen, in denen die Gröse der zu sendenden Daten bekannt ist, @@ -83,14 +83,14 @@ Ende der zufälligen Daten ist AES verschlüsselt (256bit CBC mit PKCS#5
Dieser Code ist in den safeEncrypt und safeDecrypt Methoden der -[AESEngine] +[AESEngine] implementiert.
Signaturen werden mit 1024bit DSA erzeugt und verifiziert, wie es in der -[DSAEngine] +[DSAEngine] implementiert ist.
Hashes in I2P sind bekannte, alte SHA256 wie es im -[SHA256Generator] +[SHA256Generator] implementiert ist.
Die grundlegenden TCP Verbindungsalgorhythmen sind in der establishConnection() Funktion implementiert (die wiederrum exchangeKey() und identifyStationToStation()) in -[TCPConnection] +[TCPConnection] aufruft).
Dieses wird erweitert durch die -[RestrictiveTCPConnection] +[RestrictiveTCPConnection] Funktion, die die establishConnection() Methode aktualisiert um die Protokoll Version, die Uhrzeit und die öffentlich erreichbare IP Adresse des Knotens verifizieren zu können. (Da wir noch keine diff --git a/www.i2p2/pages/how_elgamalaes.html b/www.i2p2/pages/how_elgamalaes.html index f8650fcd..1e68a98d 100644 --- a/www.i2p2/pages/how_elgamalaes.html +++ b/www.i2p2/pages/how_elgamalaes.html @@ -18,7 +18,7 @@ the three possible conditions:
If its ElGamal encrypted to us, the message is considered a new session, and is encrypted per encryptNewSession(...) in -[ElGamalAESEngine] +[ElGamalAESEngine] as follows -
An initial ElGamal block, encrypted as before:
@@ -63,7 +63,7 @@ the old one. brief period (30 minutes currently) until they are used (and discarded). They are used by packaging in a message that is not preceded by an ElGamal block. Instead, it is encrypted per encryptExistingSession(...) in -[ElGamalAESEngine] +[ElGamalAESEngine] as follows -diff --git a/www.i2p2/pages/how_elgamalaes_de.html b/www.i2p2/pages/how_elgamalaes_de.html index 4511ab31..40d42629 100644 --- a/www.i2p2/pages/how_elgamalaes_de.html +++ b/www.i2p2/pages/how_elgamalaes_de.html @@ -22,7 +22,7 @@ testen, ob diese in eine der drei Möglichkeiten passt: Falls es für uns ElGamal verschlüsselt ist, wird die Nachricht als neue Session behandelt und wird verschlüsselt mittels encryptNewSession(....) in der -[ElGamalAESEngine] +[ElGamalAESEngine] wie folgend -Ein initialer ElGamal Block, verschlüsselt wie zuvor:
@@ -67,7 +67,7 @@ ersetzt. Zeit (zur Zeit 30 Minuten) gespeichert bis sie gebraucht (und weggeworfen) werden. Sie werden für das Einpacken einer Nachricht ohne vorhergehnden ElGamal Block genutzt. Dabei wird es mit encryptExistingSession(...) in der -[ElGamalAESEngine] +[ElGamalAESEngine] wie folgend verschlüsselt -diff --git a/www.i2p2/pages/how_networkdatabase.html b/www.i2p2/pages/how_networkdatabase.html index bfa2c3ad..45ec4f82 100644 --- a/www.i2p2/pages/how_networkdatabase.html +++ b/www.i2p2/pages/how_networkdatabase.html @@ -1,14 +1,19 @@ {% extends "_layout.html" %} {% block title %}How the Network Database (netDb) Works{% endblock %} {% block content %} + ++Updated July 2010, current as of router version 0.8 + +
Overview
+I2P's netDb is a specialized distributed database, containing just two types of data - router contact information (RouterInfos) and destination contact information (LeaseSets). Each piece of data is signed by the appropriate party and verified by anyone who uses or stores it. In addition, the data has liveliness information within it, allowing irrelevant entries to be dropped, newer entries to replace -older ones, and, for the paranoid, protection against certain classes of attack. -(Note that this is also why I2P bundles in the necessary code to determine the -correct time).
+older ones, and protection against certain classes of attack. +The netDb is distributed with a simple technique called "floodfill". @@ -32,17 +37,64 @@ from the SHA256 of the router's identity. The structure itself contains:
The arbitrary text options are currently used to help debug the network, -publishing various stats about the router's health. These stats will be disabled -by default once I2P 1.0 is out, and can be disabled by adding -"router.publishPeerRankings=false" to the router -configuration. The data -published can be seen on the router's netDb -page, but should not be trusted.
++The following text options, while not strictly required, are expected +to be present: +
Additional text options include +a small number of statistics about the router's health, which are aggregated by +sites such as stats.i2p +for network performance analysis and debugging. +These statistics were chosen to provide data crucial to the developers, +such as tunnel build success rates, while balancing the need for such data +with the side-effects that could result from revealing this data. +Current statistics are limited to: +
+Lease specification
+
+LeaseSet specification
+
+Lease Javadoc
+
+LeaseSet Javadoc
+
+
+
The netDb is decentralized, however you do need at @@ -201,394 +282,11 @@ The multihoming behavior has been verified with the test eepsite http://multihome.i2p/.
--Floodfill was introduced in release 0.6.0.4, keeping Kademlia as a backup algorithm. -
+Moved to the netdb disussion page. -
-(Adapted from posts by jrandom in the old Syndie, Nov. 26, 2005)
-
-As I've often said, I'm not particularly bound to any specific technology -
-what matters to me is what will get results. While I've been working through
-various netDb ideas over the last few years, the issues we've faced in the last
-few weeks have brought some of them to a head. On the live net,
-with the netDb redundancy factor set to 4 peers (meaning we keep sending an
-entry to new peers until 4 of them confirm that they've got it) and the
-per-peer timeout set to 4 times that peer's average reply time, we're
-still getting an average of 40-60 peers sent to before 4 ACK the store.
-That means sending 36-56 times as many messages as should go out, each using
-tunnels and thereby crossing 2-4 links. Even further, that value is heavily
-skewed, as the average number of peers sent to in a 'failed' store (meaning
-less than 4 people ACKed the message after 60 seconds of sending messages out)
-was in the 130-160 peers range.
-
-This is insane, especially for a network with only perhaps 250 peers on it. -
-The simplest answer is to say "well, duh jrandom, it's broken. fix it", but -that doesn't quite get to the core of the issue. In line with another current -effort, it's likely that we have a substantial number of network issues due to -restricted routes - peers who cannot talk with some other peers, often due to -NAT or firewall issues. If, say, the K peers closest to a particular netDb -entry are behind a 'restricted route' such that the netDb store message could -reach them but some other peer's netDb lookup message could not, that entry -would be essentially unreachable. Following down those lines a bit further and -taking into consideration the fact that some restricted routes will be created -with hostile intent, its clear that we're going to have to look closer into a -long term netDb solution. -
-There are a few alternatives, but two worth mentioning in particular. The -first is to simply run the netDb as a Kademlia DHT using a subset of the full -network, where all of those peers are externally reachable. Peers who are not -participating in the netDb still query those peers but they don't receive -unsolicited netDb store or lookup messages. Participation in the netDb would -be both self-selecting and user-eliminating - routers would choose whether to -publish a flag in their routerInfo stating whether they want to participate -while each router chooses which peers it wants to treat as part of the netDb -(peers who publish that flag but who never give any useful data would be -ignored, essentially eliminating them from the netDb). -
-Another alternative is a blast from the past, going back to the DTSTTCPW -(Do The Simplest Thing That Could Possibly Work) -mentality - a floodfill netDb, but like the alternative above, using only a -subset of the full network. When a user wants to publish an entry into the -floodfill netDb, they simply send it to one of the participating routers, wait -for an ACK, and then 30 seconds later, query another random participant in the -floodfill netDb to verify that it was properly distributed. If it was, great, -and if it wasn't, just repeat the process. When a floodfill router receives a -netDb store, they ACK immediately and queue off the netDb store to all of its -known netDb peers. When a floodfill router receives a netDb lookup, if they -have the data, they reply with it, but if they don't, they reply with the -hashes for, say, 20 other peers in the floodfill netDb. -
-Looking at it from a network economics perspective, the floodfill netDb is -quite similar to the original broadcast netDb, except the cost for publishing -an entry is borne mostly by peers in the netDb, rather than by the publisher. -Fleshing this out a bit further and treating the netDb like a blackbox, we can -see the total bandwidth required by the netDb to be:
- recvKBps = N * (L + 1) * (1 + F) * (1 + R) * S / T -where
- N = number of routers in the entire network - L = average number of client destinations on each router - (+1 for the routerInfo) - F = tunnel failure percentage - R = tunnel rebuild period, as a fraction of the tunnel lifetime - S = average netDb entry size - T = tunnel lifetime -Plugging in a few values:
- recvKBps = 1000 * (5 + 1) * (1 + 0.05) * (1 + 0.2) * 2KB / 10m - = 25.2KBps -That, in turn, scales linearly with N (at 100,000 peers, the netDb must -be able to handle netDb store messages totaling 2.5MBps, or, at 300 peers, -7.6KBps). -
-While the floodfill netDb would have each netDb participant receiving only a -small fraction of the client generated netDb stores directly, they would all -receive all entries eventually, so all of their links should be capable of -handling the full recvKBps. In turn, they'll all need to send -(recvKBps/sizeof(netDb)) * (sizeof(netDb)-1) to keep the other -peers in sync. -
-A floodfill netDb would not require either tunnel routing for netDb operation -or any special selection as to which entries it can answer 'safely', as the -basic assumption is that they are all storing everything. Oh, and with regards -to the netDb disk usage required, its still fairly trivial for any modern -machine, requiring around 11MB for every 1000 peers (N * (L + 1) * -S). -
-The Kademlia netDb would cut down on these numbers, ideally bringing them to K -over M times their value, with K = the redundancy factor and M being the number -of routers in the netDb (e.g. 5/100, giving a recvKBps of 126KBps and 536MB at -100,000 routers). The downside of the Kademlia netDb though is the increased -complexity of safe operation in a hostile environment. -
-What I'm thinking about now is to simply implement and deploy a floodfill netDb -in our existing live network, letting peers who want to use it pick out other -peers who are flagged as members and query them instead of querying the -traditional Kademlia netDb peers. The bandwidth and disk requirements at this -stage are trivial enough (7.6KBps and 3MB disk space) and it will remove the -netDb entirely from the debugging plan - issues that remain to be addressed -will be caused by something unrelated to the netDb. -
-How would peers be chosen to publish that flag saying they are a part of the -floodfill netDb? At the beginning, it could be done manually as an advanced -config option (ignored if the router is not able to verify its external -reachability). If too many peers set that flag, how do the netDb participants -pick which ones to eject? Again, at the beginning it could be done manually as -an advanced config option (after dropping peers which are unreachable). How do -we avoid netDb partitioning? By having the routers verify that the netDb is -doing the flood fill properly by querying K random netDb peers. How do routers -not participating in the netDb discover new routers to tunnel through? Perhaps -this could be done by sending a particular netDb lookup so that the netDb -router would respond not with peers in the netDb, but with random peers outside -the netDb. -
-I2P's netDb is very different from traditional load bearing DHTs - it only -carries network metadata, not any actual payload, which is why even a netDb -using a floodfill algorithm will be able to sustain an arbitrary amount of -eepsite/IRC/bt/mail/syndie/etc data. We can even do some optimizations as I2P -grows to distribute that load a bit further (perhaps passing bloom filters -between the netDb participants to see what they need to share), but it seems we -can get by with a much simpler solution for now. -
-One fact may be worth digging -into - not all leaseSets need to be published in the netDb! In fact, most -don't need to be - only those for destinations which will be receiving -unsolicited messages (aka servers). This is because the garlic wrapped -messages sent from one destination to another already bundles the sender's -leaseSet so that any subsequent send/recv between those two destinations -(within a short period of time) work without any netDb activity. -
-So, back at those equations, we can change L from 5 to something like 0.1 -(assuming only 1 out of every 50 destinations is a server). The previous -equations also brushed over the network load required to answer queries from -clients, but while that is highly variable (based on the user activity), it's -also very likely to be quite insignificant as compared to the publishing -frequency. -
-Anyway, still no magic, but a nice reduction of nearly 1/5th the bandwidth/disk -space required (perhaps more later, depending upon whether the routerInfo -distribution goes directly as part of the peer establishment or only through -the netDb). -
- --Kademlia was completely disabled in release 0.6.1.20. -
-(this is adapted from an IRC conversation with jrandom 11/07)
-
-Kademlia requires a minimum level of service that the baseline could not offer (bandwidth, cpu),
-even after adding in tiers (pure kad is absurd on that point).
-Kademlia just wouldn't work. It was a nice idea, but not for a hostile and fluid environment.
-
The netDb plays a very specific role in the I2P network, and the algorithms -have been tuned towards our needs. This also means that it hasn't been tuned -to address the needs we have yet to run into. I2P is currently -fairly small (a few hundred routers). -There were some calculations that 3-5 floodfill routers should be able to handle -10,000 nodes in the network. -The netDb implementation more than adequately meets our -needs at the moment, but there will likely be further tuning and bugfixing as -the network grows.
- -Current numbers: -
- recvKBps = N * (L + 1) * (1 + F) * (1 + R) * S / T -where
- N = number of routers in the entire network - L = average number of client destinations on each router - (+1 for the routerInfo) - F = tunnel failure percentage - R = tunnel rebuild period, as a fraction of the tunnel lifetime - S = average netDb entry size - T = tunnel lifetime --Changes in assumptions: -
recvKBps = 700 * (0.5 + 1) * (1 + 0.33) * (1 + 0.5) * 4KB / 10m - ~= 28KBps --This just accounts for the stores - what about the queries? - - -
-(this is adapted from the I2P meeting Jan. 2, 2007)
-
-The Kademlia netDb just wasn't working properly.
-Is it dead forever or will it be coming back?
-If it comes back, the peers in the Kademlia netDb would be a very limited subset
-of the routers in the network (basically an expanded number of floodfill peers, if/when the floodfill peers
-cannot handle the load).
-But until the floodfill peers cannot handle the load (and other peers cannot be added that can), it's unnecessary.
-
-(this is adapted from an IRC conversation with jrandom 11/07)
-
-Here's a proposal: Capacity class O is automatically floodfill.
-Hmm.
-Unless we're sure, we might end up with a fancy way of DDoS'ing all O class routers.
-This is quite the case: we want to make sure the number of floodfill is as small as possible while providing sufficient reachability.
-If/when netDb requests fail, then we need to increase the number of floodfill peers, but atm, I'm not aware of a netDb fetch problem.
-There are 33 "O" class peers according to my records.
-33 is a /lot/ to floodfill to.
-
-So floodfill works best when the number of peers in that pool is firmly limited? -And the size of the floodfill pool shouldn't grow much, even if the network itself gradually would? -3-5 floodfill peers can handle 10K routers iirc (I posted a bunch of numbers on that explaining the details in the old syndie). -Sounds like a difficult requirement to fill with automatic opt-in, -especially if nodes opting in cannot trust data from others. -e.g. "let's see if I'm among the top 5", -and can only trust data about themselves (e.g. "I am definitely O class, and moving 150 KB/s, and up for 123 days"). -And top 5 is hostile as well. Basically, it's the same as the tor directory servers - chosen by trusted people (aka devs). -Yeah, right now it could be exploited by opt-in, but that'd be trivial to detect and deal with. -Seems like in the end, we might need something more useful than Kademlia, and have only reasonably capable peers join that scheme. -N class and above should be a big enough quantity to suppress risk of an adversary causing denial of service, I'd hope. -But it would have to be different from floodfill then, in the sense that it wouldn't cause humongous traffic. -Large quantity? For a DHT based netDb? -Not necessarily DHT-based. -
- --The network was down to only one floodfill for a couple of hours on March 13, 2008 -(approx. 18:00 - 20:00 UTC), -and it caused a lot of trouble. -
-Two changes implemented in 0.6.1.33 should reduce the disruption caused -by floodfill peer removal or churn: -
-One benefit is faster first contact to an eepsite (i.e. when you had to fetch -the leaseset first). The lookup timeout is 10s, so if you don't start out by -asking a peer that is down, you can save 10s. - -
-There may be anonymity implications in these changes. -For example, in the floodfill store code, there are comments that -shitlisted peers are not avoided, since a peer could be "shitty" and then -see what happens. -Searches are much less vulnerable than stores - -they're much less frequent, and give less away. -So maybe we don't think we need to worry about it? -But if we want to tweak the changes, it would be easy to -send to a peer listed as "down" or shitlisted anyway, just not -count it as part of the 2 we are sending to -(since we don't really expect a reply). - -
-There are several places where a floodfill peer is selected - this fix addresses only one - -who a regular peer searches from [2 at a time]. -Other places where better floodfill selection should be implemented: -
-Lots more that could and should be done - -
-Recently completed - -
-That's a long list but it will take that much work to -have a network that's resistant to DOS from lots of peers turning the floodfill switch on and off. -Or pretending to be a floodfill router. -None of this was a problem when we had only two ff routers, and they were both up -24/7. Again, jrandom's absence has pointed us to places that need improvement. - -
-To assist in this effort, additional profile data for floodfill peers are -now (as of release 0.6.1.33) displayed on the "Profiles" page in -the router console. -We will use this to analyze which data are appropriate for -rating floodfill peers. -
- --The network is currently quite resilient, however -we will continue to enhance our algorithms for measuring and reacting to the performance and reliability -of floodfill peers. While we are not, at the moment, fully hardened to the potential threats of -malicious floodfills or a floodfill DDOS, most of the infrastructure is in place, -and we are well-positioned to react quickly -should the need arise. -
- -Currently, there is no 'ejection' strategy to get rid of the profiles for @@ -60,7 +60,7 @@ integrated into the network it is, and whether it is failing.
The speed calculation (as implemented -here) +here) simply goes through the profile and estimates how much data we can send or receive on a single tunnel through the peer in a minute. For this estimate it just looks at past performance. @@ -80,7 +80,7 @@ on today's network.
The capacity calculation (as implemented -here) +here) simply goes through the profile and estimates how many tunnels we think the peer would agree to participate in over the next hour. For this estimate it looks at how many the peer has agreed to lately, how many the peer rejected, and how many @@ -104,7 +104,7 @@ on today's network.
The integration calculation (as implemented -here) +here) is important only for the network database (and in turn, only when trying to focus the 'exploration' of the network to detect and heal splits). At the moment it is not used for anything though, as the detection code is not necessary for generally @@ -128,7 +128,7 @@ thus eliminating a major weakness in today's floodfill scheme.
The failing calculation (as implemented -here) +here) keeps track of a few data points and determines whether the peer is overloaded or is unable to honor its agreements. When a peer is marked as failing, it will be avoided whenever possible.
@@ -157,7 +157,7 @@ across peers as well as to prevent some simple attacks. These groupings are implemented in the ProfileOrganizer's +href="http://docs.i2p2.de/router/net/i2p/router/peermanager/ProfileOrganizer.html">ProfileOrganizer's reorganize() method (using the calculateThresholds() and locked_placeProfile() methods in turn).Momentan gibt es keine "Löschstrategie, um sich der Profile der Knoten @@ -55,7 +55,7 @@ Netzwerk integriert ist und ob er nicht funktioniert.
Die Berechnung der Geschwindigkeit (wie im Quellkode implementiert) +here) --!> wird einfach mit den Profilen durchgeführt und schätzt, wie viele Daten wir durch einen einzigen Tunnel durch diesen Knoten in einer Minute senden oder empfangen können. Dafür werden einfach die Werte der @@ -77,7 +77,7 @@ dem derzeitigem Netzwerk zu bestätigen.
die Berechnung der Kapazität +here) --!> bearbeitet die Profile und schätzt wie viele von uns angefragte Tunnel der Knoten in der nächsten Stunde akzeptiert. Dafür betrachtet die Methode, wieviele Tunnel der Knoten in letzter Zeit akzeptiert hat, wie @@ -104,7 +104,7 @@ zur Effektivität im aktuellem Netzwerk.
Die Berechnung der Integration +here) --!> ist nur für die Netzwerkdatenbank notwendig (und, falls eingebaut, zum Erkennen und reparieren von Netzwerk`splits`). Zur Zeit wird es für nichts benutzt, da der Kode zum Erkennen der Splits in gut verbundenen Netzwerken @@ -131,7 +131,7 @@ Schwachstelle im derzeitigem Floodfillsystem behoben.
Die Berechung zum Versagen diff --git a/www.i2p2/pages/netdb_discussion.html b/www.i2p2/pages/netdb_discussion.html new file mode 100644 index 00000000..f29a1f3c --- /dev/null +++ b/www.i2p2/pages/netdb_discussion.html @@ -0,0 +1,399 @@ +{% extends "_layout.html" %} +{% block title %}Network Database Discussion{% endblock %} +{% block content %} +
+NOTE: The following is a discussion of the history of netdb implmenetation and is not current information. +See the main netdb page for current documentation. + +
+Floodfill was introduced in release 0.6.0.4, keeping Kademlia as a backup algorithm. +
+ +
+(Adapted from posts by jrandom in the old Syndie, Nov. 26, 2005)
+
+As I've often said, I'm not particularly bound to any specific technology -
+what matters to me is what will get results. While I've been working through
+various netDb ideas over the last few years, the issues we've faced in the last
+few weeks have brought some of them to a head. On the live net,
+with the netDb redundancy factor set to 4 peers (meaning we keep sending an
+entry to new peers until 4 of them confirm that they've got it) and the
+per-peer timeout set to 4 times that peer's average reply time, we're
+still getting an average of 40-60 peers sent to before 4 ACK the store.
+That means sending 36-56 times as many messages as should go out, each using
+tunnels and thereby crossing 2-4 links. Even further, that value is heavily
+skewed, as the average number of peers sent to in a 'failed' store (meaning
+less than 4 people ACKed the message after 60 seconds of sending messages out)
+was in the 130-160 peers range.
+
+This is insane, especially for a network with only perhaps 250 peers on it. +
+The simplest answer is to say "well, duh jrandom, it's broken. fix it", but +that doesn't quite get to the core of the issue. In line with another current +effort, it's likely that we have a substantial number of network issues due to +restricted routes - peers who cannot talk with some other peers, often due to +NAT or firewall issues. If, say, the K peers closest to a particular netDb +entry are behind a 'restricted route' such that the netDb store message could +reach them but some other peer's netDb lookup message could not, that entry +would be essentially unreachable. Following down those lines a bit further and +taking into consideration the fact that some restricted routes will be created +with hostile intent, its clear that we're going to have to look closer into a +long term netDb solution. +
+There are a few alternatives, but two worth mentioning in particular. The +first is to simply run the netDb as a Kademlia DHT using a subset of the full +network, where all of those peers are externally reachable. Peers who are not +participating in the netDb still query those peers but they don't receive +unsolicited netDb store or lookup messages. Participation in the netDb would +be both self-selecting and user-eliminating - routers would choose whether to +publish a flag in their routerInfo stating whether they want to participate +while each router chooses which peers it wants to treat as part of the netDb +(peers who publish that flag but who never give any useful data would be +ignored, essentially eliminating them from the netDb). +
+Another alternative is a blast from the past, going back to the DTSTTCPW +(Do The Simplest Thing That Could Possibly Work) +mentality - a floodfill netDb, but like the alternative above, using only a +subset of the full network. When a user wants to publish an entry into the +floodfill netDb, they simply send it to one of the participating routers, wait +for an ACK, and then 30 seconds later, query another random participant in the +floodfill netDb to verify that it was properly distributed. If it was, great, +and if it wasn't, just repeat the process. When a floodfill router receives a +netDb store, they ACK immediately and queue off the netDb store to all of its +known netDb peers. When a floodfill router receives a netDb lookup, if they +have the data, they reply with it, but if they don't, they reply with the +hashes for, say, 20 other peers in the floodfill netDb. +
+Looking at it from a network economics perspective, the floodfill netDb is +quite similar to the original broadcast netDb, except the cost for publishing +an entry is borne mostly by peers in the netDb, rather than by the publisher. +Fleshing this out a bit further and treating the netDb like a blackbox, we can +see the total bandwidth required by the netDb to be:
+ recvKBps = N * (L + 1) * (1 + F) * (1 + R) * S / T +where
+ N = number of routers in the entire network + L = average number of client destinations on each router + (+1 for the routerInfo) + F = tunnel failure percentage + R = tunnel rebuild period, as a fraction of the tunnel lifetime + S = average netDb entry size + T = tunnel lifetime +Plugging in a few values:
+ recvKBps = 1000 * (5 + 1) * (1 + 0.05) * (1 + 0.2) * 2KB / 10m + = 25.2KBps +That, in turn, scales linearly with N (at 100,000 peers, the netDb must +be able to handle netDb store messages totaling 2.5MBps, or, at 300 peers, +7.6KBps). +
+While the floodfill netDb would have each netDb participant receiving only a +small fraction of the client generated netDb stores directly, they would all +receive all entries eventually, so all of their links should be capable of +handling the full recvKBps. In turn, they'll all need to send +(recvKBps/sizeof(netDb)) * (sizeof(netDb)-1) to keep the other +peers in sync. +
+A floodfill netDb would not require either tunnel routing for netDb operation +or any special selection as to which entries it can answer 'safely', as the +basic assumption is that they are all storing everything. Oh, and with regards +to the netDb disk usage required, its still fairly trivial for any modern +machine, requiring around 11MB for every 1000 peers (N * (L + 1) * +S). +
+The Kademlia netDb would cut down on these numbers, ideally bringing them to K +over M times their value, with K = the redundancy factor and M being the number +of routers in the netDb (e.g. 5/100, giving a recvKBps of 126KBps and 536MB at +100,000 routers). The downside of the Kademlia netDb though is the increased +complexity of safe operation in a hostile environment. +
+What I'm thinking about now is to simply implement and deploy a floodfill netDb +in our existing live network, letting peers who want to use it pick out other +peers who are flagged as members and query them instead of querying the +traditional Kademlia netDb peers. The bandwidth and disk requirements at this +stage are trivial enough (7.6KBps and 3MB disk space) and it will remove the +netDb entirely from the debugging plan - issues that remain to be addressed +will be caused by something unrelated to the netDb. +
+How would peers be chosen to publish that flag saying they are a part of the +floodfill netDb? At the beginning, it could be done manually as an advanced +config option (ignored if the router is not able to verify its external +reachability). If too many peers set that flag, how do the netDb participants +pick which ones to eject? Again, at the beginning it could be done manually as +an advanced config option (after dropping peers which are unreachable). How do +we avoid netDb partitioning? By having the routers verify that the netDb is +doing the flood fill properly by querying K random netDb peers. How do routers +not participating in the netDb discover new routers to tunnel through? Perhaps +this could be done by sending a particular netDb lookup so that the netDb +router would respond not with peers in the netDb, but with random peers outside +the netDb. +
+I2P's netDb is very different from traditional load bearing DHTs - it only +carries network metadata, not any actual payload, which is why even a netDb +using a floodfill algorithm will be able to sustain an arbitrary amount of +eepsite/IRC/bt/mail/syndie/etc data. We can even do some optimizations as I2P +grows to distribute that load a bit further (perhaps passing bloom filters +between the netDb participants to see what they need to share), but it seems we +can get by with a much simpler solution for now. +
+One fact may be worth digging +into - not all leaseSets need to be published in the netDb! In fact, most +don't need to be - only those for destinations which will be receiving +unsolicited messages (aka servers). This is because the garlic wrapped +messages sent from one destination to another already bundles the sender's +leaseSet so that any subsequent send/recv between those two destinations +(within a short period of time) work without any netDb activity. +
+So, back at those equations, we can change L from 5 to something like 0.1 +(assuming only 1 out of every 50 destinations is a server). The previous +equations also brushed over the network load required to answer queries from +clients, but while that is highly variable (based on the user activity), it's +also very likely to be quite insignificant as compared to the publishing +frequency. +
+Anyway, still no magic, but a nice reduction of nearly 1/5th the bandwidth/disk +space required (perhaps more later, depending upon whether the routerInfo +distribution goes directly as part of the peer establishment or only through +the netDb). +
+ ++Kademlia was completely disabled in release 0.6.1.20. +
+(this is adapted from an IRC conversation with jrandom 11/07)
+
+Kademlia requires a minimum level of service that the baseline could not offer (bandwidth, cpu),
+even after adding in tiers (pure kad is absurd on that point).
+Kademlia just wouldn't work. It was a nice idea, but not for a hostile and fluid environment.
+
The netDb plays a very specific role in the I2P network, and the algorithms +have been tuned towards our needs. This also means that it hasn't been tuned +to address the needs we have yet to run into. I2P is currently +fairly small (a few hundred routers). +There were some calculations that 3-5 floodfill routers should be able to handle +10,000 nodes in the network. +The netDb implementation more than adequately meets our +needs at the moment, but there will likely be further tuning and bugfixing as +the network grows.
+ +Current numbers: +
+ recvKBps = N * (L + 1) * (1 + F) * (1 + R) * S / T +where
+ N = number of routers in the entire network + L = average number of client destinations on each router + (+1 for the routerInfo) + F = tunnel failure percentage + R = tunnel rebuild period, as a fraction of the tunnel lifetime + S = average netDb entry size + T = tunnel lifetime ++Changes in assumptions: +
recvKBps = 700 * (0.5 + 1) * (1 + 0.33) * (1 + 0.5) * 4KB / 10m + ~= 28KBps ++This just accounts for the stores - what about the queries? + + +
+(this is adapted from the I2P meeting Jan. 2, 2007)
+
+The Kademlia netDb just wasn't working properly.
+Is it dead forever or will it be coming back?
+If it comes back, the peers in the Kademlia netDb would be a very limited subset
+of the routers in the network (basically an expanded number of floodfill peers, if/when the floodfill peers
+cannot handle the load).
+But until the floodfill peers cannot handle the load (and other peers cannot be added that can), it's unnecessary.
+
+(this is adapted from an IRC conversation with jrandom 11/07)
+
+Here's a proposal: Capacity class O is automatically floodfill.
+Hmm.
+Unless we're sure, we might end up with a fancy way of DDoS'ing all O class routers.
+This is quite the case: we want to make sure the number of floodfill is as small as possible while providing sufficient reachability.
+If/when netDb requests fail, then we need to increase the number of floodfill peers, but atm, I'm not aware of a netDb fetch problem.
+There are 33 "O" class peers according to my records.
+33 is a /lot/ to floodfill to.
+
+So floodfill works best when the number of peers in that pool is firmly limited? +And the size of the floodfill pool shouldn't grow much, even if the network itself gradually would? +3-5 floodfill peers can handle 10K routers iirc (I posted a bunch of numbers on that explaining the details in the old syndie). +Sounds like a difficult requirement to fill with automatic opt-in, +especially if nodes opting in cannot trust data from others. +e.g. "let's see if I'm among the top 5", +and can only trust data about themselves (e.g. "I am definitely O class, and moving 150 KB/s, and up for 123 days"). +And top 5 is hostile as well. Basically, it's the same as the tor directory servers - chosen by trusted people (aka devs). +Yeah, right now it could be exploited by opt-in, but that'd be trivial to detect and deal with. +Seems like in the end, we might need something more useful than Kademlia, and have only reasonably capable peers join that scheme. +N class and above should be a big enough quantity to suppress risk of an adversary causing denial of service, I'd hope. +But it would have to be different from floodfill then, in the sense that it wouldn't cause humongous traffic. +Large quantity? For a DHT based netDb? +Not necessarily DHT-based. +
+ ++NOTE: The following is not current information. +See the main netdb page for the current status and a list of future work. +
+The network was down to only one floodfill for a couple of hours on March 13, 2008 +(approx. 18:00 - 20:00 UTC), +and it caused a lot of trouble. +
+Two changes implemented in 0.6.1.33 should reduce the disruption caused +by floodfill peer removal or churn: +
+One benefit is faster first contact to an eepsite (i.e. when you had to fetch +the leaseset first). The lookup timeout is 10s, so if you don't start out by +asking a peer that is down, you can save 10s. + +
+There may be anonymity implications in these changes. +For example, in the floodfill store code, there are comments that +shitlisted peers are not avoided, since a peer could be "shitty" and then +see what happens. +Searches are much less vulnerable than stores - +they're much less frequent, and give less away. +So maybe we don't think we need to worry about it? +But if we want to tweak the changes, it would be easy to +send to a peer listed as "down" or shitlisted anyway, just not +count it as part of the 2 we are sending to +(since we don't really expect a reply). + +
+There are several places where a floodfill peer is selected - this fix addresses only one - +who a regular peer searches from [2 at a time]. +Other places where better floodfill selection should be implemented: +
+Lots more that could and should be done - +
+Recently completed - +
+That's a long list but it will take that much work to +have a network that's resistant to DOS from lots of peers turning the floodfill switch on and off. +Or pretending to be a floodfill router. +None of this was a problem when we had only two ff routers, and they were both up +24/7. Again, jrandom's absence has pointed us to places that need improvement. + +
+To assist in this effort, additional profile data for floodfill peers are +now (as of release 0.6.1.33) displayed on the "Profiles" page in +the router console. +We will use this to analyze which data are appropriate for +rating floodfill peers. +
+ ++The network is currently quite resilient, however +we will continue to enhance our algorithms for measuring and reacting to the performance and reliability +of floodfill peers. While we are not, at the moment, fully hardened to the potential threats of +malicious floodfills or a floodfill DDOS, most of the infrastructure is in place, +and we are well-positioned to react quickly +should the need arise. +
+ + +{% endblock %}