260 lines
12 KiB
HTML
260 lines
12 KiB
HTML
|
<pre>-----BEGIN PGP SIGNED MESSAGE-----
|
||
|
Hash: SHA1
|
||
|
|
||
|
Hi y'all, a belated status notes this week
|
||
|
|
||
|
* Index
|
||
|
1) Net status
|
||
|
2) Router dev status
|
||
|
3) Syndie rationale continued
|
||
|
4) Syndie dev status
|
||
|
5) Distributed version control
|
||
|
6) ???
|
||
|
|
||
|
* 1) Net status
|
||
|
|
||
|
The past week or two have been fairly stable on irc and other
|
||
|
services, though dev.i2p/squid.i2p/www.i2p/cvs.i2p had a few bumps
|
||
|
(due to temporary OS-related issues). Things seem to be at a
|
||
|
steady state at the moment.
|
||
|
|
||
|
* 2) Router dev status
|
||
|
|
||
|
The flip side to the Syndie discussion is "so, what does that mean
|
||
|
for the router?", and to answer that, let me explain a bit where the
|
||
|
router development stands right now.
|
||
|
|
||
|
On the whole, the thing holding the router back from 1.0 is in my
|
||
|
view its performance, not its anonymity properties. Certainly, there
|
||
|
are anonymity issues to improve, but while we do get pretty good
|
||
|
performance for an anonymous network, our performance is not
|
||
|
sufficient for wider use. In addition, improvements to the anonymity
|
||
|
of the network will not improve its performance (in most instances I
|
||
|
can think of, anonymity improvements reduce throughput and increase
|
||
|
latency). We need to sort out the performance issues first, for
|
||
|
if the performance is insufficient, the whole system is insufficient,
|
||
|
regardless of how strong its anonymity techniques are.
|
||
|
|
||
|
So, what is keeping our performance back? Oddly enough, it seems to
|
||
|
be our CPU usage. Before we get to exactly why, a little more
|
||
|
background first.
|
||
|
|
||
|
- to prevent partitioning attacks, we all need to plausibly build
|
||
|
our tunnels from the same pool of routers.
|
||
|
- to allow the tunnels to be of manageable length (and source
|
||
|
routed), the routers in that pool must be directly reachable by
|
||
|
anyone.
|
||
|
- the bandwidth costs of receiving and rejecting tunnel join
|
||
|
requests exceeds the capacity of dialup users on burst.
|
||
|
|
||
|
Therefore, we need tiers of routers - some globally reachable with
|
||
|
high bandwidth limits (tier A), some not (tier B). This has already
|
||
|
in effect been implemented through the capacity information in the
|
||
|
netDb, and as of a day or two ago, the ratio of tier B to tier A
|
||
|
has been around 3 to 1 (93 routers of cap L, M, N, or O, and 278 of
|
||
|
cap K).
|
||
|
|
||
|
Now, there are basically two scarce resources to be managed in
|
||
|
tier A - bandwidth and CPU. Bandwidth can be managed by the usual
|
||
|
means (split load across a wide pool, have some peers handle insane
|
||
|
amounts [e.g. those on T3s], and reject or throttle individual
|
||
|
tunnels and connections).
|
||
|
|
||
|
Managing CPU usage is harder. The primary CPU bottleneck seen on
|
||
|
tier A routers is the decryption of tunnel build requests. Large
|
||
|
routers can be (and are) entirely consumed by this activity - for
|
||
|
instance, the lifetime average tunnel decrypt time on one of my
|
||
|
routers is 225ms, and the lifetime *average* frequency of a tunnel
|
||
|
request decryption is 254 events per 60 seconds, or 4.2 per second.
|
||
|
Simply multiplying those two together and that shows that 95% of the
|
||
|
CPU is consumed by tunnel request decryption alone (and that doesn't
|
||
|
take into consideration the spikes in the event counts). That router
|
||
|
still somehow manages to participate in 4-6000 tunnels at a time,
|
||
|
accepting approximately 80% of the decrypted requests.
|
||
|
|
||
|
Unfortunately, because the CPU on that router is so heavily loaded,
|
||
|
it has to drop a significant number of tunnel build requests before
|
||
|
they can even be decrypted (otherwise the requests would sit on the
|
||
|
queue so long that even if they were accepted, the original
|
||
|
requestor would have considered them lost or too loaded to do
|
||
|
anything anyway). In that light, the router's 80% accept rate looks
|
||
|
much worse - over its lifetime, it decrypted around 250k requests
|
||
|
(meaning around 200k were accepted), but it had to drop around 430k
|
||
|
requests in the decrypt queue due to CPU overload (turning that 80%
|
||
|
accept rate into 30%).
|
||
|
|
||
|
The solutions seem to be along the lines of reducing the relevent
|
||
|
CPU cost for tunnel request decryption. If we cut the CPU time by an
|
||
|
order of magnitude, that would increase the tier A router's capacity
|
||
|
substantially, thereby reducing rejections (both explicit and
|
||
|
implicit, due to dropped requests). That in turn would increase the
|
||
|
tunnel build success rate, thereby reducing the frequency of lease
|
||
|
expirations, which would then reduce the bandwidth load on the
|
||
|
network due to tunnel rebuilding.
|
||
|
|
||
|
One method for doing this would be to change the tunnel build
|
||
|
requests from using 2048bit Elgamal to, say, 1024bit or 768bit.
|
||
|
The problem there though is that if you break the encryption on a
|
||
|
tunnel build request message, you know the full path of the tunnel.
|
||
|
Even if we went this route, how much would it buy us? An improvement
|
||
|
by an order of magnitude in the decryption time could be wiped out by
|
||
|
an increase of an order of magnitude in the ratio of tier B to tier A
|
||
|
(aka the freeriders problem), and then we'd be stuck, since there's
|
||
|
no way we could move to 512 or 256bit Elgamal (and look ourselves in
|
||
|
the mirror ;)
|
||
|
|
||
|
One alternative would be to use weaker crypto but drop the protection
|
||
|
for packet counting attacks that we added in with the new tunnel
|
||
|
build process. That would allow us to use entirely ephemeral
|
||
|
negotiated keys in a Tor-like telescopic tunnel (though, again, would
|
||
|
expose the tunnel creator to trivial passive packet counting attacks
|
||
|
that identify a service).
|
||
|
|
||
|
Another idea is to publish and use even more explicit load
|
||
|
information in the netDb, allowing clients to more accurately detect
|
||
|
situations like the one above where a high bandwidth router drops 60%
|
||
|
of its tunnel request messages without even looking at them. There
|
||
|
are a few experiments worth doing along this avenue, and they can be
|
||
|
done with full backwards compatability, so we should be seeing them
|
||
|
soon.
|
||
|
|
||
|
So, that's the bottleneck in the router/network as I see it today.
|
||
|
Any and all suggestions for how we can deal with it would very much
|
||
|
be appreciated.
|
||
|
|
||
|
* 3) Syndie rationale continued
|
||
|
|
||
|
There's a meaty post up on the forum regarding Syndie and where it
|
||
|
fits in with things - check it out at
|
||
|
<a rel="nofollow" href="http://forum.i2p.net/viewtopic.php?t=1910">http://forum.i2p.net/viewtopic.php?t=1910</a>
|
||
|
|
||
|
Also, I'd just like to highlight two snippets from the Syndie docs
|
||
|
being worked on. First, from irc (and the not-yet-out-there FAQ):
|
||
|
|
||
|
<bar> a question i've been pondering is, who is later going to have
|
||
|
balls big enough to host syndie production servers/archives?
|
||
|
<bar> aren't those going to be as easy to track down as the eepsites
|
||
|
are today?
|
||
|
<jrandom> public syndie archives do not have the ability to
|
||
|
*read* the content posted to forums, unless the forums publish
|
||
|
the keys to do so
|
||
|
<jrandom> and see the second paragraph of usecases.html
|
||
|
<jrandom> of course, those hosting archives given lawful
|
||
|
orders to drop a forum will probably do so
|
||
|
<jrandom> (but then people can move to another
|
||
|
archive, without disrupting the forum's operation)
|
||
|
<void> yeah, you should mention the fact that migration to a
|
||
|
different medium is going to be seamless
|
||
|
<bar> if my archive shuts down, i can upload my whole forum to a new
|
||
|
one, right?
|
||
|
<jrandom> 'zactly bar
|
||
|
<void> they can use two methods at the same time while migrating
|
||
|
<void> and anyone is able to synchronize the mediums
|
||
|
<jrandom> right void
|
||
|
|
||
|
The relevent section of (the not yet published) Syndie usecases.html
|
||
|
is:
|
||
|
|
||
|
While many different groups often want to organize discussions into
|
||
|
an online forum, the centralized nature of traditional forums
|
||
|
(websites, BBSes, etc) can be a problem. For instance, the site
|
||
|
hosting the forum can be taken offline through denial of service
|
||
|
attacks or administrative action. In addition, the single host
|
||
|
offers a simple point to monitor the group's activity, so that even
|
||
|
if a forum is pseudonymous, those pseudonyms can be tied to the IP
|
||
|
that posted or read individual messages.
|
||
|
|
||
|
In addition, not only are the forums decentralized, they are
|
||
|
organized in an ad-hoc manner yet fully compatible with other
|
||
|
organization techniques. This means that some small group of people
|
||
|
can run their forum using one technique (distributing the messages
|
||
|
by pasting them on a wiki site), another can run their forum using
|
||
|
another technique (posting their messages in a distributed
|
||
|
hashtable like OpenDHT, yet if one person is aware of both
|
||
|
techniques, they can synchronize the two forums together. This lets
|
||
|
the people who were only aware of the wiki site talk to people who
|
||
|
were only aware of the OpenDHT service without knowing anything
|
||
|
about each other. Extended further, Syndie allows individual cells
|
||
|
to control their own exposure while communicating across the whole
|
||
|
organization.
|
||
|
|
||
|
* 4) Syndie dev status
|
||
|
|
||
|
There's been lots of progress on Syndie lately, with 7 alpha
|
||
|
releases handed out to folks on the irc channel. Most of the major
|
||
|
issues in the scriptable interface have been addressed, and I'm
|
||
|
hoping we can get the Syndie 1.0 release out later this month.
|
||
|
|
||
|
Did I just say "1.0"? You betcha! While Syndie 1.0 will be a text
|
||
|
based application, and won't even be comparable to the usability of
|
||
|
other comparable text based apps (such as mutt or tin), it will
|
||
|
provide the full range of functionality, allow HTTP and file based
|
||
|
syndication strategies, and hopefully demonstrate to potential
|
||
|
developers Syndie's capabilities.
|
||
|
|
||
|
Right now, I'm penciling in a Syndie 1.1 release (allowing people
|
||
|
to organize their archives and reading habits better) and maybe a
|
||
|
1.2 release to integrate some search functionality (both simple
|
||
|
searches and maybe lucene's fulltext searches). Syndie 2.0 will
|
||
|
probably be the first GUI release, with the browser plugin coming
|
||
|
with 3.0. Support for additional archives and message distribution
|
||
|
networks will be coming when implemented, of course (freenet,
|
||
|
mixminion/mixmaster/smtp, opendht, gnutella, etc).
|
||
|
|
||
|
I realize though that Syndie 1.0 won't be the earth shaker that some
|
||
|
want, as text based apps are really for the geeks, but I'd like to
|
||
|
try to break us of the habit of viewing "1.0" as a terminal release
|
||
|
and instead consider it a beginning.
|
||
|
|
||
|
* 5) Distributed version control
|
||
|
|
||
|
So far, I've been mucking around with subversion as the vcs for
|
||
|
Syndie, even though I'm only really fluent in CVS and clearcase.
|
||
|
This is because I'm offline most of the time, and even when I am
|
||
|
online, dialup is slow, so subversion's local diff/revert/etc
|
||
|
has been quite handy. However, yesterday void poked me with the
|
||
|
suggestion that we look into one of the distributed systems
|
||
|
instead.
|
||
|
|
||
|
I looked at them a few years back when evaluating a vcs for I2P,
|
||
|
but I dismissed them because I didn't need their offline
|
||
|
functionality (I had good net access then) so learning them wasn't
|
||
|
worthwhile. Thats not the case anymore, so I'm looking at them a
|
||
|
bit more now.
|
||
|
|
||
|
- From what I can see, darcs, monotone, and codeville are the top
|
||
|
contenders, and darcs' patch-based vcs seems particularly attractive.
|
||
|
For instance, I can do all my work locally and just scp up the
|
||
|
gzip'ed & gpg'ed diffs to an apache directory on dev.i2p.net, and
|
||
|
people can contribute their own changes by posting their gzip'ed and
|
||
|
gpg'ed diffs to locations of their choice. When it comes time to tag
|
||
|
a release, I'd make a darcs diff which specifies the set of patches
|
||
|
contained within the release and push that .gz'ed/.gpg'ed diff up
|
||
|
like the others (as well as push out actual tar.bz2, .exe, and .zip
|
||
|
files, of course ;)
|
||
|
|
||
|
And, as a particularly interesting point, these gzip'ed/gpg'ed diffs
|
||
|
can be posted as attachments to Syndie messages, allowing Syndie to
|
||
|
be self-hosting.
|
||
|
|
||
|
Anyone have any experience with these suckers though? Any advice?
|
||
|
|
||
|
* 6) ???
|
||
|
|
||
|
Only 24 screenfulls of text this time (including the forum post) ;)
|
||
|
I unfortunately wasn't able to make it to the meeting, but as always,
|
||
|
I'd love to hear from you if you've got any ideas or suggestions -
|
||
|
just post up to the list, the forum, or swing on by IRC.
|
||
|
|
||
|
=jr
|
||
|
-----BEGIN PGP SIGNATURE-----
|
||
|
Version: GnuPG v1.4.3 (GNU/Linux)
|
||
|
|
||
|
iD8DBQFFI8RHzgi8JTPcjUkRAuHoAJ0Ym4sOLHlii2eHdwyQYS0IregZzACffi4E
|
||
|
H/X9NBh7t6KTc9dibqLdgow=
|
||
|
=WLl9
|
||
|
-----END PGP SIGNATURE-----
|
||
|
|
||
|
|
||
|
</pre>
|