2012-12-11 02:35:16 +00:00
{% extends "global/layout.html" %}
2013-02-05 23:23:58 +00:00
{% block title %}{% trans %}I2P Blockfile Specification{% endtrans %}{% endblock %}
{% block lastupdated %}{% trans %}January 2012{% endtrans %}{% endblock %}
2012-12-14 03:41:29 +00:00
{% block accuratefor %}0.8.12{% endblock %}
2012-01-28 18:33:49 +00:00
{% block content %}
2013-02-05 23:23:58 +00:00
< h2 > {% trans %}Blockfile and Hosts Database Specification{% endtrans %}< / h2 >
< h3 > {% trans %}Overview{% endtrans %}< / h3 >
< p > {% trans naming=site_url('docs/naming') -%}
2012-01-28 18:33:49 +00:00
This document specifies
the I2P blockfile file format
2013-02-05 23:23:58 +00:00
and the tables in the hostsdb.blockfile used by the Blockfile < a href = "{{ naming }}" > Naming Service< / a > .
{%- endtrans %}< / p >
< p > {% trans -%}
2012-01-28 18:33:49 +00:00
The blockfile provides fast Destination lookup in a compact format. While the blockfile page overhead is substantial,
the destinations are stored in binary rather than in Base 64 as in the hosts.txt format.
In addition, the blockfile provides the capability of arbitrary metadata storage
(such as added date, source, and comments) for each entry.
The metadata may be used in the future to provide advanced addressbook features.
The blockfile storage requirement is a modest increase over the hosts.txt format, and the blockfile provides
approximately 10x reduction in lookup times.
2013-02-05 23:23:58 +00:00
{%- endtrans %}< / p >
< p > {% trans url='http://www.metanotion.net/software/sandbox/block.html' -%}
2012-01-28 18:33:49 +00:00
A blockfile is simply on-disk storage of multiple sorted maps (key-value pairs),
implemented as skiplists.
The blockfile format is adopted from the
2013-02-05 23:23:58 +00:00
< a href = "{{ url }}" > Metanotion Blockfile Database< / a > .
2012-01-28 18:33:49 +00:00
First we will define the file format, then the use of that format by the BlockfileNamingService.
2013-02-05 23:23:58 +00:00
{%- endtrans %}< / p >
2012-01-28 18:33:49 +00:00
2013-02-05 23:23:58 +00:00
< h3 > {% trans %}Blockfile Format{% endtrans %}< / h3 >
< p > {% trans -%}
2012-01-28 18:33:49 +00:00
The original blockfile spec was modified to add magic numbers to each page.
The file is structured in 1024-byte pages. Pages are numbered starting from 1.
The "superblock" is always at page 1, i.e. starting at byte 0 in the file.
The metaindex skiplist is always at page 2, i.e. starting at byte 1024 in the file.
2013-02-05 23:23:58 +00:00
{%- endtrans %}< / p >
< p > {% trans -%}
2012-01-28 19:11:56 +00:00
All 2-byte integer values are unsigned.
All 4-byte integer values (page numbers) are signed and negative values are illegal.
2013-02-05 23:23:58 +00:00
{%- endtrans %}< / p >
< p > {% trans -%}
2012-01-28 19:11:56 +00:00
The database is designed to be opened and accessed by a single thread.
The BlockfileNamingService provides synchronization.
2013-02-05 23:23:58 +00:00
{%- endtrans %}< / p >
2012-01-28 18:33:49 +00:00
2013-02-05 23:23:58 +00:00
< p > {% trans -%}
2012-01-28 18:33:49 +00:00
Superblock format:
2013-02-05 23:23:58 +00:00
{%- endtrans %}< / p >
2013-06-10 07:23:39 +00:00
{% highlight %}
2012-01-28 18:33:49 +00:00
Byte Contents
2012-01-28 19:11:56 +00:00
0-5 Magic number 0x3141de493250 ("1A" 0xde "I2P")
2012-01-28 18:33:49 +00:00
6 Major version 0x01
7 Minor version 0x01
8-15 File length Total length in bytes
16-19 First free list page
20-21 Mounted flag 0x01 = yes
22-23 Span size max number of key/value pairs per span (16 for hostsdb)
24-1023 unused
2013-06-10 07:23:39 +00:00
{% endhighlight %}
2012-01-28 18:33:49 +00:00
2013-02-05 23:23:58 +00:00
< p > {% trans -%}
2012-01-28 18:33:49 +00:00
Skip list block page format:
2013-02-05 23:23:58 +00:00
{%- endtrans %}< / p >
2013-06-10 07:23:39 +00:00
{% highlight %}
2012-01-28 18:33:49 +00:00
Byte Contents
0-7 Magic number 0x536b69704c697374 "SkipList"
8-11 First span page
12-15 First level page
16-19 Size (total number of keys - may only be valid at startup)
20-23 Spans (total number of spans - may only be valid at startup)
24-27 Levels (total number of levels - may only be valid at startup)
28-1023 unused
2013-06-10 07:23:39 +00:00
{% endhighlight %}
2012-01-28 18:33:49 +00:00
2013-02-05 23:23:58 +00:00
< p > {% trans -%}
2012-01-28 18:33:49 +00:00
Skip level block page format is as follows.
All levels have a span. Not all spans have levels.
2013-02-05 23:23:58 +00:00
{%- endtrans %}< / p >
2013-06-10 07:23:39 +00:00
{% highlight %}
2012-01-28 18:33:49 +00:00
Byte Contents
0-7 Magic number 0x42534c6576656c73 "BSLevels"
8-9 Max height
10-11 Current height
12-15 Span page
2012-01-28 19:11:56 +00:00
16- Next level pages ('current height' entries, 4 bytes each, lowest first)
remaining bytes unused
2013-06-10 07:23:39 +00:00
{% endhighlight %}
2012-01-28 18:33:49 +00:00
2013-02-05 23:23:58 +00:00
< p > {% trans -%}
2012-01-28 18:33:49 +00:00
Skip span block page format is as follows.
Key/value structures are sorted by key within each span and across all spans.
Key/value structures are sorted by key within each span.
Spans other than the first span may not be empty.
2013-02-05 23:23:58 +00:00
{%- endtrans %}< / p >
2013-06-10 07:23:39 +00:00
{% highlight %}
2012-01-28 18:33:49 +00:00
Byte Contents
0-3 Magic number 0x5370616e "Span"
4-7 First continuation page or 0
8-11 Previous span page or 0
12-15 Next span page or 0
16-17 Max keys (16 for hostsdb)
18-19 Size (current number of keys)
2012-01-28 19:11:56 +00:00
20-1023 key/value structures
2013-06-10 07:23:39 +00:00
{% endhighlight %}
2012-01-28 18:33:49 +00:00
2013-02-05 23:23:58 +00:00
< p > {% trans -%}
2012-01-28 18:33:49 +00:00
Span Continuation block page format:
2013-02-05 23:23:58 +00:00
{%- endtrans %}< / p >
2013-06-10 07:23:39 +00:00
{% highlight %}
2012-01-28 18:33:49 +00:00
Byte Contents
0-3 Magic number 0x434f4e54 "CONT"
4-7 Next continuation page or 0
8-1023 key/value structures
2013-06-10 07:23:39 +00:00
{% endhighlight %}
2012-01-28 18:33:49 +00:00
2013-02-05 23:23:58 +00:00
< p > {% trans -%}
2012-01-28 18:33:49 +00:00
Key/value structure format is as follows.
Key and value lengths must not be split across pages, i.e. all 4 bytes must be on the same page.
If there is not enough room the last 1-3 bytes of a page are unused and the lengths will
be at offset 8 in the continuation page.
Key and value data may be split across pages.
Max key and value lengths are 65535 bytes.
2013-02-05 23:23:58 +00:00
{%- endtrans %}< / p >
2013-06-10 07:23:39 +00:00
{% highlight %}
2012-01-28 18:33:49 +00:00
Byte Contents
0-1 key length in bytes
2-3 value length in bytes
4- key data
value data
2013-06-10 07:23:39 +00:00
{% endhighlight %}
2012-01-28 18:33:49 +00:00
2013-02-05 23:23:58 +00:00
< p > {% trans -%}
2012-01-28 18:33:49 +00:00
Free list block page format:
2013-02-05 23:23:58 +00:00
{%- endtrans %}< / p >
2013-06-10 07:23:39 +00:00
{% highlight %}
2012-01-28 18:33:49 +00:00
Byte Contents
0-7 Magic number 0x2366724c69737423 "#frList#"
8-11 Next free list block or 0 if none
12-15 Number of valid free pages in this block (0 - 252)
16-1023 Free pages (4 bytes each), only the first (valid number) are valid
2013-06-10 07:23:39 +00:00
{% endhighlight %}
2012-01-28 18:33:49 +00:00
2013-02-05 23:23:58 +00:00
< p > {% trans -%}
2012-01-28 18:33:49 +00:00
Free page block format:
2013-02-05 23:23:58 +00:00
{%- endtrans %}< / p >
2013-06-10 07:23:39 +00:00
{% highlight %}
2012-01-28 18:33:49 +00:00
Byte Contents
0-7 Magic number 0x7e2146524545217e "~!FREE!~"
8-1023 unused
2013-06-10 07:23:39 +00:00
{% endhighlight %}
2012-01-28 18:33:49 +00:00
2013-02-05 23:23:58 +00:00
< p > {% trans -%}
2012-01-28 18:33:49 +00:00
The metaindex (located at page 2) is a mapping of US-ASCII strings to 4-byte integers.
The key is the name of the skiplist and the value is the page index of the skiplist.
2013-02-05 23:23:58 +00:00
{%- endtrans %}< / p >
2012-01-28 18:33:49 +00:00
2013-02-05 23:23:58 +00:00
< h3 > {% trans %}Blockfile Naming Service Tables{% endtrans %}< / h3 >
< p > {% trans -%}
2012-01-28 18:33:49 +00:00
The tables created and used by the BlockfileNamingService are as follows.
The maximum number of entries per span is 16.
2013-02-05 23:23:58 +00:00
{%- endtrans %}< / p >
2012-01-28 18:33:49 +00:00
2013-02-05 23:23:58 +00:00
< h4 > {% trans %}Properties Skiplist{% endtrans %}< / h4 >
2013-08-15 21:27:54 +00:00
< p > {% trans INFO='"%%__INFO__%%"' -%}
{{ INFO }} is the master database skiplist with String/Properties key/value entries containing only one entry:
2013-02-05 23:23:58 +00:00
{%- endtrans %}< / p >
2012-01-28 18:33:49 +00:00
< pre >
2013-02-05 23:23:58 +00:00
"info": a Properties (UTF-8 String/String Map), serialized as a < a href = "{{ site_url('docs/spec/common-structures') }}#type_Mapping" > Mapping< / a > :
2012-01-28 18:33:49 +00:00
"version": "2"
"created": Java long time (ms)
"upgraded": Java long time (ms) (as of database version 2)
"lists": Comma-separated list of host databases, to be
searched in-order for lookups. Almost always "privatehosts.txt,userhosts.txt,hosts.txt".
< / pre >
2013-02-05 23:23:58 +00:00
< h4 > {% trans %}Reverse Lookup Skiplist{% endtrans %}< / h4 >
2013-08-15 21:27:54 +00:00
< p > {% trans REVERSE='"%%__REVERSE__%%"' -%}
{{ REVERSE }} is the reverse lookup skiplist with Integer/Properties key/value entries
2012-01-28 18:33:49 +00:00
(as of database version 2):
2013-02-05 23:23:58 +00:00
{%- endtrans %}< / p >
2012-01-28 18:33:49 +00:00
< pre >
The skiplist keys are 4-byte Integers, the first 4 bytes of the hash of the Destination.
2013-02-05 23:23:58 +00:00
The skiplist values are each a Properties (a UTF-8 String/String Map) serialized as a < a href = "{{ site_url('docs/spec/common-structures') }}#type_Mapping" > Mapping< / a >
2012-01-28 18:33:49 +00:00
There may be multiple entries in the properties, each one is a reverse mapping,
as there may be more than one hostname for a given destination,
or there could be collisions with the same first 4 bytes of the hash.
Each property key is a hostname.
Each property value is the empty string.
< / pre >
2013-02-05 23:23:58 +00:00
< h4 > {% trans %}hosts.txt, userhosts.txt, and privatehosts.txt Skiplists{% endtrans %}< / h4 >
< p > {% trans -%}
2012-01-28 18:33:49 +00:00
For each host database, there is a skiplist containing
the hosts for that database.
The keys/values in these skiplists are as follows:
2013-02-05 23:23:58 +00:00
{%- endtrans %}< / p >
2012-01-28 18:33:49 +00:00
< pre >
key: a UTF-8 String (the hostname)
2013-02-05 23:23:58 +00:00
value: a DestEntry, which is a Properties (a UTF-8 String/String Map) serialized as a < a href = "{{ site_url('docs/spec/common-structures') }}#type_Mapping" > Mapping< / a >
followed by a binary Destination (serialized < a href = "{{ site_url('docs/spec/common-structures') }}#struct_Destination" > as usual< / a > ).
2012-01-28 18:33:49 +00:00
< / pre >
2013-02-05 23:23:58 +00:00
< p > {% trans -%}
2012-01-28 18:33:49 +00:00
The DestEntry Properties typically contains:
2013-02-05 23:23:58 +00:00
{%- endtrans %}< / p >
2012-01-28 18:33:49 +00:00
< pre >
"a": The time added (Java long time in ms)
"s": The original source of the entry (typically a file name or subscription URL)
others: TBD
< / pre >
2013-02-05 23:23:58 +00:00
< p > {% trans -%}
2012-01-28 18:33:49 +00:00
Hostname keys are stored in lower-case and always end in ".i2p".
2013-02-05 23:23:58 +00:00
{%- endtrans %}< / p >
2012-01-28 18:33:49 +00:00
{% endblock %}