Rework XDR encoding

This commit is contained in:
Jakob Borg
2014-02-20 17:40:15 +01:00
parent 87d473dc8f
commit 5837277f8d
27 changed files with 1843 additions and 1029 deletions

View File

@@ -1,26 +1,29 @@
Block Exchange Protocol v1.0
============================
Block Exchange Protocol v1
==========================
Introduction and Definitions
----------------------------
The BEP is used between two or more _nodes_ thus forming a _cluster_.
Each node has a _repository_ of files described by the _local model_,
containing modifications times and block hashes. The local model is sent
to the other nodes in the cluster. The union of all files in the local
models, with files selected for most recent modification time, forms the
_global model_. Each node strives to get it's repository in sync with
the global model by requesting missing blocks from the other nodes.
BEP is used between two or more _nodes_ thus forming a _cluster_. Each
node has one or more _repositories_ of files described by the _local
model_, containing metadata and block hashes. The local model is sent to
the other nodes in the cluster. The union of all files in the local
models, with files selected for highest change version, forms the
_global model_. Each node strives to get it's repositories in sync with
the global model by requesting missing or outdated blocks from the other
nodes in the cluster.
File data is described and transferred in units of _blocks_, each being
128 KiB (131072 bytes) in size.
Transport and Authentication
----------------------------
The BEP itself does not provide retransmissions, compression, encryption
nor authentication. It is expected that this is performed at lower
layers of the networking stack. A typical deployment stack should be
similar to the following:
BEP itself does not provide retransmissions, compression, encryption nor
authentication. It is expected that this is performed at lower layers of
the networking stack. The typical deployment stack is the following:
|-----------------------------|
+-----------------------------|
| Block Exchange Protocol |
|-----------------------------|
| Compression (RFC 1951) |
@@ -48,73 +51,127 @@ message boundary.
Messages
--------
Every message starts with one 32 bit word indicating the message version
and type. For BEP v1.0 the Version field is set to zero. Future versions
with incompatible message formats will increment the Version field. The
reserved bits must be set to zero.
Every message starts with one 32 bit word indicating the message
version, type and ID.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Ver=0 | Message ID | Type | Reserved |
| Ver | Type | Message ID | Reply To |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
All data following the message header is in XDR (RFC 1014) encoding.
The actual data types in use by BEP, in XDR naming convention, are:
For BEP v1 the Version field is set to zero. Future versions with
incompatible message formats will increment the Version field.
The Type field indicates the type of data following the message header
and is one of the integers defined below.
The Message ID is set to a unique value for each transmitted message. In
request messages the Reply To is set to zero. In response messages it is
set to the message ID of the corresponding request.
All data following the message header is in XDR (RFC 1014) encoding. All
fields smaller than 32 bits and all variable length data is padded to a
multiple of 32 bits. The actual data types in use by BEP, in XDR naming
convention, are:
- (unsigned) int -- (unsigned) 32 bit integer
- (unsigned) hyper -- (unsigned) 64 bit integer
- opaque<> -- variable length opaque data
- string<> -- variable length string
The encoding of opaque<> and string<> are identical, the distinction is
solely in interpretation. Opaque data should not be interpreted as such,
but can be compared bytewise to other opaque data. All strings use the
UTF-8 encoding.
The transmitted length of string and opaque data is the length of actual
data, excluding any added padding. The encoding of opaque<> and string<>
are identical, the distinction being solely in interpretation. Opaque
data should not be interpreted but can be compared bytewise to other
opaque data. All strings use the UTF-8 encoding.
### Index (Type = 1)
The Index message defines the contents of the senders repository. A Index
message is sent by each peer immediately upon connection and whenever the
local repository contents changes. However, if a peer has no data to
advertise (the repository is empty, or it is set to only import data) it
is allowed but not required to send an empty Index message (a file list of
zero length). If the repository contents change from non-empty to empty,
an empty Index message must be sent. There is no response to the Index
message.
The Index message defines the contents of the senders repository. An
Index message is sent by each peer immediately upon connection. A peer
with no data to advertise (the repository is empty, or it is set to only
import data) is allowed but not required to send an empty Index message
(a file list of zero length). If the repository contents change from
non-empty to empty, an empty Index message must be sent. There is no
response to the Index message.
struct IndexMessage {
string Repository<>;
FileInfo Files<>;
}
#### Graphical Representation
struct FileInfo {
string Name<>;
unsigned int Flags;
hyper Modified;
unsigned int Version;
BlockInfo Blocks<>;
}
IndexMessage Structure:
struct BlockInfo {
unsigned int Length;
opaque Hash<>
}
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length of Repository |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/ /
\ Repository (variable length) \
/ /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Number of Files |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/ /
\ Zero or more FileInfo Structures \
/ /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
FileInfo Structure:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length of Name |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/ /
\ Name (variable length) \
/ /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Flags |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ Modified (64 bits) +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Version |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Number of Blocks |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/ /
\ Zero or more BlockInfo Structures \
/ /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
BlockInfo Structure:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length of Hash |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/ /
\ Hash (variable length) \
/ /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
#### Fields
The Repository field identifies the repository that the index message
pertains to. For single repository implementations an empty repository
ID is acceptable.
ID is acceptable, or the word "default". The Name is the file name path
relative to the repository root. The combination of Repository and Name
uniquely identifies each file in a cluster.
The file name is the part relative to the repository root. The
modification time is expressed as the number of seconds since the Unix
Epoch. The version field is a counter that increments each time the file
changes but resets to zero each time the modification is updated. This
is used to signal changes to the file (or file metadata) while the
modification time remains unchanged. The hash algorithm is implied by
the hash length. Currently, the hash must be 32 bytes long and computed
by SHA256.
The Version field is a counter that is initially zero for each file. It
is incremented each time a change is detected. The combination of
Repository, Name and Version uniquely identifies the contents of a file
at a certain point in time.
The flags field is made up of the following single bit flags:
The Flags field is made up of the following single bit flags:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
@@ -136,62 +193,128 @@ The flags field is made up of the following single bit flags:
- Bit 0 through 17 are reserved for future use and shall be set to
zero.
The hash algorithm is implied by the Hash length. Currently, the hash
must be 32 bytes long and computed by SHA256.
The Modified time is expressed as the number of seconds since the Unix
Epoch. In the rare occasion that a file is simultaneously and
independently modified by two nodes in the same cluster and thus end up
on the same Version number after modification, the Modified field is
used as a tie breaker.
The Size field is the size of the file, in bytes.
The Blocks list contains the size and hash for each block in the file.
Each block represents a 128 KiB slice of the file, except for the last
block which may represent a smaller amount of data.
#### XDR
struct IndexMessage {
string Repository<>;
FileInfo Files<>;
}
struct FileInfo {
string Name<>;
unsigned int Flags;
hyper Modified;
unsigned int Version;
BlockInfo Blocks<>;
}
struct BlockInfo {
unsigned int Size;
opaque Hash<>;
}
### Request (Type = 2)
The Request message expresses the desire to receive a data block
corresponding to a part of a certain file in the peer's repository.
The requested block must correspond exactly to one block seen in the
peer's Index message. The hash field must be set to the expected value by
the sender. The receiver may validate that this is actually the case
before transmitting data. Each Request message must be met with a Response
#### Graphical Representation
RequestMessage Structure:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length of Repository |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/ /
\ Repository (variable length) \
/ /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length of Name |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/ /
\ Name (variable length) \
/ /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ Offset (64 bits) +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
#### Fields
The Repository and Name fields are as documented for the Index message.
The Offset and Size fields specify the region of the file to be
transferred. This should equate to exactly one block as seen in an Index
message.
#### XDR
struct RequestMessage {
string Repository<>;
string Name<>;
unsigned hyper Offset;
unsigned int Length;
opaque Hash<>;
unsigned int Size;
}
The hash algorithm is implied by the hash length. Currently, the hash
must be 32 bytes long and computed by SHA256.
The Message ID in the header must set to a unique value to be able to
correlate the request with the response message.
### Response (Type = 3)
The Response message is sent in response to a Request message. In case the
requested data was not available (an outdated block was requested, or
the file has been deleted), the Data field is empty.
The Response message is sent in response to a Request message.
#### Graphical Representation
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length of Data |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/ /
\ Data (variable length) \
/ /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
#### Fields
The Data field contains either a full 128 KiB block, a shorter block in
the case of the last block in a file, or is empty (zero length) if the
requested block is not available.
#### XDR
struct ResponseMessage {
opaque Data<>
}
The Message ID in the header is used to correlate requests and
responses.
### Ping (Type = 4)
The Ping message is used to determine that a connection is alive, and to
keep connections alive through state tracking network elements such as
firewalls and NAT gateways. The Ping message has no contents.
struct PingMessage {
}
### Pong (Type = 5)
The Pong message is sent in response to a Ping. The Pong message has no
contents, but copies the Message ID from the Ping.
struct PongMessage {
}
### IndexUpdate (Type = 6)
### Index Update (Type = 6)
This message has exactly the same structure as the Index message.
However instead of replacing the contents of the repository in the
@@ -206,26 +329,59 @@ configuration, version, etc. It is sent at connection initiation and,
optionally, when any of the sent parameters have changed. The message is
in the form of a list of (key, value) pairs, both of string type.
Key ID:s apart from the well known ones are implementation specific. An
implementation is expected to ignore unknown keys. An implementation may
impose limits on key and value size.
Well known keys:
- "clientId" -- The name of the implementation. Example: "syncthing".
- "clientVersion" -- The version of the client. Example: "v1.0.33-47". The
Following the SemVer 2.0 specification for version strings is
encouraged but not enforced.
#### Graphical Representation
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Number of Options |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/ /
\ Zero or more KeyValue Structures \
/ /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
KeyValue Structure:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length of Key |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/ /
\ Key (variable length) \
/ /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length of Value |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/ /
\ Value (variable length) \
/ /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
#### XDR
struct OptionsMessage {
KeyValue Options<>;
}
struct KeyValue {
string Key;
string Value;
string Key<>;
string Value<>;
}
Key ID:s apart from the well known ones are implementation
specific. An implementation is expected to ignore unknown keys. An
implementation may impose limits on key and value size.
Well known keys:
- "clientId" -- The name of the implementation. Example: "syncthing".
- "clientVersion" -- The version of the client. Example: "v1.0.33-47". The
Following the SemVer 2.0 specification for version strings is
encouraged but not enforced.
Example Exchange
----------------
@@ -239,7 +395,7 @@ Example Exchange
7. <-Response
8. <-Response
9. <-Response
10. Index->
10. Index Update->
...
11. Ping->
12. <-Pong
@@ -250,7 +406,7 @@ of the data in the cluster. In this example, peer A has four missing or
outdated blocks. At 2 through 5 peer A sends requests for these blocks.
The requests are received by peer B, who retrieves the data from the
repository and transmits Response records (6 through 9). Node A updates
their repository contents and transmits an updated Index message (10).
their repository contents and transmits an Index Update message (10).
Both peers enter idle state after 10. At some later time 11, peer A
determines that it has not seen data from B for some time and sends a
Ping request. A response is sent at 12.