Refactor compression support, now at message level.

This commit is contained in:
Jakob Borg
2014-07-28 11:31:22 +02:00
parent 6a441d5013
commit 6c5c14f35f
8 changed files with 418 additions and 249 deletions

View File

@@ -25,13 +25,11 @@ Transport and Authentication
----------------------------
BEP is deployed as the highest level in a protocol stack, with the lower
level protocols providing compression, encryption and authentication.
level protocols providing encryption and authentication.
+-----------------------------|
| Block Exchange Protocol |
|-----------------------------|
| Compression (LZ4) |
|-----------------------------|
| Encryption & Auth (TLS 1.2) |
|-----------------------------|
| TCP |
@@ -62,48 +60,19 @@ requests are received.
The underlying transport protocol MUST be TCP.
Compression
-----------
All data is sent within compressed blocks. Blocks are compressed using
the LZ4 format and algorithm described in
https://code.google.com/p/lz4/. Each compressed block is preceded by a
header consisting of three 32 bit words, in network order (big endian):
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Magic (0x0x5e63b278) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Uncompressed Block Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/ /
\ Compressed Data \
/ /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The Data Length indicates the length of data following the Data Length
field until the next header, i.e. the length of the Compressed Data
section plus four bytes for the Uncompressed Block Length field. The
Uncompressed Block Length indicates the amount of data that will result
when decompressing the Compressed Data section.
A single BEP message SHOULD be sent as a single compressed block. A
single compressed block MAY NOT contain more than one BEP message.
Messages
--------
Every message starts with one 32 bit word indicating the message
version, type and ID. The header is in network byte order, i.e. big
endian.
version, type and ID, followed by the length of the message. The header
is in network byte order, i.e. big endian.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Ver | Message ID | Type | Reserved |
| Ver | Message ID | Type | Reserved |C|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
For BEP v1 the Version field is set to zero. Future versions with
@@ -125,10 +94,28 @@ The Type field indicates the type of data following the message header
and is one of the integers defined below. A message of an unknown type
is a protocol error and MUST result in the connection being terminated.
All data following the message header MUST be in XDR (RFC 1014)
encoding. All fields shorter than 32 bits and all variable length data
MUST be padded to a multiple of 32 bits. The actual data types in use by
BEP, in XDR naming convention, are the following:
The Compression bit "C" indicates the compression used for the message.
For C=1:
* The Length field contains the length, in bytes, of the
compressed message data.
* The message data is compressed using the LZ4 format and algorithm
described in https://code.google.com/p/lz4/.
For C=0:
* The Length field contains the length, in bytes, of the
uncompressed message data.
* The message is not compressed.
All data within the the message (post decompression, if compression is
in use) MUST be in XDR (RFC 1014) encoding. All fields shorter than 32
bits and all variable length data MUST be padded to a multiple of 32
bits. The actual data types in use by BEP, in XDR naming convention, are
the following:
- (unsigned) int -- (unsigned) 32 bit integer
- (unsigned) hyper -- (unsigned) 64 bit integer