Crate rbml [−] [src]
rustc_private
)Really Bad Markup Language (rbml) is an internal serialization format of rustc. This is not intended to be used by users.
Originally based on the Extensible Binary Markup Language (ebml; http://www.matroska.org/technical/specs/rfc/index.html), it is now a separate format tuned for the rust object metadata.
Encoding
RBML document consists of the tag, length and data. The encoded data can contain multiple RBML documents concatenated.
Tags are a hint for the following data.
Tags are a number from 0x000 to 0xfff, where 0xf0 through 0xff is reserved.
Tags less than 0xf0 are encoded in one literal byte.
Tags greater than 0xff are encoded in two big-endian bytes,
where the tag number is ORed with 0xf000. (E.g. tag 0x123 = f1 23
)
Lengths encode the length of the following data. It is a variable-length unsigned isize, and one of the following forms:
80
throughfe
for lengths up to 0x7e;40 ff
through7f ff
for lengths up to 0x3fff;20 40 00
through3f ff ff
for lengths up to 0x1fffff;10 20 00 00
through1f ff ff ff
for lengths up to 0xfffffff.
The "overlong" form is allowed so that the length can be encoded
without the prior knowledge of the encoded data.
For example, the length 0 can be represented either by 80
, 40 00
,
20 00 00
or 10 00 00 00
.
The encoder tries to minimize the length if possible.
Also, some predefined tags listed below are so commonly used that
their lengths are omitted ("implicit length").
Data can be either binary bytes or zero or more nested RBML documents. Nested documents cannot overflow, and should be entirely contained within a parent document.
Predefined Tags
Most RBML tags are defined by the application.
(For the rust object metadata, see also rustc::metadata::common
.)
RBML itself does define a set of predefined tags however,
intended for the auto-serialization implementation.
Predefined tags with an implicit length:
U8
(00
): 1-byte unsigned integer.U16
(01
): 2-byte big endian unsigned integer.U32
(02
): 4-byte big endian unsigned integer.U64
(03
): 8-byte big endian unsigned integer. Any ofU*
tags can be used to encode primitive unsigned integer types, as long as it is no greater than the actual size. For example,u8
can only be represented via theU8
tag.I8
(04
): 1-byte signed integer.I16
(05
): 2-byte big endian signed integer.I32
(06
): 4-byte big endian signed integer.I64
(07
): 8-byte big endian signed integer. Similar toU*
tags. Always uses two's complement encoding.Bool
(08
): 1-byte boolean value,00
for false and01
for true.Char
(09
): 4-byte big endian Unicode scalar value. Surrogate pairs or out-of-bound values are invalid.F32
(0a
): 4-byte big endian unsigned integer representing IEEE 754 binary32 floating-point format.F64
(0b
): 8-byte big endian unsigned integer representing IEEE 754 binary64 floating-point format.Sub8
(0c
): 1-byte unsigned integer for supplementary information.Sub32
(0d
): 4-byte unsigned integer for supplementary information. Those two tags normally occur as the first subdocument of certain tags, namelyEnum
,Vec
andMap
, to provide a variant or size information. They can be used interchangeably.
Predefined tags with an explicit length:
Str
(10
): A UTF-8-encoded string.Enum
(11
): An enum. The first subdocument should beSub*
tags with a variant ID. Subsequent subdocuments, if any, encode variant arguments.Vec
(12
): A vector (sequence).VecElt
(13
): A vector element. The first subdocument should beSub*
tags with the number of elements. Subsequent subdocuments should beVecElt
tag per each element.Map
(14
): A map (associated array).MapKey
(15
): A key part of the map entry.MapVal
(16
): A value part of the map entry. The first subdocument should beSub*
tags with the number of entries. Subsequent subdocuments should be an alternating sequence ofMapKey
andMapVal
tags per each entry.Opaque
(17
): An opaque, custom-format tag. Used to wrap ordinary custom tags or data in the auto-serialized context. Rustc typically uses this to encode type information.
First 0x20 tags are reserved by RBML; custom tags start at 0x20.
Reexports
pub use self::EbmlEncoderTag::*; |
pub use self::Error::*; |
Modules
leb128 | [Unstable] |
opaque | [Unstable] |
reader | [Unstable] |
writer | [Unstable] |
Structs
Doc |
[Unstable] Common data structures |
TaggedDoc | [Unstable] |
Enums
EbmlEncoderTag | [Unstable] |
Error | [Unstable] |