This specification defines a binary serialization of RDF based on CBOR. Like CBOR the serialization is optimized for small code size and fairly small message size. The serialization is suitable for systems and devices that are possibly constrained in terms of network or computation.¶
By re-using the existing and well-defined CBOR data model, RDF terms can be efficiently encoded. In particular, mappings of common RDF literal datatypes into a binary CBOR representation are defined. Furthermore we use compression techniques such as Incremental Encoding and Bitmap Triples to make the serialization compact.¶
The serialization is defined using the Concise Data Definition Language (CDDL). This allows a precise and concise definition, enabling wide implementation and usage.¶
We also describe how the serialization can be used to make RDF content-addressable. A group of RDF statements can then be addressed by a unique identifier determined exactly by the contents of the statements. This allows RDF data to be made available more robustly and enables the use of RDF in decentralized systems.¶
The Resource Description Framework (RDF) [RDF] is a data model for structured content. The data is modeled as graph with nodes and edges labeled with unique identifiers that are at the same time references to further data (Internationalized Resource Identifier [RFC3987]).¶
Together with foundational principles such as the open-world assumption (no single agent has complete knowledge) and the unique name assumption (the same thing has the same name regardless of context) this graph structure makes RDF well-suited for decentralized systems.¶
Nevertheless, RDF and the Semantic Web (the vision behind RDF) have failed to provide a robust and decentralized foundation. The reasons for this has been argued to include low availability of content and challenges in scaling systems querying large data sets [Polleres20].¶
We would like to take the counterpoint and argue that the reason for why RDF has failed to provide a robust and decentralized foundation is not the challenge in scaling to large systems, but to small systems. RDF/CBOR is an attempt to make RDF usable by decentralized systems that exchange small pieces of content that are aggregated locally. An example of such a system is the network of servers speaking the ActivityPub protocol [ActivityPub].¶
We acknowledge the diversity of actors and devices that generate and consume content and emphasize the necessity for a well-defined encoding as well as an encoding that can be re-implemented.¶
The Concise Binary Object Representation (CBOR) [RFC8949] is a binary data serialization that provides basic data types (string, integer, arrays, etc.) as well as extendable tags for annotating more complex data types. By using CBOR we can re-use the already defined data types and tags. Implementations of CBOR exist for a wide range of languages and platforms and can be re-used to implement RDF/CBOR. Furthermore, we use the Concise Data Definition Language (CDDL) [RFC8610] which allows a concise and unambiguous description of CBOR data structures used in RDF/CBOR.¶
Finally, RDF/CBOR allows content-addressing. A group of RDF statements can be identified by a unique identifier that is determined by the content of the statements itself. This enables caching and duplicating to make content available more robustly.¶
The objectives and requirements of RDF/CBOR are (roughly in order):¶
The specified serialization takes much inspiration from the HDT serialization [HDT]. HDT is a binary serialization of RDF optimized for large data sets and allowing in-place queries without loading the entire content. RDF/CBOR uses the same encoding of RDF terms into a dictionary (see Section 3.1) and triples (see Section 3.2). RDF/CBOR does not have a headers section for meta-data. Unlike HDT, RDF/CBOR uses variable length encoding of data items (by using CBOR). This can make binary representation more compact, but prevents random-access queries and in-place queries.¶
RDF/CBOR can be used for stream processing of RDF triples. When encoding large data sets, triples are packed into smaller groups that can be decoded independently. This is inspired by the ERI serialization [ERI]. Unlike ERI we don't provide a generic and abstract model, but a concrete encoding. The term molecule is adopted from ERI.¶
RDF/CBOR is an improvement to previous attempts of using CBOR for RDF [Sahlmann2018]. We attempt to combine the compression techniques from HDT with the built-in datatypes provided by CBOR.¶
CBOR-LD is a CBOR based serialization for Linked Data. CBOR-LD is based on the JSON-LD serialization [JSON-LD] and requires the JSON-LD processing algorithms. This limits usability of the serialization on constrained devices. Furthermore, CBOR-LD uses JSON-LD context, user defined and possibly remote schemas, that are required when decoding to RDF. This makes CBOR-LD unsuitable for systems with limited connectivity.¶
Previous work on signing RDF data include [Tummarello05] and [TrustyURIs]. They both use existing serializations (N-triples) that are not designed to have canonical representations. RDF/CBOR provides a canonical representation, making the content-addressing and signing scheme more robust. A W3C working group has recently been established to develop a canonical representation of RDF (RDF Dataset Canonicalization and Hash Working Group Charter).¶
This work is based on a previous paper titled Content-addressable RDF.¶
In section Section 2 we present an encoding of RDF terms (IRIs, literals and blank nodes) to CBOR. This uses existing CBOR datatypes and tags, allowing fairly compact and straight-forward encoding.¶
The core idea of RDF/CBOR is to split large sets of RDF triples into smaller groups and to encode such groups individually. The smaller groups are called molecules and are encoded using some compression tricks. Section 3 describes the encoding of molecules.¶
In Section 4 we describe how the encoding of molecules can be used to make RDF content-addressable. This requires defining a suitable grouping of triples as well as a canonical serialization based on the serialization of molecules.¶
Section 5 describes how multiple molecules can be combined to a stream. This is a relatively straight-forward construction but one that permits future optimizations.¶
We conclude with some final remarks and an outlook what could be possible in Section 7.¶
Examples of encoded content are provided in Appendix A.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].¶
The encoding is defined using the Concise Data Definition Language (CDDL) [RFC8610]. A basic understanding of CBOR and CDDL is required to read this document.¶
An RDF triple consists of three components: subject, predicate and object. The components are IRIs [RFC3987], literals or blank nodes. Collectively IRIs, literals and blank nodes are called RDF terms. In this section we present an encoding of individual terms to CBOR.¶
The encoding of RDF terms is defined with the CDDL rules iri
, literal
and blank-node
:¶
The rules iri
, literal
and blank-node
are described in the following sections.¶
In general IRIs [RFC3987] can be encoded as CBOR text strings. Some IRIs (in particular URNs [RFC2141]) represent binary identifiers. For two kinds of such binary URNs we define specialized encodings that map the URNs directly to a binary CBOR representation (for UUID and ERIS URNs). This allows such binary URNs to be encoded much more efficiently.¶
The CDDL rule for encoding IRIs is:¶
For generic IRIs we use the tag 266
[IRI_CBOR].¶
We define CBOR encodings for UUID URNs [RFC4122] and ERIS URNs [ERIS] using the tags 37
[UUID_CBOR] and 276
respectively.¶
As the CBOR encodings can only encode URNs without fragment parts we introduce the fragment constructor tag 305
. This can be used to construct binary URNs with fragment parts.¶
A language-tagged string literal with datatype http://www.w3.org/1999/02/22-rdf-syntax-ns#langString
is encoded using CBOR tag 38
[draft-ietf-core-problem-details-08]:¶
Note that the CBOR tag 38
allows a third element to indicate text direction. We do not use such a third element as this can not be mapped to RDF language-tagged strings.¶
A literal with datatype xsd:string
is encoded as a CBOR text string:¶
A literal with datatype xsd:boolean
is encoded as a CBOR boolean:¶
A literal with datatype xsd:integer
is encoded as a CBOR integer or as a Bignum if the integer is larger than what can be expressed in 64 bits (see section 3.4.3 of [RFC8949]):¶
Serialization of bignums MUST leave out any leading zeroes.¶
A literal with datatype xsd:float
is encoded as a CBOR single-precision float:¶
A literal with datatype xsd:double
is encoded as a CBOR double-precision float:¶
A literal with datatype xsd:dateTime
is encoded using CBOR tag 0
followed by a text string in the standard format described by the date-time
production in [RFC3339]:¶
A literal with datatype xsd:hexBinary
is encoded using CBOR tag 23
followed by a binary string:¶
A literal with datatype xsd:base64Binary
is encoded as a CBOR binary string:¶
Note that we do not use the tag 22
as defined in section 3.4.5.2 of [RFC8949] to explicitly mark conversion to Base64. Instead we by default assume that binary strings correspond to Base64 encoded content. This makes the encoding more efficient.¶
For RDF literals with other literal types we define the CBOR tag 303
. The content of the tag is an CBOR array with exactly two items: The dataype IRI and the lexical form of the literal.¶
The usage of blank nodes is discouraged. For legacy reasons an encoding is provided by defining the CBOR tag 304
. The content of the tag is the blank node identifier as text string:¶
A RDF molecule is a group of RDF triples that are encoded together. Triples in a molecule use the same dictionary. Encoding a molecule requires all triples of the molecule to be available. When decoding the entire molecule must be kept in memory to decode the RDF triples of the molecule.¶
Molecules allow large RDF datasets to be split into smaller groupings, enabling usage on constrained devices and allowing stream-processing of RDF data. RDF/CBOR molecules correspond exactly to the molecules defined in the ERI serialization [ERI].¶
Small molecules require less memory and processing capacity to encode and decode, whereas larger molecules allow more compact encodings. The most basic molecule is a single triple. More natural groupings are triples that share a common subject (subject-molecule as defined in [ERI]) or fragment-molecules (see Section 4.1). Users, libraries and applications MAY use any molecule grouping. It is RECOMMENDED to use fragment-molecules.¶
An encoded molecule consists of two sections:¶
Concretely, a molecule is encoded as a CBOR array with 5 items:¶
The items predicate-bitmap
, predicates
, object-bitmap
and objects
encode the triples of the molecule as bitmap triples. The meaning of the individual values is explained in Section 3.2 .¶
A molecule MAY be tagged with the CBOR tag 301
.¶
The collection of RDF terms appearing in a molecule is called the vocabulary. A dictionary is a structure that encodes the vocabulary efficiently and assigns every term an integer identifier. The integer identifiers can then be used when encoding triples.¶
This allows more compact representation of the molecule by allowing the triple encoding to just use the integer identifiers of terms. Terms that appear multiple times in the molecule are only encoded once. Furthermore, we can compress the dictionary. This is a simple, effective and widely-used optimization for encoding RDF data [RDF-Dict].¶
When encoding a dictionary, the vocabulary is provided as a sorted sequence of terms with following order:¶
Within the two groups terms are ordered as follows:¶
Within the term types we use lexicographical order.¶
A dictionary is encoded as CBOR array of the terms. Terms are encoded as described in Section 2. If the preceding term in the CBOR array encodes an IRI and the current term is also an IRI that shares a prefix that is longer than 9 characters, then instead of encoding the current IRI in full, we encode the length of the shared prefix along with the suffix of the current term.¶
For example if the vocabulary contains the terms:¶
https://www.w3.org/ns/activitystreams#Create
¶
https://www.w3.org/ns/activitystreams#actor
¶
https://www.w3.org/ns/activitystreams#object
¶
We encode the two following terms more compactly by indicating that a prefix is shared (prefix has length 38):¶
[["https://www.w3.org/ns/activitystreams#Create", [38, "actor"] [38, "object"]]¶
This compression method is called Incremental Encoding [Witten99] and is also used in the HDT serialization [HDT]. It is very effective when encoding IRIs appearing in RDF as shared prefixes are very common.¶
In CDDL the encoding of a dictionary is defined as:¶
References to terms appearing in the dictionary are simply the integer index of the term as appearing in the dictionary:¶
Note that this requires dictionary references to only be used in contexts where there is no confusion between literals with datatype xsd:integer
(see Section 2.2.2.3). This is the case for our encoding of triples and is more efficient than using the explicit references as proposed by Packed CBOR [draft-ietf-cbor-packed-07].¶
In this section we describe the encoding of triples in a molecule.¶
Triples are assumed to be sorted according to lexicographical order using the same rules as the dictionary (see previous section).¶
As a first step, triples can be represented as a list of integer triples where the integers are references to dictionary terms:¶
[[0, 1, 5], [0, 2, 1], [1, 0, 2], [1, 1, 3], [1, 1, 4], [1, 3, 0], [2, 2, 1]]¶
We represent the triples as a list of subjects and lists of grouped predicates and objects:¶
[0, 1, 2] / subjects [[1, 2], [0, 1, 3], [2]] // predicates [[5], [1], [3, 4], [0], [1]] // objects¶
Every predicate group corresponds to a subject and every object group corresponds to a predicate.¶
Because we used the same ordering for triples as for terms in the dictionary, the subject list is redundant and can be omitted in the encoding.¶
The resulting encoding is called compact triples:¶
[[1, 2], [0, 1, 3], [2]] // predicates [[5], [1], [3, 4], [0], [1]] // objects¶
We can improve further by encoding the groupings of predicates and objects with a bitmap. We collapse the list of predicates and objects to a simple list, but remember the last element of every group by setting a bit in a bitmap at the corresponding position:¶
predicate-bitmap: 0b010011 predicates: [1, 2, 0, 1, 3, 2] object-bitmap: 0b110111 objects: [5, 1, 3, 4, 0, 1]¶
The predicate-bitmap
and object-bitmap
are encoded as CBOR integers. If the binary representation becomes larger than what can be represented with CBOR integers, CBOR Bignums are used.¶
This representation of triples is called bitmap triples. This is exactly the encoding used in the HDT serialization [HDT].¶
Most existing RDF content is location-addressed. The IRIs are pointers to hosts that hold the content. If the host goes down the content is no longer available. This happens frequently enough to seriously undermine the robustness of systems relying on RDF [Polleres20].¶
Availability of content can be increased by caching the content on multiple peers. However, this results in the content receiving a new location. The original identifier does not match the location of the cache. Caching location-addressed content is complicated.¶
An alternative to identifying content by its location is to identify content by its content itself. This is called content-addressing. The hash of some content is computed and used as an unique identifier for the content.¶
In this section we illustrate how RDF data can be content-addressed. There are two concepts we need:¶
IRIs may include fragment identifiers. Fragment identifiers identify a secondary resource that is usually a part of, view of, defined in, or described in the primary resource.¶
In many transfer protocols, such as HTTP, fetching a resource with fragment identifier (e.g. http://example.com/resource#part-a
) will return the primary (or base) resource (http://example.com/resource
) that contains the requested resource with fragment identifier (and any other sub-resources with fragment identifiers).¶
Fragment identifiers form a natural grouping of RDF triples and we define a fragment-molecule with this intuition.¶
Given some IRI base subject s
that does not have a fragment part. A fragment-molecule is the set of triples where either:¶
For blank nodes subjects b
, a fragment-molecule is the set of triples with b
in subject position.¶
When content-addressing the base subject is replaced with a identifier that exactly identifies the content of the molecule. Such identifiers are URNs. Blank node base subjects are always replaced with URNs and there is no need to use them at all. Blank nodes are not permitted in content-addressed molecules.¶
A content-addressable molecule is encoded like a regular molecule (see Section 3). The only difference is the types of terms encoded in the dictionary. We must make sure that terms are in a canonical form and that the base subject is replaced with a place-holder.¶
The CBOR tag 302
is defined for content-addressable molecules and MUST be used to clearly identify such molecules.¶
RDF terms are encoded as described in Section 2 with the additional requirement that for generic literals, the canonical lexical form is used.¶
As the fragment molecules base subject can not be part of the encoding itself, we use the CBOR undefined item as a place-holder value:¶
Similarly we must make sure that in references to fragments of the molecule, the base subject is not present. We use the fragment constructor tag 305
with a single text string:¶
Finally the encoding of RDF terms appearing in a content-addressable fragment molecule is:¶
The dictionary is encoded as defined in Section 3.1 with terms ca-terms
.¶
Ordering between term types is as follows:¶
Within the term types we use lexicographical order.¶
Multiple RDF/CBOR molecules and content-addressable molecules may be combined in a CBOR sequence [RFC8742]. In some cases it might be useful to explicitly tag a sequence (or stream) of RDF/CBOR molecules. For this we define the CBOR tag 300
.¶
Note that the array holding stream elements may be an indefinite-length array.¶
This specification requires the assignment of a CBOR tag for various RDF/CBOR types. The tags are added to the CBOR Tags Registry as defined in RFC 8949 [RFC8949].¶
Tag | Data Item | Semantics |
---|---|---|
300 | array | RDF/CBOR Stream (see Section 5) |
301 | array | RDF/CBOR Molecule (see Section 3) |
302 | array | RDF/CBOR Content-Addressable Molecule (see Section 4) |
303 | array | RDF/CBOR generic literal (see Section 2.2.3) |
304 | text string | RDF/CBOR blank node (see Section 2.3) |
305 | array or text string | RDF/CBOR fragment constructor (see Section 2.1.1 and Section 4.2.1) |
We have described a binary RDF serialization based on CBOR. A reference implementation is provided in OCaml (see the ocaml-rdf library).¶
After implementing parsers for a bunch of other RDF serializations, using CBOR might not be such a bad idea. Low-level details of parsing are handled by CBOR and we can specify and document the encoding very concisely.¶
Initial tests seem promising. Further performance tests and comparisons with serializations such as HDT should be done.¶
Some possible improvements include:¶
Comments, feedback and questions are very welcome. Please get in touch with the author by mail or join the #openEngiadina
IRC channel on the Libera network.¶
Development of RDF/CBOR was done as part of the openEngiadina project and was supported by the NLnet Foundation trough the NGI0 Discovery Fund.¶
The openEngiadina developer rustra is imprisoned as a victim of political repression in Belarus. Read his last words in court and an interview with him. Consider donating to the Anarchist Black Cross Belarus. Support victims of repression and resist any form of repression and oppression. Resistance is not futile.¶
Term | CBOR Diagnostic | Encoded |
---|---|---|
<https://example.com>
|
266("https://example.com/")
|
0xd9010a7468747470733a2f2f6578616d706c652e636f6d2f
|
<https://example.com#fragment>
|
266("https://example.com#fragment")
|
0xd9010a781c68747470733a2f2f6578616d706c652e636f6d23667261676d656e74
|
<urn:uuid:1da600cf-c852-469a-936f-e608d3d90d9b>
|
37(h'1da600cfc852469a936fe608d3d90d9b')
|
0xd825501da600cfc852469a936fe608d3d90d9b
|
<urn:uuid:1da600cf-c852-469a-936f-e608d3d90d9b#a>
|
305([37(h'1da600cfc852469a936fe608d3d90d9b'), "a"])
|
0xd9013182d825501da600cfc852469a936fe608d3d90d9b6161
|
"Hello World!"@en
|
38(["en", "Hello World!"])
|
0xd8268262656e6c48656c6c6f20576f726c6421
|
"asdf"
|
"asdf"
|
0x6461736466
|
true
|
true
|
0xf5
|
42
|
42
|
0x182a
|
1.5
|
1.5
|
0xfa3fc00000
|
"POINT(7.9736903 47.5412464)"^^<http://www.opengis.net/ont/geosparql#wktLiteral>
|
303([266("http://www.opengis.net/ont/geosparql#wktLiteral"), "POINT(7.9736903 47.5412464)"])
|
0xd9012f82d9010a782f687474703a2f2f7777772e6f70656e6769732e6e65742f6f6e742f67656f73706172716c23776b744c69746572616c781b504f494e5428372e393733363930332034372e3534313234363429
|
_:bnode0
|
304("bnode0")
|
0xd9013066626e6f646530
|
Some sample RDF data in Turtle:¶
@prefix as: <https://www.w3.org/ns/activitystreams#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . @prefix geo: <http://www.w3.org/2003/01/geo/wgs84_pos#> . @prefix dcterms: <http://purl.org/dc/terms/> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix mo: <http://purl.org/ontology/mo/> . <https://example.com/activity> a as:Create ; as:actor <xmpp:pukkamustard@jblis.xyz> ; as:published "2022-08-18T09:04:45-00:00"^^xsd:dateTime ; as:object <https://example.com/activity#object> . <https://example.com/activity#object> a as:Create ; a as:Note ; geo:lat "45.1864"; geo:long "5.7361"; as:content "RDF/CBOR allows the efficient encoding of small pieces of content"@en . <urn:uuid:c34d4219-5fbb-4e54-9217-1cbdaf831a64> a as:Listen ; as:published "2022-08-13T09:04:45-00:00"^^xsd:dateTime ; as:actor <xmpp:pukkamustard@jblis.xyz> ; as:object <urn:uuid:c34d4219-5fbb-4e54-9217-1cbdaf831a64#track> . <urn:uuid:c34d4219-5fbb-4e54-9217-1cbdaf831a64#track> a mo:Track ; dcterms:creator "Funki Porcini" ; dcterms:title "Back Home" ; mo:musicbrainz <urn:uuid:a9dae29a-3f23-4c4b-804d-e125d4582adf> ; mo:release <urn:uuid:0f028066-5891-322e-ad8d-6aa588063a2e> ; foaf:maker <urn:uuid:2adb429d-e39c-467b-b175-3f40440ff630> .¶
The sample RDF data can be encoded in a single molecule (all triples are grouped together). The encoding in CBOR diagnostic notation:¶
[ // dictionary [37(h'c34d42195fbb4e5492171cbdaf831a64'), [45, "#track"], 266("https://example.com/activity"), [28, "#object"], 37(h'0f0280665891322ead8d6aa588063a2e'), 305([37(h'1da600cfc852469a936fe608d3d90d9b'), "object"]), 37(h'2adb429de39c467bb1753f40440ff630'), 37(h'a9dae29a3f234c4b804de125d4582adf'), 266("xmpp:pukkamustard@jblis.xyz"), 266("http://purl.org/dc/terms/creator"), [25, "title"], [16, "ontology/mo/Track"], [28, "musicbrainz"], [28, "release"], 266("http://www.w3.org/1999/02/22-rdf-syntax-ns#type"), [18, "2003/01/geo/wgs84_pos#lat"], [41, "ong"], 266("https://www.w3.org/ns/activitystreams#Create"), [38, "Listen"], [38, "Note"], [38, "actor"], [38, "content"], [38, "object"], [38, "published"], 266("http://xmlns.com/foaf/0.1/maker"), 38(["en", "RDF/CBOR allows the efficient encoding of small pieces of content"]), 0("2022-08-13T09:04:45-00:00"), 0("2022-08-18T09:04:45-00:00"), "45.1864", "5.7361", "Back Home", "Funki Porcini"], // bitmap triples 0b100010001000001000, [14, 20, 22, 23, 9, 10, 12, 13, 14, 24, 14, 20, 22, 23, 14, 15, 16, 21], 0b1111011111111111111, [18, 8, 1, 26, 31, 30, 7, 4, 11, 6, 17, 8, 5, 27, 17, 19, 28, 29, 25]]¶
The binary representation in octets (715 bytes):¶
8598 1fd8 2550 c34d 4219 5fbb 4e54 9217 1cbd af83 1a64 8218 2d66 2374 7261 636b d901 0a78 1c68 7474 7073 3a2f 2f65 7861 6d70 6c65 2e63 6f6d 2f61 6374 6976 6974 7982 181c 6723 6f62 6a65 6374 d825 500f 0280 6658 9132 2ead 8d6a a588 063a 2ed8 2550 2adb 429d e39c 467b b175 3f40 440f f630 d825 50a9 dae2 9a3f 234c 4b80 4de1 25d4 582a dfd9 010a 781b 786d 7070 3a70 756b 6b61 6d75 7374 6172 6440 6a62 6c69 732e 7879 7ad9 010a 7820 6874 7470 3a2f 2f70 7572 6c2e 6f72 672f 6463 2f74 6572 6d73 2f63 7265 6174 6f72 8218 1965 7469 746c 6582 1071 6f6e 746f 6c6f 6779 2f6d 6f2f 5472 6163 6b82 181c 6b6d 7573 6963 6272 6169 6e7a 8218 1c67 7265 6c65 6173 65d9 010a 782f 6874 7470 3a2f 2f77 7777 2e77 332e 6f72 672f 3139 3939 2f30 322f 3232 2d72 6466 2d73 796e 7461 782d 6e73 2374 7970 6582 1278 1932 3030 332f 3031 2f67 656f 2f77 6773 3834 5f70 6f73 236c 6174 8218 2963 6f6e 67d9 010a 782c 6874 7470 733a 2f2f 7777 772e 7733 2e6f 7267 2f6e 732f 6163 7469 7669 7479 7374 7265 616d 7323 4372 6561 7465 8218 2666 4c69 7374 656e 8218 2664 4e6f 7465 8218 2665 6163 746f 7282 1826 6763 6f6e 7465 6e74 8218 2666 6f62 6a65 6374 8218 2669 7075 626c 6973 6865 64d9 010a 781f 6874 7470 3a2f 2f78 6d6c 6e73 2e63 6f6d 2f66 6f61 662f 302e 312f 6d61 6b65 72d8 2682 6265 6e78 4152 4446 2f43 424f 5220 616c 6c6f 7773 2074 6865 2065 6666 6963 6965 6e74 2065 6e63 6f64 696e 6720 6f66 2073 6d61 6c6c 2070 6965 6365 7320 6f66 2063 6f6e 7465 6e74 c078 1932 3032 322d 3038 2d31 3354 3039 3a30 343a 3435 2d30 303a 3030 c078 1932 3032 322d 3038 2d31 3854 3039 3a30 343a 3435 2d30 303a 3030 6734 352e 3138 3634 6635 2e37 3336 3169 4261 636b 2048 6f6d 656d 4675 6e6b 6920 506f 7263 696e 691a 0002 2208 920d 1315 1608 090b 0c0d 170d 1315 160d 0e0f 141a 0007 bfff 9311 0701 1819 181e 181d 0604 0a05 1007 0318 1a10 1218 1b18 1c18 18¶
A small RDF molecule in Turtle:¶
@prefix as: <https://www.w3.org/ns/activitystreams#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . <> a as:Create ; as:actor <xmpp:pukkamustard@jblis.xyz> ; as:published "2022-08-18T09:04:45-00:00"^^xsd:dateTime ; as:object <#object> . <#object> a as:Note ; as:content "RDF is underused in decentralized systems. RDF/CBOR is an attempt to change that.".¶
Encoded as a Content-addressable molecule in CBOR diagnostic notation:¶
302([ // dictionary [undefined, 305("object"), 266("xmpp:pukkamustard@jblis.xyz"), 266("http://www.w3.org/1999/02/22-rdf-syntax-ns#type"), 266("https://www.w3.org/ns/activitystreams#Create"), [38, "Note"], [38, "actor"], [38, "content"], [38, "object"], [38, "published"], 0("2022-08-18T09:04:45-00:00"), "RDF is underused in decentralized systems. RDF/CBOR is an attempt to change that."], // bitmap triples 0b101000, [3, 6, 8, 9, 3, 7], 0b111111, [4, 2, 1, 10, 5, 11]])¶
The binary representation in octets (329 bytes):¶
d901 2e85 8cf7 d901 3166 6f62 6a65 6374 d901 0a78 1b78 6d70 703a 7075 6b6b 616d 7573 7461 7264 406a 626c 6973 2e78 797a d901 0a78 2f68 7474 703a 2f2f 7777 772e 7733 2e6f 7267 2f31 3939 392f 3032 2f32 322d 7264 662d 7379 6e74 6178 2d6e 7323 7479 7065 d901 0a78 2c68 7474 7073 3a2f 2f77 7777 2e77 332e 6f72 672f 6e73 2f61 6374 6976 6974 7973 7472 6561 6d73 2343 7265 6174 6582 1826 644e 6f74 6582 1826 6561 6374 6f72 8218 2667 636f 6e74 656e 7482 1826 666f 626a 6563 7482 1826 6970 7562 6c69 7368 6564 c078 1932 3032 322d 3038 2d31 3854 3039 3a30 343a 3435 2d30 303a 3030 7851 5244 4620 6973 2075 6e64 6572 7573 6564 2069 6e20 6465 6365 6e74 7261 6c69 7a65 6420 7379 7374 656d 732e 2052 4446 2f43 424f 5220 6973 2061 6e20 6174 7465 6d70 7420 746f 2063 6861 6e67 6520 7468 6174 2e18 2886 0306 0809 0307 183f 8604 0201 0a05 0b¶
The URN of the molecule when using the Blake2b-256 hash function is urn:blake2b:7B6VYVGTSQC7KWXANVA4PYUP6VDGSNIOLYX4QLY7AF5CKHAIMJ4QE7U3DTGPCSSFEW4PIJ4OFZ4AEZVYEOZV3KW476RDGUFZR4JGOOY
.¶
Back as Turtle using the computed base subject:¶
@prefix as: <https://www.w3.org/ns/activitystreams#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . <urn:blake2b:7B6VYVGTSQC7KWXANVA4PYUP6VDGSNIOLYX4QLY7AF5CKHAIMJ4QE7U3DTGPCSSFEW4PIJ4OFZ4AEZVYEOZV3KW476RDGUFZR4JGOOY> a as:Create ; as:actor <xmpp:pukkamustard@jblis.xyz> ; as:published "2022-08-18T09:04:45-00:00"^^xsd:dateTime ; as:object <#object> . <urn:blake2b:7B6VYVGTSQC7KWXANVA4PYUP6VDGSNIOLYX4QLY7AF5CKHAIMJ4QE7U3DTGPCSSFEW4PIJ4OFZ4AEZVYEOZV3KW476RDGUFZR4JGOOY#object> a as:Note ; as:content "RDF is underused in decentralized systems. RDF/CBOR is an attempt to change that.".¶