JSON-LD
JSON-LD is a W3C standard for serializing semantic web RDF graph data as JSON. With it, you can encode a graph like this:
Using this JSON:
{ "@context": "http://schema.org", "@id": "http://microsoft.com/people/#SatyaNadella", "name": "Satya Nadella", "jobTitle": "CEO"}
This document explains the rationale behind the semantic web approach to representing data. It also explains the basics of JSON-LD syntax so that you can understand it when you read it and so you can encode your own data as JSON-LD.
If you'd like a more detailed introduction to these concepts and technologies, please read our tutorial.
RDF and semantic web
JSON-LD serializes Resource Description Framework (or RDF) data. Fluree implements JSON-LD for transactions; you send JSON-LD objects to Fluree to add data to your database.
In RDF, a resource is anything that you want to capture information about; it's analogous to entity. The Resource Description Framework is a standard that lays out rules for capturing data -- describing resources -- such that the data can be unambigously understood by any system that understands RDF. It's like a protocol for representing data, similar to how TCP/IP and HTTP are protocols for exchanging information on the internet.
Without such a standard, every data system is responsible for creating its own way to represent data: What should fields be named? How do you refer to entities? How do you capture relationships between entities? Data systems must, in effect, create their own idiosyncratic representational systems for their data.
This distributed data -- data that is not managed by a central authority, with an uncoordinated network of producers and consumers -- is made much more useful if it's translated into the same representation. Developers have spent untold hours translating among these systems, and RDF nearly eliminates the need for this kind of work.
The RDF spec has rules for two aspects of representing data:
- How to describe the relationship between two resources
- How to unambiguously refer to resources
Relationships are captured as subject, predicate, object (SPO) triples. The
examples below are informal (indicated by the <>
brackets), meaning they don't
follow formatting specifications, but they convey the underlying idea:
<Reggie> <loves> <honey><Reggie> <loves> <guitar><Megan> <is the mother of> <Ruby><Ruby> <loves> <mangoes>
subject | predicate | object |
---|---|---|
<Reggie> | <loves> | <honey> |
<Reggie> | <loves> | <guitar> |
<Megan> | <is the mother of> | <Ruby> |
<Ruby> | <loves> | <mangoes> |
Triples like this allow us to describe relationships between any two things. This is the essence of data: two things, and the relationship between them.
Triples can be used to represent graphs:
This is an essential aspect of RDF: it's used to represent graph data.
So that's the first part of the data "language" that RDF describes. The second
part is how to unambiguously refer to resources: my <Reggie>
may or may not be
the same as your <Reggie>
; for that matter, my <loves>
might be different
from yours. Resolving these kinds of ambiguities is a huge part of the work of
translating among data systems.
RDF introduces structure to our data representations so that we can unambiguously refer to the things we're describing and their relationships. Instead of freeform plain text, we use Internationalized Resource Identifiers, or IRIs, (which are essentially like URIs). The above data might be captured like this:
http://reggies-domain.com/#person http://emotions.com/love "honey"http://reggies-domain.com/#person http://emotions.com/love "guitar"http://megan-family.com/megan http://schema.com/parent http://megan-family.com/rubyhttp://megan-family.com/ruby http://emotions.com/love "mangoes"
If you're using the IRI http://megan-family.com/megan
and I'm also using IRI,
we can be certain that we're referring to the same entity. Likewise, wherever we
use the IRI http://schema.com/parent
, we know we're both referring to the same
kind of relationship between two entities.
Note that the objects in SPO triples can be strings ("honey" and "mangoes" above). They can also be numbers and other values; for more information see RDF datatypes.
JSON-LD is a standard for serializing these RDF triples into JSON. Serializing RDF data as JSON is desireable because there is ample language and tooling support for working with JSON.
The rest of this document explains basic vocabulary and rules for understanding how a JSON-LD object relates to RDF data.
node objects and properties
JSON-LD uses JSON objects to capture the subject, predicate, object triples that describe resources. In a JSON-LD context, a JSON object can be referred to as a node object. The word node is used because RDF captures graph data, and a single JSON object is "about" a node in that graph.
This JSON object describes a node where the node being described has a
"http://emotions.com/love"
relationship with "http://objects.com/honey"
:
{ "http://emotions.com/love": "honey"}
We would say that the node has the property "http://emotions.com/love"
, with
a value of "http://objects.com/honey"
. Property is synonymous with
"predicate", and value is synonymous with "object".
But what is the subject? How do you identify the node being described? That's
where the @id
keyword comes in.
@id
and keywords
The @id
keyword designates the subject for a node object:
{ "@id": http://reggies-domain.com/#person, "http://emotions.com/love": "honey", "http://emotions.com/love": "guitar"}
A library that parses JSON objects as JSON-LD could read this and convert it into the following RDF triple:
http://reggies-domain.com/#person http://emotions.com/love "honey"http://reggies-domain.com/#person http://emotions.com/love "guitar"
keywords
The "@id"
example above shows how the JSON format does not support the
parsing of a JSON object as a set of RDF triples. We need special designators
like "@id"
along with rules for interpreting these designators in order to
treat JSON objects as RDF data.
These designators are called keywords, and their role in including RDF information in JSON objects is defined by the W3C's JSON-LD specification.
Generally, the key value pairs of a JSON object are treated as properties and
values for a node. Keywords, however, are treated differently, as described by
the spec. We've already seen this with @id
: it doesn't designate another
property of the node, rather, it indicates what node is being described by the
JSON object.
This doc also describes @context
and @type
, showing how they help
serialize RDF data as JSON objects.
@context
, aliases, metadata
The @context
keyword is almost always included in JSON-LD objects, and
understanding its role is essential to understanding the JSON-LD data you're
working with. It serves two primary functions:
- It defines aliases
- It contains other metadata like the language being used (English, Chinese, etc)
aliases
You'll most often see @context
used to define namespace aliases. Namespace
aliases allow you to use compact IRIs.
IRIs take the form of http://domain.com/some/path/your/value, and including this full path every time can be very repetitive. It's harder for humans to write and parse, and it can unnecessarily consume bandwidth. Compact IRIs are a way of reducing this repetition. Here's an example:
{ "@context": { "schema": "http://schema.org/" }, "schema:familyName": "Worrel", "schema:givenName": "Ernest"}
schema:familyName
and schema:givenName
are compact IRIs. Compact IRIs are
handled as follows:
- If a string has the form
namespace:identifier
, andnamespace
is a key under@context
- Then replace
namespace:
with the corresponding value under@context
In this example, instances of "schema:"
are replaced with
"http://schema.org/"
. Applications that understand JSON-LD, like Fluree, will
expand namespaces into their full form when parsing JSON-LD objects.
replacements
@context
can also be used to define simple replacements:
{ "@context": { "familyName": "http://schema.org/familyName", "givenName": "http://schema.org/givenName", }, "familyName": "Worrel", "givenName": "Ernest"}
When a key in a JSON-LD object (like "familyName"
) has an entry in the
@context
, then that key will be replaced with the value in the @context
.
@base
, @vocab
, and relative IRIs
JSON-LD supports relative IRIs. These are analogous to relative file system
paths or relative URLs, like when you use ./tutorial/introduction
instead of
an absolute path like
https://next.developers.flur.ee/docs/learn/tutorial/introduction/
.
You can use a relative IRI for @id
values. These are all valid:
{ "@id": ""}{ "@id": "SatyaNadella"}{ "@id": "people/SatyaNadella"}
By default, the relative IRI is resolved against the IRI used to retrieve the
JSON-LD document, so if you got these examples from http://microsoft.com/
then
they would resolve to:
{ "@id": "http://microsoft.com/"}{ "@id": "http://microsoft.com/SatyaNadella"}{ "@id": "http://microsoft.com/people/SatyaNadella"}
You can also set the base IRI for resolving relative IRIs using the @base
keyword in a @context
:
{ "@context": { "@base": "http://google.com/" }, "@id": "SundarPichai"}
Here, the @id
would resolve to http://google.com/SundarPichai
You can also set the base IRI for properties and @type
s by setting @vocab
in
the @context
:
{ "@context": { "@base": "http://google.com/", "@vocab": "http://schema.org/" }, "@id": "SundarPichai" "name": "Sundar Pichai"}
This JSON-LD object serializes this RDF triple:
http://google.com/SundarPichai http://schema.org/name "Sundar Pichai"
You can also define your @vocab
relative to your @base
:
{ "@context": { "@base": "http://google.com/", "@vocab": "properties/" }, "@id": "SundarPichai" "name": "Sundar Pichai"}
Here, the @vocab
is resolved to http://google.com
, so this object serializes
this triple:
http://google.com/SundarPichai http://google.com/properties/name "Sundar Pichai"
If you set @base
and include an empty string for @vocab
, then @vocab
will
resolve to the same string as @base
:
{ "@context": { "@base": "http://google.com/", "@vocab": "" }, "@id": "SundarPichai" "name": "Sundar Pichai"}
The triple:
http://google.com/SundarPichai http://google.com/name "Sundar Pichai"
metadata
The @context
can also be used to set metadata like the language of string
values in the JSON-LD object.
Other ways to define the context
The value of @context
can be a JSON object. There are two other days to define
the @context
:
- Provide a URL which points to a document which contains a JSON object that
defines the
@context
- Provide an array of such URLs
These are both valid:
{ "@context": "http://schema.org"}{ "@context": [ "http://schema.org", "http://www.w3.org/2004/02/skos/core#" ]}
Further reading
This document doesn't cover the full range of behavior governed by the
@context
object. To learn more, check out the JSON-LD
specification.
@type
The @type
keyword is an alias for rdf:type
. rdf:type
is a property
introduced by RDF Schema, also known as
RDFS.
A full explanation of RDFS and how to use it is outside the scope of these docs. The W3C specification is a good starting point, and the book Semantic Web for the Working Ontologist gives the topic a full treatment.
JSON-LD and Fluree
Fluree introduces additional concepts and tools for working with JSON-LD.
id
, and type
Fluree's built-in context includes:
{ "id": "@id", "type": "@type"}
Aliasing @id
and @type
like this is somewhat common among RDF practitioners.
If you see id
and type
being used as keys without the preceding @
symbol,
it's likely that those keys are aliases of the corresponding JSON-LD keywords.
The aliases can be set in the @context
.
Fluree accepts plain JSON
Fluree allows you to transact JSON objects like the following:
{ "@id": "320-1214-1231", "name": "Bill Waters"}
If you send Fluree data like this, then it will treat the "320-1214-1231"
and "name
as naive identifiers, unless you have leveraged @base
& @vocab
to instead treat @id
like a relative IRI and name
like a
relative property name.