Analytical Query
FlureeQL Analytical Queries are used to answer more sophisticated questions about your data.
Analytical Queries can:
- Query across multiple internal and external datasets
- Query across the same Fluree at different points in time
- Calculate aggregates
- Create complicated joins
- Group results by a single or multiple variables
- Use aggregate values calculated mid-query to filter results
We utilized concepts of logic programming and variable binding to give an immense amount of query potential that can largely be designed by you.
This section covers analytical queries using the FlureeQL syntax. All code examples are shown in FlureeQL. All of these queries can be issued through the API or the user interface (select FlureeQL
in the sidebar, then make sure Query
is selected in the top-right, as well as in the dropdown).
To issue these queries using the API, see /query
.
You can also issue multiple queries at once using the /multi-query
endpoint.
Query Keys
FlureeQL Analytical Queries are also structured as a JSON object/map and may contain the following keys:
Key | Required? | Description |
---|---|---|
select, selectOne, or selectDistinct | yes | select returns all relevant results, selectOne returns one result, and selectDistinct only returns unique results. See select key |
where | yes | A collection of tuples which allow for complex filtering of data. |
block | no | Optional block specified by block number, duration, or wall-clock time as a ISO-8601 formatted string. This applies a block to every part of the query that does not have a block specified. It follows the same syntax as the block key in basic queries. |
prefixes | no | Optional map of outside sources. |
vars | no | Optional map of variable bindings. |
opts | no | Optional map where options like limit , orderBy , prettyPrint , and wikipediaOpts can be specified. |
Select Key
select
returns all relevant results, selectOne
returns one result, and selectDistinct
only returns unique results. Exactly one of the select keys is required.
Item | Example | Description |
---|---|---|
Variable | "?apple" | Variables are declared in the where clause |
Variable Select-Array | {"?var1": ["*"]} {"?var1": ["*", {"chat/person": ["*"]}]} | A map where the key is a valid variable and the value is a valid select-array. This only works when the variable is bound to subject ids. |
Aggregate variable | "(avg ?nums)" | Any valid variables can be wrapped in an aggregate function. All valid aggregate functions are listed in the table below. |
The value for the select key can be any of the items in the following table. Your select key can be multiple of these select items. In the case that it is multiple items, these items should be wrapped in [ ]
. For example [ "?var1", "(sum ?var2)", {"?var3": ["*"] }, "?var4" ]
.
If your select value only has one item, then you can either wrap it in square brackets or not (i.e. { "select": ["?var"] }
or { "select": "?var" }
). If your select key is an array, then your results will also return an array.
Valid Aggregate Functions
All the following are valid aggregate functions. These below functions can only be applied to variables- you cannot provide your own array. In addition, they cannot be nested within each other- for example (rand (sample 2 ?age))
is NOT valid.
Function | Arguments | Description |
---|---|---|
avg | variable | Returns the average of the values. |
count | variable | Returns a count of the values. |
count-distinct | variable | Returns a count of the distinct values. This is equivalent to (count (distinct ?var)) |
max | variable | Returns the largest value. |
min | variable | Returns the smallest value. |
median | variable | Returns the median of the values. |
rand | variable | Returns a random value from the specified values. |
sample | n, variable | Given a sample size and a set of values, returns an appropriately sized sample, i.e. (sample 2 ?age) returns two ages from values bound to the variable, ?age . |
stddev | variable | Returns the standard deviation of the values. |
sum | variable | Returns the sum of the values. |
variance | variable | Returns the variance of the values. |
Additional Modifiers
as
:as
can be wrapped around the ENTIRE aggregate function to rename the result for this aggregate. This renaming only applies to how the results are displayed if pretty printed. For example,(as (sum ?nums) ?sum)
.distinct
: If you want an aggregate function to only apply to the set of distinct values, you can wrapdistinct
around the variable. For example,(variance (distinct ?nums))
or(as (sum (distinct ?nums)) ?sum)
.
If using groupBy
, aggregates are applied only to the values within that group.
If selecting a mixture of aggregate and non-aggregate variables, the results of the aggregate are returned alongside every non-aggregate value. To see what this looks like, try issuing the below query against the basic schema.
{
"select": ["?nums", "(avg ?nums)"],
"where": [["?person", "person/favNums", "?nums"]]
}
Where Key
The value for the where key is an array- we'll call it a where-array in this section. Each item of the where-array acts to filter data, bind variables, or perform other complex data actions. The order of the items in a where-array is extremely important, as each item is resolved sequentially.
A where-array can be comprised of any of the following items in any order:
Item | Function |
---|---|
three-tuple | Looks for flakes (pieces of data) in the ledger that match a provided pattern, and bind variables to the results. |
four-tuple | Same as a simple three-tuple, except the first item in the tuple specifies a particular data source, such as Wikidata, another ledger, or the current ledger at a specific in time. |
two-tuple variable binding | Bind a variable to a value, including to a calculated aggregate value. These variables can be used in subsequent where-array items. |
binding map | Serves the same function as a two-tuple variable binding, except the syntax is different. |
optional map | Like, three-tuples, looks for flakes (pieces of data) that match a provided pattern. However, when joining the results of this map with the queries existing resultset, will simply bind null if there is no match (a left outer join). |
union map | The two parts of a union map are outer joined. |
filter map | Filters the results up to that point in the query. |
Each where-array item is resolved and then inner joined with the resultset up until that point. See the Analytical Query guide for more information.
Three Tuple
Every piece of data in Fluree can be expressed as [ subject, predicate, object ]
. To read more about this, see the Subject-Predicate-Object Model in the Technical Overview
document.
When you include a three-tuple in your where-array, you specify the values for one or two parts of the tuple. The parts you don't specify are either null
or variables. For example, in this three tuple, ["?person", "person/handle", "?handle"]
, only the predicate person/handle
is specified. The subject ?person
and the object ?handle
are variables. You can also look-up a data for a known id using the pattern ["?s", "_id", 351843720888320]
.
The below table lists all the possible values for each part of the three tuple.
Part | Value |
---|---|
subject | Can be an subject id, unique two-tuple, a variable (a string that begins with ? ), or null. |
predicate , predicate+[RECUR-DEPTH] , or service call | Can be either predicate name, the Fluree reserved word "_id", or a variable (a string that begins with ? ). Reverse references are NOT supported. If using a predicate name, can add + after the predicate name to signify recur. Can also specify how many times to recur, i.e. person/follows+3 .If using "_id", the object should reference a known id. The second item in the clause can also be a service call. All supported service calls are in the subsequent table. |
object or object-function | Can be a value, subject id, unique two-tuple, a variable (a string that begins with ? ), or null. The object can also be a filter function, which contains an existing or a newly declared variable, for example, (= 20 ?nums) . The filters follow the same syntax as filter maps, and you can read more in that section. If your object begins with a ? , and it is NOT a variable, for example someone's name is ?Fred , then you can use escape strings in the object, for example: "\"?Fred\"" . |
A three tuple acts as a pattern in a where-array. First it pulls all the data that matches that given pattern, and then it binds the appropriate variables. Subsequent three tuples are inner-joined. For more information and examples, see inner joins in Fluree.
Supported Service Call | Example Three-tuple | Description |
---|---|---|
Full Text Search a Collection | ["?movie", "fullText:movie", "redemption"] | This service call searches for the word redemption in any predicates in the movie collection that are enabled for full text search. See the Full Text Search guide for more examples and details. |
Full Text Search a Predicate | ["?movie", "fullText:movie/title", "redemption"] | This service call searches for the word redemption in movie/title s. This only works if the movie/title predicate is enabled for full text search. See the Full Text Search guide for more examples and details. |
Collection Select | ["?movie", "rdf:type", "movie"] | This service call binds all subject ids where the subject belongs to the movie . collection to the provided variable, ?movie . |
Four Tuple
Four tuples are exactly the same as three tuples, except that four tuples specify a data source for the subject, predicate, object
pattern. Currently, Fluree supports querying across multiple Fluree ledgers and across Wikidata. The following are the sources that can be used.
Source | Example | Description |
---|---|---|
This ledger | $fdb | Default source. The current version of a given Fluree. Can be omitted. |
This ledger at a Previous Point in Time | $fdb3 , $fdb2019-03-14T20:59:36.097Z , $fdbPT5M | This ledger at a specified block. The block is specified either by providing the block integer, ISO-8601 formatted wall clock time, or ISO-8601 formatted duration ago |
Other Fluree ledger | ftest | Other Fluree ledger must be specifed in the prefixes key. |
Other Fluree ledger | ftestPT5M | Other ledger at a specified block. The block is specified either by providing the block integer, ISO-8601 formatted wall clock time, or ISO-8601 formatted duration ago |
Wikidata | $wd | Wikidata |
For example, the three tuple ["?person", "person/handle", "?handle"]
is equivalent to ["$fdb", "?person", "person/handle", "?handle"]
, because the specified source is the current ledger. The four tuple, ["$fdbPT2H", "?person", "person/handle", "?handle"]
is matching the given tuple pattern to the subject-predicate-object triples active in the ledger as of two hours ago.
{
"select": "?nums",
"where": [
["?person", "rdf:type", "person"],
["?person", "person/handle", "?handle"],
["?person", "person/favNums", "#(> 80 ?nums)"]
]
}
Two Tuple Variable Binding
You can bind a variable to an aggregate value. As mentioned above, where-array items are resolved in the order in which they appear. This means that aggregates do NOT take into account the where-array item that comes later in the array. To bind an intermediate aggregate value, just specify a two-tuple with the first item as a variable and the second item as the aggregate function. For example ["?maxBlock", "#(max ?bNum)"]
.
Valid functions are the same as the ones listed in Valid Aggregate Functions. You can also use the aggregate modifier (distinct
) explained in the valid aggregate section (you can use the as
modifier, but this will simply by ignored). A function must be preceded with a #
.
- FlureeQL
- Curl
- GraphQL
- SPARQL
{
"select": "?hash",
"where": [
["?s", "_block/number", "?bNum"],
["?maxBlock", "#(max ?bNum)"],
["?s", "_block/number", "?maxBlock"],
["?s", "_block/hash", "?hash"]
]
}
curl \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $FLUREE_TOKEN" \
-d '{"select": "?hash",
"where": [
["?s", "_block/number", "?bNum"],
["?maxBlock", "#(max ?bNum)"],
["?s", "_block/number", "?maxBlock"],
["?s", "_block/hash", "?hash"]
]}' \
[HOST]/api/db/query
Not supported
SELECT ?hash
WHERE {
?s fdb:_block/number ?bNum.
BIND (MAX(?bNum) AS ?maxBlock)
?s fdb:_block/number ?maxBlock.
?s fdb:_block/hash ?hash.
}
You can also simply specify values as the second item in the tuple. This binds the given variable to the value provided.
{
"select": "?person",
"where": [
["?handle", "jdoe"],
["?person", "person/handle", "?handle"]
]
}
Binding Map
Binding works the same as intermediate aggregate values, except the syntax is different.
A bind
map (or multiple maps), can be declared anywhere in the where clause. The bind
map must precede any clauses that use the variables declared in the map.
The map is comprised of keys that correspond to variables, and values. For example, {"bind": {"?handle": "dsanchez"}}
. You can bind multiple variables in the same map, as well, for example {"bind": {"?handle": "dsanchez", "?person": 351843720888324}}
.
- FlureeQL
- Curl
- GraphQL
- SPARQL
{
"select": ["?person", "?handle"],
"where": [
{ "bind": { "?handle": "dsanchez" } },
["?person", "person/handle", "?handle"]
]
}
curl \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $FLUREE_TOKEN" \
-d '{
"select": [ "?person", "?handle"],
"where": [
{"bind": {"?handle": "dsanchez"}},
[
"?person",
"person/handle",
"?handle"
]
]
}' \
[HOST]/api/db/query
Not supported
SELECT ?person ?handle
WHERE {
BIND ("dsanchez" AS ?handle)
?person fdb:person/handle ?handle.
}
Like intermediate aggregate clauses, binds can use aggregate functions. See the Valid Aggregate Functions list above.
- FlureeQL
- Curl
- GraphQL
- SPARQL
{
"select": "?hash",
"where": [
["?s", "_block/number", "?bNum"],
{ "bind": { "?maxBlock": "#(max ?bNum)" } },
["?s", "_block/number", "?maxBlock"],
["?s", "_block/hash", "?hash"]
]
}
curl \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $FLUREE_TOKEN" \
-d '{"select": "?hash",
"where": [
["?s", "_block/number", "?bNum"],
{"bind": {"?maxBlock": "#(max ?bNum)"}},
["?s", "_block/number", "?maxBlock"],
["?s", "_block/hash", "?hash"]
]}' \
[HOST]/api/db/query
Not supported
SELECT ?hash
WHERE {
?s fdb:_block/number ?bNum,
BIND (MAX(?bNum) AS ?maxBlock)
?s fdb:_block/number ?maxBlock.
?s fdb:_block/hash ?hash.
}
Optional Map
An optional map is a map where the key is optional
, and the value is an array of where-items (you can include any where-items within an optional map, such as three-tuples, four-tuples, two-tuple variable bindings, other optional maps, etc).
For example:
{
"optional": [
["?person", "person/fullName", "?name"],
["?favNums", 1223],
["?person", "person/favNums", "?favNums"]
]
}
Anything within an optional map is resolved and then left outer joined with previous results. In other words, optional maps do not remove any entries from the existing resultset - any rows from the original resultset that do not have a match are bound with null
. For a more in-depth explanation of optional clauses, see the optional clauses guide.
Union Map
A union map takes an two-item array as a value. The two items in this array is itself an array comprised of any number of where-items (for example, three-tuples, two-tuples, optional maps, and filter maps). The results of each array of where-items is then outer joined.
{
"select": ["?x", "?y", "?b", "?c", "?cname", "?pname"],
"where": [
{
"union": [
[
["?x", "_collection/name", "?cname"],
["?b", 2]
],
[
["?y", "_predicate/name", "?pname"],
["?c", 8]
]
]
}
]
}
Filter Map
A filter map has the key filter
, and the value is a number of filter functions. Currently, all filter functions should be written using Clojure. The following are all of the accepted filter functions:
Name | Example | Explanation |
---|---|---|
> | (> 10 ?nums) | greater than |
>= | (>= 10 ?nums) | greater than or equal to |
< | (< 10 ?nums) | less than |
<= | (<= 10 ?nums) | less than or equal to |
= | (= 10 ?nums) | equal to |
not= | (not= 10 ?nums) | not equal to |
+ | (> 11 (+ 10 ?nums)) | add |
- | (- 11 (+ 10 ?nums)) | subtract |
* | (> 11 (* 10 ?nums)) | multiply |
/ | (> 11 (/ 10 ?nums)) | divide |
and | (and (> 10 ?nums1) (< 100 ?nums2)) | and |
&& | (&& (> 10 ?nums1) (< 100 ?nums2)) | Same as and. |
or | (or (> 10 ?nums1) (< 100 ?nums2)) | or |
|| | (\|\| (> 10 ?nums1) (< 100 ?nums2)) | Same as or. |
nil? | (nil? ?nums) | nil? |
bound | (bound ?nums) | True if non-nil |
coalesce | (> ?nums (coalesce ?age 30)) | Coalesce takes the first non-nil argument, example below. |
if | (if (> ?nums 30) (> ?age 100) true) | if |
not | (not (> ?nums 30)) | not |
! | (! (> ?nums 30)) | Same as not |
now | (> ?nums (now)) | Returns the current time in epoch milliseconds. |
re-pattern | (re-pattern "\\d+") | Returns a regex pattern, for use in re-find. re-pattern |
re-find | (re-find (re-pattern \"^_collection\") ?name) | Returns the first regex match. re-find |
strStarts | (strStarts ?message \"Hi\") | Returns true if the string starts with the given substring, case sensitive. |
strEnds | (strEnds ?message \"Amy.\") | Returns true if the string ends with the given substring, case sensitive. |
An example query:
- FlureeQL
- Curl
- GraphQL
- SPARQL
{
"select": ["?handle", "?num"],
"where": [
["?person", "person/handle", "?handle"],
["?person", "person/favNums", "?num"],
{ "filter": ["(> 10 ?num)", "(= \"jdoe\" ?handle)"] }
]
}
curl \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $FLUREE_TOKEN" \
-d '{
"select": ["?handle", "?num"],
"where": [ ["?person", "person/handle", "?handle"],
["?person", "person/favNums", "?num"],
{ "filter": [ "(> 10 ?num)", "(= \"jdoe\" ?handle)"] }]
}' \
[HOST]/api/db/query
Not supported
SELECT ?handle ?num
WHERE {
?person fdb:person/handle ?handle.
?person fdb:person/favNums ?num.
FILTER ( ?num > 10 ).
}
Prefixes Key
If we want to query across multiple Fluree ledgers we can do so by specifying the ledger in the prefixes
map.
In the prefix map, we can specify what ledger we want to query across, in this case fluree/test
, and we give that source a name, ftest
. The name given to a source must be only lowercase letters and no numbers.
Now we can use ftest
as a source in any clause.
- FlureeQL
- Curl
- GraphQL
- SPARQL
{
"prefixes": {
"ftest": "fluree/test"
},
"select": "?nums",
"where": [
["$fdb4", ["person/handle", "zsmith"], "person/favNums", "?nums"],
["ftest", ["person/handle", "zsmith"], "person/favNums", "?nums"]
]
}
curl \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $FLUREE_TOKEN" \
-d '{
"prefixes": {
"ftest": "fluree/test"
},
"select": "?nums",
"where": [ ["$fdb4", ["person/handle", "zsmith"], "person/favNums", "?nums"],
["ftest", ["person/handle", "zsmith"], "person/favNums", "?nums"] ]
}' \
[HOST]/api/db/query
Not supported.
// SPARQL Note: When using SPARQL, omit the `$` in front-of built-in sources. In addition, if you want to specify an `ISO-8601` formatted wall clock time, replace all the `:` with `;`. For example, `"SELECT ?s ?o WHERE { ?s fdb2019-03-14T20;59;36;097Z:person/handle ?o.}`.
PREFIX ftest: <fluree/test>
SELECT ?nums
WHERE {
?person fd4:person/handle "zsmith";
fd4:person/favNums ?nums.
?personTest ftest:person/handle "zsmith".
ftest:person/favNums ?nums.
}
After declaring the source in the prefix, we can access that ledger at any block by specifying the time in the clause. You can specify a time using a block integer (ftest3
), a duration (ftestPT5M
), or an ISO-8601 formatted time-string (ftest2019-03-14T20:59:36.097Z
). The time SHOULD NOT be declared in the prefix map - only in a particular clause.
Permissions If you want to access information in different ledgers, you need to have permissions to access those ledgers, and those ledgers need to be running on the same transactor.
- If you are accessing outside ledgers in the same network as your current ledger (i.e.
fluree/one
andfluree/test
) and fdb-api-open istrue
, then you can freely query across ledgers. - In any other situation,
fdb-api-open
must befalse
, and your query must be signed. The_auth
record with which you signed your query will be the one that determines your permissions for each given ledger.
Vars Key
The vars
key takes a map where the keys are variable names and the values are the variable values. These variables will be substituted into the given query.
For example, in the below query ?handle
will be replaced with dsanchez
anywhere it appears in the query.
- FlureeQL
- Curl
- GraphQL
- SPARQL
{
"select": "?handle",
"where": [["?person", "person/handle", "?handle"]],
"vars": {
"?handle": "dsanchez"
}
}
curl \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $FLUREE_TOKEN" \
-d '{
"select": "?handle",
"where": [
["?person", "person/handle", "?handle"] ],
"vars": {
"?handle": "dsanchez"
}
}' \
[HOST]/api/db/query
Not supported
// Althought the SPARQL 1.1 spec supports multiple values for any given variable, currently Fluree only support a 1:1 relationship
// between variables and their values
SELECT ?handle
WHERE {
VALUES ?handle { "dsanchez" }
?person fdb:person/handle ?handle.
}
Opts Key
Key | Value | Description |
---|---|---|
limit | integer | Limit (integer) of results to include. Default is 100. |
offset | integer | Number of results to exclude (i.e for pagination). |
orderBy | predicate or [ORDER, predicate] | Predicate name or two-tuple specifying how to order results. If using a two-tuple, the first element must be "ASC" or "DESC" and the second element is the predicate name. For example, "person/chat" or ["ASC", "person/handle"] . Predicate name should match the name in the select clause. Ordering is done before taking the number of results specified in limit. ASC is the used if not specified. |
groupBy | ?var1 or ["?var1", "?var2", ... ] | You can group by any variable or variables that appears in your where clause. Grouping is specified in the top-level groupBy key. This can either be a single variable, i.e. ?person or a vector of variables, i.e. ["?person", "?handle"] . |
prettyPrint | boolean | Default false. Optional boolean. Whether to "pretty print" the results (as a map with keys) or as a vector without labels. This is only available when select is an array of values. Note that depending on the query, pretty print can significantly slow down results. |
wikidataOpts | { Wikidata Options Map } | Optional map of configurations for Wikidata queries. See all options below. |
Key | Description |
---|---|
distinct | Default is true. Boolean, which specifies whether to include only distinct Wikidata results. |
limit | Default is 100. Number of results (integer) to return for each query. |
offset | Default is 0. Number of results to skip before returning the results. |
language | Default is "en". See Wikidata language codes for other options. |