Skip to main content

Analytical Query

FlureeQL Analytical Queries are used to answer more sophisticated questions about your data.

Analytical Queries can:

  • Query across multiple internal and external datasets
  • Query across the same Fluree at different points in time
  • Calculate aggregates
  • Create complicated joins
  • Group results by a single or multiple variables
  • Use aggregate values calculated mid-query to filter results

We utilized concepts of logic programming and variable binding to give an immense amount of query potential that can largely be designed by you.

This section covers analytical queries using the FlureeQL syntax. All code examples are shown in FlureeQL. All of these queries can be issued through the API or the user interface (select FlureeQL in the sidebar, then make sure Query is selected in the top-right, as well as in the dropdown).

To issue these queries using the API, see /query. You can also issue multiple queries at once using the /multi-query endpoint.

Query Keys

FlureeQL Analytical Queries are also structured as a JSON object/map and may contain the following keys:

KeyRequired?Description
select, selectOne, or selectDistinctyesselect returns all relevant results, selectOne returns one result, and selectDistinct only returns unique results. See select key
whereyesA collection of tuples which allow for complex filtering of data.
blocknoOptional block specified by block number, duration, or wall-clock time as a ISO-8601 formatted string. This applies a block to every part of the query that does not have a block specified. It follows the same syntax as the block key in basic queries.
prefixesnoOptional map of outside sources.
varsnoOptional map of variable bindings.
optsnoOptional map where options like limit, orderBy, prettyPrint, and wikipediaOpts can be specified.

Select Key

select returns all relevant results, selectOne returns one result, and selectDistinct only returns unique results. Exactly one of the select keys is required.

ItemExampleDescription
Variable"?apple"Variables are declared in the where clause
Variable Select-Array{"?var1": ["*"]}

{"?var1": ["*", {"chat/person": ["*"]}]}
A map where the key is a valid variable and the value is a valid select-array. This only works when the variable is bound to subject ids.
Aggregate variable"(avg ?nums)"Any valid variables can be wrapped in an aggregate function. All valid aggregate functions are listed in the table below.

The value for the select key can be any of the items in the following table. Your select key can be multiple of these select items. In the case that it is multiple items, these items should be wrapped in [ ]. For example [ "?var1", "(sum ?var2)", {"?var3": ["*"] }, "?var4" ].

If your select value only has one item, then you can either wrap it in square brackets or not (i.e. { "select": ["?var"] } or { "select": "?var" }). If your select key is an array, then your results will also return an array.

Valid Aggregate Functions

All the following are valid aggregate functions. These below functions can only be applied to variables- you cannot provide your own array. In addition, they cannot be nested within each other- for example (rand (sample 2 ?age)) is NOT valid.

FunctionArgumentsDescription
avgvariableReturns the average of the values.
countvariableReturns a count of the values.
count-distinctvariableReturns a count of the distinct values. This is equivalent to (count (distinct ?var))
maxvariableReturns the largest value.
minvariableReturns the smallest value.
medianvariableReturns the median of the values.
randvariableReturns a random value from the specified values.
samplen, variableGiven a sample size and a set of values, returns an appropriately sized sample, i.e. (sample 2 ?age) returns two ages from values bound to the variable, ?age.
stddevvariableReturns the standard deviation of the values.
sumvariableReturns the sum of the values.
variancevariableReturns the variance of the values.

Additional Modifiers

  1. as: as can be wrapped around the ENTIRE aggregate function to rename the result for this aggregate. This renaming only applies to how the results are displayed if pretty printed. For example, (as (sum ?nums) ?sum).

  2. distinct: If you want an aggregate function to only apply to the set of distinct values, you can wrap distinct around the variable. For example, (variance (distinct ?nums)) or (as (sum (distinct ?nums)) ?sum).

If using groupBy, aggregates are applied only to the values within that group.

If selecting a mixture of aggregate and non-aggregate variables, the results of the aggregate are returned alongside every non-aggregate value. To see what this looks like, try issuing the below query against the basic schema.

{
"select": ["?nums", "(avg ?nums)"],
"where": [["?person", "person/favNums", "?nums"]]
}

Where Key

The value for the where key is an array- we'll call it a where-array in this section. Each item of the where-array acts to filter data, bind variables, or perform other complex data actions. The order of the items in a where-array is extremely important, as each item is resolved sequentially.

A where-array can be comprised of any of the following items in any order:

ItemFunction
three-tupleLooks for flakes (pieces of data) in the ledger that match a provided pattern, and bind variables to the results.
four-tupleSame as a simple three-tuple, except the first item in the tuple specifies a particular data source, such as Wikidata, another ledger, or the current ledger at a specific in time.
two-tuple variable bindingBind a variable to a value, including to a calculated aggregate value. These variables can be used in subsequent where-array items.
binding mapServes the same function as a two-tuple variable binding, except the syntax is different.
optional mapLike, three-tuples, looks for flakes (pieces of data) that match a provided pattern. However, when joining the results of this map with the queries existing resultset, will simply bind null if there is no match (a left outer join).
union mapThe two parts of a union map are outer joined.
filter mapFilters the results up to that point in the query.

Each where-array item is resolved and then inner joined with the resultset up until that point. See the Analytical Query guide for more information.

Three Tuple

Every piece of data in Fluree can be expressed as [ subject, predicate, object ]. To read more about this, see the Subject-Predicate-Object Model in the Technical Overview document.

When you include a three-tuple in your where-array, you specify the values for one or two parts of the tuple. The parts you don't specify are either null or variables. For example, in this three tuple, ["?person", "person/handle", "?handle"], only the predicate person/handle is specified. The subject ?person and the object ?handle are variables. You can also look-up a data for a known id using the pattern ["?s", "_id", 351843720888320].

The below table lists all the possible values for each part of the three tuple.

PartValue
subjectCan be an subject id, unique two-tuple, a variable (a string that begins with ?), or null.
predicate, predicate+[RECUR-DEPTH], or service callCan be either predicate name, the Fluree reserved word "_id", or a variable (a string that begins with ?). Reverse references are NOT supported.

If using a predicate name, can add + after the predicate name to signify recur. Can also specify how many times to recur, i.e. person/follows+3.

If using "_id", the object should reference a known id.

The second item in the clause can also be a service call. All supported service calls are in the subsequent table.
object or object-functionCan be a value, subject id, unique two-tuple, a variable (a string that begins with ?), or null.

The object can also be a filter function, which contains an existing or a newly declared variable, for example, (= 20 ?nums). The filters follow the same syntax as filter maps, and you can read more in that section.

If your object begins with a ?, and it is NOT a variable, for example someone's name is ?Fred, then you can use escape strings in the object, for example: "\"?Fred\"".

A three tuple acts as a pattern in a where-array. First it pulls all the data that matches that given pattern, and then it binds the appropriate variables. Subsequent three tuples are inner-joined. For more information and examples, see inner joins in Fluree.

Supported Service CallExample Three-tupleDescription
Full Text Search a Collection["?movie", "fullText:movie", "redemption"]This service call searches for the word redemption in any predicates in the movie collection that are enabled for full text search. See the Full Text Search guide for more examples and details.
Full Text Search a Predicate["?movie", "fullText:movie/title", "redemption"]This service call searches for the word redemption in movie/titles. This only works if the movie/title predicate is enabled for full text search. See the Full Text Search guide for more examples and details.
Collection Select["?movie", "rdf:type", "movie"]This service call binds all subject ids where the subject belongs to the movie. collection to the provided variable, ?movie.

Four Tuple

Four tuples are exactly the same as three tuples, except that four tuples specify a data source for the subject, predicate, object pattern. Currently, Fluree supports querying across multiple Fluree ledgers and across Wikidata. The following are the sources that can be used.

SourceExampleDescription
This ledger$fdbDefault source. The current version of a given Fluree. Can be omitted.
This ledger at a Previous Point in Time$fdb3, $fdb2019-03-14T20:59:36.097Z, $fdbPT5MThis ledger at a specified block. The block is specified either by providing the block integer, ISO-8601 formatted wall clock time, or ISO-8601 formatted duration ago
Other Fluree ledgerftestOther Fluree ledger must be specifed in the prefixes key.
Other Fluree ledgerftestPT5MOther ledger at a specified block. The block is specified either by providing the block integer, ISO-8601 formatted wall clock time, or ISO-8601 formatted duration ago
Wikidata$wdWikidata

For example, the three tuple ["?person", "person/handle", "?handle"] is equivalent to ["$fdb", "?person", "person/handle", "?handle"], because the specified source is the current ledger. The four tuple, ["$fdbPT2H", "?person", "person/handle", "?handle"] is matching the given tuple pattern to the subject-predicate-object triples active in the ledger as of two hours ago.

{
"select": "?nums",
"where": [
["?person", "rdf:type", "person"],
["?person", "person/handle", "?handle"],
["?person", "person/favNums", "#(> 80 ?nums)"]
]
}

Two Tuple Variable Binding

You can bind a variable to an aggregate value. As mentioned above, where-array items are resolved in the order in which they appear. This means that aggregates do NOT take into account the where-array item that comes later in the array. To bind an intermediate aggregate value, just specify a two-tuple with the first item as a variable and the second item as the aggregate function. For example ["?maxBlock", "#(max ?bNum)"].

Valid functions are the same as the ones listed in Valid Aggregate Functions. You can also use the aggregate modifier (distinct) explained in the valid aggregate section (you can use the as modifier, but this will simply by ignored). A function must be preceded with a #.

{
"select": "?hash",
"where": [
["?s", "_block/number", "?bNum"],
["?maxBlock", "#(max ?bNum)"],
["?s", "_block/number", "?maxBlock"],
["?s", "_block/hash", "?hash"]
]
}

You can also simply specify values as the second item in the tuple. This binds the given variable to the value provided.

{
"select": "?person",
"where": [
["?handle", "jdoe"],
["?person", "person/handle", "?handle"]
]
}

Binding Map

Binding works the same as intermediate aggregate values, except the syntax is different.

A bind map (or multiple maps), can be declared anywhere in the where clause. The bind map must precede any clauses that use the variables declared in the map.

The map is comprised of keys that correspond to variables, and values. For example, {"bind": {"?handle": "dsanchez"}}. You can bind multiple variables in the same map, as well, for example {"bind": {"?handle": "dsanchez", "?person": 351843720888324}}.

{
"select": ["?person", "?handle"],
"where": [
{ "bind": { "?handle": "dsanchez" } },
["?person", "person/handle", "?handle"]
]
}

Like intermediate aggregate clauses, binds can use aggregate functions. See the Valid Aggregate Functions list above.

{
"select": "?hash",
"where": [
["?s", "_block/number", "?bNum"],
{ "bind": { "?maxBlock": "#(max ?bNum)" } },
["?s", "_block/number", "?maxBlock"],
["?s", "_block/hash", "?hash"]
]
}

Optional Map

An optional map is a map where the key is optional, and the value is an array of where-items (you can include any where-items within an optional map, such as three-tuples, four-tuples, two-tuple variable bindings, other optional maps, etc).

For example:

{
"optional": [
["?person", "person/fullName", "?name"],
["?favNums", 1223],
["?person", "person/favNums", "?favNums"]
]
}

Anything within an optional map is resolved and then left outer joined with previous results. In other words, optional maps do not remove any entries from the existing resultset - any rows from the original resultset that do not have a match are bound with null. For a more in-depth explanation of optional clauses, see the optional clauses guide.

Union Map

A union map takes an two-item array as a value. The two items in this array is itself an array comprised of any number of where-items (for example, three-tuples, two-tuples, optional maps, and filter maps). The results of each array of where-items is then outer joined.

{
"select": ["?x", "?y", "?b", "?c", "?cname", "?pname"],
"where": [
{
"union": [
[
["?x", "_collection/name", "?cname"],
["?b", 2]
],

[
["?y", "_predicate/name", "?pname"],
["?c", 8]
]
]
}
]
}

Filter Map

A filter map has the key filter, and the value is a number of filter functions. Currently, all filter functions should be written using Clojure. The following are all of the accepted filter functions:

NameExampleExplanation
>(> 10 ?nums)greater than
>=(>= 10 ?nums)greater than or equal to
<(< 10 ?nums)less than
<=(<= 10 ?nums)less than or equal to
=(= 10 ?nums)equal to
not=(not= 10 ?nums)not equal to
+(> 11 (+ 10 ?nums))add
-(- 11 (+ 10 ?nums))subtract
*(> 11 (* 10 ?nums))multiply
/(> 11 (/ 10 ?nums))divide
and(and (> 10 ?nums1) (< 100 ?nums2))and
&&(&& (> 10 ?nums1) (< 100 ?nums2))Same as and.
or(or (> 10 ?nums1) (< 100 ?nums2))or
||(\|\| (> 10 ?nums1) (< 100 ?nums2))Same as or.
nil?(nil? ?nums)nil?
bound(bound ?nums)True if non-nil
coalesce(> ?nums (coalesce ?age 30))Coalesce takes the first non-nil argument, example below.
if(if (> ?nums 30) (> ?age 100) true)if
not(not (> ?nums 30))not
!(! (> ?nums 30))Same as not
now(> ?nums (now))Returns the current time in epoch milliseconds.
re-pattern(re-pattern "\\d+")Returns a regex pattern, for use in re-find. re-pattern
re-find(re-find (re-pattern \"^_collection\") ?name)Returns the first regex match. re-find
strStarts(strStarts ?message \"Hi\")Returns true if the string starts with the given substring, case sensitive.
strEnds(strEnds ?message \"Amy.\")Returns true if the string ends with the given substring, case sensitive.

An example query:

{
"select": ["?handle", "?num"],
"where": [
["?person", "person/handle", "?handle"],
["?person", "person/favNums", "?num"],
{ "filter": ["(> 10 ?num)", "(= \"jdoe\" ?handle)"] }
]
}

Prefixes Key

If we want to query across multiple Fluree ledgers we can do so by specifying the ledger in the prefixes map.

In the prefix map, we can specify what ledger we want to query across, in this case fluree/test, and we give that source a name, ftest. The name given to a source must be only lowercase letters and no numbers.

Now we can use ftest as a source in any clause.

{
"prefixes": {
"ftest": "fluree/test"
},
"select": "?nums",
"where": [
["$fdb4", ["person/handle", "zsmith"], "person/favNums", "?nums"],
["ftest", ["person/handle", "zsmith"], "person/favNums", "?nums"]
]
}

After declaring the source in the prefix, we can access that ledger at any block by specifying the time in the clause. You can specify a time using a block integer (ftest3), a duration (ftestPT5M), or an ISO-8601 formatted time-string (ftest2019-03-14T20:59:36.097Z). The time SHOULD NOT be declared in the prefix map - only in a particular clause.

Permissions If you want to access information in different ledgers, you need to have permissions to access those ledgers, and those ledgers need to be running on the same transactor.

  1. If you are accessing outside ledgers in the same network as your current ledger (i.e. fluree/one and fluree/test) and fdb-api-open is true, then you can freely query across ledgers.
  2. In any other situation, fdb-api-open must be false, and your query must be signed. The _auth record with which you signed your query will be the one that determines your permissions for each given ledger.

Vars Key

The vars key takes a map where the keys are variable names and the values are the variable values. These variables will be substituted into the given query.

For example, in the below query ?handle will be replaced with dsanchez anywhere it appears in the query.

{
"select": "?handle",
"where": [["?person", "person/handle", "?handle"]],
"vars": {
"?handle": "dsanchez"
}
}

Opts Key

KeyValueDescription
limitintegerLimit (integer) of results to include. Default is 100.
offsetintegerNumber of results to exclude (i.e for pagination).
orderBypredicate or [ORDER, predicate]Predicate name or two-tuple specifying how to order results. If using a two-tuple, the first element must be "ASC" or "DESC" and the second element is the predicate name. For example, "person/chat" or ["ASC", "person/handle"]. Predicate name should match the name in the select clause. Ordering is done before taking the number of results specified in limit. ASC is the used if not specified.
groupBy?var1 or ["?var1", "?var2", ... ]You can group by any variable or variables that appears in your where clause. Grouping is specified in the top-level groupBy key. This can either be a single variable, i.e. ?person or a vector of variables, i.e. ["?person", "?handle"].
prettyPrintbooleanDefault false. Optional boolean. Whether to "pretty print" the results (as a map with keys) or as a vector without labels. This is only available when select is an array of values. Note that depending on the query, pretty print can significantly slow down results.
wikidataOpts{ Wikidata Options Map }Optional map of configurations for Wikidata queries. See all options below.
KeyDescription
distinctDefault is true. Boolean, which specifies whether to include only distinct Wikidata results.
limitDefault is 100. Number of results (integer) to return for each query.
offsetDefault is 0. Number of results to skip before returning the results.
languageDefault is "en". See Wikidata language codes for other options.