The joinOrder
SPARQL query hint
When you submit a SPARQL query, the Amazon Neptune query engine investigates the structure of the query. It reorders parts of the query and tries to minimize the amount of work required for evaluation and query response time.
For example, a sequence of connected triple patterns is typically not evaluated in the given order. It is reordered using heuristics and statistics such as the selectivity of the individual patterns and how they are connected through shared variables. Additionally, if your query contains more complex patterns such as subqueries, FILTERs, or complex OPTIONAL or MINUS blocks, the Neptune query engine reorders them where possible, aiming for an efficient evaluation order.
For more complex queries, the order in which Neptune chooses to evaluate the query might not always be optimal. For instance, Neptune might miss instance data-specific characteristics (such as hitting power nodes in the graph) that emerge during query evaluation.
If you know the exact characteristics of the data and want to manually dictate the order
of the query execution, use the Neptune joinOrder
query hint to specify that
the query be evaluated in the given order.
joinOrder
SPARQL hint syntax
The joinOrder
query hint is specified as a triple pattern included in a
SPARQL query.
For clarity, the following syntax uses a hint
prefix defined
and included in the query to specify the Neptune query-hint namespace:
PREFIX hint: <http://aws.amazon.com/neptune/vocab/v01/QueryHints#>
scope
hint:joinOrder "Ordered" .
Available Scopes
hint:Query
hint:Group
For more information about query hint scopes, see Scope of SPARQL query hints in Neptune.
joinOrder
SPARQL hint example
This section shows a query written with and without the joinOrder
query
hint and related optimizations.
For this example, assume that the dataset contains the following:
A single person named
John
that:likes
1,000 persons, includingJane
.A single person named
Jane
that:likes
10 persons, includingJohn
.
No Query Hint
The following SPARQL query extracts all the pairs of people named John
and Jane
who both like each other from a set of social networking
data:
PREFIX : <https://example.com/> SELECT ?john ?jane { ?person1 :name "Jane" . ?person1 :likes ?person2 . ?person2 :name "John" . ?person2 :likes ?person1 . }
The Neptune query engine might evaluate the statements in a different order than written. For example, it might choose to evaluate in the following order:
Find all persons named
John
.Find all persons connected to
John
by a:likes
edge.Filter this set by persons named
Jane
.Filter this set by those connected to
John
by a:likes
edge.
According to the dataset, evaluating in this order results in 1,000 entities being
extracted in the second step. The third step narrows this down to the single node,
Jane
. The final step then determines that Jane
also
:likes
the John
node.
Query Hint
It would be favorable to start with the Jane
node because she has only 10
outgoing :likes
edges. This reduces the amount of work during the evaluation
of the query by avoiding the extraction of the 1,000 entities during the second
step.
The following example uses the joinOrder query hint to ensure that the Jane
node and its outgoing
edges are processed first by disabling all automatic join reordering for the query:
PREFIX : <https://example.com/> PREFIX hint: <http://aws.amazon.com/neptune/vocab/v01/QueryHints#> SELECT ?john ?jane { hint:Query hint:joinOrder "Ordered" . ?person1 :name "Jane" . ?person1 :likes ?person2 . ?person2 :name "John" . ?person2 :likes ?person1 . }
An applicable real-world scenario might be a social network application in which persons
in the network are classified as either influencers with many connections or normal users
with few connections. In such a scenario, you could ensure that the normal user
(Jane
) is processed before the influencer (John
) in a query like
the preceding example.
Query Hint and Reorder
You can take this example one step further. If you know that the :name
attribute is unique to a single node, you could speed up the query by reordering and using
the joinOrder
query hint. This step ensures that the unique nodes are
extracted first.
PREFIX : <https://example.com/> PREFIX hint: <http://aws.amazon.com/neptune/vocab/v01/QueryHints#> SELECT ?john ?jane { hint:Query hint:joinOrder "Ordered" . ?person1 :name "Jane" . ?person2 :name "John" . ?person1 :likes ?person2 . ?person2 :likes ?person1 . }
In this case, you can reduce the query to the following single actions in each step:
Find the single person node with
:name
Jane
.Find the single person node with
:name
John
.Check that the first node is connected to the second with a
:likes
edge.Check that the second node is connected to the first with a
:likes
edge.
Important
If you choose the wrong order, the joinOrder
query hint can lead to
significant performance drops. For example, the preceding example would be inefficient if
the :name
attributes were not unique. If all 100 nodes were named
Jane
and all 1,000 nodes were named John
, then the query would
end up checking 1,000 * 100 (100,000) pairs for :likes
edges.