Standard breadth-first search (BFS) algorithm - Neptune Analytics

Standard breadth-first search (BFS) algorithm

Standard breadth-first search (BFS) is an algorithm for finding nodes from a starting node or nodes in a graph in breadth-first order.

It returns the source node or nodes that it started from, and all of the nodes visited by each search.

Note

Because every source node passed in leads to its own execution of the algorithm, your queries should limit the number of source nodes as much as possible.

.bfs   syntax

CALL neptune.algo.bfs( [source-node list (required)], { edgeLabels: [list of edge labels for filtering (optional)], vertexLabel: a node label for filtering (optional), maxDepth: maximum number of hops to traverse from a source node (optional), traversalDirection: traversal direction (optional), concurrency: number of threads to use (optional) } ) YIELD the outputs to generate (source and/or node) RETURN the outputs to return

.bfs   inputs

  • a source node list   (required)   –   type: Node[] or NodeId[];   default: none.

    The source-node list contains the node or nodes used as the starting locations for the algorithm.

    • Each starting node triggers its own execution of the algorithm.

    • If the source-node list is empty then the query result is also empty.

    • If the algorithm is called following a MATCH clause (this is known as query-algorithm integration), the output of the MATCH clause is used as the source-node list for the algorithm.

  • a configuration object that contains:
    • edgeLabels   (optional)   –   type: a list of edge label strings;   example: ["route", ...];   default: no edge filtering.

      To filter on one more edge labels, provide a list of the ones to filter on. If no edgeLabels field is provided then all edge labels are processed during traversal.

    • vertexLabel   (optional)   –   type: string;   example: "airport";  default: no node filtering.

      If you provide a node label to filter on then only nodes matching that label will be traversed. This does not, however, filter out any nodes in the source node list.

    • maxDepth   (optional)   –   type: positive integer or 0 or -1;   default: -1.

      The maximum number of hops to traverse from a source node. If set at -1 then there's no maximum depth limit. If set to 0, only the nodes in the source node list are returned.

    • traversalDirection   (optional)   –   type: string;   default: "outbound".

      The direction of edge to follow. Must be one of: "inbound", "oubound", or "both".

    • concurrency   (optional)   –   type: 0 or 1;   default: 0.

      Controls the number of concurrent threads used to run the algorithm.

      If set to 0, uses all available threads to complete execution of the individual algorithm invocation. If set to 1, uses a single thread. This can be useful when requiring the invocation of many algorithms concurrently.

.bfs   outputs

The .bfs algorithm returns:

  • source   –   type: Node[].

    The source nodes.

  • node   –   type: Node[].

    The nodes that the algorithm traversed from each source node.

.bfs   query examples

This is a standalone example, where the query provides an explicit source node list.

CALL neptune.algo.bfs( ["101", "102"], { edgeLabels: ["route"], vertexLabel: "airport", maxDepth: 11, traversalDirection: "both", concurrency: 2 } ) YIELD node

You can run that query using the execute-query operation in the AWS CLI like this:

aws neptune-graph execute-query \ --graph-identifier ${graphIdentifier} \ --query-string 'CALL neptune.algo.bfs(["101", "102"], {edgeLabels: ["route"], vertexLabel: "airport", maxDepth: 11, traversalDirection: "both", concurrency: 2})' \ --language open_cypher \ /tmp/out.txt

A query like this one would return an empty result because the source list is empty:

CALL neptune.algo.bfs([], {edgeLabels: ["route"]})

By default, both the source nodes ("source" output) and the visited nodes ("node" output) are returned. You can use YIELD to specify which of those outputs you would like to see. For example, to see only the "node" outputs:

CALL neptune.algo.bfs(["101"], {edgeLabels: ["route"]}) YIELD node

The examples below are query integration examples, where .bfs follows a MATCH clause and uses the output of the MATCH clause as its source node list:

MATCH (n) WITH n LIMIT 5 CALL neptune.algo.bfs(n, {edgeLabels: ["route"]}) YIELD node RETURN node

The MATCH clause can also explitly specify a starting node list using the id() function, like this:

MATCH (n) where id(n)="101" CALL neptune.algo.bfs(n, {edgeLabels: ["route"]}) YIELD node RETURN node

Also:

MATCH (n) where id(n) IN ["101", "102"] CALL neptune.algo.bfs(n, {edgeLabels: ["route"]}) YIELD node RETURN COUNT(node)
Warning

It is not good practice to use MATCH(n) without restriction in query integrations. Keep in mind that every node returned by the MATCH(n) clause invokes the algorithm once, which can result a very long-running query if a large number of nodes is returned. Use LIMIT or put conditions on the MATCH clause to restrict its output appropriately.

Sample .bfs output

Here is an example of the output returned by .bfs when run against the sample air-routes dataset [nodes], and sample air-routes dataset [edges], when using the following query:

aws neptune-graph execute-query \ --graph-identifier ${graphIdentifier} \ --query-string "CALL neptune.algo.bfs(['101'], {maxDepth: 1}) yield source, node return source, node limit 2" \ --language open_cypher \ /tmp/out.txt cat /tmp/out.txt { "results": [{ "source": { "~id": "101", "~entityType": "node", "~labels": ["airport"], "~properties": { "lat": 13.681099891662599, "elev": 5, "longest": 13123, "city": "Bangkok", "type": "airport", "region": "TH-10", "desc": "Suvarnabhumi Bangkok International Airport", "code": "BKK", "lon": 100.74700164794901, "country": "TH", "icao": "VTBS", "runways": 2 } }, "node": { "~id": "1483", "~entityType": "node", "~labels": ["airport"], "~properties": { "lat": 39.490000000000002, "elev": 4557, "longest": 9186, "city": "Ordos", "type": "airport", "region": "CN-15", "desc": "Ordos Ejin Horo Airport", "code": "DSN", "lon": 109.861388889, "country": "CN", "icao": "ZBDS", "runways": 1 } } }, { "source": { "~id": "101", "~entityType": "node", "~labels": ["airport"], "~properties": { "lat": 13.681099891662599, "elev": 5, "longest": 13123, "city": "Bangkok", "type": "airport", "region": "TH-10", "desc": "Suvarnabhumi Bangkok International Airport", "code": "BKK", "lon": 100.74700164794901, "country": "TH", "icao": "VTBS", "runways": 2 } }, "node": { "~id": "103", "~entityType": "node", "~labels": ["airport"], "~properties": { "lat": 55.972599029541001, "elev": 622, "longest": 12139, "city": "Moscow", "type": "airport", "region": "RU-MOS", "desc": "Moscow, Sheremetyevo International Airport", "code": "SVO", "lon": 37.414600372314503, "country": "RU", "icao": "UUEE", "runways": 2 } } }] }