Loading data into a Neptune Analytics graph
Neptune Analytics provides several options for loading data into a graph.
-
Bulk import – Create a graph loaded with data from files in Amazon S3 in a CSV-like data format, or data in a Neptune Database cluster. You can also load data from files in Amazon S3 into an empty graph. This could be the fastest way to load large volumes of initial data.
-
Batch load – Add more data to an existing non-empty graph from files in Amazon S3 in a CSV-like data format. This could be the fastest way to add more data or update single cardinality property values in a non-empty graph. The volume of data that can be ingested in a single request is lower than what bulk import can support, and multiple requests with smaller batches of files could be a workaround.
-
openCypher queries – Add more data through queries, if data is not available from files in Amazon S3 or the data volume is small. This is also a more generic approach for conditional inserts based on data already in the graph, and updating contents of the graph.
Warning
Be cautious while loading a file of edges. If the same edge file is loaded twice, duplicate edges will be inserted into the graph which can lead to unintended results.
Also, while using the SDK/CLI command execute-query to run neptune.load(), it is recommended to increase the timeout window and disable the retries for the SDK/CLI.
For more information about increasing the timeout and disabling retries, see ExecuteQuery.