Best practices: getting the most out of Neptune
The following are some general recommendations for working with Amazon Neptune. Use this information as a reference to quickly find recommendations for using Amazon Neptune and maximizing performance.
Contents
- Amazon Neptune basic operational guidelines
- Amazon Neptune security best practices
- Avoid different instance classes in a cluster
- Avoid repeated restarts during bulk loading
- Enable the OSGP Index if you have a large number of predicates
- Avoid long-running transactions where possible
- Best practices for using Neptune metrics
- Best practices for tuning Neptune queries
- Load balancing across read replicas
- Loading faster using a temporary larger instance
- Resize your writer instance by failing over to a read-replica
- Retry upload after data prefetch task interrupted error
- General Best Practices for Using Gremlin with Neptune
- Structure upsert queries to take advantage of the DFE engine
- Test Gremlin code in the context where you will deploy it
- Creating Efficient Multithreaded Gremlin Writes
- Pruning Records with the Creation Time Property
- Using the datetime( ) Method for Groovy Time Data
- Using Native Date and Time for GLV Time Data
- Best practices using the Gremlin Java client with Neptune
- Use the latest compatible version of the Apache TinkerPop Java client
- Re-use the client object across multiple threads
- Create separate Gremlin Java client objects for read and write endpoints
- Add multiple read replica endpoints to a Gremlin Java connection pool
- Close the client to avoid the connections limit
- Create a new connection after failover
- Set maxInProcessPerConnection and maxSimultaneousUsagePerConnection to the same value
- Send queries to the server as bytecode rather than as strings
- Always completely consume the ResultSet or Iterator returned by a query
- Bulk add vertices and edges in batches
- Disable DNS caching in the Java Virtual Machine
- Optionally, set timeouts at a per-query level
- Troubleshooting java.util.concurrent.TimeoutException
- Neptune Best Practices Using openCypher and Bolt
- Create a new connection after failover
- Connection handling for long-lived applications
- Connection handling for AWS Lambda
- Prefer directed to bi-directional edges in queries
- Neptune does not support multiple concurrent queries in a transaction
- Close driver objects when you're done
- Use explicit transaction modes for reading and writing
- Retry logic for exceptions
- Set multiple properties at once using a single SET clause
- Use parameterized queries
- Use flattened maps instead of nested maps in UNWIND clause
- Place more restrictive nodes on the left side in Variable-Length Path (VLP) expressions
- Avoid redundant node label checks by using granular relationship names
- Specify edge labels where possible
- Avoid using the WITH clause when possible
- Place restrictive filters as early in the query as possible
- Explicitly check whether properties exist
- Do not use named path (unless it is required)
- Avoid COLLECT(DISTINCT())
- Prefer the properties function over individual property lookup when retrieving all property values
- Perform static computations outside of the query
- Batch inputs using UNWIND instead of individual statements
- Prefer using custom IDs for node/relationship
- Avoid doing ~id computations in the query
- Neptune Best Practices Using SPARQL