Nodestream is a framework for dealing with semantically
modeling data as a graph. It is designed to be flexible and extensible, allowing you to define how data is collected and
modeled as a graph. It uses a pipeline-based approach to define how data is collected and processed, and it provides a way to
define how the graph should be updated when the schema changes. All of this is done using a simple, human-readable configuration
file in yaml format. To accomplish this, Nodestream uses a number of core concepts, including pipelines, extractors,
transformers, filters, interpreters, interpretations, and migrations.
Beginning with Nodestream 0.12,
Amazon Neptune is supported for both
Neptune Database and Neptune Analytics.
Please view the Nodestream documentation for details on how to configure and use Nodestream with Neptune :
Nodestream support for Amazon Neptune.
Nodestream with Neptune currently supports standard ETL pipelines as well as time to live (TTL) pipelines. ETL pipelines
enable bulk data ingestion into Neptune from a much broader range of data sources and formats than have previously been
possible in Neptune including:
Nodestream fully supports IAM authentication when connecting to Amazon Neptune, as long as credentials are properly configured.
See the
boto3 credentials guide for more information on correctly configuring credentials.
Nodestream's TTL mechanism
also enables new capabilities not previously available in Neptune . By annotating ingested graph elements with timestamps,
Nodestream can create pipelines which automatically expire and remove data that has passed a configured lifespan.