Select your cookie preferences

We use essential cookies and similar tools that are necessary to provide our site and services. We use performance cookies to collect anonymous statistics, so we can understand how customers use our site and make improvements. Essential cookies cannot be deactivated, but you can choose “Customize” or “Decline” to decline performance cookies.

If you agree, AWS and approved third parties will also use cookies to provide useful site features, remember your preferences, and display relevant content, including relevant advertising. To accept or decline all non-essential cookies, choose “Accept” or “Decline.” To make more detailed choices, choose “Customize.”

Introduction to entities

Focus mode
Introduction to entities - Amazon SageMaker AI

Amazon SageMaker AI automatically creates tracking entities for SageMaker AI jobs, models, model packages, and endpoints if the data is available. For a basic workflow, suppose you train a model using a dataset. SageMaker AI automatically generates a lineage graph with three entities:

  • Dataset : A type of artifact, which is an entity representing a URI addressable object or data. An artifact is generally either an input or an output to a trial component or action.

  • TrainingJob: A type of trial component, which is an entity representing processing, training, and transform jobs.

  • Model: Another type of artifact. Like the Dataset artifact, a Model is a URI addressable object. In this case, it is an output of the TrainingJob trial component.

Your model lineage graph expands quickly if you add additional steps to your workflow, such as data preprocessing or postprocessing, if you deploy your model to an endpoint, or if you include your model in a model package, among many other possibilities. For the complete list of SageMaker AI entities, see Amazon SageMaker ML Lineage Tracking.

Entity properties

Each node in the graph displays the entity type, but you can choose the vertical ellipsis to the right of the entity type to see specific details related to your workflow. In our previous barebones lineage graph, you can choose the vertical ellipsis next to DataSet to see specific values for the following properties (common to all artifact entities):

  • Name: The name of your dataset.

  • Source URI: The Amazon S3 location of your dataset.

For the TrainingJob entity, you can see the specific values for the following properties (common to all TrialComponent entities):

  • Name: The name of the training job.

  • Job ARN: The Amazon Resource Name (ARN) of your training job.

For the Model entity, you see the same properties as listed for DataSet since they are both artifact entities. For a list of the entities and their associated properties, see Lineage Tracking Entities.

Entity queries

Amazon SageMaker AI automatically generates graphs of lineage entities as you use them. However if you are running many iterations of an experiment and don't want to view every lineage graph, the AWS SDK can help you perform queries across all your workflows. For example, you can query your lineage entities for all the processing jobs that use an endpoint. Or, you can see all the downstream trails that use an artifact. For a list of all the queries you can perform, see Querying Lineage Entities.

PrivacySite termsCookie preferences
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.