Channel
A channel is a named input source that training algorithms can consume.
Contents
- ChannelName
-
The name of the channel.
Type: String
Length Constraints: Minimum length of 1. Maximum length of 64.
Pattern:
[A-Za-z0-9\.\-_]+
Required: Yes
- DataSource
-
The location of the channel data.
Type: DataSource object
Required: Yes
- CompressionType
-
If training data is compressed, the compression type. The default value is
None
.CompressionType
is used only in Pipe input mode. In File mode, leave this field unset or set it to None.Type: String
Valid Values:
None | Gzip
Required: No
- ContentType
-
The MIME type of the data.
Type: String
Length Constraints: Maximum length of 256.
Pattern:
.*
Required: No
- InputMode
-
(Optional) The input mode to use for the data channel in a training job. If you don't set a value for
InputMode
, SageMaker uses the value set forTrainingInputMode
. Use this parameter to override theTrainingInputMode
setting in a AlgorithmSpecification request when you have a channel that needs a different input mode from the training job's general setting. To download the data from Amazon Simple Storage Service (Amazon S3) to the provisioned ML storage volume, and mount the directory to a Docker volume, useFile
input mode. To stream data directly from Amazon S3 to the container, choosePipe
input mode.To use a model for incremental training, choose
File
input model.Type: String
Valid Values:
Pipe | File | FastFile
Required: No
- RecordWrapperType
-
Specify RecordIO as the value when input data is in raw format but the training algorithm requires the RecordIO format. In this case, SageMaker wraps each individual S3 object in a RecordIO record. If the input data is already in RecordIO format, you don't need to set this attribute. For more information, see Create a Dataset Using RecordIO
. In File mode, leave this field unset or set it to None.
Type: String
Valid Values:
None | RecordIO
Required: No
- ShuffleConfig
-
A configuration for a shuffle option for input data in a channel. If you use
S3Prefix
forS3DataType
, this shuffles the results of the S3 key prefix matches. If you useManifestFile
, the order of the S3 object references in theManifestFile
is shuffled. If you useAugmentedManifestFile
, the order of the JSON lines in theAugmentedManifestFile
is shuffled. The shuffling order is determined using theSeed
value.For Pipe input mode, shuffling is done at the start of every epoch. With large datasets this ensures that the order of the training data is different for each epoch, it helps reduce bias and possible overfitting. In a multi-node training job when ShuffleConfig is combined with
S3DataDistributionType
ofShardedByS3Key
, the data is shuffled across nodes so that the content sent to a particular node on the first epoch might be sent to a different node on the second epoch.Type: ShuffleConfig object
Required: No
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following: