Amazon Forecast is no longer available to new customers. Existing customers of
Amazon Forecast can continue to use the service as normal.
Learn more"
Getting Started (AWS CLI)
In this exercise, you use the AWS Command Line Interface (AWS CLI) to explore Amazon Forecast. You create an Amazon Forecast dataset, train an predictor, and use the resulting predictor to generate a forecast. Before you begin, make sure that you have an AWS account and that you've set up the AWS CLI. For more information, see Setting Up.
Note
The AWS CLI commands in this exercise were tested on Linux. For information about using the AWS CLI commands on Windows, see Specifying Parameter Values for the AWS Command Line Interface in the AWS Command Line Interface User Guide.
Begin by creating a dataset and importing the electricity usage data into it.
To create an Amazon Forecast dataset
-
Decide which domain and dataset type is appropriate.
The training data that you will import into the dataset influences your choice of dataset domain and type. So, let's review a few sample rows of the electricity usage data.
2014-01-01 01:00:00, 2.53807106598985, client_0 2014-01-01 01:00:00, 23.648648648648624, client_1 2014-01-01 02:00:00, 9.648648648612345, client_0
The data format is CSV (comma-separated values), and it's collected hourly (as shown by the timestamps). It includes these columns:
-
Column 1 – Timestamps that show when electricity usage was recorded.
-
Column 2 – Hourly electricity usage values (note how the timestamp values increase by hour).
-
Column 3 – Client ID values that identify the customers using the electricity.
For this data, choose the following predefined dataset domain and dataset type:
-
Custom domain – None of the dataset domains, such as METRICS, RETAIL, or WEB_TRAFFIC, applies to this data, so choose the Custom domain.
-
Target time series type – The data is a time series because it tracks electricity usage over time. It also includes the target that we want to forecast (Column 2, electricity usage). Therefore, choose the target time series dataset type.
To understand why you choose this type, see Predefined Dataset Domains and Dataset Types.
-
-
Decide on a dataset schema.
The target time series type for the CUSTOM Domain requires these fields;
timestamp
,target_value
, anditem_id
. Thetarget_value
field is the target. Amazon Forecast generates the forecast for this field.To map the required fields to columns in your data, you create a schema. Each attribute in the schema maps to a field in the data.
Important
The order of attributes in the schema must match the order of fields in the training data.
{ "Attributes":[ { "AttributeName": "timestamp", "AttributeType": "timestamp" }, { "AttributeName": "target_value", "AttributeType": "float" }, { "AttributeName": "item_id", "AttributeType": "string" } ] }
You now have the information necessary to create a dataset and import data into it.
-
Create the dataset.
aws forecast create-dataset \ --dataset-name electricity_demand_ds \ --domain CUSTOM \ --dataset-type TARGET_TIME_SERIES \ --data-frequency H \ --schema '{ "Attributes": [ { "AttributeName": "timestamp", "AttributeType": "timestamp" }, { "AttributeName": "target_value", "AttributeType": "float" }, { "AttributeName": "item_id", "AttributeType": "string" } ] }'
In the request, the
data-frequency
valueH
represents a data collection frequency of hourly. The following is an example response.{ "DatasetArn": "arn:aws:forecast:us-west-2:acct-id:dataset/electricity_demand_ds" }
For more information about this operation, see CreateDataset.
-
(Optional) Get the description of the dataset.
aws forecast describe-dataset \ --dataset-arn arn:aws:forecast:us-west-2:
acct-id
:dataset/electricity_demand_dsThe following is an example response.
{ "DatasetName": "electricity_demand_ds", "DatasetArn": "arn:aws:forecast:us-west-2:acct-id:dataset/electricity_demand_ds", "CreationTime": 1564533087.907, "LastModificationTime": 1564533087.907, "Domain": "CUSTOM", "DatasetType": "TARGET_TIME_SERIES", "DataFrequency": "H", "Schema": { ... }, "EncryptionConfig": {}, "Status": "ACTIVE" }
Note
The order of the key-value pairs in the response is arbitrary.
-
Create a dataset group and add the dataset to it. The value of the
domain
parameter must match thedomain
of the dataset.aws forecast create-dataset-group \ --dataset-group-name electricity_ds_group \ --dataset-arns arn:aws:forecast:us-west-2:
acct-id
:dataset/electricity_demand_ds \ --domain CUSTOMThe following is an example response.
{ "DatasetGroupArn": "arn:aws:forecast:us-west-2:acct-id:dataset-group/electricity_ds_group" }
For more information about this operation, see CreateDatasetGroup.
-
(Optional) Get the description of the dataset group.
aws forecast describe-dataset-group \ --dataset-group-arn arn:aws:forecast:us-west-2:
acct-id
:dataset-group/electricity_ds_groupThe following is an example response.
{ "DatasetGroupName": "electricity_ds_group", "DatasetGroupArn": "arn:aws:forecast:us-west-2:acct-id:dataset-group/electricity_ds_group", "DatasetArns": [ "arn:aws:forecast:us-west-2:acct-id:dataset-group/electricity_ds_group" ], "Domain": "CUSTOM", "CreationTime": 1564533719.852, "LastModificationTime": 1564533719.852, "Status": "ACTIVE" }
-
Import the electricity usage training data from your Amazon S3 bucket to the dataset. The IAM role that you provide must have permission to read data from your S3 bucket. For information on how to create an IAM role, see Create an IAM Role for Amazon Forecast (AWS CLI).
aws forecast create-dataset-import-job \ --dataset-arn arn:aws:forecast:us-west-2:
acct-id
:dataset/electricity_demand_ds \ --dataset-import-job-name electricity_ds_import_job \ --data-source '{ "S3Config": { "Path": "s3://bucket
/electricityusagedata.csv", "RoleArn": "arn:aws:iam::acct-id
:role/Role
" } }'The following is the shorthand syntax for the
data-source
parameter.--data-source S3Config="{Path='s3://
bucket
/electricityusagedata.csv',RoleArn='arn:aws:iam::acct-id
:role/Role
'}"The following is an example response.
{ "DatasetImportJobArn": "arn:aws:forecast:us-west-2:acct-id:dataset-import-job/electricity_demand_ds/electricity_ds_import_job" }
For more information about this operation, see CreateDatasetImportJob.
-
Check the import status.
aws forecast describe-dataset-import-job \ --dataset-import-job-arn arn:aws:forecast:us-west-2:
acct-id
:dataset-import-job/electricity_demand_ds/electricity_ds_import_jobThe following is an example response.
{ "DatasetImportJobName": "electricity_ds_import_job", "DatasetImportJobArn": "arn:aws:forecast:us-west-2:acct-id:dataset-import-job/electricity_demand_ds/electricity_ds_import_job", "DatasetArn": "arn:aws:forecast:us-west-2:acct-id:dataset/electricity_demand_ds", "DataSource": { "S3Config": { "Path": "s3://bucket/electricityusagedata.csv", "RoleArn": "arn:aws:iam::acct-id:role/ForecastRole" } }, "DataSize": 0.14639010466635227, "TimeStampFormat": "yyyy-MM-dd HH:mm:ss", "CreationTime": 1564537011.114, "LastModificationTime": 1564537028.223, "Status": "CREATE_IN_PROGRESS" }
When all of the data has been imported, the status changes to ACTIVE and the response includes statistics for the data, as shown in the following example.
{ "DatasetArn": "arn:aws:forecast:us-west-2:acct-id:dataset/electricity_demand_ds", "Status": "ACTIVE", "FieldStatistics": { "date": { "Min": "2014-01-01T01:00:00Z", "Max": "2015-01-01T00:00:00Z", "Count": 3241200, "CountDistinct": 8760, "CountNull": 0 }, "target": { "Min": "0.0", "Max": "168200.0", "Avg": 606.5167610461679, "Stddev": 3518.405223972031, "Count": 3241200, "CountDistinct": 1196961, "CountNull": 0, "CountNan": 0 }, "item": { "Count": 3241200, "CountDistinct": 370, "CountNull": 0 } }, ... }
Important
You must wait until the status is ACTIVE before creating a predictor with the dataset group.
For more information about this operation, see DescribeDatasetImportJob.
To create a predictor, you use the CreateAutoPredictor operation and provide the following information.
-
Predictor name – Give the predictor a name so you can distinguish it from your other predictors
-
Dataset group – You created the dataset group in the preceding step.
-
Forecast frequency – The granularity of your forecasts (hourly, daily, weekly, etc).
-
Forecast horizon – The number of time steps being forecasted.
After the predictor is created, you review the accuracy metrics generated by Amazon Forecast. The metrics help you decide whether to use the predictor for generating a forecast. For more information about predictors, see Training Predictors.
To create a predictor and review the accuracy metrics
-
Create the predictor.
aws forecast create-predictor \ --predictor-name electricitypredictor \ --input-data-config DatasetGroupArn="arn:aws:forecast:us-west-2:
acct-id
:dsgroup/electricity_ds_group" \ --forecast-horizon 36 \ --forecast-frequency DThe following is an example response.
{ "PredictorArn": "arn:aws:forecast:us-west-2:acct-id:predictor/electricitypredictor" }
-
Get the predictor's status.
aws forecast describe-predictor \ --predictor-arn arn:aws:forecast:us-west-2:
acct-id
:predictor/electricitypredictorThe following is an example response.
{ "PredictorArn": "arn:aws:forecast:<region>:<acct-num>:predictor/electricitypredictor", "PredictorName": "electricitypredictor", "ForecastHorizon": 36, "ForecastTypes": [ "0.1", "0.5", "0.9" ], "ForecastFrequency": "D", "DatasetImportJobArns": [ "arn:aws:forecast:<region>:<acct-num>:dataset-import-job/getting_started_dataset/gs_import" ], "DataConfig": { "DatasetGroupArn": "arn:aws:forecast:<region>:<acct-num>:dataset-group/getting_started", "AttributeConfigs": [ { "AttributeName": "target_value", "Transformations": { "aggregation": "sum", "backfill": "zero", "frontfill": "none", "middlefill": "zero" } } ] }, "EstimatedTimeRemainingInMinutes": 97, "Status": "CREATE_IN_PROGRESS", "CreationTime": "2022-02-23T09:26:24.643000-08:00", "LastModificationTime": "2022-02-23T09:49:26.899000-08:00", "ExplainabilityInfo": { "Status": "NOT_AVAILABLE" } }
Important
Model training takes time. Don't proceed until training has completed and the status of the predictor is ACTIVE.
-
Get the accuracy metrics for the predictor.
aws forecast get-accuracy-metrics \ --predictor-arn arn:aws:forecast:us-west-2:
acct-id
:predictor/electricitypredictorThe following is an example response.
{ "PredictorEvaluationResults": [ { "TestWindows": [ { "EvaluationType": "SUMMARY", "Metrics": { "RMSE": 448.19602551622864, "WeightedQuantileLosses": [ { "Quantile": 0.9, "LossValue": 0.11574311406253326 }, { "Quantile": 0.5, "LossValue": 0.1706269067283527 }, { "Quantile": 0.1, "LossValue": 0.11724164222477837 } ] } }, { "EvaluationType": "COMPUTED", "Metrics": { "RMSE": 448.19602551622864, "WeightedQuantileLosses": [ { "Quantile": 0.9, "LossValue": 0.11574311406253326 }, { "Quantile": 0.5, "LossValue": 0.1706269067283527 }, { "Quantile": 0.1, "LossValue": 0.11724164222477837 } ] }, "TestWindowEnd": 1420070400.0, "TestWindowStart": 1420002000.0 } ] } ] }
The metrics show the error loss for each quantile. For example, there was an 11.7% error for the first quantile. The metrics also show the root-mean-square error (
RMSE
).The summary metrics show the average of the computed metrics over all test windows. Because there was only one test window, the summary and computed metrics are equal.
For more information about this operation, see GetAccuracyMetrics.
Amazon Forecast creates a forecast for the target_value
field
(as determined by the dataset domain and type) for each unique item_id
in the
dataset. In this exercise, the target_value
field provides electricity usage and
the item_id
provides client IDs. You get a forecast for the hourly
electricity usage by customer.
After the forecast has been created, you can query for a single item or export the complete forecast.
To create, retrieve, and export a forecast
-
Create the forecast.
aws forecast create-forecast \ --forecast-name electricityforecast \ --predictor-arn arn:aws:forecast:us-west-2:
acct-id
:predictor/electricitypredictorThe operation uses the predictor to create a forecast. In the response, you get the Amazon Resource Name (ARN) of the forecast. You use this ARN to retrieve and export the forecast. The following is an example response.
{ "ForecastArn": "arn:aws:forecast:us-west-2:acct-id:forecast/electricityforecast" }
For more information about this operation, see CreateForecast.
-
Retrieve the first two hours of the forecast for
client_1
.Note
The service name,
forecastquery
, is different then the service name used elsewhere.aws forecastquery query-forecast \ --forecast-arn arn:aws:forecast:us-west-2:
acct-id
:forecast/electricityforecast \ --start-date 2015-01-01T00:00:00 \ --end-date 2015-01-01T02:00:00 \ --filters '{"item_id":"client_1"}'The operation includes the following parameters.
-
start-date
andend-date
– Specifies an optional date range to retrieve the forecast for. If you don't specify these parameters, the operation returns the entire forecast forclient_1
. -
filters
– Specifies theitem_id
filter to retrieve the electricity forecast forclient_1
.The following is the shorthand syntax for the
filters
parameter.--filters item_id="client_1"
The following is an example response.
{ "Forecast": { "Predictions": { "mean": [ { "Timestamp": "2015-01-01T01:00:00", "Value": 20.952411651611328 }, { "Timestamp": "2015-01-01T02:00:00", "Value": 19.11078453063965 } ], "p90": [ { "Timestamp": "2015-01-01T01:00:00", "Value": 24.524038314819336 }, { "Timestamp": "2015-01-01T02:00:00", "Value": 22.319091796875 } ], "p50": [ { "Timestamp": "2015-01-01T01:00:00", "Value": 20.7841739654541 }, { "Timestamp": "2015-01-01T02:00:00", "Value": 19.237524032592773 } ], "p10": [ { "Timestamp": "2015-01-01T01:00:00", "Value": 18.507278442382812 }, { "Timestamp": "2015-01-01T02:00:00", "Value": 16.15062141418457 } ] } } }
Because this is an hourly forecast, the response shows hourly forecast values. In the response, note the following:
-
mean
– For the specific date and time, the mean is the predicted mean electricity usage value for the customer. -
p90
,p50
, andp10
– Specify the confidence level that the actual value will be below the listed value at the specified date and time. For example, at 2015-01-01T01:00:00, Amazon Forecast is 90% confident that the electric usage will be below 24.5. Amazon Forecast is 50% confident that usage will be below 20.8, and 10% confident that usage will be below 18.5.
For more information about this operation, see QueryForecast.
-
-
Export the complete forecast to your Amazon S3 bucket. The IAM role that you provide must have permission to write data to your S3 bucket. For information on how to create an IAM role, see Create an IAM Role for Amazon Forecast (AWS CLI).
Create a forecast export job.
aws forecast create-forecast-export-job \ --forecast-export-job-name electricityforecast_exportjob \ --forecast-arn arn:aws:forecast:us-west-2:
acct-id
:forecast/electricityforecast \ --destination S3Config="{Path='s3://bucket
',RoleArn='arn:aws:iam::acct-id
:role/Role
'}"The following is an example response.
{ "ForecastExportJobArn": "arn:aws:forecast::us-west-2:acct-id:forecast-export/64bbc087" }
For more information about this operation, see CreateForecastExportJob.
-
Get the status of the export job.
aws forecast describe-forecast-export-job \ --forecast-export-job-arn arn:aws:forecast:us-west-2:acct-id:forecast/electricityforecast
The following is an example response.
{ "ForecastExportJobArn": "arn:aws:forecast::us-west-2:
acct-id
:forecast-export/64bbc087", "ForecastExportJobName": "electricityforecast_exportjob", "Status": "CREATE_IN_PROGRESS" }When the status is ACTIVE, you can find the forecast files in the specified S3 bucket.