Using PyTorch Elastic Inference accelerators on Amazon ECS

To use Elastic Inference accelerators with PyTorch

From your terminal, create an Amazon ECS cluster named pytorch-eia on AWS in an AWS Region that has access to Elastic Inference.
```
aws ecs create-cluster --cluster-name pytorch-eia \
                       --region <region>
```
Create a text file called pt_script.txt and add the following text.
```
#!/bin/bash
echo ECS_CLUSTER=pytorch-eia >> /etc/ecs/ecs.config
```

Create a text file called my_mapping.txt and add the following text.


[
    {
        "DeviceName": "/dev/xvda",
        "Ebs": {
            "VolumeSize": 100
        }
    }
]

Launch an Amazon EC2 instance in the cluster that you created in Step 1 without attaching an Elastic Inference accelerator. To select an AMI, see Amazon ECS-optimized AMIs.


aws ec2 run-instances --image-id <ECS_Optimized_AMI> \
                      --count 1 \
                      --instance-type <cpu_instance_type> \
                      --key-name <name_of_key_pair_on_ec2_console>
                      --security-group-ids <sg_created_with_vpc> \
                      --iam-instance-profile Name="ecsInstanceRole" \
                      --user-data file://pt_script.txt \
                      --block-device-mapping file://my_mapping.txt \
                      --region <region> \
                      --subnet-id <subnet_with_ei_endpoint>

Create a PyTorch inference task definition named pt_task_def.json. Set “image” to any PyTorch image name. To select an image, see Prebuilt Amazon SageMaker Docker Images. For "deviceType" options, see Launching an Instance with Elastic Inference.


{
   "requiresCompatibilities":[
      "EC2"
   ],
   "containerDefinitions":[
      {
         "entryPoint":[
                "/bin/bash",
                "-c",
           "mxnet-model-server --start --foreground --mms-config /home/model-server/config.properties --models densenet-eia=https://aws-dlc-sample-models.s3.amazonaws.com/pytorch/densenet_eia/densenet_eia.mar"],
         "name":"pytorch-inference-container",
         "image":"<pytorch-image-name>",
         "memory":8111,
         "cpu":256,
         "essential":true,
         "portMappings":[
            {
               "hostPort":80,
               "protocol":"tcp",
               "containerPort":8080
            },
            {
               "hostPort":8081,
               "protocol":"tcp",
               "containerPort":8081
            }
         ],
         "healthCheck":{
            "retries":2,
            "command":[
               "CMD-SHELL",
               "LD_LIBRARY_PATH=/opt/ei_health_check/lib /opt/ei_health_check/bin/health_check"
            ],
            "timeout":5,
            "interval":30,
            "startPeriod":60
         },
         "logConfiguration":{
            "logDriver":"awslogs",
            "options":{
               "awslogs-group":"/ecs/pytorch-inference-eia",
               "awslogs-region":"<region>",
               "awslogs-stream-prefix":"densenet-eia",
               "awslogs-create-group":"true"
            }
         },
         "resourceRequirements":[
                {
                    "type":"InferenceAccelerator",
                    "value":"device_1"
                }
         ]
      }
   ],
   "inferenceAccelerators":[
      {
         "deviceName":"device_1",
         "deviceType":"<EIA_instance_type>"
      }
   ],
   "volumes":[

   ],
   "networkMode":"bridge",
   "placementConstraints":[

   ],
   "family":"pytorch-eia"
}

Register the PyTorch inference task definition. Note the task definition family and revision number in the output.
```
aws ecs register-task-definition --cli-input-json file://pt_task_def.json --region <region>
```

Create a PyTorch inference service.


aws ecs create-service --cluster pytorch-eia --service-name pt-eia1 --task-definition pytorch-eia:<revision_number> --desired-count 1 --scheduling-strategy="REPLICA" --region <region>

Download the input image for the test.


curl -O https://s3.amazonaws.com/model-server/inputs/flower.jpg

Begin inference using a query with the REST API.


curl -X POST http://<ec2_public_ip_address>:80/predictions/densenet-eia -T flower.jpg

The results should look something like the following.


[
  [
    "pot, flowerpot",
    14.690367698669434
  ],
  [
    "sulphur butterfly, sulfur butterfly",
    9.29893970489502
  ],
  [
    "bee",
    8.29178237915039
  ],
  [
    "vase",
    6.987090587615967
  ],
  [
    "hummingbird",
    4.341294765472412
  ]
]

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Using MXNet Elastic Inference accelerators on Amazon ECS

Using Amazon Deep Learning Containers with Elastic Inference on Amazon SageMaker