Using MXNet Elastic Inference accelerators on Amazon ECS

To use the Elastic Inference accelerator with MXNet

Create an Amazon ECS cluster with named mxnet-eia on AWS in an AWS Region that has access to Elastic Inference.
```
aws ecs create-cluster --cluster-name mxnet-eia \
                       --region <region>
```
Create a text file called mx_script.txt and add the following text.
```
#!/bin/bash
echo ECS_CLUSTER=mxnet-eia >> /etc/ecs/ecs.config
```

Create a text file called my_mapping.txt and add the following text.


[
    {
        "DeviceName": "/dev/xvda",
        "Ebs": {
            "VolumeSize": 100
        }
    }
]

Launch an Amazon EC2 instance in the cluster that you created in Step 1 without attaching an Elastic Inference accelerator. To select an AMI, see Amazon ECS-optimized AMIs.


aws ec2 run-instances --image-id <ECS_Optimized_AMI> \
                      --count 1 \
                      --instance-type <cpu_instance_type> \
                      --key-name <name_of_key_pair_on_ec2_console>
                      --security-group-ids <sg_created_with_vpc> \
                      --iam-instance-profile Name="ecsInstanceRole" \
                      --user-data file://mx_script.txt \
                      --block-device-mapping file://my_mapping.txt \
                      --region <region> \
                      --subnet-id <subnet_with_ei_endpoint>

Create an MXNet inference task definition named mx_task_def.json. Set “image” to any MXNet image name. To select an image, see Prebuilt Amazon SageMaker Docker Images. For "deviceType" options, see Launching an Instance with Elastic Inference.


{
   "requiresCompatibilities":[
      "EC2"
   ],
   "containerDefinitions":[
      {
         "entryPoint":[
                "/bin/bash",
                "-c",
           "/usr/local/bin/mxnet-model-server --start --foreground --mms-config /home/model-server/config.properties --models resnet-152-eia=https://s3.amazonaws.com/model-server/model_archive_1.0/resnet-152-eia.mar"],
         "name":"mxnet-inference-container",
         "image":"<mxnet-image-name>",
         "memory":8111,
         "cpu":256,
         "essential":true,
         "portMappings":[
            {
               "hostPort":80,
               "protocol":"tcp",
               "containerPort":8080
            },
            {
               "hostPort":8081,
               "protocol":"tcp",
               "containerPort":8081
            }
         ],
         "healthCheck":{
            "retries":2,
            "command":[
               "CMD-SHELL",
               "LD_LIBRARY_PATH=/opt/ei_health_check/lib /opt/ei_health_check/bin/health_check"
            ],
            "timeout":5,
            "interval":30,
            "startPeriod":60
         },
         "logConfiguration":{
            "logDriver":"awslogs",
            "options":{
               "awslogs-group":"/ecs/mxnet-inference-eia",
               "awslogs-region":"<region>",
               "awslogs-stream-prefix":"squeezenet",
               "awslogs-create-group":"true"
            }
         },
         "resourceRequirements":[
                {
                    "type":"InferenceAccelerator",
                    "value":"device_1"
                }
         ]
      }
   ],
   "inferenceAccelerators":[
      {
         "deviceName":"device_1",
         "deviceType":"<EIA_instance_type>"
      }
   ],
   "volumes":[

   ],
   "networkMode":"bridge",
   "placementConstraints":[

   ],
   "family":"mxnet-eia"
}

Register the MXNet inference task definition. Note the task definition family and revision number in the output.
```
aws ecs register-task-definition --cli-input-json file://mx_task_def.json --region <region>
```

Create an MXNet inference service.


aws ecs create-service --cluster mxnet-eia --service-name mx-eia1 --task-definition mxnet-eia:<revision_number> --desired-count 1 --scheduling-strategy="REPLICA" --region <region>

Download the input image for the test.


curl -O https://s3.amazonaws.com/model-server/inputs/kitten.jpg

Begin inference using a query with the REST API.


curl -X POST http://<ec2_public_ip_address>:80/predictions/resnet-152-eia -T kitten.jpg

The results should look something like the following.


[
  {
    "probability": 0.8582226634025574,
    "class": "n02124075 Egyptian cat"
  },
  {
    "probability": 0.09160050004720688,
    "class": "n02123045 tabby, tabby cat"
  },
  {
    "probability": 0.037487514317035675,
    "class": "n02123159 tiger cat"
  },
  {
    "probability": 0.0061649843119084835,
    "class": "n02128385 leopard, Panthera pardus"
  },
  {
    "probability": 0.003171598305925727,
    "class": "n02127052 lynx, catamount"
  }
]

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Using TensorFlow Elastic Inference accelerators on Amazon ECS

Using PyTorch Elastic Inference accelerators on Amazon ECS