Using MXNet Elastic Inference accelerators on Amazon ECS - Amazon Elastic Inference

Using MXNet Elastic Inference accelerators on Amazon ECS

To use the Elastic Inference accelerator with MXNet
  1. Create an Amazon ECS cluster with named mxnet-eia on AWS in an AWS Region that has access to Elastic Inference.

    aws ecs create-cluster --cluster-name mxnet-eia \ --region <region>
  2. Create a text file called mx_script.txt and add the following text.

    #!/bin/bash echo ECS_CLUSTER=mxnet-eia >> /etc/ecs/ecs.config
  3. Create a text file called my_mapping.txt and add the following text.

    [ { "DeviceName": "/dev/xvda", "Ebs": { "VolumeSize": 100 } } ]
  4. Launch an Amazon EC2 instance in the cluster that you created in Step 1 without attaching an Elastic Inference accelerator. To select an AMI, see Amazon ECS-optimized AMIs.

    aws ec2 run-instances --image-id <ECS_Optimized_AMI> \ --count 1 \ --instance-type <cpu_instance_type> \ --key-name <name_of_key_pair_on_ec2_console> --security-group-ids <sg_created_with_vpc> \ --iam-instance-profile Name="ecsInstanceRole" \ --user-data file://mx_script.txt \ --block-device-mapping file://my_mapping.txt \ --region <region> \ --subnet-id <subnet_with_ei_endpoint>
  5. Create an MXNet inference task definition named mx_task_def.json. Set “image” to any MXNet image name. To select an image, see Prebuilt Amazon SageMaker Docker Images. For "deviceType" options, see Launching an Instance with Elastic Inference.

    { "requiresCompatibilities":[ "EC2" ], "containerDefinitions":[ { "entryPoint":[ "/bin/bash", "-c", "/usr/local/bin/mxnet-model-server --start --foreground --mms-config /home/model-server/config.properties --models resnet-152-eia=https://s3.amazonaws.com/model-server/model_archive_1.0/resnet-152-eia.mar"], "name":"mxnet-inference-container", "image":"<mxnet-image-name>", "memory":8111, "cpu":256, "essential":true, "portMappings":[ { "hostPort":80, "protocol":"tcp", "containerPort":8080 }, { "hostPort":8081, "protocol":"tcp", "containerPort":8081 } ], "healthCheck":{ "retries":2, "command":[ "CMD-SHELL", "LD_LIBRARY_PATH=/opt/ei_health_check/lib /opt/ei_health_check/bin/health_check" ], "timeout":5, "interval":30, "startPeriod":60 }, "logConfiguration":{ "logDriver":"awslogs", "options":{ "awslogs-group":"/ecs/mxnet-inference-eia", "awslogs-region":"<region>", "awslogs-stream-prefix":"squeezenet", "awslogs-create-group":"true" } }, "resourceRequirements":[ { "type":"InferenceAccelerator", "value":"device_1" } ] } ], "inferenceAccelerators":[ { "deviceName":"device_1", "deviceType":"<EIA_instance_type>" } ], "volumes":[ ], "networkMode":"bridge", "placementConstraints":[ ], "family":"mxnet-eia" }
  6. Register the MXNet inference task definition. Note the task definition family and revision number in the output.

    aws ecs register-task-definition --cli-input-json file://mx_task_def.json --region <region>
  7. Create an MXNet inference service.

    aws ecs create-service --cluster mxnet-eia --service-name mx-eia1 --task-definition mxnet-eia:<revision_number> --desired-count 1 --scheduling-strategy="REPLICA" --region <region>
  8. Download the input image for the test.

    curl -O https://s3.amazonaws.com/model-server/inputs/kitten.jpg
  9. Begin inference using a query with the REST API.

    curl -X POST http://<ec2_public_ip_address>:80/predictions/resnet-152-eia -T kitten.jpg
  10. The results should look something like the following.

    [ { "probability": 0.8582226634025574, "class": "n02124075 Egyptian cat" }, { "probability": 0.09160050004720688, "class": "n02123045 tabby, tabby cat" }, { "probability": 0.037487514317035675, "class": "n02123159 tiger cat" }, { "probability": 0.0061649843119084835, "class": "n02128385 leopard, Panthera pardus" }, { "probability": 0.003171598305925727, "class": "n02127052 lynx, catamount" } ]