示例 AWS FIS实验模板 - AWS 故障注入服务

本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。

示例 AWS FIS实验模板

如果你使用的是 AWS FISAPI或者使用命令行工具来创建实验模板,则可以在 JavaScript 对象表示法 (JSON) 中构造模板。有关实验模板组件的更多信息,请参阅 AWS FIS实验模板组件

要使用其中一个示例模板创建实验,请将其保存到JSON文件中(例如,my-template.json),替换中的占位符值 italics 使用您自己的值,然后运行以下create-experiment-template命令。

aws fis create-experiment-template --cli-input-json file://my-template.json

根据过滤器停止EC2实例

以下示例使用指定标签停止指定区域中所有正在运行的 Amazon EC2 实例VPC。两分钟后重启实例。

{ "tags": { "Name": "StopEC2InstancesWithFilters" }, "description": "Stop and restart all instances in us-east-1b with the tag env=prod in the specified VPC", "targets": { "myInstances": { "resourceType": "aws:ec2:instance", "resourceTags": { "env": "prod" }, "filters": [ { "path": "Placement.AvailabilityZone", "values": ["us-east-1b"] }, { "path": "State.Name", "values": ["running"] }, { "path": "VpcId", "values": [ "vpc-aabbcc11223344556"] } ], "selectionMode": "ALL" } }, "actions": { "StopInstances": { "actionId": "aws:ec2:stop-instances", "description": "stop the instances", "parameters": { "startInstancesAfterDuration": "PT2M" }, "targets": { "Instances": "myInstances" } } }, "stopConditions": [ { "source": "aws:cloudwatch:alarm", "value": "arn:aws:cloudwatch:us-east-1:111122223333:alarm:alarm-name" } ], "roleArn": "arn:aws:iam::111122223333:role/role-name" }

停止指定数量的EC2实例

以下示例使用指定标签停止三个实例。 AWS FIS选择要随机停止的特定实例。两分钟后重启实例。

{ "tags": { "Name": "StopEC2InstancesByCount" }, "description": "Stop and restart three instances with the specified tag", "targets": { "myInstances": { "resourceType": "aws:ec2:instance", "resourceTags": { "env": "prod" }, "selectionMode": "COUNT(3)" } }, "actions": { "StopInstances": { "actionId": "aws:ec2:stop-instances", "description": "stop the instances", "parameters": { "startInstancesAfterDuration": "PT2M" }, "targets": { "Instances": "myInstances" } } }, "stopConditions": [ { "source": "aws:cloudwatch:alarm", "value": "arn:aws:cloudwatch:us-east-1:111122223333:alarm:alarm-name" } ], "roleArn": "arn:aws:iam::111122223333:role/role-name" }

运行预先配置的 AWS FISSSM文档

以下示例使用预先配置的在指定EC2实例上运行CPU故障注入 60 秒 AWS FISSSM文档,AWSFIS-Run-CPU-Stress。 AWS FIS监视实验两分钟。

{ "tags": { "Name": "CPUStress" }, "description": "Run a CPU fault injection on the specified instance", "targets": { "myInstance": { "resourceType": "aws:ec2:instance", "resourceArns": ["arn:aws:ec2:us-east-1:111122223333:instance/instance-id"], "selectionMode": "ALL" } }, "actions": { "CPUStress": { "actionId": "aws:ssm:send-command", "description": "run cpu stress using ssm", "parameters": { "duration": "PT2M", "documentArn": "arn:aws:ssm:us-east-1::document/AWSFIS-Run-CPU-Stress", "documentParameters": "{\"DurationSeconds\": \"60\", \"InstallDependencies\": \"True\", \"CPU\": \"0\"}" }, "targets": { "Instances": "myInstance" } } }, "stopConditions": [ { "source": "aws:cloudwatch:alarm", "value": "arn:aws:cloudwatch:us-east-1:111122223333:alarm:alarm-name" } ], "roleArn": "arn:aws:iam::111122223333:role/role-name" }

运行预定义的自动化运行手册

以下示例SNS使用 Systems Manager 提供的运行手册向亚马逊发布通知,AWS-ublishSNSNotification P。该角色必须具有向指定SNS主题发布通知的权限。

{ "description": "Publish event through SNS", "stopConditions": [ { "source": "none" } ], "targets": { }, "actions": { "sendToSns": { "actionId": "aws:ssm:start-automation-execution", "description": "Publish message to SNS", "parameters": { "documentArn": "arn:aws:ssm:us-east-1::document/AWS-PublishSNSNotification", "documentParameters": "{\"Message\": \"Hello, world\", \"TopicArn\": \"arn:aws:sns:us-east-1:111122223333:topic-name\"}", "maxDuration": "PT1M" }, "targets": { } } }, "roleArn": "arn:aws:iam::111122223333:role/role-name" }

限制对具有目标IAM角色的EC2实例的API操作

以下示例限制了操作定义中API为目标定义中指定的IAM角色发出的API呼叫的 100% 的呼叫。

注意

如果您想定位属于 Auto Scaling 组成员的EC2实例,请使用 aws: ec2: asg-insufficient-instance-capacity-error 操作,改用 Auto Scaling 组进行定位。有关更多信息,请参阅

{ "tags": { "Name": "ThrottleEC2APIActions" }, "description": "Throttle the specified EC2 API actions on the specified IAM role", "targets": { "myRole": { "resourceType": "aws:iam:role", "resourceArns": ["arn:aws:iam::111122223333:role/role-name"], "selectionMode": "ALL" } }, "actions": { "ThrottleAPI": { "actionId": "aws:fis:inject-api-throttle-error", "description": "Throttle APIs for 5 minutes", "parameters": { "service": "ec2", "operations": "DescribeInstances,DescribeVolumes", "percentage": "100", "duration": "PT2M" }, "targets": { "Roles": "myRole" } } }, "stopConditions": [ { "source": "aws:cloudwatch:alarm", "value": "arn:aws:cloudwatch:us-east-1:111122223333:alarm:alarm-name" } ], "roleArn": "arn:aws:iam::111122223333:role/role-name" }

对 Kubernetes 集群中的 Pod 进行压力测试 CPU

以下示例使用混沌网格对 Amazon EKS Kubernetes 集群中的 pod 进行一分钟的压力测试。CPU

{ "description": "ChaosMesh StressChaos example", "targets": { "Cluster-Target-1": { "resourceType": "aws:eks:cluster", "resourceArns": [ "arn:aws:eks:arn:aws::111122223333:cluster/cluster-id" ], "selectionMode": "ALL" } }, "actions": { "TestCPUStress": { "actionId": "aws:eks:inject-kubernetes-custom-resource", "parameters": { "maxDuration": "PT2M", "kubernetesApiVersion": "chaos-mesh.org/v1alpha1", "kubernetesKind": "StressChaos", "kubernetesNamespace": "default", "kubernetesSpec": "{\"selector\":{\"namespaces\":[\"default\"],\"labelSelectors\":{\"run\":\"nginx\"}},\"mode\":\"all\",\"stressors\": {\"cpu\":{\"workers\":1,\"load\":50}},\"duration\":\"1m\"}" }, "targets": { "Cluster": "Cluster-Target-1" } } }, "stopConditions": [{ "source": "none" }], "roleArn": "arn:aws:iam::111122223333:role/role-name", "tags": {} }

以下示例使用 Litmus 对 Amazon EKS Kubernetes 集群中的 pod 进行一分钟的压力测试。CPU

{ "description": "Litmus CPU Hog", "targets": { "MyCluster": { "resourceType": "aws:eks:cluster", "resourceArns": [ "arn:aws:eks:arn:aws::111122223333:cluster/cluster-id" ], "selectionMode": "ALL" } }, "actions": { "MyAction": { "actionId": "aws:eks:inject-kubernetes-custom-resource", "parameters": { "maxDuration": "PT2M", "kubernetesApiVersion": "litmuschaos.io/v1alpha1", "kubernetesKind": "ChaosEngine", "kubernetesNamespace": "litmus", "kubernetesSpec": "{\"engineState\":\"active\",\"appinfo\":{\"appns\":\"default\",\"applabel\":\"run=nginx\",\"appkind\":\"deployment\"},\"chaosServiceAccount\":\"litmus-admin\",\"experiments\":[{\"name\":\"pod-cpu-hog\",\"spec\":{\"components\":{\"env\":[{\"name\":\"TOTAL_CHAOS_DURATION\",\"value\":\"60\"},{\"name\":\"CPU_CORES\",\"value\":\"1\"},{\"name\":\"PODS_AFFECTED_PERC\",\"value\":\"100\"},{\"name\":\"CONTAINER_RUNTIME\",\"value\":\"docker\"},{\"name\":\"SOCKET_PATH\",\"value\":\"/var/run/docker.sock\"}]},\"probe\":[]}}],\"annotationCheck\":\"false\"}" }, "targets": { "Cluster": "MyCluster" } } }, "stopConditions": [{ "source": "none" }], "roleArn": "arn:aws:iam::111122223333:role/role-name", "tags": {} }