Terjemahan disediakan oleh mesin penerjemah. Jika konten terjemahan yang diberikan bertentangan dengan versi bahasa Inggris aslinya, utamakan versi bahasa Inggris.
Lakukan penyebaran bertahap versi mesin status di Step Functions
Penyebaran bergulir adalah strategi penyebaran yang secara perlahan menggantikan versi aplikasi sebelumnya dengan versi aplikasi baru. Untuk melakukan penyebaran bergulir versi mesin negara, secara bertahap kirim peningkatan jumlah lalu lintas eksekusi ke versi baru. Jumlah lalu lintas dan tingkat kenaikan adalah parameter yang Anda konfigurasikan.
Anda dapat melakukan penerapan bergulir versi menggunakan salah satu opsi berikut:
Konsol Step Functions
— Buat alias yang menunjuk ke dua versi mesin status yang sama. Untuk alias ini, Anda mengkonfigurasi konfigurasi routing untuk menggeser lalu lintas antara dua versi. Untuk informasi selengkapnya tentang menggunakan konsol untuk meluncurkan versi, lihat Versi danAlias. -
Skrip untuk AWS CLI dan SDK — Buat skrip shell menggunakan AWS CLI atau AWS SDK. Untuk informasi selengkapnya, lihat bagian berikut untuk menggunakan AWS CLI and AWS SDK.
AWS CloudFormation template — Gunakan
sumber daya untuk mempublikasikan beberapa versi mesin status dan membuat alias untuk menunjuk ke satu atau dua versi ini.
Contoh skrip di bagian ini menunjukkan bagaimana Anda dapat menggunakan AWS CLI untuk secara bertahap mengalihkan lalu lintas dari versi mesin status sebelumnya ke versi mesin status baru. Anda dapat menggunakan contoh skrip ini atau memperbaruinya sesuai dengan kebutuhan Anda.
Skrip ini menunjukkan penyebaran Canary untuk menerapkan versi mesin status baru menggunakan alias. Langkah-langkah berikut menguraikan tugas-tugas yang dilakukan skrip:
parameter disetel ke true, publikasikan yang terbaru revision sebagai versi berikutnya dari mesin status. Versi ini menjadi versi langsung yang baru jika penerapan berhasil.Jika Anda menyetel
parameter ke false, skrip akan menyebarkan versi terakhir yang diterbitkan dari mesin status. -
Buat alias jika belum ada. Jika alias tidak ada, arahkan 100 persen lalu lintas untuk alias ini ke versi baru, lalu keluar dari skrip.
Perbarui konfigurasi routing alias untuk menggeser sebagian kecil lalu lintas dari versi sebelumnya ke versi baru. Anda mengatur persentase kenari ini dengan
parameter. -
Secara default, pantau CloudWatch alarm yang dapat dikonfigurasi setiap 60 detik. Jika salah satu dari alarm ini dimatikan, segera kembalikan penyebaran dengan mengarahkan 100 persen lalu lintas ke versi sebelumnya.
Setelah setiap interval waktu, dalam hitungan detik, ditentukan dalam
, lanjutkan memantau alarm. Lanjutkan pemantauan sampai interval waktu yang ditentukancanary_interval_seconds
telah berlalu. -
Jika tidak ada alarm yang dimatikan selama
, alihkan 100 persen lalu lintas ke versi baru. -
Jika versi baru berhasil digunakan, hapus versi apa pun yang lebih lama dari nomor yang ditentukan dalam
#!/bin/bash # # AWS StepFunctions example showing how to create a canary deployment with a # State Machine Alias and versions. # # Requirements: AWS CLI installed and credentials configured. # # A canary deployment deploys the new version alongside the old version, while # routing only a small fraction of the overall traffic to the new version to # see if there are any errors. Only once the new version has cleared a testing # period will it start receiving 100% of traffic. # # For a Blue/Green or All at Once style deployment, you can set the # canary_percentage to 100. The script will immediately shift 100% of traffic # to the new version, but keep on monitoring the alarms (if any) during the # canary_interval_seconds time interval. If any alarms raise during this period, # the script will automatically rollback to the previous version. # # Step Functions allows you to keep a maximum of 1000 versions in version history # for a state machine. This script has a version history deletion mechanism at # the end, where it will delete any versions older than the limit specified. # # For an example that also demonstrates linear (or rolling) deployments, see the following: # https://github.com/aws-samples/aws-stepfunctions-examples/blob/main/gradual-deploy/sfndeploy.py set -euo pipefail # ****************************************************************************** # you can safely change the variables in this block to your values state_machine_name="my-state-machine" alias_name="alias-1" region="us-east-1" # array of cloudwatch alarms to poll during the test period. # to disable alarm checking, set alarm_names=() alarm_names=("alarm1" "alarm name with a space") # true to publish the current revision as the next version before deploy. # false to deploy the latest version from the state machine's version history. publish_revision=true # true to force routing configuration update even if the current routing # for the alias does not have a 100% routing config. # false will abandon deploy attempt if current routing config not 100% to a # single version. # Be careful when you combine this flag with publish_revision - if you just # rerun the script you might deploy the newly published revision from the # previous run. force=false # percentage of traffic to route to the new version during the test period canary_percentage=10 # how many seconds the canary deployment lasts before full deploy to 100% canary_interval_seconds=300 # how often to poll the alarms alarm_polling_interval=60 # how many versions to keep in history. delete versions prior to this. # set to 0 to disable old version history deletion. history_max=0 # ****************************************************************************** ####################################### # Update alias routing configuration. # # If you don't specify version 2 details, will only create 1 routing entry. In # this case the routing entry weight must be 100. # # Globals: # alias_arn # Arguments: # 1. version 1 arn # 2. version 1 weight # 3. version 2 arn (optional) # 4. version 2 weight (optional) ####################################### function update_routing() { if [[ $# -eq 2 ]]; then local routing_config="[{\"stateMachineVersionArn\": \"$1\", \"weight\":$2}]" elif [[ $# -eq 4 ]]; then local routing_config="[{\"stateMachineVersionArn\": \"$1\", \"weight\":$2}, {\"stateMachineVersionArn\": \"$3\", \"weight\":$4}]" else echo "You have to call update_routing with either 2 or 4 input arguments." >&2 exit 1 fi ${aws} update-state-machine-alias --state-machine-alias-arn ${alias_arn} --routing-configuration "${routing_config}" } # ****************************************************************************** # pre-run validation if [[ (("${#alarm_names[@]}" -gt 0)) ]]; then alarm_exists_count=$(aws cloudwatch describe-alarms --alarm-names "${alarm_names[@]}" --alarm-types "CompositeAlarm" "MetricAlarm" --query "length([MetricAlarms, CompositeAlarms][])" --output text) if [[ (("${#alarm_names[@]}" -ne "${alarm_exists_count}")) ]]; then echo All of the alarms to monitor do not exist in CloudWatch: $(IFS=,; echo "${alarm_names[*]}") >&2 echo Only the following alarm names exist in CloudWatch: aws cloudwatch describe-alarms --alarm-names "${alarm_names[@]}" --alarm-types "CompositeAlarm" "MetricAlarm" --query "join(', ', [MetricAlarms, CompositeAlarms][].AlarmName)" --output text exit 1 fi fi if [[ (("${history_max}" -gt 0)) && (("${history_max}" -lt 2)) ]]; then echo The minimum value for history_max is 2. This is the minimum number of older state machine versions to be able to rollback in the future. >&2 exit 1 fi # ****************************************************************************** # main block follows account_id=$(aws sts get-caller-identity --query Account --output text) sm_arn="arn:aws:states:${region}:${account_id}:stateMachine:${state_machine_name}" # the aws command we'll be invoking a lot throughout. aws="aws stepfunctions" # promote the latest revision to the next version if [[ "${publish_revision}" = true ]]; then new_version=$(${aws} publish-state-machine-version --state-machine-arn=$sm_arn --query stateMachineVersionArn --output text) echo Published the current revision of state machine as the next version with arn: ${new_version} else new_version=$(${aws} list-state-machine-versions --state-machine-arn ${sm_arn} --max-results 1 --query "stateMachineVersions[0].stateMachineVersionArn" --output text) echo "Since publish_revision is false, using the latest version from the state machine's version history: ${new_version}" fi # find the alias if it exists alias_arn_expected="${sm_arn}:${alias_name}" alias_arn=$(${aws} list-state-machine-aliases --state-machine-arn ${sm_arn} --query "stateMachineAliases[?stateMachineAliasArn==\`${alias_arn_expected}\`].stateMachineAliasArn" --output text) if [[ "${alias_arn_expected}" == "${alias_arn}" ]]; then echo Found alias ${alias_arn} echo Current routing configuration is: ${aws} describe-state-machine-alias --state-machine-alias-arn "${alias_arn}" --query routingConfiguration else echo Alias does not exist. Creating alias ${alias_arn_expected} and routing 100% traffic to new version ${new_version} ${aws} create-state-machine-alias --name "${alias_name}" --routing-configuration "[{\"stateMachineVersionArn\": \"${new_version}\", \"weight\":100}]" echo Done! exit 0 fi # find the version to which the alias currently points (the current live version) old_version=$(${aws} describe-state-machine-alias --state-machine-alias-arn $alias_arn --query "routingConfiguration[?weight==\`100\`].stateMachineVersionArn" --output text) if [[ -z "${old_version}" ]]; then if [[ "${force}" = true ]]; then echo Force setting is true. Will force update to routing config for alias to point 100% to new version. update_routing "${new_version}" 100 echo Alias ${alias_arn} now pointing 100% to ${new_version}. echo Done! exit 0 else echo Alias ${alias_arn} does not have a routing config entry with 100% of the traffic. This means there might be a deploy in progress, so not starting another deploy at this time. >&2 exit 1 fi fi if [[ "${old_version}" == "${new_version}" ]]; then echo The alias already points to this version. No update necessary. exit 0 fi echo Switching ${canary_percentage}% to new version ${new_version} (( old_weight = 100 - ${canary_percentage} )) update_routing "${new_version}" ${canary_percentage} "${old_version}" ${old_weight} echo New version receiving ${canary_percentage}% of traffic. echo Old version ${old_version} is still receiving ${old_weight}%. if [[ ${#alarm_names[@]} -eq 0 ]]; then echo No alarm_names set. Skipping cloudwatch monitoring. echo Will sleep for ${canary_interval_seconds} seconds before routing 100% to new version. sleep ${canary_interval_seconds} echo Canary period complete. Switching 100% of traffic to new version... else echo Checking if alarms fire for the next ${canary_interval_seconds} seconds. (( total_wait = canary_interval_seconds + $(date +%s) )) now=$(date +%s) while [[ ((${now} -lt ${total_wait})) ]]; do alarm_result=$(aws cloudwatch describe-alarms --alarm-names "${alarm_names[@]}" --state-value ALARM --alarm-types "CompositeAlarm" "MetricAlarm" --query "join(', ', [MetricAlarms, CompositeAlarms][].AlarmName)" --output text) if [[ ! -z "${alarm_result}" ]]; then echo The following alarms are in ALARM state: ${alarm_result}. Rolling back deploy. >&2 update_routing "${old_version}" 100 echo Rolled back to ${old_version} exit 1 fi echo Monitoring alarms...no alarms have triggered. sleep ${alarm_polling_interval} now=$(date +%s) done echo No alarms detected during canary period. Switching 100% of traffic to new version... fi update_routing "${new_version}" 100 echo Version ${new_version} is now receiving 100% of traffic. if [[ (("${history_max}" -eq 0 ))]]; then echo Version History deletion is disabled. Remember to prune your history, the default limit is 1000 versions. echo Done! exit 0 fi echo Keep the last ${history_max} versions. Deleting any versions older than that... # the results are sorted in descending order of the version creation time version_history=$(${aws} list-state-machine-versions --state-machine-arn ${sm_arn} --max-results 1000 --query "join(\`\"\\n\"\`, stateMachineVersions[].stateMachineVersionArn)" --output text) counter=0 while read line; do ((counter=${counter} + 1)) if [[ (( ${counter} -gt ${history_max})) ]]; then echo Deleting old version ${line} ${aws} delete-state-machine-version --state-machine-version-arn ${line} fi done <<< "${version_history}" echo Done!
Contoh skrip di aws-stepfunctions-examples
Skrip menunjukkan strategi penyebaran berikut:
Canary — Menggeser lalu lintas dalam dua peningkatan.
Pada kenaikan pertama, sebagian kecil lalu lintas, misalnya, 10 persen digeser ke versi baru. Pada kenaikan kedua, sebelum interval waktu tertentu dalam hitungan detik berakhir, lalu lintas yang tersisa digeser ke versi baru. Peralihan ke versi baru untuk lalu lintas yang tersisa hanya terjadi jika tidak ada CloudWatch alarm yang dimatikan selama interval waktu yang ditentukan.
Linear atau Rolling — Menggeser lalu lintas ke versi baru dengan peningkatan yang sama dengan jumlah detik yang sama antara setiap kenaikan.
Misalnya, jika Anda menentukan persentase kenaikan seperti
detik, penyebaran ini meningkatkan lalu lintas sebesar 20 persen setiap 600 detik hingga versi baru menerima 100 persen lalu lintas.--interval
Penerapan ini segera memutar kembali versi baru jika ada CloudWatch alarm yang dimatikan.
Semua Sekaligus atau Biru/Hijau — Menggeser 100 persen lalu lintas ke versi baru dengan segera. Penerapan ini memonitor versi baru dan memutarnya kembali secara otomatis ke versi sebelumnya jika ada CloudWatch alarm yang dimatikan.
Contoh CloudFormation template berikut menerbitkan dua versi mesin negara bernama
. Ini menciptakan alias bernamaMyStateMachine
, yang menunjuk ke kedua versi ini, dan kemudian menyebarkan versi. PROD
Dalam contoh ini, 10 persen lalu lintas dialihkan ke versi 2
setiap lima menit sampai versi ini menerima 100 persen lalu lintas. Contoh ini juga menunjukkan bagaimana Anda dapat mengatur CloudWatch alarm. Jika salah satu alarm yang Anda atur masuk ke ALARM
status, penerapan gagal dan segera berputar kembali.
MyStateMachine: Type: AWS::StepFunctions::StateMachine Properties: Type: STANDARD StateMachineName: MyStateMachine RoleArn: arn:aws:iam::123456789012:role/myIamRole Definition: StartAt: PassState States: PassState: Type: Pass Result: Result End: true MyStateMachineVersionA: Type: AWS::StepFunctions::StateMachineVersion Properties: Description: Version 1 StateMachineArn: !Ref MyStateMachine MyStateMachineVersionB: Type: AWS::StepFunctions::StateMachineVersion Properties: Description: Version 2 StateMachineArn: !Ref MyStateMachine PROD: Type: AWS::StepFunctions::StateMachineAlias Properties: Name: PROD Description: The PROD state machine alias taking production traffic. DeploymentPreference: StateMachineVersionArn: !Ref MyStateMachineVersionB Type: LINEAR Percentage: 10 Interval: 5 Alarms: # A list of alarms that you want to monitor. If any of these alarms trigger, rollback the deployment immediately by pointing 100 percent of traffic to the previous version. - !Ref CloudWatchAlarm1 - !Ref CloudWatchAlarm2