Best Practice 11.3 – Define an approach to restore service availability - SAP Lens

Best Practice 11.3 – Define an approach to restore service availability

Restoring availability assumes that for a particular failure scenario, some loss of service will occur. The restore approach should examine the amount of time needed to restore service, and the actions required to meet the availability goal.

Suggestion 11.3.1 – Enable instance recovery for EC2 instances

AWS provides two modes of instance recovery: simplified (on by default) and Amazon CloudWatch action-based (configurable). Both modes monitor an Amazon EC2 instance and automatically recover the instance if it becomes impaired due to an underlying hardware failure. This feature can remove the need for manual intervention, but startup, application restart, and load times should be factored into the recovery time objective (RTO).

CloudWatch action-based alarms are customizable, which can help you to control the recovery time of an instance for standalone instances.

If you intend to use a clustering solution to protect against hardware failure, you should evaluate if instance recovery is compatible with the cluster solution.

Suggestion 11.3.2 – Have a strategy to rebuild EC2 instances using AMIs and infrastructure as code

The benefit of infrastructure as code (IaC) is the ability to build and tear down entire environments programmatically. If architected for resiliency, an environment can be implemented in minutes using AWS CloudFormation templates or AWS Systems Manager automation. Automation is critical for maintaining high availability and fast recovery.

You should evaluate the following AWS services as part of your strategy:

Suggestion 11.3.3 – Understand Amazon EBS failures

Failure of one or more EBS volumes could impact the availability and durability of your SAP workload. Therefore, you should understand the Amazon EBS failure rates, notification mechanisms, and recovery options.

Suggestion 11.3.4 – Have a strategy for reacting to AWS Personal Health Dashboard notifications

You should have a strategy for receiving and actioning notifications from your AWS Personal Health Dashboard. This could include using CloudWatch to start Amazon SNS or integration with your ITSM tools via the AWS Health API.

Suggestion 11.3.5 – Ensure that you are protected against accidental or malicious events impacting availability

You should consider the following approaches for ensuring that you are protected against accidental or malicious events that could impact the availability of your SAP workload.

Suggestion 11.3.6 – Identify dependencies beyond the SAP workload in AWS

Understand the underlying dependencies for your SAP business processes, including shared services and supporting components or systems. Some examples include Active Directory, DNS, identity providers, SaaS services, and on-premises systems. Assess the impact of failure and the required mitigations.