AI/ML for security - AWS Prescriptive Guidance

AI/ML for security

Influence the future of the AWS Security Reference Architecture (AWS SRA) by taking a short survey.

Artificial intelligence and machine learning (AI/ML) is transforming businesses. AI/ML has been a focus for Amazon for over 20 years, and many of the capabilities customers use with AWS, including security services, are driven by AI/ML. This creates a built-in differentiated value, because you can build securely on AWS without requiring your security or application development teams to have expertise in AI/ML.

AI is an advanced technology that allows machines and systems to gain intelligence and prediction capability. AI systems learn from past experience through data that it consumes or is trained on. ML is one of the most important aspects of AI. ML is the ability of computers to learn from data without being explicitly programmed. In traditional programming, the programmer writes rules that define how the program should work on a computer or machine. In ML, the model learns the rules from data. ML models can discover hidden patterns in the data or make accurate predictions on new data that weren’t used during training. Multiple AWS services use AI/ML to learn from huge datasets and make security inferences.

  • Amazon Macie is a data security service that uses ML and pattern matching to discover and help protect your sensitive data. Macie automatically detects a large and growing list of sensitive data types, including personally identifiable information (PII) such as names, addresses, and financial information such as credit card numbers. It also gives you constant visibility into your data that’s stored in Amazon Simple Storage Service (Amazon S3). Macie uses natural language processing (NLP) and ML models that are trained on different types of datasets to understand your existing data and to assign business values to prioritize business-critical data. Macie then generates sensitive data findings.

  • Amazon GuardDuty is a threat detection service that uses ML, anomaly detection, and integrated threat intelligence to continuously monitor for malicious activity and unauthorized behavior to help protect your AWS accounts, instances, serverless and container workloads, users, databases, and storage. GuardDuty incorporates ML techniques that are highly effective at discerning potentially malicious user activity from anomalous but benign operational behavior within AWS accounts. This capability continuously models API invocations within an account and incorporates probabilistic predictions to more accurately isolate and alert on highly suspicious user behavior. This approach helps identify malicious activity associated with known threat tactics, including discovery, initial access, persistence, privilege escalation, defense evasion, credential access, impact, and data exfiltration. To learn more about how GuardDuty uses machine learning, see the AWS re:Inforce 2023 breakout session Developing new findings using machine learning in Amazon GuardDuty (TDR310).

Provable security

AWS develops automated reasoning tools that use mathematical logic to answer critical questions about your infrastructure and to detect misconfigurations that could potentially expose your data. This capability is called provable security because it provides higher assurance in the security of the cloud and in the cloud. Provable security uses automated reasoning, which is a specific discipline of AI that applies logical deduction to computer systems. For example, automated reasoning tools can analyze policies and network architecture configurations, and prove the absence of unintended configurations that could potentially expose vulnerable data. This approach provides the highest level of assurance possible for the critical security characteristics of the cloud. For more information, see Provable Security Resources on the AWS website. The following AWS services and features currently use automated reasoning to help you achieve provable security for your applications:

  • Amazon CodeGuru Security is a static application security testing (SAST) tool that combines ML and automated reasoning to identify vulnerabilities in your code and to provide recommendations on how to fix these vulnerabilities and track their status until closure. CodeGuru Security detects the top 10 issues identified by Open Worldwide Application Security Project (OWASP), the top 25 issues identified by Common Weakness Enumeration (CWE), log injection, secrets, and insecure use of AWS APIs and SDKs. CodeGuru Security also borrows from AWS security best practices and was trained on millions of lines of code at Amazon.

    CodeGuru Security can identify code vulnerabilities with a very high true-positive rate because of its deep semantic analysis. This helps developers and security teams have confidence in the guidance, which results in an increase in quality. This service is trained by using rule mining and supervised ML models that use a combination of logistic regression and neural networks. For example, during training for sensitive data leaks, CodeGuru Security performs a full code analysis for code paths that use the resource or access sensitive data, creates a feature set that represents those, and then uses the code paths as inputs for logistic regression models and convolutional neural networks (CNNs). The CodeGuru Security bug-tracking feature automatically detects when a bug is closed. The bug-tracking algorithm makes sure that you have up-to-date information on your organization's security posture without additional effort. To begin reviewing code, you can associate your existing code repositories on GitHub, GitHub Enterprise, Bitbucket, or AWS CodeCommit on the CodeGuru console. The CodeGuru Security API-based design provides integration capabilities that you can use at any stage of the development workflow.

  • Amazon Verified Permissions is a scalable permissions management and fine-grained authorization service for the applications that you build. Verified Permissions uses Cedar, which is an open-source language for access control that was built by using automated reasoning and differential testing. Cedar is a language for defining permissions as policies that describe who should have access to which resources. It is also a specification for evaluating those policies. Use Cedar policies to control what each user of your application is permitted to do and which resources they may access. Cedar policies are permit or forbid statements that determine whether a user can act on a resource. Policies are associated with resources, and you can attach multiple policies to a resource. Forbid policies override permit policies. When a user of your application attempts to perform an action on a resource, your application makes an authorization request to the Cedar policy engine. Cedar evaluates the applicable policies and returns an ALLOW or DENY decision. Cedar supports authorization rules for any type of principal and resource, allows for role-based and attribute-based access control, and supports analysis through automated reasoning tools that can help optimize your policies and validate your security model.

  • AWS Identity and Access Management (IAM) Access Analyzer helps you streamline permissions management. You can use this feature to set fine-grained permissions, verify intended permissions, and refine permissions by removing unused access. IAM Access Analyzer generates a fine-grained policy based on the access activity captured in your logs. It also provides over 100 policy checks to help you author and validate your policies. IAM Access Analyzer uses provable security to analyze access paths and provide comprehensive findings for public and cross-account access to your resources. This tool is built on Zelkova, which translates IAM policies into equivalent logical statements and runs a suite of general-purpose and specialized logical solvers (satisfiability modulo theories) against the problem. IAM Access Analyzer applies Zelkova repeatedly to a policy with increasingly specific queries to characterize classes of behaviors the policy allows, based on the content of the policy. The analyzer doesn’t examine access logs to determine whether an external entity accessed a resource within your zone of trust. It generates a finding when a resource-based policy allows access to a resource, even if the resource wasn’t accessed by the external entity. To learn more about satisfiability modulo theories, see Satisfiability Modulo Theories in Handbook of Satisfiability.*

  • Amazon S3 Block Public Access is a feature of Amazon S3 that allows you to block possible misconfigurations that could lead to public access of your buckets and objects. You can enable Amazon S3 Block Public Access at bucket level or account level (which affects both existing and new buckets in the account). Public access is granted to buckets and objects through access control lists (ACLs), bucket policies, or both. The determination of whether a given policy or ACL is considered public is made by using the Zelkova automated reasoning system. Amazon S3 uses Zelkova to check each bucket policy and warns you if an unauthorized user is able to read or write to your bucket. If a bucket is flagged as public, some public requests are allowed to access the bucket. If a bucket is flagged as not public, all public requests are denied. Zelkova is able to make such determinations because it has a precise mathematical representation of IAM policies. It creates a formula for each policy and proves a theorem about that formula.

  • Amazon VPC Network Access Analyzer is a feature of Amazon VPC that helps you understand potential network paths to your resources, and identifies potential unintended network access. Network Access Analyzer helps you verify network segmentation, identify internet accessibility, and verify trusted network paths and network access. This feature uses automated reasoning algorithms to analyze the network paths that a packet can take between resources in an AWS network. It then produces findings for paths that match your Network Access Scopes, which define outbound and inbound traffic patterns. Network Access Analyzer performs a static analysis of a network configuration, meaning that no packets are transmitted in the network as part of this analysis.

  • Amazon VPC Reachability Analyzer is a feature of Amazon VPC that lets you debug, understand, and visualize connectivity in your AWS network. Reachability Analyzer is a configuration analysis tool that enables you to perform connectivity testing between a source resource and a destination resource in your virtual private clouds (VPCs). When the destination is reachable, Reachability Analyzer produces hop-by-hop details of the virtual network path between the source and the destination. When the destination isn’t reachable, Reachability Analyzer identifies the blocking component. Reachability Analyzer uses automated reasoning to identify feasible paths by building a model of the network configuration between a source and destination. It then checks for reachability based on the configuration. It doesn’t send packets or analyze the data plane.

* Biere, A. M. Heule, H. van Maaren, and T. Walsh. 2009. Handbook of Satisfiability. IOS Press, NLD.