Defining Guard queries and filtering
This topic covers writing queries and using filtering when writing Guard rule clauses.
Prerequisites
Filtering is an advanced AWS CloudFormation Guard concept. We recommend that you review the following foundational topics before you learn about filtering:
Defining queries
Query expressions are simple dot (.
) separated expressions written to
traverse hierarchical data. Query expressions can include filter expressions to target a
subset of values. When queries are evaluated, they result in a collection of values,
similar to a result set returned from an SQL query.
The following example query searches a AWS CloudFormation template for AWS::IAM::Role
resources.
Resources.*[ Type == 'AWS::IAM::Role' ]
Queries follow these basic principles:
-
Each dot (
.
) part of the query traverses down the hierarchy when an explicit key term is used, such asResources
orProperties.Encrypted.
If any part of the query doesn't match the incoming datum, Guard throws a retrieval error. -
A dot (
.
) part of the query that uses a wildcard*
traverses all values for the structure at that level. -
A dot (
.
) part of the query that uses an array wildcard[*]
traverses all indices for that array. -
All collections can be filtered by specifying filters inside square brackets
[]
. Collections can be encountered in the following ways:-
Naturally occurring arrays in datum are collections. Following are examples:
Ports:
[20, 21, 110, 190]
Tags:
[{"Key": "Stage", "Value": "PROD"}, {"Key": "App", "Value": "MyService"}]
-
When traversing all values for a structure like
Resources.*
-
Any query result is itself a collection from which values can be further filtered. See the following example.
let all_resources = Resource.* # query let iam_resources = %resources[ Type == /IAM/ ] # filter from query results let managed_policies = %iam_resources[ Type == /ManagedPolicy/ ] # further refinements %managed_policies { # traversing each value # do something with each }
-
The following is an example CloudFormation template snippet.
Resources: SampleRole: Type: AWS::IAM::Role ... SampleInstance: Type: AWS::EC2::Instance ... SampleVPC: Type: AWS::EC2::VPC ... SampleSubnet1: Type: AWS::EC2::Subnet ... SampleSubnet2: Type: AWS::EC2::Subnet ...
Based on this template, the path traversed is SampleRole
and the final
value selected is Type: AWS::IAM::Role
.
Resources: SampleRole: Type: AWS::IAM::Role ...
The resulting value of the query Resources.*[ Type == 'AWS::IAM::Role' ]
in YAML format is shown in the following example.
- Type: AWS::IAM::Role ...
Some of the ways that you can use queries are as follows:
-
Assign a query to variables so that query results can be accessed by referencing those variables.
-
Follow the query with a block that tests against each of the selected values.
-
Compare a query directly against a basic clause.
Assigning queries to variables
Guard supports one-shot variable assignments within a given scope. For more information about variables in Guard rules, see Assigning and referencing variables in Guard rules.
You can assign queries to variables so that you can write queries once and then reference them elsewhere in your Guard rules. See the following example variable assignments that demonstrate query principles discussed later in this section.
# # Simple query assignment # let resources = Resources.* # All resources # # A more complex query here (this will be explained below) # let iam_policies_allowing_log_creates = Resources.*[ Type in [/IAM::Policy/, /IAM::ManagedPolicy/] some Properties.PolicyDocument.Statement[*] { some Action[*] == 'cloudwatch:CreateLogGroup' Effect == 'Allow' } ]
Directly looping through values from a variable assigned to a query
Guard supports directly running against the results from a query. In the
following example, the when
block tests against the Encrypted
,
VolumeType
, and AvailabilityZone
property for each
AWS::EC2::Volume
resource found in a CloudFormation template.
let ec2_volumes = Resources.*[ Type == 'AWS::EC2::Volume' ] when %ec2_volumes !empty { %ec2_volumes { Properties { Encrypted == true VolumeType in ['gp2', 'gp3'] AvailabilityZone in ['us-west-2b', 'us-west-2c'] } } }
Direct clause-level comparisons
Guard also supports queries as a part of direct comparisons. For example, see the following.
let resources = Resources.* some %resources.Properties.Tags[*].Key == /PROD$/ some %resources.Properties.Tags[*].Value == /^App/
In the preceding example, the two clauses (starting with the some
keyword) expressed in the form shown are considered independent clauses and are
evaluated separately.
Single clause and block clause form
Taken together, the two example clauses shown in the preceding section aren't equivalent to the following block.
let resources = Resources.* some %resources.Properties.Tags[*] { Key == /PROD$/ Value == /^App/ }
This block queries for each Tag
value in the collection and compares
its property values to the expected property values. The combined form of the
clauses in the preceding section evaluates the two clauses independently. Consider
the following input.
Resources: ... MyResource: ... Properties: Tags: - Key: EndPROD Value: NotAppStart - Key: NotPRODEnd Value: AppStart
Clauses in the first form evaluate to PASS
. When validating the first
clause in first form, the following path across Resources
,
Properties
, Tags
, and Key
matches the
value NotPRODEnd
and does not match the expected value
PROD
.
Resources: ... MyResource: ... Properties: Tags: - Key: EndPROD Value: NotAppStart - Key: NotPRODEnd Value: AppStart
The same happens with the second clause of the first form. The path across
Resources
, Properties
, Tags
, and
Value
matches the value AppStart
. As a result, the
second clause independently.
The overall result is a PASS
.
However, the block form evaluates as follows. For each Tags
value, it
compares if both the Key
and Value
does match;
NotAppStart
and NotPRODEnd
values are not matched in
the following example.
Resources: ... MyResource: ... Properties: Tags: - Key: EndPROD Value: NotAppStart - Key: NotPRODEnd Value: AppStart
Because evaluations check for both Key == /PROD$/
, and Value ==
/^App/
, the match is not complete. Therefore, the result is
FAIL
.
Note
When working with collections, we recommend that you use the block clause form when you want to compare multiple values for each element in the collection. Use the single clause form when the collection is a set of scalar values, or when you only intend to compare a single attribute.
Query outcomes and associated clauses
All queries return a list of values. Any part of a traversal, such as a missing key,
empty values for an array (Tags: []
) when accessing all indices, or missing
values for a map when encountering an empty map (Resources: {}
), can lead
to retrieval errors.
All retrieval errors are considered failures when evaluating clauses against such queries. The only exception is when explicit filters are used in the query. When filters are used, associated clauses are skipped.
The following block failures are associated with running queries.
-
If a template does not contain resources, then the query evaluates to
FAIL
, and the associated block level clauses also evaluate toFAIL
. -
When a template contains an empty resources block like
{ "Resources": {} }
, the query evaluates toFAIL
, and the associated block level clauses also evaluate toFAIL
. -
If a template contains resources but none match the query, then the query returns empty results, and the block level clauses are skipped.
Using filters in queries
Filters in queries are effectively Guard clauses that are used as selection criteria. Following is the structure of a clause.
<query> <operator> [query|value literal] [message] [or|OR]
Keep in mind the following key points from Writing AWS CloudFormation Guard rules when you work with filters:
-
Combine clauses by using Conjunctive Normal Form (CNF)
. -
Specify each conjunction (
and
) clause on a new line. -
Specify disjunctions (
or
) by using theor
keyword between two clauses.
The following example demonstrates conjunctive and disjunctive clauses.
resourceType == 'AWS::EC2::SecurityGroup' InputParameters.TcpBlockedPorts not empty InputParameters.TcpBlockedPorts[*] { this in r(100, 400] or this in r(4000, 65535] }
Using clauses for selection criteria
You can apply filtering to any collection. Filtering can be applied directly on
attributes in the input that are already a collection like securityGroups:
[....]
. You can also apply filtering against a query, which is always a
collection of values. You can use all features of clauses, including conjunctive
normal form, for filtering.
The following common query is often used when selecting resources by type from a CloudFormation template.
Resources.*[ Type == 'AWS::IAM::Role' ]
The query Resources.*
returns all values present in the
Resources
section of the input. For the example template input in
Defining queries, the
query returns the following.
- Type: AWS::IAM::Role ... - Type: AWS::EC2::Instance ... - Type: AWS::EC2::VPC ... - Type: AWS::EC2::Subnet ... - Type: AWS::EC2::Subnet ...
Now, apply the filter against this collection. The criterion to match is
Type == AWS::IAM::Role
. Following is the output of the query after
the filter is applied.
- Type: AWS::IAM::Role ...
Next, check various clauses for AWS::IAM::Role
resources.
let all_resources = Resources.* let all_iam_roles = %all_resources[ Type == 'AWS::IAM::Role' ]
The following is an example filtering query that selects all
AWS::IAM::Policy
and AWS::IAM::ManagedPolicy
resources.
Resources.*[ Type in [ /IAM::Policy/, /IAM::ManagedPolicy/ ] ]
The following example checks if these policy resources have a
PolicyDocument
specified.
Resources.*[ Type in [ /IAM::Policy/, /IAM::ManagedPolicy/ ] Properties.PolicyDocument exists ]
Building out more complex filtering needs
Consider the following example of an AWS Config configuration item for ingress and egress security groups information.
--- resourceType: 'AWS::EC2::SecurityGroup' configuration: ipPermissions: - fromPort: 172 ipProtocol: tcp toPort: 172 ipv4Ranges: - cidrIp: 10.0.0.0/24 - cidrIp: 0.0.0.0/0 - fromPort: 89 ipProtocol: tcp ipv6Ranges: - cidrIpv6: '::/0' toPort: 189 userIdGroupPairs: [] ipv4Ranges: - cidrIp: 1.1.1.1/32 - fromPort: 89 ipProtocol: '-1' toPort: 189 userIdGroupPairs: [] ipv4Ranges: - cidrIp: 1.1.1.1/32 ipPermissionsEgress: - ipProtocol: '-1' ipv6Ranges: [] prefixListIds: [] userIdGroupPairs: [] ipv4Ranges: - cidrIp: 0.0.0.0/0 ipRanges: - 0.0.0.0/0 tags: - key: Name value: good-sg-delete-me vpcId: vpc-0123abcd InputParameter: TcpBlockedPorts: - 3389 - 20 - 21 - 110 - 143
Note the following:
-
ipPermissions
(ingress rules) is a collection of rules inside a configuration block. -
Each rule structure contains attributes such as
ipv4Ranges
andipv6Ranges
to specify a collection of CIDR blocks.
Let’s write a rule that selects any ingress rules that allow connections from any IP address, and verifies that the rules do not allow TCP blocked ports to be exposed.
Start with the query portion that covers IPv4, as shown in the following example.
configuration.ipPermissions[ # # at least one
ipv4Ranges
equals ANY IPv4 # some ipv4Ranges[*].cidrIp == '0.0.0.0/0' ]
The some
keyword is useful in this context. All queries return a
collection of values that match the query. By default, Guard evaluates that
all values returned as a result of the query are matched against checks. However,
this behavior might not always be what you need for checks. Consider the following
part of the input from the configuration item.
ipv4Ranges: - cidrIp: 10.0.0.0/24 - cidrIp: 0.0.0.0/0 # any IP allowed
There are two values present for ipv4Ranges
. Not all
ipv4Ranges
values equal an IP address denoted by
0.0.0.0/0
. You want to see if at least one value matches
0.0.0.0/0
. You tell Guard that not all results returned from
a query need to match, but at least one result must match. The some
keyword tells Guard to ensure that one or more values from the resultant query
match the check. If no query result values match, Guard throws an
error.
Next, add IPv6, as shown in the following example.
configuration.ipPermissions[ # # at-least-one ipv4Ranges equals ANY IPv4 # some ipv4Ranges[*].cidrIp == '0.0.0.0/0' or # # at-least-one ipv6Ranges contains ANY IPv6 # some ipv6Ranges[*].cidrIpv6 == '::/0' ]
Finally, in the following example, validate that the protocol is not
udp
.
configuration.ipPermissions[ # # at-least-one ipv4Ranges equals ANY IPv4 # some ipv4Ranges[*].cidrIp == '0.0.0.0/0' or # # at-least-one ipv6Ranges contains ANY IPv6 # some ipv6Ranges[*].cidrIpv6 == '::/0' # # and ipProtocol is not udp # ipProtocol != 'udp' ] ]
The following is the complete rule.
rule any_ip_ingress_checks { let ports = InputParameter.TcpBlockedPorts[*] let targets = configuration.ipPermissions[ # # if either ipv4 or ipv6 that allows access from any address # some ipv4Ranges[*].cidrIp == '0.0.0.0/0' or some ipv6Ranges[*].cidrIpv6 == '::/0' # # the ipProtocol is not UDP # ipProtocol != 'udp' ] when %targets !empty { %targets { ipProtocol != '-1' << result: NON_COMPLIANT check_id: HUB_ID_2334 message: Any IP Protocol is allowed >> when fromPort exists toPort exists { let each_target = this %ports { this < %each_target.fromPort or this > %each_target.toPort << result: NON_COMPLIANT check_id: HUB_ID_2340 message: Blocked TCP port was allowed in range >> } } } } }
Separating collections based on their contained types
When using infrastructure as code (IaC) configuration templates, you might
encounter a collection that contains references to other entities within the
configuration template. The following is an example CloudFormation template that
describes Amazon Elastic Container Service (Amazon ECS) tasks with a local reference to TaskRoleArn
, a
reference to TaskArn
, and a direct string reference.
Parameters: TaskArn: Type: String Resources: ecsTask: Type: 'AWS::ECS::TaskDefinition' Metadata: SharedExectionRole: allowed Properties: TaskRoleArn: 'arn:aws:....' ExecutionRoleArn: 'arn:aws:...' ecsTask2: Type: 'AWS::ECS::TaskDefinition' Metadata: SharedExectionRole: allowed Properties: TaskRoleArn: 'Fn::GetAtt': - iamRole - Arn ExecutionRoleArn: 'arn:aws:...2' ecsTask3: Type: 'AWS::ECS::TaskDefinition' Metadata: SharedExectionRole: allowed Properties: TaskRoleArn: Ref: TaskArn ExecutionRoleArn: 'arn:aws:...2' iamRole: Type: 'AWS::IAM::Role' Properties: PermissionsBoundary: 'arn:aws:...3'
Consider the following query.
let ecs_tasks = Resources.*[ Type == 'AWS::ECS::TaskDefinition' ]
This query returns a collection of values that contains all three
AWS::ECS::TaskDefinition
resources shown in the example template.
Separate ecs_tasks
that contain TaskRoleArn
local
references from others, as shown in the following example.
let ecs_tasks = Resources.*[ Type == 'AWS::ECS::TaskDefinition' ] let ecs_tasks_role_direct_strings = %ecs_tasks[ Properties.TaskRoleArn is_string ] let ecs_tasks_param_reference = %ecs_tasks[ Properties.TaskRoleArn.'Ref' exists ] rule task_role_from_parameter_or_string { %ecs_tasks_role_direct_strings !empty or %ecs_tasks_param_reference !empty } rule disallow_non_local_references { # Known issue for rule access: Custom message must start on the same line not task_role_from_parameter_or_string << result: NON_COMPLIANT message: Task roles are not local to stack definition >> }