Creating a parallel query DB cluster in Aurora MySQL
To create an Aurora MySQL cluster with parallel query, add new instances to it, or perform other administrative operations, you use the same AWS Management Console and AWS CLI techniques that you do with other Aurora MySQL clusters. You can create a new cluster to work with parallel query. You can also create a DB cluster to work with parallel query by restoring from a snapshot of a MySQL-compatible Aurora DB cluster. If you aren't familiar with the process for creating a new Aurora MySQL cluster, you can find background information and prerequisites in Creating an Amazon Aurora DB cluster.
When you choose an Aurora MySQL engine version, we recommend that you choose the latest one available. Currently, all available Aurora MySQL versions support parallel query. You have more flexibility to turn parallel query on and off, or use parallel query with existing clusters, if you use the latest versions.
Whether you create a new cluster or restore from a snapshot, you use the same techniques to add new DB instances that you do with other Aurora MySQL clusters.
You can create a parallel query cluster using the Amazon RDS console or the AWS CLI.
Contents
Creating a parallel query cluster using the console
You can create a new parallel query cluster with the console as described following.
To create a parallel query cluster with the AWS Management Console
-
Follow the general AWS Management Console procedure in Creating an Amazon Aurora DB cluster.
-
For Engine type, choose Aurora MySQL.
-
For Additional configuration, choose a parameter group that you created for DB cluster parameter group. Using such a custom parameter group is required for Aurora MySQL 2.09 and higher. In your DB cluster parameter group, specify the parameter settings
aurora_parallel_query=ON
andaurora_disable_hash_join=OFF
. Doing so turns on parallel query for the cluster, and turns on the hash join optimization that works in combination with parallel query.
To verify that a new cluster can use parallel query
Create a cluster using the preceding technique.
-
(For Aurora MySQL version 2 or 3) Check that the
aurora_parallel_query
configuration setting is true.mysql>
select @@aurora_parallel_query;+-------------------------+ | @@aurora_parallel_query | +-------------------------+ | 1 | +-------------------------+
-
(For Aurora MySQL version 2) Check that the
aurora_disable_hash_join
setting is false.mysql>
select @@aurora_disable_hash_join;+----------------------------+ | @@aurora_disable_hash_join | +----------------------------+ | 0 | +----------------------------+
-
With some large tables and data-intensive queries, check the query plans to confirm that some of your queries are using the parallel query optimization. To do so, follow the procedure in Verifying which statements use parallel query for Aurora MySQL.
Creating a parallel query cluster using the CLI
You can create a new parallel query cluster with the CLI as described following.
To create a parallel query cluster with the AWS CLI
-
(Optional) Check which Aurora MySQL versions are compatible with parallel query clusters. To do so, use the
describe-db-engine-versions
command and check the value of theSupportsParallelQuery
field. For an example, see Checking Aurora MySQL version compatibility for parallel query. -
(Optional) Create a custom DB cluster parameter group with the settings
aurora_parallel_query=ON
andaurora_disable_hash_join=OFF
. Use commands such as the following.aws rds create-db-cluster-parameter-group --db-parameter-group-family aurora-mysql8.0 --db-cluster-parameter-group-name pq-enabled-80-compatible aws rds modify-db-cluster-parameter-group --db-cluster-parameter-group-name pq-enabled-80-compatible \ --parameters ParameterName=aurora_parallel_query,ParameterValue=ON,ApplyMethod=pending-reboot aws rds modify-db-cluster-parameter-group --db-cluster-parameter-group-name pq-enabled-80-compatible \ --parameters ParameterName=aurora_disable_hash_join,ParameterValue=OFF,ApplyMethod=pending-reboot
If you perform this step, specify the option
--db-cluster-parameter-group-name
in the subsequentmy_cluster_parameter_group
create-db-cluster
statement. Substitute the name of your own parameter group. If you omit this step, you create the parameter group and associate it with the cluster later, as described in Turning parallel query on and off in Aurora MySQL. -
Follow the general AWS CLI procedure in Creating an Amazon Aurora DB cluster.
-
Specify the following set of options:
-
For the
--engine
option, useaurora-mysql
. These values produce parallel query clusters that are compatible with MySQL 5.7 or 8.0. -
For the
--db-cluster-parameter-group-name
option, specify the name of a DB cluster parameter group that you created and specified the parameter valueaurora_parallel_query=ON
. If you omit this option, you can create the cluster with a default parameter group and later modify it to use such a custom parameter group. -
For the
--engine-version
option, use an Aurora MySQL version that's compatible with parallel query. Use the procedure from Optimizing parallel query in Aurora MySQL to get a list of versions if necessary.The following code example shows how. Substitute your own value for each of the environment variables such as
$CLUSTER_ID
. This example also specifies the--manage-master-user-password
option to generate the master user password and manage it in Secrets Manager. For more information, see Password management with Amazon Aurora and AWS Secrets Manager. Alternatively, you can use the--master-password
option to specify and manage the password yourself.aws rds create-db-cluster --db-cluster-identifier
$CLUSTER_ID
\ --engine aurora-mysql --engine-version 8.0.mysql_aurora.3.04.1 \ --master-username$MASTER_USER_ID
--manage-master-user-password \ --db-cluster-parameter-group-name$CUSTOM_CLUSTER_PARAM_GROUP
aws rds create-db-instance --db-instance-identifier${INSTANCE_ID}-1
\ --enginesame_value_as_in_create_cluster_command
\ --db-cluster-identifier$CLUSTER_ID
--db-instance-class$INSTANCE_CLASS
-
-
Verify that a cluster you created or restored has the parallel query feature available.
Check that the
aurora_parallel_query
configuration setting exists. If this setting has the value 1, parallel query is ready for you to use. If this setting has the value 0, set it to 1 before you can use parallel query. Either way, the cluster is capable of performing parallel queries.mysql>
select @@aurora_parallel_query;+------------------------+ | @@aurora_parallel_query| +------------------------+ | 1 | +------------------------+
To restore a snapshot to a parallel query cluster with the AWS CLI
-
Check which Aurora MySQL versions are compatible with parallel query clusters. To do so, use the
describe-db-engine-versions
command and check the value of theSupportsParallelQuery
field. For an example, see Checking Aurora MySQL version compatibility for parallel query. Decide which version to use for the restored cluster. -
Locate an Aurora MySQL-compatible cluster snapshot.
-
Follow the general AWS CLI procedure in Restoring from a DB cluster snapshot.
aws rds restore-db-cluster-from-snapshot \ --db-cluster-identifier
mynewdbcluster
\ --snapshot-identifiermydbclustersnapshot
\ --engine aurora-mysql -
Verify that a cluster you created or restored has the parallel query feature available. Use the same verification procedure as in Creating a parallel query cluster using the CLI.