Trying to update a cluster
The following section provides possible troubleshooting solutions to issues that might happen while you're trying to update a cluster.
pcluster update-cluster
command fails to run locally
Check the ~/.parallelcluster/pcluster-cli.log
in your local file system for failure details.
Seeing clusterStatus
is UPDATE_FAILED
with pcluster describe-cluster
command
If the cluster stack update rolled back, check the /var/log/chef-client.logs
file for error details.
Check to see if your issue is mentioned in GitHub Known Issues
The cluster update timed out
This could be an issue related to cfn-hup
not running. If the cfn-hup
demon is terminated by an external cause,
it's not restarted automatically. If cfn-hup
isn't running, during a cluster update, the CloudFormation stack starts the update process
as expected, but the update procedure isn't activated on the head node and the stack deployment eventually times out. For more information, see
Troubleshooting a cluster update timeout when cfn-hup isn't
running to troubleshoot and recover
from the issue.