When you troubleshoot a cluster, you may want to list running processes. You may also want to stop or restart processesS. For example, you can restart a process after you change a configuration or notice a problem with a particular process after you analyze log files and error messages.
There are two types of processes that run on a cluster: Amazon EMR processes (for example, instance-controller and Log Pusher), and processes associated with the applications installed on the cluster (for example, hadoop-hdfs-namenode, and hadoop-yarn-resourcemanager).
To work with processes directly on a cluster, you must first connect to the master node. For more information, see Connect to an Amazon EMR cluster.
Viewing running processes
The method you use to view running processes on a cluster differs according to the Amazon EMR version you use.
Example : List all running processes
The following example uses systemctl
and specifies
--type
to view all processes.
systemctl --type=service
Example : List specific processes
The following example lists all processes with names that contain
hadoop
.
systemctl --type=service | grep -i hadoop
Example output:
hadoop-hdfs-namenode.service loaded active running Hadoop namenode
hadoop-httpfs.service loaded active running Hadoop httpfs
hadoop-kms.service loaded active running Hadoop kms
hadoop-mapreduce-historyserver.service loaded active running Hadoop historyserver
hadoop-state-pusher.service loaded active running Daemon process that processes and serves EMR metrics data.
hadoop-yarn-proxyserver.service loaded active running Hadoop proxyserver
hadoop-yarn-resourcemanager.service loaded active running Hadoop resourcemanager
hadoop-yarn-timelineserver.service loaded active running Hadoop timelineserver
Example : See a detailed status report for a specific process
The following example displays a detailed status report for the
hadoop-hdfs-namenode
service.
sudo systemctl status
hadoop-hdfs-namenode
Example output:
hadoop-hdfs-namenode.service - Hadoop namenode
Loaded: loaded (/etc/systemd/system/hadoop-hdfs-namenode.service; enabled; vendor preset: disabled)
Active: active (running) since Wed 2021-08-18 21:01:46 UTC; 26min ago
Main PID: 9733 (java)
Tasks: 0
Memory: 1.1M
CGroup: /system.slice/hadoop-hdfs-namenode.service
‣ 9733 /etc/alternatives/jre/bin/java -Dproc_namenode -Xmx1843m -server -XX:OnOutOfMemoryError=kill -9 %p ...
Aug 18 21:01:37 ip-172-31-20-123 systemd[1]: Starting Hadoop namenode...
Aug 18 21:01:37 ip-172-31-20-123 su[9715]: (to hdfs) root on none
Aug 18 21:01:37 ip-172-31-20-123 hadoop-hdfs-namenode[9683]: starting namenode, logging to /var/log/hadoop-hdfs/ha...out
Aug 18 21:01:46 ip-172-31-20-123 hadoop-hdfs-namenode[9683]: Started Hadoop namenode:[ OK ]
Aug 18 21:01:46 ip-172-31-20-123 systemd[1]: Started Hadoop namenode.
Hint: Some lines were ellipsized, use -l to show in full.
Stopping and restarting processes
After you determine which processes are running, you can stop and then restart them if necessary.
Example : Stop a process
The following example stops the hadoop-hdfs-namenode
process.
sudo systemctl stop
hadoop-hdfs-namenode
You can query the status
to verify that the process is
stopped.
sudo systemctl status
hadoop-hdfs-namenode
Example output:
hadoop-hdfs-namenode.service - Hadoop namenode
Loaded: loaded (/etc/systemd/system/hadoop-hdfs-namenode.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Wed 2021-08-18 21:37:50 UTC; 8s ago
Main PID: 9733 (code=exited, status=143)
Example : Start a process
The following example starts the hadoop-hdfs-namenode
process.
sudo systemctl start
hadoop-hdfs-namenode
You can query the status to verify that the process is running.
sudo systemctl status hadoop-hdfs-namenode
Example output:
hadoop-hdfs-namenode.service - Hadoop namenode
Loaded: loaded (/etc/systemd/system/hadoop-hdfs-namenode.service; enabled; vendor preset: disabled)
Active: active (running) since Wed 2021-08-18 21:38:24 UTC; 2s ago
Process: 13748 ExecStart=/etc/init.d/hadoop-hdfs-namenode start (code=exited, status=0/SUCCESS)
Main PID: 13800 (java)
Tasks: 0
Memory: 1.1M
CGroup: /system.slice/hadoop-hdfs-namenode.service
‣ 13800 /etc/alternatives/jre/bin/java -Dproc_namenode -Xmx1843m -server -XX:OnOutOfMemoryError=kill -9 %p...