This procedure is to change Yarn configuration on a live cluster, propagate the changes to all the nodes and restart Yarn node manager.
Both commands are listing all the nodes on the cluster and then filtering the DNS name to execute a remote command via SSH. You can customize the sed filter depending on your own needs. This is filtering DNS names with Elastic Mapreduce format (ip-xx-xx-xx-xx.eu-west-1.compute.internal).
1. Upload the private key (.pem) file you are using to access the master node on the cluster. Change the private key permissions to at least 600 (i.e chmod 600 MyKeyName.pem)
2. Change /conf/yarn-site.xml and use a command like this to populate the change across the cluster.
yarn node -list|sed -n "s/^\(ip[^:]*\):.*/\1/p" | xargs -t -I{} -P10 scp -o StrictHostKeyChecking=no -i ~/MyKeyName.pem ~/conf/yarn-site.xml hadoop@{}://home/hadoop/conf/
3. This command will restart Yarn Node Resource manager on all the nodes.
yarn node -list|sed -n "s/^\(ip[^:]*\):.*/\1/p" | xargs -t -I{} -P10 ssh -o StrictHostKeyChecking=no -i ~/MyKeyName.pem hadoop@{} "yarn nodemanager stop"
Working Script for EMR 5.x
for node in $(hadoop dfsadmin -report | grep ^Name | cut -f2 -d: | cut -f2 -d’ ‘); do
ssh -i ~/key.pem hadoop@$node “sudo chmod 777 /etc/hadoop/conf/*”
scp -i ~/key.pem /etc/hadoop/conf/yarn-site.xml hadoop@$node:/etc/hadoop/conf/yarn-site.xml
ssh -i ~/key.pem hadoop@$node “sudo chmod 674 /etc/hadoop/conf/*”
done