Debugging AWS ECS Exec Command
AWS CloudShell has become one of my favorite tools. I can quickly connect to an ECS container or EC2 instance and have a full bash shell where ctrl-c
and ctrl-v
work as expected. But Session Manager needs to be properly configured for this to work.
Let’s say there exists an EC2 instance called EC2WithSessionManager
with Session Manager already enabled. We can connect to it from CloudShell using Session Manager:
aws ssm start-session --target $(aws ec2 describe-instances \
--filters "Name=tag:Name,Values=EC2WithSessionManager" \
--query 'Reservations[0].Instances[0].InstanceId' \
--output text)
Become ec2-user
, update the system and install Git (if it’s not already installed):
sudo su - ec2-user
Clone the ECS Exec Checker:
git clone https://github.com/aws-containers/amazon-ecs-exec-checker.git && cd amazon-ecs-exec-checker
Let’s say the cluster is named reporter
:
CLUSTER_NAME=reporter
Get the task ID:
TASK_ID=$(aws ecs list-tasks --cluster $CLUSTER_NAME | jq -r '.[][0]' | awk -F'/' '{ print $NF }')
Run the checker:
./check-ecs-exec.sh $CLUSTER_NAME $TASK_ID
The output will provide a detailed report making it easy to see what is missing.
Enabling ECS Exec Command for a Service
One common issue is that the service is not enabled for ECS Exec. If it’s enabled, the following command will return true
:
aws ecs describe-services \
--cluster $CLUSTER_NAME \
--services $SERVICE_NAME \
--query 'services[0].enableExecuteCommand'
If it is not enabled, it can be enabled with the update-service
command.
aws ecs update-service --cluster $CLUSTER_NAME --services $SERVICE_NAME --enable-execute-command --force-new-deployment
ResourceInitializationError
Another common issue is the task not being able to pull the image from ECR. This can happen when the subnet’s Route table does not allow access. The solution is to identify the right subnet and security group and update the service to use them.
aws ecs update-service \
--cluster $CLUSTER_NAME \
--service $SERVICE_NAME \
--network-configuration "awsvpcConfiguration={
subnets=[subnet-1234567890],
securityGroups=[sg-1234567890],
assignPublicIp=ENABLED
}"
--force-new-deployment
Test the Connection
Try connecting to your container:
aws ecs execute-command \
--cluster $CLUSTER_NAME \
--task $TASK_ID \
--command "/bin/bash" \
--interactive
Or, a one-liner (if there is only one service in the cluster):
aws ecs execute-command --cluster $CLUSTER_NAME --task $(aws ecs list-tasks --cluster $CLUSTER_NAME --service $(aws ecs list-services --cluster $CLUSTER_NAME | jq -r '.[][0]' | awk -F'/' '{ print $NF }') | jq -r '.[][0]' | awk -F'/' '{ print $NF }') --interactive --command "/bin/bash"