Node Operations#
Common operations for managing cluster nodes: shutdown, reboot, drain, add, and remove.
Apply package updates to nodes#
The update_packages role is part of the servers play — not a tag of its own.
To run package updates (e.g. after adding a new package to the role):
# All nodes
ansible-playbook pb_all.yml --tags servers
# Specific nodes only
ansible-playbook pb_all.yml --tags servers --limit node02,node03
Note
--tags update_packages will silently do nothing — the correct tag is servers,
which runs both the move_fs and update_packages roles.
Shutdown all nodes#
ansible all_nodes -a "/sbin/shutdown now" -f 10 --become
Reboot all nodes#
ansible all_nodes -m reboot -f 10 --become
Shutdown or reboot a single node#
# Shutdown
ansible node03 -a "/sbin/shutdown now" --become
# Reboot
ansible node03 -m reboot --become
Drain a node for maintenance#
Before taking a node offline for hardware maintenance:
# Drain the node (evict pods, mark unschedulable)
kubectl drain node03 --ignore-daemonsets --delete-emptydir-data
# Perform maintenance...
# Uncordon the node (allow pods to schedule again)
kubectl uncordon node03
Add extra (non-Turing Pi) nodes#
To add standalone Linux servers to the cluster:
Step 1: Add to inventory#
Edit hosts.yml and add entries under extra_nodes:
extra_nodes:
hosts:
nuc1:
nuc2:
vars:
ansible_user: "{{ ansible_account }}"
all_nodes:
children:
turingpi_nodes:
extra_nodes: # Make sure extra_nodes is listed here
Step 2: Bootstrap Ansible access#
ansible-playbook pb_add_nodes.yml
This prompts for an existing username and password on the new servers, creates the
ansible user with SSH key authentication and passwordless sudo.
Step 3: Join the cluster#
ansible-playbook pb_all.yml --limit nuc1,nuc2 --tags known_hosts,servers,k3s
The new nodes will be prepared (known_hosts, package updates) and joined to the existing K3s cluster as workers.
Remove a worker node#
Step 1: Drain the node#
kubectl drain node03 --ignore-daemonsets --delete-emptydir-data
Step 2: Delete from Kubernetes#
kubectl delete node node03
Step 3: Uninstall K3s on the node#
ssh ansible@node03 'sudo /usr/local/bin/k3s-agent-uninstall.sh'
Step 4: Remove from inventory#
Remove the node from hosts.yml and commit the change.
Run an ad-hoc command on all nodes#
# Check disk usage
ansible all_nodes -a "df -h" --become
# Check K3s agent status
ansible all_nodes -a "systemctl status k3s-agent" --become
# Run a role standalone
ansible all_nodes -m include_role -a name=known_hosts