Skip to content

Behind the Cloud – Network

Introduction

This third part is about networking. The cluster uses a simple L2 design: every node is connected to the same switch so that any node can reach any other directly. One node (the gateway) has two network interfaces and provides Internet access to the rest of the cluster via NAT. This keeps operations simple while I’m still bringing services online.

One network

Hardware

As shown in Part 2, all Raspberry Pi boards plug into a switch. One Raspberry Pi acts as the gateway; it uses:

  • eth0 → the cluster LAN (to the switch)
  • eth1 → a USB‑to‑Ethernet adapter that uplinks to the home router/Internet

A minimal topology diagram:

[Internet/Home Router] ←→ (eth1) Gateway Pi (eth0) ←→ [Switch] ←→ [Pi nodes]

Tip: USB Ethernet interfaces may appear as eth1, usb0 or enx<MAC> depending on the adapter/OS. Use ip link to see exact names.

DHCP (addresses) with Kea

Why DHCP? Each node needs an IP address and the ability to renew it automatically. DHCP does that. I run the Kea DHCP server on the gateway.

Deterministic addressing: To make management and DNS easier, I use reservations by MAC address so every node always gets the same IP (no guessing which host has which address). In Kea this is called a host reservation.

Practical notes

  • Keep a small dynamic pool for temporary devices, and reservations for cluster nodes.
  • Record each node’s MAC → hostname/IP in version control alongside your Ansible inventory.
  • Set sensible lease timers (e.g., hours, not days) so changes roll out quickly during setup.
DHCP configuration (kea.conf.j2)
{
"Dhcp4": {
"valid-lifetime": 7200,
"renew-timer": 600,
"rebind-timer": 1200,
"interfaces-config": {
"interfaces": [ "eth0" ]
},
"lease-database": {
"type": "memfile",
"persist": true,
"name": "/var/lib/kea/dhcp4.leases"
},
"dhcp-ddns" : {
"enable-updates" : true,
"server-ip" : "127.0.0.1",
"server-port" : 53001,
"max-queue-size" : 2048,
"ncr-protocol" : "UDP",
"ncr-format" : "JSON"
},
"ddns-qualifying-suffix": "armorique.local.",
"subnet4": [
{
"subnet": "{{ dhcp_subnet }}",
"pools": [
{
"pool": "{{ dhcp_range_start }} - {{ dhcp_range_end }}"
}
],
"reservations": [
{% for host in groups['k3s_agents'] %}
{
"hw-address": "{{ hostvars[host]['mac_address'] }}",
"ip-address": "{{ hostvars[host]['ansible_host'] }}",
"hostname": "{{ host }}"
}{% if not loop.last %},{% endif %}
{% endfor %}
],
"option-data": [
{
"name": "routers",
"data": "{{ dhcp_gateway }}"
},
{
"name": "domain-name-servers",
"data": "8.8.8.8, 8.8.4.4"
},
{
"name": "domain-name",
"data": "{{ domain_name }}"
}
]
}
],
"loggers": [
{
"name": "kea-dhcp4",
"output_options": [
{
"output": "/var/log/kea-dhcp4.log"
}
],
"severity": "INFO"
}
]
}
}

DNS (names) with BIND 9

Why DNS? Humans use names; software logs are nicer too. I run BIND 9 on the gateway for local name resolution.

Authoritative zone: I serve an internal zone (e.g., color-cluster.local) with A/AAAA records for nodes. On each node, /etc/resolv.conf (or NetworkManager) points to the gateway DNS.

Forwarding: For external names, the gateway forwards queries to the home router/ISP DNS (or public resolvers). That keeps one resolver path for everything inside the cluster.

DNS configuration
/etc/NetworkManager/conf.d/01-dns.conf
# NetworkManager DNS configuration
# Generated by Ansible
# Prevent NetworkManager from overwriting /etc/resolv.conf
[main]
dns=none
/etc/bind/zones/db.color-cluster.local
;
; BIND data file for {{ domain_name }}
; Generated by Ansible
;
$TTL 604800
@ IN SOA ns1.{{ domain_name }}. admin.{{ domain_name }}. (
1 ; Serial
604800 ; Refresh
86400 ; Retry
2419200 ; Expire
604800 ) ; Negative Cache TTL
;
@ IN NS ns1.{{ domain_name }}.
ns1 IN A {{ dhcp_gateway }}
; Add K3s server
k3s-server IN A {{ dhcp_gateway }}
master IN CNAME k3s-server
; Add specific hosts with their IPs
{% for host in groups['k3s_agents'] %}
{{ host }} IN A {{ hostvars[host]['ansible_host'] }}
{% endfor %}
; Add convenient domain entries
*.{{ domain_name }}. IN A {{ dhcp_gateway }}
/etc/bind/named.conf
// This is the primary configuration file for the BIND DNS server named.
//
// Generated by Ansible
include "/etc/bind/named.conf.options";
include "/etc/bind/named.conf.local";
include "/etc/bind/named.conf.default-zones";
/etc/bind/named.conf.local
// Zone configuration for {{ domain_name }}
// Generated by Ansible
zone "{{ domain_name }}" {
type master;
file "/etc/bind/zones/db.{{ domain_name }}";
allow-update { none; };
};
/etc/bind/named.conf.options
options {
directory "/var/cache/bind";
// Forward DNS requests to Google DNS if not resolved locally
forwarders {
8.8.8.8;
8.8.4.4;
};
forward first;
// Enable recursion for local network clients
recursion yes;
allow-recursion { 127.0.0.1; {{ dhcp_subnet }}; };
// Listen on all interfaces
listen-on { any; };
listen-on-v6 { any; };
// Allow queries from localhost and local network
allow-query { localhost; {{ dhcp_subnet }}; };
dnssec-validation no;
auth-nxdomain no; # conform to RFC1035
};
/etc/resolv.conf
# DNS configuration for k3s server node
search {{ domain_name }}
nameserver 127.0.0.1

Routing & NAT on the gateway

Only the gateway has Internet connectivity. It forwards packets between the cluster LAN (eth0) and the uplink (eth1) and performs NAT so internal addresses can reach the Internet.

  1. Enable IPv4 forwarding using Ansible (once, and persistently):
- name: Enable IP forwarding
ansible.posix.sysctl:
name: net.ipv4.ip_forward
value: '1'
state: present
sysctl_set: yes
reload: yes
  1. NAT + forwarding rules (iptables):
Terminal window
# translate cluster source IPs to the gateway's uplink address
iptables -t nat -A POSTROUTING -o eth1 -j MASQUERADE
# allow replies from the Internet back to cluster nodes
iptables -A FORWARD -i eth1 -o eth0 -m state --state RELATED,ESTABLISHED -j ACCEPT
# allow new connections from cluster to the Internet
iptables -A FORWARD -i eth0 -o eth1 -j ACCEPT
  1. Persistence. I originally used iptables-persistent with Ansible. After installing k3s (which configures networking via Flannel), I switched to a boot script that reapplies the rules—simpler and more predictable during cluster restarts.
/etc/systemd/system/set-routing.service
[Unit]
Description=Configure network routing
After=network.target
[Service]
ExecStart=/usr/local/bin/set-routing.sh
Restart=no
User=root
Group=root
Type=simple
[Install]
WantedBy=multi-user.target

Operational note: with this setup, all maintenance hops go through the gateway first.

Certificates (TLS) for internal services

I need HTTPS for internal endpoints (e.g., a private container registry), but I don’t expose them publicly. For this scenario a good pattern is a private Certificate Authority (CA).

To do so:

  • I created a private CA (once on the gateway).
  • I generated a key + CSR and sign it with the private CA.
  • Distribute the CA certificate (public) to every node’s trust store so they trust certificates issued by that CA.

Only the public CA certificate should be copied to other machines. The CA private key must remain offline or tightly protected.

For simplicity, we’re reusing one key/CSR across nodes. The right approach is unique keys and certificates per node and per service, which I’ll implement later.

Later (optional): once the infrastructure will be more stable, I can let cert-manager manage internal certificates and renewals inside the cluster, still rooted in my private CA.

Conclusion

This network configuration gives the cluster a clean foundation:

  • one simple L2 network for node‑to‑node traffic
  • deterministic addressing (Kea) and local naming (BIND 9)
  • a single egress point with NAT on the gateway
  • internal TLS via a private CA

Useful references