Using CoreDNS and MetalLB on bare-metal Kubernetes clusters

If you decide to build your own bare-metal Kubernetes cluster, one of the things you need to think about is how you are going to get access to the services running on it. You obviously need an Ingress controller and you can decide to use Traefik. Ingress controller controls a reverse proxy which knows how to route traffic from the IP addresses assigned to it to internal services. Then you need a level 2 load balancer so that IP addresses can be resolved to MAC addresses of the physical devices, and it seems like MetalLB is a good choice for such tasks. And at last, you need to resolve a DNS name of your choice to the IP address assigned to Traefik and this is exactly what this small article is about.

When I was building my bare-metal mini-cluster I found several sources which you can divide into two categories regarding how they approach DNS resolution:

  1. Configure dnsmasq on your router or, if your router doesn’t support dnsmasq, add corresponding entries into /etc/hosts, see a great article from Carlos Eduardo).
  2. Run a local CoreDNS, add A record(s) and change resolve.conf to point to the instance as described here.

All the solutions involving changes in ether etc/hosts or resolve.conf are problematic. You want any machine connected you your network to automatically find services hosted on your cluster and if you need to make changes you shouldn’t go through all the machines connected and apply them.

Configuring dnsmasq on a router obviously requires a router with dnsmasq support. You might have it already available in the router’s firmware otherwise you can flash DD-WRT into it. I’m using a quite geeky Mikrotik hPA ac as my subnet router and it does support dnsmasq (as well as a lot of other features) but I appreciate that a lot of people don’t have dnsmasq support on their routers and don’t want or even can’t flash DD-WRT for different reasons.

So I decided to run a CoreDNS instance on my cluster and resolve DNS queries using it since MetalLB does now support IP address sharing. The workflow of resolving a DNS name is represented below:

  1. A client computer receives a DNS server configuration provided by the router via DHCP.
  2. An ARP request is issued for resolving the IP address of the DNS server. The request is received by MetalLB which returns MAC address of the actual Kubernetes node where CoreDNS is currently running.
  3. A DNS request is issued to the resolved CoreDNS instance and the corresponding authoritative response received.

In order to set this up, you need to perform the following steps, assuming that you have a Kubernetes cluster up and running as well as Traefik with an external IP address assigned.

  1. Pick an IP address for CoreDNS service from the pool of addresses available to MetalLB (e.g. 192.168.88.50) and decide which wildcard domain name you want to use for your internal services, for example, *.example.com.
  2. Create and apply a configuration map for CoreDNS with a wild card A record for the *.example.com resolving all the sub-domains to the Traefik service you already running. If your Traefik’s IP is 192.168.88.100 then the configuration map might look like this:
apiVersion: v1
kind: ConfigMap
metadata:
  name: external-dns
data:
  Corefile: |
    .:53 {
        errors
        health
        file /etc/coredns/zones/example.com example.com
        prometheus :9153
        proxy . /etc/resolv.conf
        cache 30
        loop
        reload
        loadbalance
    }
  example.com: |
    $TTL 60
    $ORIGIN example.com.
    @                   IN	SOA sns.dns.icann.org. noc.dns.icann.org. (
              2017042745 ; serial
              7200       ; refresh (2 hours)				
              3600       ; retry (1 hour)			
              1209600    ; expire (2 weeks)				
              3600       ; minimum (1 hour)				
              )
    @                   IN A     192.168.88.100
    *.example.com.     IN A     192.168.88.100
  1. If you want to use faster and more secure forward DNS from Cloudflare (see 1.1.1.1 for more information) it is super easy to enable DNS over HTTP (DoH), just replace proxy . /etc/resolv.conf with:
    forward . tls://1.1.1.1 tls://1.0.0.1 {
       tls_servername cloudflare-dns.com
       health_check 5s
    }
  1. Create and apply a deployment for CoreDNS. The full deployment file you can find here but make sure you’ve changed the configuration path according to the name specified in the configuration map you applied previously.
  2. Create and apply a service definition for CoreDNS. If you assigned IP address 192.168.88.50 to your CoreDNS the configuration file might look like this:
apiVersion: v1
kind: Service
metadata:
  name: external-dns-udp
  annotations:
    metallb.universe.tf/allow-shared-ip: external-dns
  labels:
    app: enternal-dns
spec:
  type: LoadBalancer
  loadBalancerIP: 192.168.88.50
  ports:
    - name: dns-udp
      port: 53
      protocol: UDP
  selector:
    app: external-dns
---
apiVersion: v1
kind: Service
metadata:
  name: external-dns-tcp
  annotations:
    metallb.universe.tf/allow-shared-ip: external-dns
  labels:
    app: external-dns
spec:
  type: LoadBalancer
  loadBalancerIP: 192.168.88.50
  ports:
    - name: dns-tcp
      port: 53
      protocol: TCP
  selector:
    app: external-dns
---
apiVersion: v1
kind: Service
metadata:
  name: external-dns-metrics
  annotations:
    prometheus.io/scrape: 'true'
    prometheus.io/port: 9153
  labels:
    app: external-nds
spec:
  type: ClusterIP
  ports:
    - name: metrics
      port: 9153
      protocol: TCP
  selector:
    app: external-dns
  1. Modify your DNS servers on the router so that the one provided through DHCP is your CoreDNS running on Kubernetes (e.g. 192.168.88.50). There is an article about configuring DNS servers on popular routers. In my case, since I’m using Mikrotik’s hPA ac, I had to add it to DHCP settings.
  2. Reconnect to your network to update the DNS servers and navigate to a service exposed in Traefik, e.g. to http://dashboard.example.com The name should be successfully resolved to 192.168.88.100 and Traefik should do the rest forwarding traffic to where it needs to be forwarded.

That’s basically it. Enjoy your Kubernetes hosted DNS!

P.S. There is an issue I reported which is related to the shared IP mode in MetalLB and the fix is not in a release yet. This issue exhibits when Kubernetes relocates CoreDNS server between the nodes and it might happen when you delete its pod for one or another reason.

You may also like...

9 Responses

  1. amir says:

    Thank you so much, this is a great article and it helped so much.
    Can I know how I can support more than one domain like example2.com, example3.com… ?
    Thanks!

    • Sergey Anisimov says:

      Hi Amir, thank you for your feedback.
      In my example above I’m resolving all sub-domains like foo.example.com, bar.example.com etc. to the same Traefik’s IP – 192.168.88.100. Traefik knows how to route traffic for different sub-domains to different services. If you need to support different high level domains (e.g. example2.com, example3.com), you just need to extend your CoreDNS configuration with additional records, similar to the one above. You can resolve additional domains to the same Traefik and it can handle the rest. I hope it make sense 🙂

  2. amir says:

    Thanks for fast reply! Actually I have tried with CoreDNS config but not successful. Is there any reference or example?
    By the way after applying new configuration even I rollback to the configuration with one address the server doesn’t work . I have to change the server IP for external-dns LoadBalancer to another IP to make it work!!

    Thanks again!

  3. Jose says:

    I have been trying to implement this with k3s on some raspberry pis but I somehow keep getting coredns errors with “read udp .. i/o timeout”. Have you had to update this since creation?

    • Sergey Anisimov says:

      Hi Jose,

      The last time I had this solution running, that was three weeks ago, it was working fine. But I haven’t updated k8s or any of the services for quite some time. I’m planning to update shortly and see if I get any issues.

  4. Joe says:

    Hi
    I’m just trying to experience with metallb and bgp using my mikrotik hap ac as well!
    However, Im suffering from routing issues and I don’t understand why the routing itself won’t work.
    I configured the correct AS values both in metallb and setup the peer in mikrotik, and I see it connection established.
    I even expose a service over some range, and I see my mikrotik creates the route (in ip routes) with distance 20.
    When I try to navigate to the service it ranges from being able to load it (happened at the beginning for once or twice) but took around 5seconds to load, to not being able to load at all. Im not sure what is the problem or how to debug it at this point honestly.

    Can yo u share your experience and configs related to this specific area?

  5. ll says:

    thank you for this article. It works. Although I had to change the apiVersion from extensions/v1beta1 to apps/v1 for the deployment of coredns: https://github.com/moikot/bare-metal-k8s/blob/master/coredns/deployment.yaml

    Rest works flawlessly

Leave a Reply to Sergey Anisimov Cancel reply

Your email address will not be published. Required fields are marked *