Under the Hood of a Simple DNS Server

Introduction

As part of bringing up my personal infrastructure, I wanted a way to connect all my various boxes together and make them available to me from anywhere with an internet connection. The first step was to put up a VPN. For that, I use Wireguard. I ended up creating a simple management interface that allows me to link peers together and be able to access them via virtual IP addresses, no matter where they are.

One downside to this method is having to remember each individual box’s IP address. That’s too much work. Fortunately, Wireguard allows us to plug in a separate DNS server to solve this problem.

At its most basic, a DNS server maps IP addresses to domain names. This makes it much easier for us humans to remember websites: we just need to remember the name and DNS will take care of translating it to an IP address. For example: www.google.com (currently) maps to IP address 172.217.10.228. I won’t go into the detail of how DNS works here, as there are much better guides for that. I highly recommend checking out one of the two links above if you want a high level overview.

For this post, I will talk mostly about the details of implementing a DNS server that follows the original two RFCs that laid out the spec: 1034 and 1035.

Basic building blocks

The primitives of DNS are known as resource records. This is the data that is requested by a client and answered by a DNS server. The most familiar would be an A record, which maps a domain name to an IPv4 address. We can use a tool known as dig¹ to query for A records:

# Look for the A record of "www.google.com"
$ dig +short www.google.com A
172.217.10.228

The command returned 172.217.10.228, the IP address attached to www.google.com.

There are many types of resource records, but let’s enumerate the most relevant and discuss each of them as we build the picture:

A (AAAA)
CNAME
NS
SOA

The A records also have a sibling called AAAA, which map to IPv6 addresses. Let’s use dig again to find the IPv6 address of www.google.com:

$ dig +short www.google.com AAAA
2607:f8b0:4006:814::2004

Feel free to play around with this. At some point, you might run into something strange. You run dig against the domain for my blog and you get back unexpected results:

$ dig +short blog.aos.sh AAAA
aos.github.io.

Wait, what? Let’s try calling dig without +short:

$ dig blog.aos.sh AAAA
; Output of generated query here, skipping...

;; QUESTION SECTION:
;blog.aos.sh.             IN    AAAA

;; ANSWER SECTION:
blog.aos.sh.      300     IN    CNAME   aos.github.io.

You will see that we asked for the AAAA record, but we got back an answer with something called a CNAME record. Intuitively, we might come to the conclusion that a CNAME record is an alias, and we would be right.

CNAME stands for canonical name, and is used as a way to point one domain name to another. Contrast that to A records which point a domain name to an IP address.

To give you an example, my blog is hosted on Github Pages which automatically generate my URL as <my-username>.github.io, but since I own the domain “aos.sh”, I created a CNAME record with my domain provider and pointed it to aos.github.io. Eventually, aos.github.io will point to a valid IP address, we can see this process in action with dig:

$ dig blog.aos.sh A
; Skipping to relevant bits...

;; QUESTION SECTION:
;blog.aos.sh.                   IN      A

;; ANSWER SECTION:
blog.aos.sh.      300     IN    CNAME   aos.github.io.
aos.github.io.    3599    IN    A       185.199.110.153
aos.github.io.    3599    IN    A       185.199.109.153
aos.github.io.    3599    IN    A       185.199.108.153
aos.github.io.    3599    IN    A       185.199.111.153

The reason my page has multiple IP addresses assigned to it could be that there is a load balancer on Github’s backend and my site is being served from any of those URLs.²

If you try to visit aos.github.io, you will be taken to my blog anyway but you will see that it changes the URL to blog.aos.sh. That is because I told Github that I’m using a custom domain, and it sets up a redirect back to blog.aos.sh.

Tying it together

DNS servers use a file known as the zone file to hold all necessary resource records. When a client makes a request to the DNS server, the server will use the information in this file to respond. In a zone file, a resource record has the following format:

catcoffeecode.hello.     300     IN      A       157.245.253.239
      ^                 ^       ^       ^             ^
      name              TTL     class   type          data

Depending on the type of resource record, the name and data portions will contain different information. The class is the network class, and is most often IN standing for INternet. We will get to the TTL in a bit. In the above example, we have an A-type record for catcoffeecode.club, that holds the IPv4 address.

The start of every zone file is a bit different, and looks something like this:

$ORIGIN hello.
$TTL 1750
@     IN      SOA     ns1.hello.    me.catcoffeecode.hello. (
                      2020071701  ; Serial
                      3H          ; refresh
                      1H          ; retry
                      1W          ; expire
                      1D)         ; minimum TTL

              NS      ns1.hello.

;; all of the rest of our resource records go below
ns1.hello.            A       10.56.0.1

lumpi.hello.  300     A       10.56.0.15
pi.hello.             CNAME   lumpi.hello.

; ...

We start with the name of our zone in the $ORIGIN directive, followed by a default $TTL, the time-to-live of a resource record. Eventually we get to the SOA record, which stands for Start of Authority. No zone file is valid without this record at the beginning. This will contain administrative information as well as other necessary fields.

The SOA starts with the @ symbol, shorthand for the origin directive (which we specified two lines above), followed by the class, resource type, and some other necessary fields commented above.

The other record shown above is the NS record. This is the name server that will be authoritative for this zone. This is typically the server that holds the actual zone file.³

Serving DNS records

Now that we have a basic understanding of resource records and zone files, we can put them to use in our DNS server. Here’s the idea: we start a server that listens on UDP port 53 (as defined by the DNS protocol), load our zone file, and respond to any DNS requests with what we have available. If we don’t have any answers, then ask an upstream/forward DNS server with the same request, and serve back the results.

Typically, DNS requests are handled by programs known as resolvers. When your browser or program makes a request for a URL, the resolver will first resolve the URL to an IP address. It does this by sending a DNS request to a preconfigured list of servers; resolvers will also usually cache successful requests for a speed boost.

A simple DNS server will have a few moving parts:

Parsing the zone file into data structures that allow us to manipulate it easily.
Monitoring the zone file for any changes. And if changed, update the data structures that hold our zone information with the changes.
Serving any DNS requests that come our way.

Having said that, let’s look at what this would look like using some code.⁴ A zone data structure might look something like this:

// Zone contains all resource records and helper fields
type Zone struct {
	filename        string
	fileLastModTime time.Time
	rrs             []dns.RR
	ns              []dns.RR
	mut             sync.RWMutex
}

We hold our zone file name, our resource records (rrs), our name server records (ns), and some extra fields to facilitate interacting with this data structure.

Because we need to monitor for changes to our zone file, we run an asynchronous routine to essentially check the zone file’s last modified time and compare it to what we have stored. If it has changed, we re-parse our zone file.

// Note: error handling removed for brevity
func monitorZonefile(zone *Zone) {
	t := time.NewTicker(time.Second * 30)
	defer t.Stop()

	for {
		select {
		case <-t.C:
			fileInfo, _ := os.Stat(zone.filename)

			if fileInfo.ModTime().After(zone.fileLastModTime) {
				log.Printf("zone file has been modified")
				zone.fileLastModTime = fileInfo.ModTime()

				parseRecords(zone)
			}
		}
	}
}

Once we’ve parsed our zone file into resource records, we can then begin to serve them to clients. We do this by bringing up our server and listening for requests on UDP port 53. All DNS requests and responses have a specific format, as defined by RFC 1035:

Header - various helpful fields
Question - the request to the DNS server
Answer - resource records answering our question
Authority - resource records which point to an authority
Additional - any resource records holding additional information

To answer a DNS request, we look in the Question field for the name of the record and check to see if we have a record that matches. The name, type, and class information of the request must all be equivalent for a match. We then create a DNS response with the same format, copy the Question over, and fill out the Answer section with the resource records that we have found. If this resource record is in our zone file, we fill out the Authority section with our name server. Finally, we respond to the request after aggregating all of our sections.

Let’s see how this would look like with some code:

func HandleRequest(w dns.ResponseWriter, req *dns.Msg) {
    // Read-lock the zone so it doesn't change from under us
    zone.mut.RLock()
    defer zone.mut.RUnlock()

    // Create a response message
    m := new(dns.Msg)
    m.SetReply(req)

    // For all questions, loop through our records to check for a match
    for _, q := range req.Question {
        answers := []dns.RR{}

        for _, rr := range zone.rrs {
            rh := rr.Header()

            // 1. Handle CNAMEs
            if q.Name == rh.Name && (rh.Rrtype == dns.TypeCNAME || q.Qtype == dns.TypeCNAME) {
                answers = append(answers, rr)

                for _, a := range resolve("127.0.0.1:53", rr.(*dns.CNAME).Target, q.Qtype) {
                    answers = append(answers, a)
                }
            }

            // 2. Handle everything else
            if q.Name == rh.Name && q.Qtype == rh.Rrtype && q.Qclass == rh.Class {
                answers = append(answers, rr)
            }
        }

        // If we don't have an answer, ask a forward DNS server
        if len(answers) == 0 {
            for _, a := range resolve("1.1.1.1:53", q.Name, q.Qtype) {
                answers = append(answers, a)
            }
        // Otherwise, we are the authority on this record
        } else {
            m.Ns = zone.ns
        }

        m.Answer = append(m.Answer, answers...)
    }
    // Send our response
    w.WriteMsg(m)
}

The resolve() function here just sends a DNS request on our behalf. Notice that if we see a matching CNAME, we make a request to our own server to find the associated A (or AAAA) record. Hopefully we added that in our zone file!

We also use this function to make requests to forward DNS servers. These are servers we can reach to for DNS information outside of our jurisdiction. A couple of other common examples are Google’s 8.8.8.8 and 8.8.4.4.

Final thoughts

The server we implemented will get us there for basically serving DNS requests. I use this with my Wireguard VPN to have my very own internal custom domain names.

The DNS protocol is much more complex. The server, as shown, is not meant for production environments. It does not encrypt DNS traffic and does not offer anonymity or privacy. It was built with a specific purpose in mind, and with a very simple use-case. The intention of writing my own server was to get a better understanding of the fundamentals of DNS.

If you have time, I highly encourage you do the same. It is a very rewarding experience.

The +short option removes a lot of unnecessary verbosity for clarity. Try removing it to see exactly what query dig is generating. ↩︎
You may have noticed that querying for the AAAA record does not generate the same kind of output as the A record. It is possible that Github’s load balancers do not have an IPv6 address assigned in this case! ↩︎
RFC 1035 mandates that we have 2 DNS servers for fault tolerance. One acts as primary, the other acts as secondary. The process of copying contents of the zone file from primary to secondary is known as a zone transfer. This is part of the reason the SOA record contains so many fields, to facilitate zone transfers. ↩︎
I’ll be using Go for demonstration, as this is what I wrote my DNS server in. It makes use of this very handy dns library, especially for parsing zone files. ↩︎