Files
ubicloud/model/dns_zone/dns_server.rb
Burak Yucesoy e44328b2ec Introduce DNS Servers
With this commit, we are introducing DNS server part of DNS zones, so it is a
good time to explain the overall architecture

--- Architecture Components ---
DNS Zone: This is the public-facing entity that ties everything together. After
we publish DNS Zone as a public service, we will display this entity in the UI,
use this entity as the entry point of the DNS APIs and even use this entity for
generating billing records.
In the database, DNS Zone itself does not keep much data. The only notable data
we keep in the dns_zone table is the name, which will be used as a suffix for
the DNS records managed by this DNS zone. For example, if we want to create DNS
records like test-pg-db.postgres.ubicloud.com, we would create a DNS zone named
postgres.ubicloud.com (technically, we can create a DNS zone named ubicloud.com
as well, but that would be overkill).

DNS Records: These are the DNS resource records kept in the DNS zone. They each
have name, type, TTL, and data fields. The name is the part that we prepend to
the DNS zone name for generating the full URL. For the example above, the name
would be "test-pg-db". The type could be one of the valid DNS record types such
as A, AAAA, NS, MX, TXT, etc. TTL is time to live; after this much time, cached
information kept by the clients expires and clients should query the DNS server
to update their data. However, in practice, clients might not respect this and
query the DNS server more or less frequently. Finally, data is the data kept in
this DNS record. For example, for type A, it should be a valid IPv4 address and
for type AAAA, it should be a valid IPv6 address.

DNS Servers: Name servers are responsible for responding to the incoming DNS
queries. Responding to DNS queries is quite fast (~50 ms), and one name server
can respond to many queries at the same time. Thus, putting multiple DNS zones
into one name server in a multi-tenant fashion is quite common. This increases
hardware utilization significantly. It is also common to assign more than one
name server to one DNS zone for the purposes of load balancing, redundancy, and
HA. That means there is a many-to-many relationship between DNS zones and name
servers, which is reflected in our data model. Finally, as an extra layer of
redundancy and load balancing, it is common to back one name server with more
than one VMs. This is achieved by creating multiple A (or AAAA) records for the
name server with the same name but different data values (i.e., IP addresses).
Our data model also allows this.

--- Choice of DNS Server ---
To be able to pick DNS server software, we surveyed many options. At the end,
we picked Knot DNS. I'm planning to write a blog post about our process, but
it is valuable to document our reasoning here, even if briefly.

We considered following open-source DNS server implementations (ordered by LoC
in their primary language)
Name         LoC
drink        3.3k
gdnsd        17k
MaraDNS      51k
coredns      53k
nsd          53k
trust-dns    56k
knot-dns     121k
pdns         160k
yadifa       175k
bind9        277

These implementations use different languages from each other, so it is not
apples-to-apples comparision, but in general low line count is preferable as
it is a signal for many things we care about: readability, maintainability,
low bug count.

We evaluated them based on following criteria:
- Code readability: Is it easy to follow the code? Are there good comments?
Does directory structure has good separation of concern? Is codebase clean
overall? How is the general feel after reading some code?
- Commit messages: Do they explain clearly why the change in question is being
made?
- Security: Does it provide enough levers to configure it securely?
- Proven reliability: Is it used widely for popular DDOS targets such as big
websites, TLDs, root servers etc.?
- Feature set: We care about being able to do dynamic updates (i.e. modifying
records without needing to reload whole zone file) and API/CLI interface for
modifying zones instead of manually editing the zone file.
- Performance: How long does it take to respond to a DNS query? How long does
it take to update zones? How long does it take to reload the zone file from
scratch?
- Access to support/maintainers: Are there channels where we can ask questions
if we stuck or have a problem?
- Motivation alignment: Are the motivations/plans of developers aligned with
ours?
- Willingness to accept patches or requests: How do maintainers react to the
patches, bug reports, and feature requests.
- Memory safe language: Does the implementation use memory-safe language?

No implementation satisfies all these criteria, but Knot DNS comes closest.
The primary reason was being able to do dynamic updates without needing to
reload whole zone file. Also, knot seemed to be better aligned with our plans
for multi-tenant name servers.

--- Setting Up Name Servers ---
We will set up the initial name servers manually. Below are the instructions
that we would use at that time. They are also useful for people who want to
test these changes.

Note: Steps 4 and 5 are specific to postgres.ubicloud.com DNS zone.

1. Install knot-dns
> sudo apt-get update
> sudo apt-get -y install apt-transport-https ca-certificates wget
> sudo wget -O /usr/share/keyrings/cznic-labs-pkg.gpg https://pkg.labs.nic.cz/gpg
> echo "deb [signed-by=/usr/share/keyrings/cznic-labs-pkg.gpg] https://pkg.labs.nic.cz/knot-dns jammy main" | sudo tee /etc/apt/sources.list.d/cznic-labs-knot-dns.list
> sudo apt-get update
> sudo apt-get install knot

2. Add the config database location to /etc/default/knot
> sudo echo "KNOTD_ARGS=\"-C /var/lib/knot/confdb\"" > /etc/default/knot

3. Populate the config database
> sudo nano /etc/knot/knot.conf

Populate it as follows:
server:
    rundir: "/run/knot"
    user: "knot:knot"
    listen: [ "<ipv4 interface address>@53", "<ipv6 interface address>@53" ]

log:
  - target: "syslog"
    any: "info"

database:
    storage: "/var/lib/knot"

acl:
  - id: "allow_dynamic_updates"
    address: "127.0.0.1/32"
    action: "update"

template:
  - id: "default"
    storage: "/var/lib/knot"
    file: "%s.zone"
    acl: "allow_dynamic_updates"
    zonefile-sync: "60"
    zonefile-load: "difference"
    journal-content: "all"

zone:
  - domain: "postgres.ubicloud.com."

4. Create /var/lib/knot/postgres.ubicloud.com.zone with the following content
> sudo nano /var/lib/knot/postgres.ubicloud.com.zone
postgres.ubicloud.com.          3600    SOA     ns.postgres.ubicloud.com. postgres.ubicloud.com. 23 86400 7200 1209600 3600
postgres.ubicloud.com.          3600    NS      ns.postgres.ubicloud.com.
ns.postgres.ubicloud.com.       3600    A       <ip-address-of-first-name-server>
ns.postgres.ubicloud.com.       3600    A       <ip-address-of-second-name-server>

5. Restart the knot server
> sudo systemctl restart knot

6. Now your DNS server can respond to queries. You can run the below
command from another machine;
> nslookup ns.postgres.ubicloud.com 88.198.87.151
Server:         88.198.87.151
Address:        88.198.87.151#53

Name:   ns.postgres.ubicloud.com
Address: 88.198.87.144
Name:   ns.postgres.ubicloud.com
Address: 88.198.87.151

7. If you want to make your name servers authoritative (from the client's
perspective, because they are already authoritative from the server's
perspective), you need to add NS records to the parent domain (wherever
that is managed). For example, for Cloudflare, go to DNS settings and
add the following records)

ns.postgres      A    88.198.87.144
ns.postgres      A    88.198.87.151
postgres        NS    ns.postgres.ubicloud.com

Now you can run nslookup without specifying the server. If you run it multiple
times, you will see that some times your queries are served from 88.198.87.144
and sometimes from 88.198.87.151

> nslookup ns.postgres.ubicloud.com
  - id: "default"
    storage: "/var/lib/knot"
    file: "%s.zone"
    acl: "allow_dynamic_updates"
    zonefile-sync: "60"
    zonefile-load: "difference"
    journal-content: "all"

zone:
  - domain: "postgres.ubicloud.com."

4. Create /var/lib/knot/postgres.ubicloud.com.zone with the following content
> sudo nano /var/lib/knot/postgres.ubicloud.com.zone
postgres.ubicloud.com.          3600    SOA     ns.postgres.ubicloud.com. postgres.ubicloud.com. 23 86400 7200 1209600 3600
postgres.ubicloud.com.          3600    NS      ns.postgres.ubicloud.com.
ns.postgres.ubicloud.com.       3600    A       <ip-address-of-first-name-server>
ns.postgres.ubicloud.com.       3600    A       <ip-address-of-second-name-server>

5. Restart the knot server
> sudo systemctl restart knot

6. Now your DNS server can respond to queries. You can run the below
command from another machine;
> nslookup ns.postgres.ubicloud.com 88.198.87.151
Server:         88.198.87.151
Address:        88.198.87.151#53

Name:   ns.postgres.ubicloud.com
Address: 88.198.87.144
Name:   ns.postgres.ubicloud.com
Address: 88.198.87.151

7. If you want to make your name servers authoritative (from the client's
perspective, because they are already authoritative from the server's
perspective), you need to add NS records to the parent domain (wherever
that is managed). For example, for Cloudflare, go to DNS settings and
add the following records)

ns.postgres      A    88.198.87.144
ns.postgres      A    88.198.87.151
postgres        NS    ns.postgres.ubicloud.com

Now you can run nslookup without specifying the server. If you run it multiple
times, you will see that some times your queries are served from 88.198.87.144
and sometimes from 88.198.87.151

> nslookup ns.postgres.ubicloud.com
Server:         88.198.87.151
Address:        88.198.87.151#53

Name:   ns.postgres.ubicloud.com
Address: 88.198.87.144
Name:   ns.postgres.ubicloud.com
Address: 88.198.87.151
2023-10-20 18:31:14 +02:00

31 lines
1.2 KiB
Ruby

# frozen_string_literal: true
require_relative "../../model"
class DnsServer < Sequel::Model
many_to_many :dns_zones
many_to_many :vms
include ResourceMethods
def run_commands_on_all_vms(commands)
vms.each do |vm|
outputs = vm.sshable.cmd("sudo -u knot knotc", stdin: commands.join("\n")).split("\n")
# Passing multiple commands to knotc via stdin is faster compared to running each
# command one by one. However, this approach has one drawback; in stdin mode knotc
# always with 0 exit code (i.e. no errors would be raised). At least, errors are
# being written to stdout, so we can search them manually to see if we need to
# raise any errors.
outputs.each_with_index do |output, index|
next if output == "OK"
next if index == 0 && output.include?("no active transaction")
next if commands[index].include?("zone-set") && output.include?("such record already exists in zone")
next if commands[index].include?("zone-unset") && output.include?("no such record in zone found")
raise "Rectify failed on #{self}. Command: #{commands[index]}. Output: #{output}"
end
end
end
end