With this commit, we are introducing DNS server part of DNS zones, so it is a good time to explain the overall architecture --- Architecture Components --- DNS Zone: This is the public-facing entity that ties everything together. After we publish DNS Zone as a public service, we will display this entity in the UI, use this entity as the entry point of the DNS APIs and even use this entity for generating billing records. In the database, DNS Zone itself does not keep much data. The only notable data we keep in the dns_zone table is the name, which will be used as a suffix for the DNS records managed by this DNS zone. For example, if we want to create DNS records like test-pg-db.postgres.ubicloud.com, we would create a DNS zone named postgres.ubicloud.com (technically, we can create a DNS zone named ubicloud.com as well, but that would be overkill). DNS Records: These are the DNS resource records kept in the DNS zone. They each have name, type, TTL, and data fields. The name is the part that we prepend to the DNS zone name for generating the full URL. For the example above, the name would be "test-pg-db". The type could be one of the valid DNS record types such as A, AAAA, NS, MX, TXT, etc. TTL is time to live; after this much time, cached information kept by the clients expires and clients should query the DNS server to update their data. However, in practice, clients might not respect this and query the DNS server more or less frequently. Finally, data is the data kept in this DNS record. For example, for type A, it should be a valid IPv4 address and for type AAAA, it should be a valid IPv6 address. DNS Servers: Name servers are responsible for responding to the incoming DNS queries. Responding to DNS queries is quite fast (~50 ms), and one name server can respond to many queries at the same time. Thus, putting multiple DNS zones into one name server in a multi-tenant fashion is quite common. This increases hardware utilization significantly. It is also common to assign more than one name server to one DNS zone for the purposes of load balancing, redundancy, and HA. That means there is a many-to-many relationship between DNS zones and name servers, which is reflected in our data model. Finally, as an extra layer of redundancy and load balancing, it is common to back one name server with more than one VMs. This is achieved by creating multiple A (or AAAA) records for the name server with the same name but different data values (i.e., IP addresses). Our data model also allows this. --- Choice of DNS Server --- To be able to pick DNS server software, we surveyed many options. At the end, we picked Knot DNS. I'm planning to write a blog post about our process, but it is valuable to document our reasoning here, even if briefly. We considered following open-source DNS server implementations (ordered by LoC in their primary language) Name LoC drink 3.3k gdnsd 17k MaraDNS 51k coredns 53k nsd 53k trust-dns 56k knot-dns 121k pdns 160k yadifa 175k bind9 277 These implementations use different languages from each other, so it is not apples-to-apples comparision, but in general low line count is preferable as it is a signal for many things we care about: readability, maintainability, low bug count. We evaluated them based on following criteria: - Code readability: Is it easy to follow the code? Are there good comments? Does directory structure has good separation of concern? Is codebase clean overall? How is the general feel after reading some code? - Commit messages: Do they explain clearly why the change in question is being made? - Security: Does it provide enough levers to configure it securely? - Proven reliability: Is it used widely for popular DDOS targets such as big websites, TLDs, root servers etc.? - Feature set: We care about being able to do dynamic updates (i.e. modifying records without needing to reload whole zone file) and API/CLI interface for modifying zones instead of manually editing the zone file. - Performance: How long does it take to respond to a DNS query? How long does it take to update zones? How long does it take to reload the zone file from scratch? - Access to support/maintainers: Are there channels where we can ask questions if we stuck or have a problem? - Motivation alignment: Are the motivations/plans of developers aligned with ours? - Willingness to accept patches or requests: How do maintainers react to the patches, bug reports, and feature requests. - Memory safe language: Does the implementation use memory-safe language? No implementation satisfies all these criteria, but Knot DNS comes closest. The primary reason was being able to do dynamic updates without needing to reload whole zone file. Also, knot seemed to be better aligned with our plans for multi-tenant name servers. --- Setting Up Name Servers --- We will set up the initial name servers manually. Below are the instructions that we would use at that time. They are also useful for people who want to test these changes. Note: Steps 4 and 5 are specific to postgres.ubicloud.com DNS zone. 1. Install knot-dns > sudo apt-get update > sudo apt-get -y install apt-transport-https ca-certificates wget > sudo wget -O /usr/share/keyrings/cznic-labs-pkg.gpg https://pkg.labs.nic.cz/gpg > echo "deb [signed-by=/usr/share/keyrings/cznic-labs-pkg.gpg] https://pkg.labs.nic.cz/knot-dns jammy main" | sudo tee /etc/apt/sources.list.d/cznic-labs-knot-dns.list > sudo apt-get update > sudo apt-get install knot 2. Add the config database location to /etc/default/knot > sudo echo "KNOTD_ARGS=\"-C /var/lib/knot/confdb\"" > /etc/default/knot 3. Populate the config database > sudo nano /etc/knot/knot.conf Populate it as follows: server: rundir: "/run/knot" user: "knot:knot" listen: [ "<ipv4 interface address>@53", "<ipv6 interface address>@53" ] log: - target: "syslog" any: "info" database: storage: "/var/lib/knot" acl: - id: "allow_dynamic_updates" address: "127.0.0.1/32" action: "update" template: - id: "default" storage: "/var/lib/knot" file: "%s.zone" acl: "allow_dynamic_updates" zonefile-sync: "60" zonefile-load: "difference" journal-content: "all" zone: - domain: "postgres.ubicloud.com." 4. Create /var/lib/knot/postgres.ubicloud.com.zone with the following content > sudo nano /var/lib/knot/postgres.ubicloud.com.zone postgres.ubicloud.com. 3600 SOA ns.postgres.ubicloud.com. postgres.ubicloud.com. 23 86400 7200 1209600 3600 postgres.ubicloud.com. 3600 NS ns.postgres.ubicloud.com. ns.postgres.ubicloud.com. 3600 A <ip-address-of-first-name-server> ns.postgres.ubicloud.com. 3600 A <ip-address-of-second-name-server> 5. Restart the knot server > sudo systemctl restart knot 6. Now your DNS server can respond to queries. You can run the below command from another machine; > nslookup ns.postgres.ubicloud.com 88.198.87.151 Server: 88.198.87.151 Address: 88.198.87.151#53 Name: ns.postgres.ubicloud.com Address: 88.198.87.144 Name: ns.postgres.ubicloud.com Address: 88.198.87.151 7. If you want to make your name servers authoritative (from the client's perspective, because they are already authoritative from the server's perspective), you need to add NS records to the parent domain (wherever that is managed). For example, for Cloudflare, go to DNS settings and add the following records) ns.postgres A 88.198.87.144 ns.postgres A 88.198.87.151 postgres NS ns.postgres.ubicloud.com Now you can run nslookup without specifying the server. If you run it multiple times, you will see that some times your queries are served from 88.198.87.144 and sometimes from 88.198.87.151 > nslookup ns.postgres.ubicloud.com - id: "default" storage: "/var/lib/knot" file: "%s.zone" acl: "allow_dynamic_updates" zonefile-sync: "60" zonefile-load: "difference" journal-content: "all" zone: - domain: "postgres.ubicloud.com." 4. Create /var/lib/knot/postgres.ubicloud.com.zone with the following content > sudo nano /var/lib/knot/postgres.ubicloud.com.zone postgres.ubicloud.com. 3600 SOA ns.postgres.ubicloud.com. postgres.ubicloud.com. 23 86400 7200 1209600 3600 postgres.ubicloud.com. 3600 NS ns.postgres.ubicloud.com. ns.postgres.ubicloud.com. 3600 A <ip-address-of-first-name-server> ns.postgres.ubicloud.com. 3600 A <ip-address-of-second-name-server> 5. Restart the knot server > sudo systemctl restart knot 6. Now your DNS server can respond to queries. You can run the below command from another machine; > nslookup ns.postgres.ubicloud.com 88.198.87.151 Server: 88.198.87.151 Address: 88.198.87.151#53 Name: ns.postgres.ubicloud.com Address: 88.198.87.144 Name: ns.postgres.ubicloud.com Address: 88.198.87.151 7. If you want to make your name servers authoritative (from the client's perspective, because they are already authoritative from the server's perspective), you need to add NS records to the parent domain (wherever that is managed). For example, for Cloudflare, go to DNS settings and add the following records) ns.postgres A 88.198.87.144 ns.postgres A 88.198.87.151 postgres NS ns.postgres.ubicloud.com Now you can run nslookup without specifying the server. If you run it multiple times, you will see that some times your queries are served from 88.198.87.144 and sometimes from 88.198.87.151 > nslookup ns.postgres.ubicloud.com Server: 88.198.87.151 Address: 88.198.87.151#53 Name: ns.postgres.ubicloud.com Address: 88.198.87.144 Name: ns.postgres.ubicloud.com Address: 88.198.87.151
31 lines
1.2 KiB
Ruby
31 lines
1.2 KiB
Ruby
# frozen_string_literal: true
|
|
|
|
require_relative "../../model"
|
|
|
|
class DnsServer < Sequel::Model
|
|
many_to_many :dns_zones
|
|
many_to_many :vms
|
|
|
|
include ResourceMethods
|
|
|
|
def run_commands_on_all_vms(commands)
|
|
vms.each do |vm|
|
|
outputs = vm.sshable.cmd("sudo -u knot knotc", stdin: commands.join("\n")).split("\n")
|
|
|
|
# Passing multiple commands to knotc via stdin is faster compared to running each
|
|
# command one by one. However, this approach has one drawback; in stdin mode knotc
|
|
# always with 0 exit code (i.e. no errors would be raised). At least, errors are
|
|
# being written to stdout, so we can search them manually to see if we need to
|
|
# raise any errors.
|
|
outputs.each_with_index do |output, index|
|
|
next if output == "OK"
|
|
next if index == 0 && output.include?("no active transaction")
|
|
next if commands[index].include?("zone-set") && output.include?("such record already exists in zone")
|
|
next if commands[index].include?("zone-unset") && output.include?("no such record in zone found")
|
|
|
|
raise "Rectify failed on #{self}. Command: #{commands[index]}. Output: #{output}"
|
|
end
|
|
end
|
|
end
|
|
end
|