[HN Gopher] What happens when you update your DNS
       ___________________________________________________________________
        
       What happens when you update your DNS
        
       Author : kiyanwang
       Score  : 83 points
       Date   : 2020-06-22 06:15 UTC (16 hours ago)
        
 (HTM) web link (jvns.ca)
 (TXT) w3m dump (jvns.ca)
        
       | jrockway wrote:
       | This reminds me that I wish DNS had some way to define a load
       | balancing algorithm for clients to use, so browsers could make
       | load balancing decisions. This would eliminate the need for
       | virtual IP addresses, having to pass originating subnet
       | information up recursive queries, having to remove faulty VIPs
       | (or hosts) from DNS, etc.
       | 
       | It is baffling to me that inside the datacenter, I can control
       | the balancing strategy for every service-to-service transaction,
       | but for the end user's browser, all I can do is some L3 hacks to
       | make two routers appear as one (for failover purposes). L3
       | balancing would be completely unnecessary if I could just program
       | the user agent to go to the right host, after all. The end result
       | is unnecessary cost and complexity multiplied over a billion
       | websites.
        
         | m3047 wrote:
         | > [...] I wish DNS had some way to define a load balancing
         | algorithm for clients to use, so browsers could make load
         | balancing decisions.
         | 
         | There's actually the germ of an interesting idea in that
         | statement. If I'm going to go to the trouble, let's say, of
         | running a local TCP forwarder (good for the whole device), can
         | I run a packet sniffer at the same time and watch netflows and
         | edit the responses I return to the device based on what I see
         | performance-wise concerning those flows?
         | 
         | Expert me says that web sites are loaded with too much cruft
         | and since the far end terminations are spread far and wide,
         | there's not enough opportunity to apply that learning in any
         | practical sense. But I could be wrong.
         | (https://github.com/m3047/shodohflo)
        
         | LinuxBender wrote:
         | That is a use of SRV records [1], however it was not accepted
         | into the HTTP protocol specification. I bring it up every time
         | there is a new protocol version but I am too lazy to write an
         | RFC addendum for it and hope that someone else will. Existing
         | protocols may not be modified in this manor once ratified.
         | Maybe HTTP/4.0? /s
         | 
         | Some applications use SRV records for load balancing. Many VoIP
         | and video conferencing apps do this. There is a better list on
         | Wikipedia.                   _service._proto.name. TTL class
         | SRV priority weight port target.
         | 
         | [1] - https://en.wikipedia.org/wiki/SRV_record
        
           | jrockway wrote:
           | Yeah, I always liked SRV records. It seems that they proved
           | inadequate for gRPC balancing, so there are new experiments
           | in progress (mostly xDS).
        
         | 1996 wrote:
         | How would it be better than round robin DNS with low TTL?
        
           | jrockway wrote:
           | Basically, it affords you the ability to cache for longer and
           | still end up with users able to go to your website.
           | 
           | Right now, you can try resolving common hosts, and you will
           | see that they often provide several hosts in response to a
           | lookup. What the browser does with those IPs is up to the
           | browser, the standard does not define what to do. What the
           | administrator that sets up that record wants is "send to
           | whichever one of these seems healthy", and some browsers do
           | do that. Other browsers just pick one at random and report
           | failure, so your redundancy makes the system more likely to
           | break.
           | 
           | What I want is a way to define what to do in this case. Maybe
           | you want to try them all in parallel and pick the first to
           | respond (at the TCP connection level). Maybe you want to try
           | them sequentially. Maybe you want to open a connection to all
           | of them and send 1/n requests to each. Right now, there is no
           | way to know what the service intends, so the browser has to
           | guess. And each one guesses differently.
           | 
           | (You will notice that people like Google and Cloudflare
           | skillfully respond with only one record with a 5 minute TTL.
           | That is so the behavior of the browser is well defined, but
           | it also eats their entire year of 99.999% uptime with one bad
           | reply. Your systems had better be very reliable if DNS issues
           | can eat a year's worth of error budget.)
        
         | aeden wrote:
         | FWIW, there is an IETF draft that may be suitable for
         | addressing this: https://datatracker.ietf.org/doc/draft-ietf-
         | dnsop-svcb-https...
        
           | jrockway wrote:
           | Aha, this sounds like exactly what I was looking for!
        
         | m3047 wrote:
         | DNS has its own load balancing at several levels (and several
         | different kinds):
         | 
         | Nameserver records (NS records) used to locate a resource are
         | served by other nameservers. NS records are chosen from among
         | those offered in response to a query (RRs), and should all be
         | tried if necessary to elicit a response. The algorithm isn't
         | strictly specified and some nameservers will shuffle the order
         | in which they return RRs in their answers, some won't assuming
         | the stub resolver or app will do it. The foregoing also applies
         | to A and AAAA records (returning IP addresses for names), and
         | this has long been used as a quick and easy form of load
         | balancing/failover, except that it doesn't really failover very
         | well unless your app is coded to try all of the different
         | answers (and the stub resolver returns them to your app).
         | 
         | Nameservers querying other nameservers (caching/recursive
         | resolvers) are supposed to compile metrics on response times
         | when they make upstream requests and pick the fastest upstreams
         | once they learn them.
         | 
         | Stub resolvers (running on your device) typically query
         | nameservers in the order you specified them in your network
         | config, but not always.
         | 
         | From the foregoing, you can probably see that running a
         | caching/recursive resolver close to your devices is supposed to
         | be desirable, by design.
         | 
         | So far, so far. ;-)
         | 
         | As specified, and it's never been changed, DNS tries UDP first.
         | "Ok", you think "that must mean it will try TCP" but that's not
         | actually true: it only tries TCP if it receives a UDP response
         | with TC=1 (flagged as truncated). But if there's a UDP frag and
         | it doesn't get all the frags or never gets a UDP response at
         | all it /never/ tries TCP.
         | 
         | You're mixing two very different environments above: 1) a
         | datacenter with (let's just assume) VPCs and 2) a web browser.
         | 
         | In case #2 I'll match your ante and raise you an overloaded
         | segment which is dropping UDP packets, in which case stuff may
         | fail to resolve at all. Oh look, I drew a wildcard:
         | traditionally browsers have utilized the devices stub resolver,
         | but since they've pushed ahead with DoH they've had to
         | implement their own. People think I'm a DNS expert (what do
         | they know?) and I guess conventional wisdom amongst myself and
         | my peers is that UDP should perform better than TCP but
         | anecdotally people are claiming that DoH and DoT perform better
         | for them than their stub resolver. "Must be your ISP messing
         | with you" says someone, "yeah right, that's gotta be it". Me:
         | "did you try running your own local resolver?" them: "wuut?"
         | 
         | So here's where I confess that the experts aren't always right,
         | because I run my own local resolver and I have the same
         | problem: when the streaming media devices are running DNS
         | resolution on the wifi-connected laptop sucks and if I run a
         | TCP forwarder it starts working!
         | (https://github.com/m3047/tcp_only_forwarder).
         | 
         | Now to case #1, the datacenter. I hope you're running your own
         | authoritative and caching server, and you should read about
         | views in your server config guide; using EDNS to pass subnet
         | info is a kludge. If you're writing datacenter apps, you should
         | consider doing your own resolution and using TCP (try the
         | forwarder, I dare you), and provisioning accordingly (because
         | DNS servers assume most requests will come in via UDP).
         | 
         | If you want load balancing "you know, like nginx" I've got news
         | for you: BIND comes with instructions for configuring nginx as
         | a reverse TCP proxy. Oh! Looks like I've got a straight in a
         | single suit: nginx provides SSL termination so I've got DoT for
         | free!
        
           | jrockway wrote:
           | I am not really talking about load balancing the DNS traffic,
           | I'm talking about interpreting the response of the DNS query.
           | (The reliability at the network level seems to be handled by
           | moving everything to DNS-over-HTTPS or something, and is a
           | debate for another day.)
           | 
           | For example, consider the case where you resolve
           | ycombinator.com. You get:                   ycombinator.com.
           | 59      IN      A       13.225.214.21
           | ycombinator.com.        59      IN      A       13.225.214.51
           | ycombinator.com.        59      IN      A       13.225.214.81
           | ycombinator.com.        59      IN      A       13.225.214.73
           | 
           | Which of those hosts should I open a TCP connection to to
           | begin speaking TLS/ALPN/HTTP2? The standard doesn't say. I
           | would like a standard that says what to do. (The more
           | interesting case is, say I pick 13.225.214.21 at random. It
           | doesn't respond. What do I do now? Tell the user
           | ycombinator.com is down? Try another one? All of this could
           | be defined by a standard ;)
        
       | JoshMcguigan wrote:
       | DNS infrastructure is really interesting. I did a bit of a deep
       | dive on it a few months ago, culminating in running my own
       | authoritative name servers [0] for a while.
       | 
       | [0]: https://www.joshmcguigan.com/blog/run-your-own-dns-servers/
        
         | rhizome wrote:
         | One neat way of retaining that control is running your own
         | SOA(s), but getting robust secondaries and listing _those_ in
         | WHOIS so that they take all of the wild queries. Then you just
         | work with your little SOA and everything just propagates as
         | necessary and you don 't get hammered.
        
       | ricardo81 wrote:
       | Recursive DNS servers can also throw you off the scent a bit by
       | giving you an answer that is not the same as the authoritative
       | server.
       | 
       | I've seen 8.8.8.8 return something other than NXDOMAIN for some
       | domains that do not exist
       | 
       | Cloudflare will not honour dns ANY requests
       | 
       | Knowing how to query the authoritative nameservers is a handy
       | tool for debugging.
        
       | eat_veggies wrote:
       | One of my favorite revelations about the network tracing tools
       | (things like `traceroute` and `dig +trace`) that might not be
       | obvious for people like me who work higher up in the stack, is
       | that the data they provide isn't usually made available during
       | "normal" usage. Packets don't just phone home and tell you where
       | they've been. Something else is going on.
       | 
       | When you send a DNS query to a recursive server like your ISP's
       | or something like 1.1.1.1, you make a single DNS query and get
       | back a single response, because the recursive DNS server handles
       | all the different queries that Julia outlines in the post. As the
       | client, we have no idea what steps just happened in the
       | background.
       | 
       | But when you run `dig +trace`, dig is actually _pretending to be
       | a recursive name server_ , and making all those queries _itself_
       | instead of letting the real recursive name servers do their work.
       | It 's a fun hack but that means it's not always 100% accurate to
       | what's going on in the real world [0]
       | 
       | [0] https://serverfault.com/questions/482913/is-dig-trace-
       | always...
        
         | nijave wrote:
         | Yup, and to complicate matters more those resolvers you're
         | talking to may be talking to more caching resolvers.
         | 
         | For a given application server it might be:
         | 
         | - check local dns caching resolver
         | 
         | - check local network caching resolver
         | 
         | - if internal domain, check local authoritative resolver
         | 
         | - if public domain check isp resolver
         | 
         | - recursively resolve from there
        
         | LogicX wrote:
         | Just to add to the discussion -- 'whats happening in the
         | background' -- more specifically is your operating system's
         | stub resolver.
         | 
         | So when you ask for www.amazon.com it ends up making multiple
         | DNS lookups, as www.amazon.com is a CNAME record.
         | 
         | Nothing about this CNAME lookup gets passed back up the stack
         | to your application; you just get that end-result: the IP
         | address.
         | 
         | host www.amazon.com www.amazon.com is an alias for
         | tp.47cf2c8c9-frontier.amazon.com.
         | tp.47cf2c8c9-frontier.amazon.com is an alias for
         | www.amazon.com.edgekey.net. www.amazon.com.edgekey.net is an
         | alias for e15316.e22.akamaiedge.net. e15316.e22.akamaiedge.net
         | has address 23.204.68.114
        
         | dgl wrote:
         | dig +trace takes one path, there's also tools like dnstrace
         | that attempt to show all the paths:
         | https://github.com/rs/dnstrace
         | 
         | Still there can be caches that don't quite agree as the other
         | comment mentions.
        
         | fragmede wrote:
         | In particular, one your ISPs/their ISP's DNS servers may be
         | caching a record for longer than it's supposed to and will
         | return incorrect and expired data.
         | 
         | The other possibility is different IP's being returned by a DNS
         | server based on where a query is coming from, eg a CDN. If
         | you're in location A and your ISPs DNS server is in location B,
         | the CDN's DNS server may return a different IP based on if the
         | request is coming from A or B. ECS [0] is supposed to mitigate
         | this, but may or may not be used.
         | 
         | [0] https://en.wikipedia.org/wiki/EDNS_Client_Subnet
        
           | qes wrote:
           | > In particular, one your ISPs/their ISP's DNS servers may be
           | caching a record for longer than it's supposed to and will
           | return incorrect and expired data.
           | 
           | It's disturbing how many clients we'll see hitting an old IP
           | address for 30 days after a change.
        
       | muppetman wrote:
       | Glad to see this. One of my (stupid) pet peeves is people that
       | say "You have to wait for the DNS to propogate". DNS _does not_
       | propogate. What you 're actually waiting for is the cache TTL to
       | expire so those name-servers that have cached it have to query
       | the real answer again, thus getting the newly pushed information.
       | Of course it appears exactly like it "takes time to propagate"
       | which is why it's actually a pretty sound description of what's
       | happening, and thus why it's a stupid pet peeve. Pointless rant
       | ends.
        
         | gerdesj wrote:
         | Don't forget negative caching. Windows famously fucks up here.
         | A DNS look up these days is minute in the grand scheme of
         | things and yet Windows still insists on caching a failed lookup
         | for five minutes.
         | 
         | So you fire up cmd.exe and issue ifconfig /releasedns, ...,
         | ipconfig /?, ipconfig /flushdns and then you go back to pinging
         | the bloody address instead of using nslookup because you
         | learned from another idle/jaded sysadmin to use ping as a
         | shortcut to querying DNS, instead of actually querying what the
         | DNS servers respond with.
         | 
         | Obviously, a better thing to do when checking your DNS entries
         | is dig out ... dig.
         | 
         | DNS _changes_ _do_ propagate: from the one you edited to the
         | others via zone transfers and the like (primary to secondary
         | etc) and thence to caching resolvers.
        
         | rovr138 wrote:
         | The change is propagating through the network, but it's not a
         | push like most would assume based on the wording
        
         | HenryBemis wrote:
         | Old guy here: maybe people were confusing DNS with the WINS
         | service that was helping to propagate ("replicate") the
         | name/servers changes 20 years ago?
        
         | tialaramex wrote:
         | Yes, I'm annoyed about this too.
         | 
         | The most egregious case I've seen was an Amiga site. The site
         | went down and for several _days_ reported that users would need
         | to wait for the updated records to propagate and lots of loyal
         | fans were insisting anybody who couldn 't read the site was
         | just being too impatient.
         | 
         | What was actually wrong? They wrote their new IP address as a
         | DNS name in their DNS configuration rather than as an IP
         | address. Once they fixed that it began working and they acted
         | as though that was just because now it had successfully
         | propagated.
         | 
         | On the other hand propagation _is_ a thing when it comes to
         | distributing modified DNS records to multiple notionally
         | authoritative DNS servers.
         | 
         | This can be a problem for using Let's Encrypt dns-01 challenges
         | for example, especially with a third party DNS provider.
         | 
         | Suppose you write a TXT record to pass dns-01 and get a
         | wildcard certificates for your domain example.com. You submit
         | it to your provider's weird custom API and it says OK.
         | Unfortunately when you do this all it really did was write the
         | updated TXT record to a text file on an SFTP server. Each of
         | the provider's say three authoritative DNS servers (mango,
         | lime, kiwi) check this site every five minutes, download any
         | updated files and begin serving the new answers.
         | 
         | Still they said OK, so you call Let's Encrypt and say you're
         | ready to pass the challenge. Let's Encrypt calls authoritative
         | server kiwi, which has never seen this TXT record and you fail
         | the challenge.
         | 
         | So you check DNS - your cache infrastructure calls lime, which
         | has updated and gives the correct answer, it seems like
         | everything is fine, so you report a bug with Let's Encrypt. But
         | nothing was wrong on their side.
         | 
         | Now, unlike typical "DNS propagation" myths the times for
         | authoritative servers are usually minutes and can be only
         | seconds for a sensible design (SFTP servers is not a sensible
         | design) so you can just add a nice generous allowance of time
         | and it'll usually work. But clearly the Right Thing(tm) is to
         | have an API that actually confirms the authoritative servers
         | are updated before returning OK.
        
       | asciimike wrote:
       | https://howdns.works is one of my favorite educational booklets
       | on the subject. Not as in depth as many other resources, but
       | highly amusing and fairly sticky.
        
         | logikblok wrote:
         | This is brilliant thanks.
        
       ___________________________________________________________________
       (page generated 2020-06-22 23:00 UTC)