Definitive Guide to DNS TTL Settings

Definitive Guide to DNS TTL Settings

DNS is a foundational piece of technology. Nearly every higher level network request, all internet traffic, web searches, email, etc. rely on the ability to resolve DNS lookups (translate names like some.domain.org to IP Addresses or other domains).

We wanted to write about Time To Live (TTL) as most Sysadmins don’t interact with DNS configurations on a daily basis and much of the information that’s out there is based upon half-remembered war stories handed down from the generations of sysadmins who came before us.

We asked on Twitter and there were some sysadmins who weren’t even exactly sure what TTL stood for (though thankfully most of them did).

To help with this situation, we are going to cover:

  1. DNS and TTL Basics
  2. DNS TTL Troubleshooting
  3. DNS Best Practices for Change Management
  4. DNS Tools
  5. Next Steps

1. DNS BASICS

What is a DNS Record?

Domain Name Server (DNS) records specify two important things:

Where requests for an entry should be pointed (resolved) to.
How long the record can be cached before it needs to be requested again – this is ominously called the Time To Live (TTL) of the record.

Why is DNS cached?

Most organizations set up DNS records and then don’t change them for years. Since they’re often requested but infrequently updated, caching DNS records is very effective in improving network performance at the cost of increasing the complexity of reasoning about and troubleshooting DNS issues.

What’s a TTL?

A Time to Live represents how long *each step* of the DNS resolution chain will cache a record – and it’s tracked in seconds (hang on, that bit will be important later.)

This doesn’t really capture the nuance of the situation though: even though it’s definitely “Time to Live”, it might make more sense if you think of it as “Time To Lookup” or “How long to keep this DNS record in cache”.

What are typical TTL times for DNS records?

Time to Live values are always represented in seconds. Most DNS setup configuration services provide you a preset list of values to set your records to.

300 seconds = 5 minutes = “Very Short”
3600 seconds = 1 hour = “Short”
86400 seconds = 24 hours = “Long”
604800 seconds = 7 days = “Insanity”

How do DNS Lookups Work?

When you type a URL into your browser, a whole series of lookups are created:

The following questions are asked at every step in this process (and there are often more steps than listed here)

  1. Do we have this record cached?
  2. If it is cached, is the TTL still valid?

If the answer to either of these questions is “No” then the request is moved to the next step up the chain.

Why DNS is about Network Connections not Devices

Troubleshooting DNS is difficult not just because the TTL and caching system introduce complexities, but because many modern devices connect via different networks and chains of DNS Servers.

Consider an ordinary laptop computer. Mine is more or less glued to my desk, and although it hasn’t moved in weeks, I’ve connected to:

  • My normal home wi-fi / cable network
  • My cell phone when cable didn’t work
  • Both of the above, but connected over VPN

Every time that a network is switched, a new DNS chain is brought into effect.If you happen to be in the midst of making changes, the servers and cache locations in the DNS chain may or may not have the correct information.

This often happens on corporate networks where the Active Directory domain is the same as the company’s website. An external DNS server (at the ISP level) has a DNS record that points ‘www.example.com’ to the correct web server IP address/CNAME, but on the internal DNS server used by Active Directory the records haven’t been mirrored.

So you’ll have people in a panic. “The webserver is down!” “the sky is falling” “where are my pants?”…and when you begin troubleshooting, you’ll find that what actually happened is that they left their VPN connected.


2. DNS TTL TROUBLESHOOTING

How long will it take my DNS to update?

To calculate the maximum (worst case) amount of time it will take between when you update a DNS record and you can feel confident that every client now references the new value, multiply the number of steps (not counting the authoritative server) times the TTL value.

So if your TTL is 3600 seconds (1 hour) and there are 5 steps, it shouldn’t take more than 18,000 seconds (5 hours) for changes to fully propagate.

But wait!

How much does a DNS lookup cost?

When you ask how much a DNS lookup “costs” you’re not usually worrying about money. You’re worrying about time. Depending on the usual menagerie of internet network gremlins, a DNS request typically takes between 100 to 200 milliseconds to complete.

While this is a very small amount of time, consider a webpage. Every image, css file, and javascript asset file that are referenced on a page need to have their DNS resolved. Without caching you’d have greatly increased load times.

Naive DNS Cost Calculations

This is “naive” as it’s unlikely that every asset on your website will be served from a different domain and browsers have lots of nice caching tricks and strategies to make things faster than this simplified model of how they work.

With Caching
(30 image files * 50ms to download each) + (100 ms one time lookup of the DNS which is then cached) = 1600ms

Without Caching
(30 image files * 50ms to download each) + (30 * 100ms DNS lookups) = 3000ms

Why isn’t my DNS updating?

On top of that, there can be additional factors that extend the propagation time beyond the base calculation. Some examples include:

  • Web browsers internally caching DNS entries for non TTL controlled periods of time in an attempt to seem “faster”. For example, modern Internet Explorer versions cache DNS for 30 minutes by default (prior to IE 4 it was 24 hours) and will ignore TTLs below that.
  • Mobile ISPs may seek to reduce overall traffic by increasing TTL times, lowering how often requests are made.
  • Complex internal networks with more DNS servers than you would anticipate will naturally take longer to update.

The above is the reason why most services state something like: “It can take days for your DNS changes to fully propagate, plan accordingly.”

Is there any way to force a client to update their DNS record remotely?

This question is typically asked in the context of: “I updated my DNS records and now a client can’t reach some site, how do I force the update to take place?”

Unfortunately the answer is “no”. There is no DNS configuration command that you can enter to force early updates from downstream clients.

There are commands you can run that will clear out DNS entries from a local cache, but typically these don’t work as effectively as you’d hope, since you still have upstream (ISP DNS caching) and downstream (Browser DNS caching) occurring.

Your best bet is to change the TTL for your records ahead of time.


3. DNS BEST PRACTICES FOR CHANGE MANAGEMENT

What’s better: a short or a long TTL?

Developers have long waged holy wars over whether code indentations should be tabs or spaces. I’ve found that network admins feel similarly about TTL length.

Typically what informs this opinion is whatever network configuration attack/debacle they were previously involved with.

A DDOS attack that disrupts root/ISP level DNS servers for 12 hours will have less of an impact on sites with a very long TTL. The long TTL lets clients keep working – even when the DNS server is offline or overwhelmed.

But, if you’re in the midst of switching web or email hosts and you happen to typo a record, the last thing you want to do is to have that change irrevocably stick around for the next 12 hours. And so you have people who advocate TTLs of a minute.

My strong personal preference is to have short (less than 1 hour/3600 seconds) DNS TTLs.

How do I know when a client will request the updated DNS record?

It’s very difficult to estimate when all clients will be updated.

See, Time To Live is *not* a freshness date. Do not consider DNS TTLs a “Best By: ” date on a stale loaf of bread – it’s not a singular time when a record goes from good to bad and needs to be replaced.

how_to_set_up_everything_you_need_to_know_with_dns_and_ttl_-_google_docs

DNS is much more like an org chart, where as you make changes, the changes slowly propagate out to the entire network as time passes and clients “lower” in the chart have their caches expire and request the record from the DNS server higher in the list.

What’s the Best Practice for changing a DNS record?

For something relatively simple like modifying a single record to a domain, it might feel like overkill to have a “plan” or “strategy” – but given the very public severity of screwing up DNS some caution is warranted. It’s like the old saying: “A packet of prevention is worth a pound of cure.”

There’s a simple way to limit your mistakes: never update both a DNS record and a TTL for that record at the same time. Ideally you’ll have a process of:

  1. Drop TTL on the DNS record to something very low a couple of days before you actually need to make the switch. Ex: 300 seconds
  2. Change the actual record on your cutover date.
  3. Several days after you’ve made the switch, up the TTL to something higher.

What’s the Best Practice for adding a new DNS record?

Adding new records is simpler than modifying existing ones.

  1. Add the record with a low TTL.
  2. After you’re sure everything works, raise the TTL.

What’s the most common TTL setting?

There’s so much controversy around what your TTL settings *should* be that we thought we’d try to generate some hard data. The Moz Top 500 sites is a nice cross section of websites and they’ve already done the hard work of putting them all into a CSV file.

I wrote a quick script to iterate through the list and look up the current TTL of the primary record for each domain. Like any type of data project, depending on how you ask the question, this data is wildly off base. It’s not a broad enough sample, it’s pulling the current (cached) results, etc. etc. With that disclaimer out of the way, there are still some great insights to pull from the results.

TTL Analysis of the Top 500 Moz domains

View/Modify the script or download and run it yourself at: https://gist.github.com/mbuckbee/79b2e76bd9271bea38487defd8a9138b

See the list and download the CSV at: https://moz.com/top500

Lowest TTL: 1
Highest TTL: 129,540
Domains Resolved: 485
Average TTL: 6468
Median TTL: 300

The lowest values are coming from domains that are doing very rapid DNS changes for load balancing purposes. The highest are from domains that haven’t been updated in a long time (python.org I’m looking at you).

From the standpoint of needing to defend the decision to have a short (sub 1 hour, 3600 seconds) TTL, you can point to the median value of 300s (5 minutes) with and state confidently that you have empirical evidence of what the setting should be.


4. DNS PLATFORM TOOLS

How do I check the TTL of a DNS record on Windows?

The nslookup – https://technet.microsoft.com/en-us/library/bb490950.aspx – utility is the easiest way to check DNS records on a Windows

Example: C:\>nslookup -type=cname -debug www.varonis.com

nslookup

The TTL is listed at the bottom of the output. “Non-authoritative answer” indicates that this is the TTL as seen from the client (that we have 2 mins and 11 secs until the local client checks the next level up in DNS).

How do I check the TTL of a DNS record on Unix/Linux/Mac?

For Unix (and derived) systems, the dig command is used for DNS troubleshooting.

Example: dig www.varonis.com

dig-osx-linux

The TTL is shown outlined in red.

How do I check a DNS record from the Web?

You might not always be at your computer when you need to check a DNS record. A handy (and free) version of dig is available online from Google at: https://toolbox.googleapps.com/apps/dig/

dig__dns_lookup_

The TTL is shown outlined in red.

How do I test for DNS TTL propagation?

If you’re trying to find out if a specific DNS server has been updated with your new DNS settings, all DNS tools (dig, nslookup, etc) will let you specify what DNS server you’d like to run your queries against instead of the default local setup.

To get a broader picture of your changes, I recommend whatsmydns.net which will check many of the top (ISP level) DNS servers which will let you know if something has gone really wrong.

whatsmydns

DNS TTL NEXT STEPS

Setting DNS TTL’s can be tricky, but if you set them to short (less than an hour) values you’ll preserve your sanity and better prepare your network to handle changes.

If you’ve like this article we’d recommend you also checkout our Web Security Fundamentals course, which will help you better secure that site or application that you just setup DNS for, it’s free and well worth your time.

Click on this handsome Australian man’s face to get started.

websec

Get the latest security news in your inbox.

  1. This seems to be in-accurate in its description of TTL , and it seems to imply that all caching clients would multiply the TTL . AFAIK this is both incorrect in most cases and also unlikely to have as much as 5 steps . ( Normal a DNS request would only have 3 steps when thinking TTL isolated , of which only 1 step would be relevant )
    If there is a DNS forwarded chain in use at the our ISP or company , this might add some ( read 1 ) level of DNS as it would make no sense to forward DNS to another forwarder , making the chain 4 instead of 3 steps . however the forwarder will pickup the same TTL as the previous step , so it should not multiply along the path of servers involved .

    I cannot see any valid reason to have TTL lower then 3600 seconds for virtually any DNS record , infact most records would benifit from 12+ hours .
    and a “recomandation” having a TTL of 300 seconds is way to low , and gets you alot closer to breaking DNS by overloading the systems as you add active hosts / domains .

  1. This seems to be in-accurate in its description of TTL , and it seems to imply that all caching clients would multiply the TTL . AFAIK this is both incorrect in most cases and also unlikely to have as much as 5 steps . ( Normal a DNS request would only have 3 steps when thinking TTL isolated , of which only 1 step would be relevant )
    If there is a DNS forwarded chain in use at the our ISP or company , this might add some ( read 1 ) level of DNS as it would make no sense to forward DNS to another forwarder , making the chain 4 instead of 3 steps . however the forwarder will pickup the same TTL as the previous step , so it should not multiply along the path of servers involved .

    I cannot see any valid reason to have TTL lower then 3600 seconds for virtually any DNS record , infact most records would benifit from 12+ hours .
    and a “recomandation” having a TTL of 300 seconds is way to low , and gets you alot closer to breaking DNS by overloading the systems as you add active hosts / domains .