Kahibaro
Discord Login Register

DNS troubleshooting

Understanding DNS Problem Types

DNS issues usually fall into a few recognizable categories:

When troubleshooting, try to classify the symptom into one of these. It helps you decide which tools and checks to use.

Basic Troubleshooting Approach

A disciplined process prevents you from chasing the wrong problem:

  1. Clarify the scope
    • A single record, a whole zone, or all DNS?
    • Only inside your network or also from the internet?
    • Only some clients / subnets / ISPs?
  2. Check from multiple vantage points
    • Local resolver (e.g., dig example.com)
    • Public resolvers (e.g., dig @1.1.1.1 example.com)
    • Authoritative servers directly (e.g., dig @ns1.example.com example.com)
  3. Work from the client outward
    • Client configuration → local resolver → upstream / recursive → authoritative
  4. Separate layers
    • Name resolution vs. connectivity:
      • Can you resolve the name?
      • Can you reach the resolved IP?

Core Tools for DNS Troubleshooting

Using `dig`

dig is the primary diagnostic tool. Key patterns:

# Basic A record query using system resolver
dig example.com
# Query a specific DNS server
dig @8.8.8.8 example.com
# Query specific record type
dig example.com MX
dig example.com TXT
dig -t AAAA example.com
# Show all records in the answer section
dig example.com ANY
# Detailed info, including flags and authority
dig +multiline +nocmd +noall +answer example.com
# Show full trace from root down (similar to a resolver’s recursion path)
dig +trace example.com
# Debugging EDNS / DNSSEC behavior
dig +dnssec example.com
dig +edns=0 example.com

Key fields in dig output to read:

Using `host` and `nslookup`

These are simpler / legacy alternatives:

host example.com
host -t MX example.com
nslookup example.com
nslookup -type=TXT example.com

host is cleaner and script‑friendly; nslookup is still around but dig is preferred for serious troubleshooting.

Low-level network tools

To confirm UDP/53 and TCP/53 connectivity:

# Simple reachability (ICMP, not DNS)
ping dns-server.example.com
# Check port 53 (TCP)
ss -tulpen | grep :53
# From client, check if the port is reachable (requires appropriate tools)
nc -vz dns-server.example.com 53

Troubleshooting from the Client Side

Verify local resolver configuration

Check what DNS servers the client is using:

    cat /etc/resolv.conf
      systemd-resolve --status
    nmcli device show | grep IP4.DNS
    nmcli general status

Common issues:

Distinguish DNS from connectivity issues

Steps:

  1. Test DNS directly:
   dig example.com
  1. Test connectivity to the resolved IP:
   ping 93.184.216.34        # example IP
   curl -v http://93.184.216.34/

If DNS resolution works but the IP is unreachable, it’s not a DNS problem.

If IP connectivity is fine but name resolution fails, focus on DNS.

Test different resolvers

Compare behavior:

dig example.com                  # default resolver
dig @8.8.8.8 example.com         # Google
dig @1.1.1.1 example.com         # Cloudflare
dig @ns1.example.com example.com # authoritative

Patterns:

Troubleshooting Authoritative DNS (BIND / others)

Check zone loading and syntax

For BIND:

# Validate zone file
named-checkzone example.com /etc/bind/db.example.com
# Check overall configuration
named-checkconf

Typical issues named-checkzone finds:

For other DNS servers, use their equivalent validation tools or built‑in commands.

Inspect logs

Common locations (may vary by distro):

Look for:

Verify that the server is listening and reachable

On the server:

ss -tulpen | grep ':53 '

Confirm:

From a client:

dig @your-dns-ip example.com
dig @your-dns-ip example.com SOA

If there’s a timeout:

Check recursion and access control

If your server is both recursive and authoritative (or you rely on its recursion):

Symptoms:

Use:

dig @your-dns-ip external-name.com

If you get REFUSED or SERVFAIL, ACLs or forwarding may be involved.

Troubleshooting DNS Delegation and Public Zones

Validate NS records and delegation chain

  1. Check the domain’s NS records as seen by the world:
   dig example.com NS
  1. Check at the parent zone (e.g., TLD):
   dig com NS example.com      # incorrect; "com" isn't queried this way
   dig +trace example.com      # better: show delegation chain

With +trace, inspect:

Common mistakes:

Glue records and in-zone nameservers

If your NS records point to names inside the same zone:

Then the parent zone (e.g., .com) must hold glue A/AAAA records for ns1.example.com and ns2.example.com.

Troubleshoot:

dig +trace ns1.example.com
dig com NS example.com   # via whois / TLD-specific tools or web-based checkers

Watch for:

Verifying SOA and serials

Check SOA:

dig example.com SOA

Important fields:

If you use secondary (slave) servers:

Check from each authoritative server:

dig @ns1.example.com example.com SOA
dig @ns2.example.com example.com SOA

Verify the serial is identical; if not, troubleshoot zone transfers.

Troubleshooting Caching, TTL, and Stale Records

Understanding TTL behavior

Each record has a TTL. Caches keep the record up to that TTL.

Common issues:

Use:

dig example.com A
dig example.com A +trace
dig @resolver-ip example.com A

Note the TTL column in the answer. To accelerate changes in the future, use a lower TTL (before planned migrations).

Forcing cache bypass

Some resolvers can bypass cache; with plain dig you cannot force upstream servers to ignore their cache, but you can:

  dig @ns1.example.com example.com A

If authoritative shows the new value but some resolvers still have the old one and TTL hasn’t expired yet, you must wait; there is no way to “pull back” cached answers.

Negative caching (NXDOMAIN)

Nonexistent domain responses can be cached, too.

Check:

dig nonexistent.example.com
dig nonexistent.example.com SOA

The SOA’s minimum field or TTL in the negative response controls how long NXDOMAIN is cached.

If you later create that record, some caches will still respond NXDOMAIN until their negative cache expires.

DNSSEC Troubleshooting Basics

Recognizing DNSSEC issues

Signs:

Check DNSSEC-specific fields:

dig +dnssec example.com
dig +trace +dnssec example.com

Look for:

Common DNSSEC misconfigurations

Troubleshooting steps:

  1. Compare DNSKEY and DS:
   dig example.com DNSKEY
   dig com DS example.com  # often best checked via online tools or TLD utilities
  1. Validate with external online validators (useful to confirm if the chain validates).
  2. Ensure all authoritative servers serve the same, correctly signed zone.

Fixing DNSSEC issues often involves:

Split-Horizon and Internal vs External DNS

Split-horizon (different answers depending on client location) is a common source of confusion.

Typical patterns:

Troubleshooting tips:

  dig @internal-dns example.com
  dig @external-dns example.com

Reverse DNS (PTR) Troubleshooting

Reverse lookups use PTR records in in-addr.arpa (IPv4) and ip6.arpa (IPv6) zones.

Common issues:

Check:

# Forward lookup
dig mail.example.com A
# Reverse lookup
dig -x 203.0.113.10

For public IPs, reverse zones are usually controlled by the ISP or hosting provider; you often must configure PTRs via their portal, not your own DNS server.

Ensure:

Performance and Load-related DNS Problems

Measuring DNS latency

Use +stats and query times:

dig example.com +stats

Look at:

If authoritative DNS is slow:

Amplification and rate limiting

DNS can be abused for amplification attacks. Countermeasures (like Response Rate Limiting, RRL) can introduce:

Look for:

If responses are frequently truncated:

Firewall and Network-related DNS Issues

Firewalls blocking or altering DNS

Symptoms:

Checks:

Use packet captures:

tcpdump -ni any port 53

Analyze:

ISP / network interception

Some ISPs:

If you see unexpected answers from a different resolver than you queried:

Systematic Checklist for DNS Troubleshooting

When facing a DNS issue for a specific name:

  1. From affected client:
    • dig name
    • Check status, answer, and server used.
    • If failure, try dig @8.8.8.8 name.
  2. From a different network or public DNS checker:
    • Confirm whether the issue is global or local.
  3. From an admin host or the DNS server itself:
    • Query authoritative server:
      • dig @ns1.example.com name ANY
    • If authoritative fails → check zone files, logs, and listening ports.
  4. Check delegation:
    • dig +trace name
    • Verify NS and glue are correct and reachable.
  5. Check caching / TTL:
    • Compare authoritative vs cached answers.
    • Note TTL and whether caches still serve old data.
  6. If DNSSEC is enabled:
    • dig +dnssec name
    • Verify RRSIG, DNSKEY, DS; look for SERVFAIL on validating resolvers.
  7. Confirm network path:
    • Ensure UDP/TCP 53 is allowed end‑to‑end.
    • Use tcpdump/ss/firewall tools to verify.
  8. Document findings and changes:
    • Record serial numbers, TLD updates, and TTLs.
    • Log when changes were made, to correlate with cache expiry.

Following this structured approach helps isolate whether the problem is client configuration, local resolver, authoritative configuration, delegation, DNSSEC, or underlying network transport.

Views: 22

Comments

Please login to add a comment.

Don't have an account? Register now!