Stop calling getaddrinfo() in your event loop

I still remember the first time I realized that DNS was the reason my “high-performance” asynchronous application was stuttering like a scratched CD. I was building a proxy service, feeling pretty good about my non-blocking I/O architecture, using epoll directly because I enjoy pain, apparently.

Everything looked perfect on the flame graphs. CPU usage was low. Memory was fine. But every few seconds, the event loop would just… stop. For 50 milliseconds here, 200 milliseconds there.

The culprit? A single blocking system call that we all take for granted: getaddrinfo.

The Deceptive Simplicity of Port 53

We teach junior devs that DNS is simple. “It’s the phonebook of the internet,” we say. “You send a UDP packet to port 53, and it gives you an IP address.”

That was true in 1998. In 2026? It’s a lie.

If you’re writing networking code today—whether it’s in Rust, C++, or even Go—and you rely on the standard libc resolver, you are essentially pausing your entire application every time you need to look up a hostname. Why? Because the standard POSIX getaddrinfo function is synchronous. It blocks. It doesn’t care about your fancy io_uring setup or your carefully tuned reactor pattern. It waits until the OS comes back with an answer.

And if that UDP packet gets dropped? It waits for the timeout. Your 100k requests-per-second server just became a 0 requests-per-second brick for two seconds.

The “Happy Eyeballs” Nightmare

Server room data center - Server Room vs Data Center: Which is Best for Your Business?
Server room data center – Server Room vs Data Center: Which is Best for Your Business?

Then there’s the IPv6 situation. I love IPv6. I really do. But implementing dual-stack support correctly is enough to make you quit tech and become a goat farmer.

You can’t just resolve the AAAA record (IPv6), try to connect, fail after a timeout, and then try the A record (IPv4). That’s too slow. Users perceive that as “the internet is broken.”

You have to use an algorithm called “Happy Eyeballs” (RFC 8305). Basically, you fire off DNS requests for both IPv4 and IPv6 simultaneously. Then, you start racing the connections. You try IPv6 first, but if it doesn’t connect within a tiny window (usually 250-300ms), you immediately fire off the IPv4 connection attempt without canceling the IPv6 one. Whichever wins, wins.

Writing this logic from scratch is tedious. You’re juggling multiple asynchronous state machines just to open a single TCP socket. And if you get it wrong? You either crush the network with duplicate traffic or leave users staring at a loading spinner.

# Pseudo-code of what you THINK you're doing
sock = connect("api.example.com", 443)

# What you actually have to do for Happy Eyeballs v2
async def happy_connect(hostname, port):
    v6_task = resolve_aaaa(hostname)
    v4_task = resolve_a(hostname)
    
    # Race resolution...
    # Start connecting to first result...
    # Set a 250ms timer...
    # If timer pops, start connecting to second result...
    # First one to handshake wins, cancel the other.
    # Handle cleanup.
    pass 
    # Yeah, it's messy.

Encryption Changed Everything

It used to be just UDP. Fire and forget. But security matters, and ISPs love snooping on DNS traffic. So now we have DNS over TLS (DoT) and DNS over HTTPS (DoH).

This is where things get heavy. To resolve a domain name securely, you now need to establish a full TCP connection and perform a TLS handshake just to ask where the server is. The overhead is massive compared to a single UDP packet.

If you’re building a high-performance client, you can’t afford to do a full TLS handshake for every DNS query. You need persistent connections to your resolvers. You need connection pooling. You need to handle HTTP/2 or HTTP/3 framing if you’re using DoH.

Suddenly, your “simple DNS client” needs a full-blown TCP/TLS stack inside it. It’s recursive complexity. You need a resolver to find the IP of the DoH server so you can resolve the IP of your target.

Server room data center - Data center and server room considerations: What you need to know ...
Server room data center – Data center and server room considerations: What you need to know …

The Async Reality Check

I’ve been playing around with some of the newer networking libraries popping up lately—ones built on top of io_uring for Linux and kqueue for BSD/macOS. The difference between a blocking resolver and a truly async implementation is night and day.

A proper modern DNS implementation in 2026 looks something like this:

  • Full User-Space Implementation: It doesn’t rely on the OS’s getaddrinfo. It constructs DNS packets manually.
  • Protocol Agnostic: It handles UDP, TCP, TLS, and QUIC seamlessly.
  • Smart Caching: It respects TTLs but also serves stale records if the upstream is down (because availability usually beats strict consistency in user-facing apps).
  • Racing: It implements Happy Eyeballs v2 natively.

This is why I roll my eyes when I see tutorials that just slap a socket.connect() in a loop and call it a day. In a controlled environment—like your laptop or a simple internal tool—that’s fine. But if you push that to production where you’re handling untrusted input or flaky networks? You’re going to have a bad time.

Why You Shouldn’t Write Your Own

Computer programming code screen - Computer programming | AnyQuestions
Computer programming code screen – Computer programming | AnyQuestions

I tried writing a DNS resolver once. Just a simple one. “How hard can it be to parse a few bytes?” I thought.

Three days later I was crying over DNS compression pointers (a clever but annoying way DNS packets save space by referencing previous parts of the message). Then I got stuck on handling truncated UDP packets that need to retry over TCP. Then I realized I hadn’t even touched DNSSEC validation yet.

The complexity of the protocol has exploded. We aren’t just mapping names to numbers anymore. We’re verifying signatures, racing protocols, and encrypting queries.

If you are building network tooling today, look for libraries that handle this complexity for you. Look for “Async DNS” or “Non-blocking resolver” in the feature list. If the library just wraps the system resolver in a thread pool? Skip it. That’s a band-aid, not a fix.

Real performance comes from handling DNS as just another asynchronous stream of bytes, integrated directly into your event loop. Anything less is just waiting to hang.

More From Author

Stop Trying to Kill TCP/IP (Unless You’re Building a Supercomputer)

Leave a Reply

Your email address will not be published. Required fields are marked *

Zeen Widget