For years, email deliverability teams have been haunted by a ghost in the machine. They meticulously implement SPF (Sender Policy Framework), DKIM (DomainKeys Identified Mail), and DMARC (Domain-based Message Authentication, Reporting, and Conformance). They maintain clean IP reputations and ensure their domains are pristine. Yet, despite these efforts, they continue to face intermittent, unexplained delivery failures. These aren’t rejections due to spam flags or configuration errors; they are mysterious "temporary failures" that leave no trace in DMARC reports.
Now, a groundbreaking paper set to be presented at the MADWeb 2026 workshop in San Diego, titled "The Fragility of DNS-Based Security Under Imperfect DNS Operation," has finally identified the culprit. Authored by Tino Hager of Mailtower.app and Professor Ronald Petrlic of the Nuremberg Institute of Technology, the research shifts the focus away from email authentication standards themselves and toward the foundational, often overlooked, DNS infrastructure that supports them.
The Chronology of a Silent Failure
The mystery began as a persistent industry grievance. Deliverability experts noted that even when all authentication "best practices" were strictly followed, large-scale email campaigns would experience erratic bounce rates.
To investigate, Hager and Petrlic conducted an exhaustive four-month controlled study. They sent 109,649 identical emails hourly from six domains to a variety of Microsoft- and Google-hosted accounts. The test environment was designed to be deliberately "boring"—using the same servers, identical signatures, clean IPs, and domains that had been active for years. By controlling these variables, the researchers ensured that if a failure occurred, it could not be attributed to the sender’s reputation or content.
The findings were immediate and stark. The data revealed a massive divide in how receiving infrastructures handle DNS requests. Microsoft-hosted recipients accounted for the overwhelming majority of unexplained temporary errors, while Google Mail recipients experienced virtually none. This disparity confirmed that the problem did not lie in the DNS records themselves, but in how the receiving mail servers were fetching and processing that DNS data.
The DKIM Key-Length Trap
One of the most unsettling findings of the research concerns the industry-wide push for stronger security. Over the past several years, cybersecurity experts have pressured organizations to move away from 1024-bit RSA keys for DKIM, favoring 2048-bit keys as the de-facto standard. Furthermore, Germany’s federal security agency has advocated for 3000+ bit keys by 2025.
However, the researchers discovered a perverse incentive structure: stronger security increases the likelihood of delivery failure.
As key lengths grow, the size of the corresponding DNS response grows as well. The researchers found that a 1024-bit RSA key produced zero errors when hosted on AWS Route 53 or Cloudflare. When they increased the key length to 2048-bit, the error rate on Route 53 jumped to between 8% and 9%, while Cloudflare remained highly stable at 0.1%–0.2%. When pushed to 4096-bit, both providers saw error rates climb to 11%–13%.
This creates a "security paradox." Organizations attempting to follow the highest security standards are, in effect, sabotaging their own deliverability. The study suggests that, currently, the safest move for a sender is to use the weakest cryptographic standard, a trade-off that compromises security for the sake of basic functionality.
Why DNS Hosts Diverge: The Minimal Response Factor
The divergence between Route 53 and Cloudflare boils down to a technical formatting nuance known as "minimal responses."
Route 53 adheres strictly to classic RFC guidelines, including the authority section (NS and SOA records) in its DNS responses. Cloudflare, conversely, utilizes "minimal responses," omitting that section. This seemingly minor difference amounts to approximately 140 bytes of data.
For a 2048-bit DKIM key, that 140-byte difference is the tipping point. It pushes the response over the traditional 512-byte UDP ceiling. Once a DNS response exceeds this threshold, it requires EDNS (Extension Mechanisms for DNS) to travel over UDP. If EDNS is not handled correctly, the response is truncated, forcing a fallback to TCP, which is more prone to failure and latency. Consequently, the exact same DKIM setup can succeed or fail entirely based on the choice of DNS hosting provider.
The Microsoft Bottleneck
The research points specifically to a missing piece of the puzzle within Microsoft’s Exchange Online infrastructure. By tracing the mail path, researchers identified a delegated nameserver—ns1-proddns.glbdns.protection.outlook.com—that lacks support for EDNS.
When a sender’s DKIM response is large enough to necessitate EDNS, this specific Microsoft nameserver fails to process it, leading to a bounce. The researchers noted that this is particularly galling given that EDNS has been a standard feature in Windows Server since 2008.
Interestingly, Microsoft has already developed a solution. A newer delivery domain, mx.microsoft, introduced in March 2024 for DANE (DNS-based Authentication of Named Entities) support, handles EDNS correctly. The researchers’ recommendation is straightforward: Microsoft should migrate all customer traffic to this modern, EDNS-compliant infrastructure. Until that happens, Microsoft users remain vulnerable to failures caused by the very security features they are required to implement.
The SPF "TXT Sprawl" and the Adobe Case Study
While DKIM failures are driven by key size, SPF failures are often driven by "DNS clutter." Modern organizations often accumulate dozens of TXT records—verification tokens for SaaS platforms, security tools, and analytics services.
The researchers highlight adobe.com as a cautionary tale. As of late 2025, the domain carried 69 TXT records, 67 of which were verification tokens, totaling over 5,000 bytes. This "sprawl" forces every SPF lookup to occur over TCP rather than UDP, which significantly slows performance and increases the risk of temporary errors.
To prove this, the researchers padded one of their own test domains with artificial verification entries to exceed the 1232-byte "DNS Flag Day" threshold. Predictably, their SPF temporary error rate doubled. The practical takeaway is clear: DNS hygiene is now synonymous with authentication hygiene. Organizations must proactively remove obsolete verification records or, preferably, move these tokens to subdomains or CNAMEs to ensure they do not interfere with the critical SPF lookup process.
The "Fix" Nobody Will Use
Perhaps the most discouraging finding in the report is the existence of a superior technological solution that the industry refuses to adopt. Ed25519-SHA256 is a modern signing algorithm that solves the key-size problem entirely. It offers higher security than RSA, and the public keys are small enough to fit within a single TXT record without the need for splitting or TCP fallback.
Despite being recommended by Germany’s BSI, the researchers could not find a single major email provider using it. Worse, when they tested whether Exchange Online or Google Mail would accept an Ed25519 signature, both rejected it outright with syntax errors. A robust, standardized, and superior solution is effectively locked out of the ecosystem by a lack of support from major players.
Implications for Enterprise Deliverability
The timing of this research is critical. Google, Yahoo, and Microsoft have transitioned from treating email authentication as a recommendation to treating it as a hard requirement. While this shift was intended to make the internet safer, the research demonstrates that strict enforcement has inadvertently exposed the fragility of the underlying DNS plumbing.
For large enterprises with complex, sprawling DNS zones and strict p=reject policies, the findings are a wake-up call. Doing everything "by the book" no longer guarantees delivery. Instead, it exposes the organization to failures that are invisible to standard monitoring tools.
The contribution of Hager and Petrlic’s work is that it turns a vague, frustrating "black box" problem into a measurable checklist. Deliverability teams now have a roadmap:
- Audit DNS response sizes: Ensure your DKIM keys and SPF records stay within the 512-byte UDP limit whenever possible.
- Prioritize DNS hygiene: Remove dead TXT records and move verification tokens off the root domain.
- Monitor DNS host behavior: Be aware that your choice of DNS provider significantly impacts your ability to communicate with Microsoft-hosted domains.
- Demand better support: Pressure major mailbox providers to implement full support for modern standards like EDNS and, eventually, Ed25519.
As the industry moves toward even stricter enforcement of DMARC, the stability of the DNS ecosystem will become the single most important factor in email reliability. For now, the "ghost in the machine" has been named, and the path to fixing it is clear—provided the industry is willing to address the infrastructure it has ignored for far too long.
