TRUSTYEIGHT / msft deliverability intelligence
verified 21 may 2026
// canonical reference · v2.5

Microsoft 365
corporate inbox
deliverability — 2026.

Eighty-three research files. Two hundred ten thousand words. Every claim source-tagged against Microsoft Learn, M3AAWG, USPTO patents, academic literature, and operator forums. This is what's actually verified.

corpus~210k words
files83
sources450+
refreshed21 may 2026
49% → 27%
outlook inbox placement YoY (Q1 2025 → Q1 2026)
3–5
emails/inbox/day equilibrium (durable reputation)
21–42d
M365 new-tenant time to neutral reputation
60–90d
to durable reputation (vs Gmail's 14–21d)
14
LLM threat classes (incl. "Contact establishment")
7
BCL Junk threshold (default policy)
§01

Executive summary

Microsoft's email deliverability in 2026 is not the system you learned in 2024. The LLM content analysis layer codified in Defender for Office 365 in April 2026 is the dominant change — and it has zero header surface, no tenant-side bypass for high-confidence phishing verdicts, and a "Contact establishment" threat class that is a literal description of cold outreach.

The honest one-paragraph synthesis: authentic peer-level correspondence wins both the LLM filter and the human recipient simultaneously. There is no LLM-evasion writing style separate from authentic peer writing. The same signals that flag content as cold-outreach archetype also cause executives to delete the message. Engineer the writing, not the evasion.

tl;dr
  1. Plain text only. HTML triggers the SCL-9 MarkAsSpam*InHtml family + content fingerprinting.
  2. OAuth XOAUTH2, never SMTP basic auth. Basic auth has been dead at M365 since March 1, 2026.
  3. 3–5 sends/inbox/day equilibrium. 1/day starves Mailbox Intelligence; 30+/day burns reputation.
  4. The First-Contact Safety Tip is per-mailbox. A reply at Tenant Y's mailbox A does not pre-warm the banner at mailbox B.
  5. Cohort structure triggers snowshoe detection at any volume. Per-node volume is not the lever — registrar/NS/SPF/DKIM/timing diversity is.
  6. No tenant-side allow bypasses HPHSH. Not TABL. Not Safe Senders. Not transport rules. Not IPV:CAL. Only Advanced Delivery (SecOps phish-sim, unusable for senders).
  7. F5000 EU corporate is Microsoft-dominant (32–48% per country); zero of nine named targets sit behind Mimecast/Proofpoint. The defensive map is not the one most senders assume.
§02

The EOP / Defender filtering pipeline

verified: microsoft learn, april 2026 refresh
   ┌──────────────────────────────────────────────────────────────────────────┐
   │  PHASE 1 — EDGE                                                          │
   │  network throttle  →  IP reputation  →  domain reputation  →  DBEB       │
   │      ↓                                                                   │
   │  backscatter filter  →  connection filter (IPV:CAL skip)                 │
   └──────────────────────────────────────────────────────────────────────────┘
                                       ↓
   ┌──────────────────────────────────────────────────────────────────────────┐
   │  PHASE 2 — SENDER INTELLIGENCE                                           │
   │  SPF  →  DKIM  →  DMARC  →  ARC  →  composite auth (compauth=X reason=N) │
   │      ↓                                                                   │
   │  spoof intelligence  →  BCL stamp  →  impersonation (UIMP/DIMP/GIMP)     │
   └──────────────────────────────────────────────────────────────────────────┘
                                       ↓
   ┌──────────────────────────────────────────────────────────────────────────┐
   │  PHASE 3 — CONTENT                                                       │
   │  transport rules  →  anti-malware  →  attachment reputation              │
   │      ↓                                                                   │
   │  heuristic clustering  →  ML phish  →  URL reputation                    │
   │      ↓                                                                   │
   │  ★ LLM content analysis (2026 new — Phi-class fine-tune, 14 classes) ★   │
   │      ↓                                                                   │
   │  safe attachments sandbox  →  URL detonation                             │
   └──────────────────────────────────────────────────────────────────────────┘
                                       ↓
   ┌──────────────────────────────────────────────────────────────────────────┐
   │  PHASE 4 — POST-DELIVERY                                                 │
   │  safe links  →  ZAP-phish  →  ZAP-malware  →  ZAP-spam (48h window)      │
   │      ↓                                                                   │
   │  campaign views  →  AIR (P2 only) cluster purge                          │
   └──────────────────────────────────────────────────────────────────────────┘

Verdict precedence (only one wins, in order): Malware → HC-Phish → Phish → HC-Spam → Spoof → UIMP → DIMP → GIMP → Spam → Bulk.

2026 changes that matter
  • Jan 6 2026 — MERR (2,000/mailbox/day) canceled. TERRL formula (500 × licenses^0.7 + 9500) is the only outbound limit. verified
  • Jan 23 2026 — Mimecast MVTP (Multi-Vector Threat Protection) GA. New correlation gate for B2B senders. verified
  • Feb 20–27 2026 — "Week Microsoft Broke Email" — S3150 permanent 550 rejections against SNDS-green IPs. PFA model misconfiguration. No retroactive remediation. verified
  • Mar 1 2026 — SMTP basic auth final deprecation. OAuth-only. verified
  • Apr 13 2026 — LLM content analysis codified as a first-class detection technology in Microsoft Learn. verified
  • Apr 22 2026 — MC1279093 Promotions folder public preview. verified
  • Jul 2026 (planned) — Promotions folder GA. "Bulk moves enabled" toggle confirmed DEFAULT OFF. verified · low risk
  • Jul 1 2026*.h-v1.mx.microsoft DANE+DNSSEC infrastructure becomes default for new Accepted Domains. verified · recipient-side
§03

The verdict system

BCL · SCL · CAT · SFV · OFR · IPV
BCL — Bulk Complaint Level
BCLsignal
0not bulk
1–3bulk, few complaints
4–7mixed complaints
8–9high complaints
presetthresholdaction
default7Junk
standard6Junk
strict5Quarantine
stamped in X-Microsoft-Antispam · principal input: complaint rate (Microsoft's only direct admission of weighting)
SCL — Spam Confidence Level
SCLmeaningaction
−1skipped (internal/allowed)deliver
0not spamdeliver
1not spam (scored clean)deliver
5suspected spamJunk / Quarantine (strict)
6spamJunk / Quarantine (strict)
7–9high-confidence spamJunk / Quarantine (std)
2/3/4 are never stamped by the filter (reserved for transport-rule injection)
CAT — Verdict categories (17)
AMP BIMP BULK DIMP FTBP GIMP HPHSH HSPM INTOS MALW OSPM PHSH SAP SPM SPOOF UIMP NONE
no CAT:LLM exists. LLM verdicts collapse to HPHSH / PHSH / SPM / HSPM on the wire. portal-only surface.

Compauth reason codes (documented, May 2026)

codecompauth valuemeaningcold-sender context
000failexplicit auth fail (SPF + DKIM fail)broken auth — fix immediately
001failimplicit fail (no DMARC published)publish DMARC even at p=none
002failTABL block by recipient adminrecipient tenant blocked you
011failintra-org spoof via implicit authrare for cold senders
100–102passaligned pass (SPF/DKIM/DMARC)the goal
109passbestguesspass (no DMARC, but SPF/DKIM aligned)works, but p=quarantine upgrades
130passARC trusted-sealer override of DMARC failonly with admin-configured trust
6xxfailintra-org spoof scenariosshouldn't occur for cold senders
7xxsoftpasspartial alignmentrecipient-config dependent
905nonecomposite auth not computedrare edge case
8xxreserved · no operator sightingslikely future ML pass codes
§04

Authentication layer

SPF · DKIM · DMARC · ARC · BIMI · MTA-STS · DANE · TLS 1.3
SPF

10-lookup limit (RFC 7208). Flattening required at scale.

Microsoft evaluates SPF then computes alignment to 5322.From. SPF alone passes compauth if domain-aligned.

survives forwardingno
DKIM

Default tenant: selector1 / selector2. CNAME pattern <tenant>.{a,r}-v1.dkim.mail.microsoft (regionally randomized).

2048-bit recommended. Rotation overlap = 4-day hard window.

Multiple signatures: any-passes model per RFC 6376. DMARC passes if any aligned DKIM signature passes.

survives forwardingyes
DMARC

Microsoft treats p=none as weaker signal — even aligned SPF+DKIM mail routinely gets compauth=fail reason=001.

p=quarantine measurably improves placement for cold senders.

M365 generates daily aggregate reports (rua) from dmarceng@microsoft.com. No forensic (ruf) ever sent. Reports only generated when recipient MX is direct-to-M365.

ARC

No Microsoft-curated global trusted-sealer list. No application process. Per-tenant configuration only via Set-ArcConfig -ArcTrustedSealers.

Override fires when M365 writes arc=pass oda=1 + compauth=pass reason=130. Defender retains spoof/impersonation veto on top.

MTA-STS

Microsoft publishes their own at mta-sts.microsoft.com:
version: STSv1 / mode: enforce / mx: *.mail.protection.outlook.com / max_age: 604800

New Set-OutboundConnector -MtaStsMode (Feb–Mar 2026 rollout): Opportunistic (default) or None. No Mandatory mode for outbound.

NDRs: 5.4.8 (MX mismatch), 5.7.5 (cert validation).

DANE

TLSA record requirements: Certificate Usage 3 (DANE-EE), Selector 1 (SPKI), Matching Type 1 (SHA-256). Usage 0/1 explicitly unusable.

DNSSEC mandatory prerequisite.

NDR codes: 5.7.321 no STARTTLS · 322 cert expired · 323 TLSA mismatch · 324 DNSSEC invalid.

TLS 1.3

EXO supports TLS 1.3 for all email submissions and server-to-server (per Purview encryption ref, May 14 2026 update). Supported, not mandatory.

POP3/IMAP4 TLS 1.0/1.1 cutoff: July 2026 (MC1293480). SMTP AUTH TLS 1.0/1.1 phasing out separately Sep 2025 → 2026.

BIMI

Strict DMARC enforcement required (p=quarantine or p=reject).

Verified Mark Certificate (VMC) ~$1.5K/domain/year.

Misaligned for cold outreach — VMC logo signals "broadcast marketing," destroys the 1:1 illusion. Phase 1 verdict: skip.

§05

LLM content analysis — 2026's biggest variable

codified in microsoft learn 2026-04-13 · zero header surface

The single most important deliverability change in 2026. Microsoft has deployed a purpose-built large language model as a first-class detection technology inside the EOP/Defender content-filter stack. It runs pre-delivery, synchronously, in parallel with content heuristics.

model identity (inferred ~85% confidence)
familyPhi (Microsoft SLM)
size~3.8B params
candidatePhi-4-mini-flash-reasoning
architecturedecoder-hybrid + SSM + GMU
fine-tuneSFT + DPO on MS BEC corpus
quantizationFP8/INT8
languages23 (Phi-3.5/4-mini standard)
claimed accuracy99.995% (MS internal benchmark)
14 threat classes (the cold-outreach killer)
  • Payroll fraud
  • Invoice fraud
  • Gift cards
  • Contact establishment ★
  • Business intelligence
  • PII gathering
  • Task fraud
  • Phishing (generic)
  • Brand impersonation
  • BEC generic
  • Lateral movement
  • Credential phishing
  • Malware delivery
  • Reconnaissance
"Contact establishment" is a literal description of cold outreach. This is the existential 2026 threat.
on-wire surface

Zero. No CAT:LLM, no CAT:BEC, no ContentAnalysis-Verdict header. LLM hits collapse to legacy CAT:HPHSH / PHSH / SPM / HSPM.

Verdict is portal-only:

  • Email Entity Page → "Detection technology: LLM content analysis"
  • Threat Explorer
  • Advanced Hunting EmailEvents.DetectionMethods

Bounce-header forensics is useless for diagnosing LLM-driven blocks.

tenant-side bypass

None.

  • TABL Allow — does not bypass HPHSH
  • IPV:CAL Connection Filter — still scans malware + HPHSH
  • Safe Senders (SFV:SFE) — not HPHSH
  • SCL=-1 transport rule (SFV:SKN) — not HPHSH
  • Anti-spam policy allow — not HPHSH
  • Advanced Delivery Policy — only path, locked to SecOps + phish-sim

Strategy must be "score clean," never "dodge."

the headline insight (P2-9)
Authentic vs promotional is structural, not stylistic. The same surface signals trigger Microsoft's Contact-establishment classifier AND cause human recipients to recognize template outreach. The defensive copy that scores LLM-clean is the same copy that gets peer replies. There is no LLM-evasion writing style separate from authentic peer writing.
§06

Snowshoe + cohort detection

structure binds, not per-node volume

Spamhaus CSS exists specifically to list senders whose volume is too low for volume-based detection. Verbatim from Spamhaus: "modest volumes of email that do not trigger automated spam blocking filters." Lower volume per mailbox does not bypass snowshoe — cohort structure does.

Detector evidence (academic + vendor)

sourceyearfeatures usedmin volume needed
van der Toorn (NOMS Best Paper)2018active DNS only — NS, SOA, MX, TXT0 — pre-send
Hao / Feamster (NDSS)2010BGP/ASN/AS cluster features0
PREDATOR (Hao et al. CCS)2016registration-time info only0
Spamhaus CSS2009→infrastructure correlation"low volume"
Spooren et al.2022single DNS query0
Cisco Talos2018→registrant-email reuse0

The 19-variable cohort-defense matrix

  • registrar — mix 5+ (Porkbun, Namecheap, Dynadot, Hetzner, OVH)
  • registration date — stagger 4–8 months, never bulk-buy
  • registrant email — unique per domain
  • WHOIS privacy — mixed (some private, some not)
  • NS provider — mixed (Cloudflare, Hetzner, Route 53, DNSimple)
  • DKIM key — unique per tenant, 2048-bit
  • DKIM rotation timing — staggered
  • SPF include chain — structurally different
  • TLS cert issuer — mixed (Let's Encrypt, ZeroSSL, Sectigo)
  • TLS cert SAN — one cert per domain (no bundles)
  • Azure subscription parent — different per tenant
  • tenant region — varied where business reasons allow
  • admin email pattern — unique per tenant
  • tenant naming — no shared lexical prefix/suffix
  • mailbox password — unique per tenant (not single shared value)
  • sender display names — non-overlapping across tenants
  • deploy step timing — paced, not batched
  • HELO / EHLO banner — Microsoft-managed (uniform)
  • SMTP submission patterns — randomized intervals
trigger probabilities at 1,650 inboxes × 1/day with shared NS/registrar/SPF
CSS listing of at least one cohort IP within 90 days~0.95
CSS-driven cohort-wide degradation~0.85
M365 EOP cohort SCL elevation~0.80
Structure-only detection independent of any volume signal~0.60
§07

The operator playbook

cadence · copy · monitoring · what to measure

Cadence equilibrium (Phase 2 P2-3R)

per-inbox/dayregimereputation outcomeverdict
1/dayunder-floorstarves Mailbox Intelligence pair-graph; mailbox stays "unknown sender" indefinitelyavoid
3–5/dayequilibriumreputation maturation + TAM utilization balancerecommended
8–12/daywarm steady-stateburns TAM in months, durable when paired with qualityscaling
30/dayBHW 2026 ceilingat the operator-community maxaggressive
50+/dayvendor maxreputation damage territory per multiple vendor sourcesburn risk

English peer-tone copy frameworks (Phase 2 P2-9)

Common requirement: every framework opens with a verifiable named-by-name specific in sentence 1–2.

FW-EN-1 · annual report anchor
"[Firstname] —

Saw the Q1 [filing/report] — congrats on [specific metric].

Curious how [observation tied to filing] is playing into [their stated priority].

Worth a conversation?

— [your firstname]"
FW-EN-2 · regulatory timeline hook
"[Firstname] —

[Specific regulation] hits [date]. Most teams are figuring out [angle].

Thinking [specific reasoning about their org's exposure].

How are you approaching it?

— [your firstname]"
FW-EN-3 · numbers / sector comparison
"[Firstname] —

[Specific public number] at [their company] vs [sector comparison] is interesting.

The gap usually means [specific operational reasoning].

Is that solved already, or on the roadmap?

— [your firstname]"
FW-EN-4 · executive hire / corp dev signal
"[Firstname] —

Saw [new VP/Director hire on LinkedIn].

Usually means [specific operational shift]. Curious if that's the read internally.

Worth a brief conversation?

— [your firstname]"

Word count by persona

personawordstoneCTA archetype
CEO / Founder50–80peer, specific, no fluffreasoning question
CMO60–100peer, business outcome anchoredreasoning question or unilateral give
VP Sales / CRO70–120peer, ramp/pipeline metric anchored"Worth a conversation?"
CTO / CISO50–90POV-first, NOT pitch"Curious how your team's handling X" — perspective ask

Anti-pattern catalog (avoid)

"Quick question about [X]"
"Are you the right person for [X]?"
"Following up on my last email"
"15 minutes Thursday?"
"I noticed you're the [title]"
"Would love to connect"
"Wanted to reach out about..."
"Hope this finds you well"
"Saw your team is..."
"Just checking in"
Fake "Re:" / "Fwd:" subject prefixes
In-Reply-To threading on T2

2-touch sequence shape

touchdayshapeconstraints
T1day 0opening anchored to specific verifiable referentplain text · <120 words · reasoning question
T2day 7standalone — substantively new — NOT "following up"new subject · no In-Reply-To threading · references DIFFERENT public signal
day 365no further touches for 12 months on no-replysuppression list discipline

Monitoring stack

toolcostwhat it shows
Microsoft SNDSfreeper-IP reputation in Microsoft's outbound pool — opaque on shared pool but register IPs anyway
Microsoft Postmaster Toolsfreeper-domain reputation (Outlook.com side)
Google Postmaster Toolsfreeper-domain at Gmail — complementary even if M365-focused
DMARC aggregatorfree–$200/moDmarcian / EasyDMARC / PowerDMARC — parses rua XML
mxtoolbox.com Standard$129/moDNS drift + blocklist monitoring across ~80 RBLs
Owned seed mailboxes$6/seat/mo × 5–10 seatsactual placement at real industry tenants — the ONLY ground truth
§08

F5000 EU target stack — MX forensics

verified · may 2026

Pulled MX records + cross-referenced AppsRunTheWorld / 6sense / Microsoft customer stories / vendor case studies. Zero of nine named targets use Mimecast, Proofpoint, or Barracuda — the SEG distribution at this specific Fortune-5000-EU subset differs sharply from market averages.

targetstackfilter layerconfidence
Mercedes-Benz Groupself-hosted MTA (corpinter.net) → M365 + DefenderEOP behind on-prem bridge · no third-party SEG visibleHIGH
BMW GroupCisco IronPort (hc324-48.eu.iphmx.com)SBRS + CASE + AsyncOS 16+VERY HIGH
DHL / Deutsche PostFortinet FortiMail CloudFortiGuard Antispam + SRR scoringVERY HIGH
Siemens AGM365 + Defender P2 + DANE (h-v1.mx.microsoft)full Defender E5HIGH
Siemens EnergyM365 + Defender P1 (vanilla EOP)standard EOPMED-HIGH
Roche HoldingGoogle Workspace (gene.com MTAs front Gmail)SpamBrain · not Microsoft at allHIGH
thyssenkruppM365 + Defender P1/P2 mixstandard EOP, P2 at HQMED-HIGH
UCB SAM365 + Defender P2full Defender E5HIGH
Ferring PharmaceuticalsM365 + Defender P1standard EOP, sp=none soft spotMED
EU SEG market share — by country (Phase 2 P2-4-2)
countryDefenderMimecastProofpointlocal anchor
UKhigh-end22–28%midEgress/KnowBe4, Clearswift
FrancemidlowlowVade + Hornetsecurity 18–24%
Germanyhighmedpharma heavyHornetsecurity, Retarus, NoSpamProxy, SEPPmail
AustriamidlowlowHornetsecurity 15–22%
Switzerlandmidlowmid (pharma)SEPPmail (FINMA banking)
ItalymidlowbankingLibraesva
NordicshighlowlowWithSecure, Heimdal
Spainhighlowlowlocal channel
§09

Operator claims — verification table

cross-referenced against verified research

Extracted from 3,181 tweets across four operators (Liam Sheridan, OutboundBandit, Termsheetinator, Gat0rtheskater). 627 deliverability-substantive. 176 with quantitative claims. Verdict tags:

claimsourceverdictcross-check
"$4.50/mo per tenant, 10 tenants = $40.50 = 5K/day send"@termsheetinatorverified-matchmatches user's config.yaml exactly
"<2% bounce + 1–2% OOO = healthy"@OutboundBanditverifiedPhase 1 P1-32 confirms
"cold email 1% / LinkedIn 10% / cold calls 5–10%"@iamliamsheridanverifiedSalesloft + Phase 2 P2-6 lower band
"Question-open lift: 2.7% → 6.2%"@OutboundBanditverifiedP2-9 CTA hierarchy confirms reasoning-question > pitch
"Conversion: 2% × 30% × 50% = 0.3% booked"@termsheetinatorverifiedmath + Instantly 2026 benchmark
"Google = ~3 mailboxes/domain"@termsheetinatorverifiedPhase 1 + P2-6 consensus
"Microsoft = 99 mailboxes/domain"@termsheetinatorcontradictedBHW 2026 consensus: 3–5/domain max even on M365
"50 emails/day per inbox, still great results"@OutboundBanditcontroversialabove BHW 30 ceiling; vendor max; P2-3R recommends 3–5
"2-week warmup is enough"@termsheetinatorcontradictedP1-W: 21–42d to neutral, 60–90d to durable
"1,000,000 cold emails in 30 days"@OutboundBanditunverifiableoperator self-report, no triangulation
"$250K from cold email this year"@OutboundBanditunverifiableoperator revenue claim
"5M emails, $25M pipeline, 1000+ booked"@iamliamsheridanplausible math0.02% booked rate fits low-end of 1-4% reply × 5-20% pos × 5-30% positive-to-booked

Full verification table (verbatim quotes, all 176 quant claims) at MSFT_Deliverability/08_PHASE_2_RESEARCH/p2_twitter_operator_claims_verification.md.

§10

Glossary

AIR
Automated Investigation and Response. Defender P2 feature that auto-cascades quarantine across cohort mailboxes after a user-report.
ARC
Authenticated Received Chain. RFC 8617. Allows forwarders to attest auth state across hops.
BCL
Bulk Complaint Level. 0–9 scale stamped in X-Microsoft-Antispam. Threshold 7 default / 6 standard / 5 strict.
CAT
Category code in X-Forefront-Antispam-Report. 17 documented values (PHSH, SPM, BULK, MALW, etc.).
compauth
Composite authentication. Microsoft's auth verdict combining SPF/DKIM/DMARC/ARC. Header value: compauth=X reason=NNN.
DANE
DNS-based Authentication of Named Entities. RFC 7672. Authenticates SMTP TLS via DNSSEC-signed TLSA records.
DBEB
Directory-Based Edge Blocking. EOP feature that rejects mail at edge for unknown recipients.
DMARC
Domain-based Message Authentication, Reporting, and Conformance. RFC 7489.
EOP
Exchange Online Protection. Microsoft's baseline mail filter, runs ahead of Defender for Office 365.
HPHSH
High-confidence phishing category. Cannot be overridden by user-level or admin allow lists.
HRDP
High-Risk Delivery Pool. Microsoft outbound IP pool for tenants with poor sending behavior. Silent — no NDR.
JMRP
Junk Mail Reporting Program. Consumer Outlook.com feedback loop. No equivalent for commercial M365 tenants.
MERR
Mailbox External Recipient Rate Limit. 2K/mailbox/day. Canceled Jan 6 2026.
MX
Mail Exchanger DNS record. The receiving server for a domain.
NDR
Non-Delivery Report. Bounce message returned by SMTP.
SCL
Spam Confidence Level. −1, 0, 1, 5–9. Stamped in X-Forefront-Antispam-Report.
SFV
Spam Filtering Verdict. 11 documented values (NSPM, SPM, SKN, SKS, SKA, SKB, etc.).
SNDS
Smart Network Data Services. Microsoft's outbound IP reputation visibility tool. Free.
TABL
Tenant Allow/Block Lists. Admin-managed override surface in Defender. Block always beats Allow.
TERRL
Tenant External Recipient Rate Limit. Formula: 500 × licenses^0.7 + 9,500/day per tenant. Trial tenants fixed 5,000/day.
ZAP
Zero-Hour Auto Purge. Post-delivery retroactive filter. 48h window.