Introduction: treat deliverability like uptime, not a vanity metric

Most teams look at SMS deliverability once a month as a single percentage.

“Looks good, we’re around 95%.”

Meanwhile:

One US carrier silently starts filtering a new promo flow.
A high-value OTP sequence begins failing at 2 a.m.
A burner pool gets tired and error codes quietly climb.

By the time anyone notices, you’ve:

Lost 5–6 figures in revenue from abandoned checkouts or deposits.
Damaged trust (“I never got the code, your app is broken.”).
Trained carriers to treat your brand as noisy or risky.

In our work triaging hundreds of deliverability incidents, the pattern is clear: teams that treat deliverability like site reliability (SRE) recover fast. Teams that treat it like a weekly vanity metric get blindsided.

This guide shows you how to:

Pick the right KPIs (and ignore misleading ones).
Slice data by carrier, sender pool, route, and campaign.
Build a dashboard and alerting system that catches issues early.
Use monitoring to improve deliverability, not just report it.

Section 1: The core SMS deliverability KPIs that actually matter

You don’t need 40 metrics. You need a small set of KPIs that map directly to incidents and recovery.

1. Delivered rate (by carrier, pool, campaign)

Definition:

Delivered rate = messages with positive “delivered” receipts ÷ total send attempts

Best practice:

Always slice by:
- Carrier (Verizon, AT&T, T-Mobile, international operators)
- Sender pool / grid
- Campaign / flow (OTP, promos, transactional)
- Country / region

What “good” looks like (US A2P, properly configured):

Core transactional flows: 99%+
High-volume promos: 98–99%+
Anything consistently under 97–98% needs investigation.

2. Hard-fail / error rate

Definition:

Percentage of messages with definitive failure codes:
- Invalid number
- Unknown subscriber
- Permanent carrier rejection

Why it matters:

Rising hard-fails often mean:
- Poor list hygiene.
- Carrier-level blocking of specific senders or content.
- A tired or burned-out number pool.

Watch for:

Sudden jumps on a single carrier.
Specific routes or pools with >1–2% persistent hard-fail rate.

3. Soft-fail / retry rate

Definition:

Temporary failures:
- Network issues
- Congestion
- Rate limiting / throttling

Why it matters:

High soft-fails = you’re pushing carriers too hard or hitting congested routes.
Shows whether your retry strategy is working or just hammering.

4. Unknown / filtered / “fake delivered” indicators

Carriers don’t always give a “filtered” code. Some:

Return generic errors.
Claim “delivered” while devices get nothing (shadow filtering).

Proxies to monitor:

Drops in downstream behavior (clicks, logins) despite “OK” receipts.
Sampling tests: seed numbers on each carrier that you log separately.
Sudden performance drops on new campaigns while others stay stable.

5. Pool and grid health

If you use:

Burner Number Pools
Private Pool Grids
Or even simple dedicated numbers

…you should track, per pool/grid:

Delivered rate
Hard-fail rate
Complaint / opt-out rate
Daily messages per sender

Healthy patterns:

Steady performance over time.
No sender crossing:
- >1% hard-fail in a 24-hour window.
- >0.3–0.5% complaint / opt-out on promos.

Section 2: The “deliverability cube,” how to segment your data

A single global “delivery rate” hides everything.

You need a deliverability cube:

Carrier (Verizon, AT&T, T-Mobile, etc.)
Sender (pool, grid, individual number)
Route / product (gateway, region)
Campaign / flow (OTP, promos, transactional)
Content risk level (mainstream, high-risk, SHAFT)

Example slice that catches real issues

Verizon × Promo × Grid A:
- Delivered rate drops from 99.1% → 94.4% over 48 hours.
- Hard-fails and soft-fails slightly up.
- Other carriers are stable.
Action:
- Shift promos from Grid A to Grid B for Verizon.
- Inspect recent content changes and velocity patterns.
- Temporarily reduce volume to baseline + 20% while you test.

Without segmentation, you’d only see:

Global delivered: 97.8% → 96.9% (shrug).

With segmentation, you see:

One output in the matrix is burning out while others are healthy.

Section 3: Alert thresholds and what to do when they fire

1. Carrier-specific delivered rate alerts

Recommended thresholds (adjust per baseline):

Alert if delivered rate on any major carrier:
- Drops >2 points from 7‑day median.
- Or falls below 97% for longer than 30–60 minutes on active traffic.

Runbook:

Confirm it’s not a data glitch (dashboards, raw logs).
Check:
- Recent deploys (content changes, routing changes).
- New campaign launches.
- Volume spikes.
Mitigate:
- Temporarily reduce sending velocity on that carrier.
- Switch to alternative pool / grid if available.
- Pause new risky campaigns for that carrier.

2. Pool / grid health alerts

Alert when:

Any pool or grid’s hard-fail rate exceeds 1–2% for >1 hour on meaningful volume.
Complaint / opt-out rates exceed 0.3–0.5% on promos.

Runbook:

Stop sending new campaigns on that pool / grid.
Shift some traffic to healthier pools.
Investigate:
- Did you mix higher-risk content onto a formerly clean pool?
- Did carrier policies change (e.g., new rule on SHAFT keywords)?

3. Shadow filtering & “fake delivery” alerts

Because you won’t always see clear error codes:

Compare:
- Delivered messages → expected conversions (clicks, logins, OTP uses).
Alert when:
- Deliverability stays “good” but downstream conversion falls sharply for one carrier or campaign.

This is where:

Seed numbers per carrier are invaluable.
Periodic live tests (manual + automated) catch reality vs receipts.

Section 4: Designing the SMS deliverability dashboard

Your dashboard doesn’t have to be fancy. It has to be useful under pressure.

Layout 1: Executive overview

Top-level tiles:

Global delivered rate (last 24h, 7d)
Per-carrier delivered rate (Verizon, AT&T, T-Mobile, top 3–5 internationals)
% messages by:
- Transactional vs marketing
- Mainstream vs high-risk

Trends:

Line charts:
- Delivered rate by carrier over time.
- Volume by carrier.

Use this to answer: “Are we on fire, yes or no?”

Layout 2: Ops / SRE view

Tables and charts by:

Carrier × Pool × Campaign
Pool health metrics (delivered, hard-fail, soft-fail, complaints)

Examples:

Heatmap: delivered rate by carrier (columns) and pool/grid (rows).
Table with sorting:
- “Show pools with highest hard-fail rate today.”

Use this when an alert fires.

Layout 3: Analytics / marketing view

Focus on:

Campaign performance:
- Delivered rate vs CTR vs conversion.
A/B tests:
- Content variants vs deliverability.

This view bridges deliverability and revenue, making it easier to justify infra decisions.

Section 5: Diagnosing common issues with your metrics

Scenario 1: One carrier tanks, others are stable

Likely causes:

Carrier‑specific filtering on:
- Content pattern.
- URL domain.
- Sender pool reputation.

What to check:

Any recent content or template changes?
New URLs being used? (e.g., changed link shortener)
Volume ramp: did you spike too fast on that carrier?

Scenario 2: All carriers degrade at once

Likely causes:

Global content change (e.g., more aggressive promos).
Aggressive volume ramp across the board.
Platform-level change (routing, pool logic).

What to check:

Last few deployments.
New high-risk campaigns.
Whether controls (burner logic, per-carrier caps) are actually enforced.

Scenario 3: Metrics look fine, but support inbox fills with “I didn’t get it”

Likely causes:

Device-level filtering (spam folders).
Shadow filtering at carrier level with misleading receipts.
Regional pockets affected (e.g., specific area codes).

What to check:

Seed device tests on each carrier.
Region / area-code breakdowns.
Presence of sensitive keywords or patterns.

Section 6: How deliverability monitoring changes your infrastructure choices

Once you see:

Which pools degrade fastest
Which carriers are most sensitive
How content and volume affect outcomes

…it becomes obvious why infrastructure matters.

Teams that move to:

Private Pool Grids (100+ multi‑carrier SIMs per grid)
Carrier‑matching algorithms (Verizon→Verizon, AT&T→AT&T)
Burner Number Pools with automated retirement

…can use their dashboards to:

Proactively rotate and cool down senders.
A/B test routing strategies, not just content.
Create per‑carrier playbooks instead of generic fixes.

We regularly see:

40–60% fewer incidents after deploying proper monitoring and grid‑based routing.
Faster RCA (root cause analysis) because logs and metrics line up.
Better risk conversations with compliance and legal (“here’s exactly how we’re controlling abuse and monitoring complaints”).

FAQ: SMS deliverability metrics & dashboards

1. What’s a “good” global delivery rate?

For a healthy, well‑architected program:

Transactional flows: 99%+
High-volume marketing: 98–99%

Anything under 97–98% on core flows is a red flag.

2. How often should we check deliverability?

Dashboards: daily (or more during launches).
Alerts: real-time for significant drops.
Deep reviews: weekly or monthly with trend analysis.

3. Do I really need per‑carrier data?

Yes. Most serious incidents are carrier-specific. Without per‑carrier slices, you’re flying blind.

4. What about small senders? Is this overkill?

If you:

Send low volume.
Operate in low‑risk verticals.
Don’t drive mission‑critical revenue via SMS.

…you can get away with simpler monitoring. But the moment SMS is core revenue, you’ll wish you had this in place.

5. How do I start if my current provider doesn’t expose good metrics?

Options:

Pull CDRs / logs and build your own aggregation.
Use webhooks to log DLRs into your data warehouse.
Consider a gateway that exposes carrier‑level data by design.

6. How does this relate to A2P 10DLC registration?

10DLC compliance affects:

Allowed volume.
Scrutiny level.
Penalties for abuse.

Monitoring delivers the feedback loop that tells you if:

Your campaigns are behaving within carrier expectations.
You’re about to trip a threshold.

7. Can monitoring fix bad content or consent?

No. It can only tell you:

How bad things are.
Where they’re bad.

You still need clean opt-in, clear messaging, and respect for local law.

8. How do I detect device-level spam filtering?

Seed devices across carriers and platforms (iOS/Android).
Correlate “delivered” receipts with real device receipts and behavior.

9. Where does privacy fit into all this?

A privacy‑first gateway should:

Minimize stored PII.
Offer clear data retention controls.
Still provide aggregated metrics without leaking sensitive content.

10. Do I need a dedicated deliverability engineer?

Not necessarily. But you do need:

Clear ownership (someone accountable).
Runbooks and dashboards that non‑experts can follow in an incident.

Conclusion: make deliverability observable before it becomes expensive

You can’t fix what you can’t see.

A basic deliverability dashboard and alerting setup can:

Catch carrier‑specific issues before they explode.
Prove the ROI of better infrastructure (carrier matching, private grids).
Turn SMS from a black box into an operationally managed system.

If SMS is tied to revenue, treat it like an SRE problem:

Instrument it.
Alert on it.
Build runbooks around it.

Once you have that in place, you’re in a perfect position to evaluate whether a private, carrier-matching gateway is worth it, because you’ll have hard data showing where your current provider is leaving money on the table.