A high test failure rate doesn't just slow down your CI pipeline—it erodes trust in your test suite, encourages developers to ignore failures, and ultimately lets bugs slip into production. This guide shows you how to measure, diagnose, and improve your test failure rate using data.
What is test failure rate and why does it matter?
Test failure rate is the percentage of pull requests that have failing CI checks at any point during their lifecycle. A rate above 10% signals reliability problems that slow delivery across your team. Each failure adds a 20-60 minute 'fix, push, wait' cycle, causes context switching, and erodes trust in your test suite. High-performing teams maintain failure rates below 5%. CodePulse tracks this metric automatically and can alert your team when rates exceed your thresholds.
What Does Test Failure Rate Actually Measure?
Definition
Test Failure Rate is the percentage of pull requests that have failing status checks (CI failures) at any point during their lifecycle.
Test Failure Rate Formula
Measures the percentage of pull requests that have at least one CI failure during their lifecycle.
Examples:
Why It Matters
Test failures have cascading effects on your delivery velocity and code quality:
- Velocity impact: Each failure adds a cycle of "fix, push, wait for CI" that can take 20-60 minutes
- Context switching: Developers move on to other work while waiting, then must reload context when CI fails
- Trust erosion: If tests fail randomly (flaky tests), developers start ignoring failures or re-running until they pass
- Review delays: Reviewers may wait for green CI before starting review, compounding delays
DORA Connection
Test failure rate directly impacts your DORA metrics:
- Lead Time: Failures add time to every PR
- Change Failure Rate: Poor test coverage lets bugs reach production
- Deployment Frequency: Unreliable CI blocks frequent deployments
🔥 Our Take
A flaky test suite is worse than no tests at all.
When tests fail randomly, developers learn to ignore them. They click "re-run" until things pass, or they merge despite red builds. You've trained your team that tests don't matter. A smaller, reliable test suite is more valuable than a large, unreliable one. Delete flaky tests until you fix them.
"Every 're-run CI' click is a confession that your test suite has lost the team's trust."
How Do You Read This Metric in CodePulse?
Dashboard Card
On your Dashboard, find the Test Failure Rate card in the Quality Metrics section:
- Percentage displayed: Current failure rate for the selected time period
- Trend indicator: Arrow showing if rate is improving or worsening
- Color coding: Green (<10%), Yellow (10-20%), Red (>20%)
📊 How to See This in CodePulse
Navigate to the Dashboard to track test failure rate:
- Current percentage and trend vs previous period
- Filter by repository to find your worst-performing repos
- Compare time periods to see if recent changes helped
- Set up Alerts to get notified when rates exceed thresholds
Check Awards to see which developers have the highest test pass rates.
Per-Repository Breakdown
Different repositories often have very different failure rates. Filter by repository to identify:
- Which repos have the worst test failure rates
- Whether specific repos have flaky test problems
- If newer repos have better testing practices than legacy ones
Developer-Level Insights
While we focus on team-level metrics, individual pass rates can be useful for coaching. Developers with notably lower pass rates might benefit from:
- Pairing on writing better tests
- Access to better local testing tools
- Understanding of which tests to run locally before pushing
What Are the Common Causes of High Failure Rates?
1. Flaky Tests
Tests that pass sometimes and fail sometimes without any code changes. The most frustrating type of failure.
Signs:
- Same test fails on retry without code changes
- "Re-run CI" is a common team behavior
- Failures happen more at certain times (race conditions)
Common causes:
- Race conditions in async tests
- Tests depending on external services
- Time-dependent tests
- Tests with shared state that isn't properly reset
2. Environment Issues
Tests that pass locally but fail in CI due to environment differences.
Signs:
- "Works on my machine" is a frequent phrase
- Failures only happen in CI, not locally
- Different failure patterns between CI runners
Common causes:
- Different dependency versions in CI vs local
- Missing environment variables or configs
- Different OS or architecture between local and CI
- Resource constraints in CI (memory, disk, network)
3. Insufficient Local Testing
Developers pushing code without running tests locally first.
Signs:
- Obvious failures that would have been caught locally
- Multiple fix-up commits after initial push
- Developers saying "CI will catch it"
4. Test Coverage Gaps
Areas of code with poor or no test coverage where bugs accumulate.
Signs:
- Bugs reach production that tests should have caught
- Regression failures when touching "untested" areas
- Low code coverage metrics
🔥 Our Take
High velocity with a broken test suite is a warning sign, not something to celebrate.
If your team is merging fast and your test failure rate is above 20%, you are shipping bugs to production faster. Velocity metrics look great on a dashboard, but they are meaningless if every third deployment causes a customer-facing issue. Fix your tests before you optimize your speed. Fast and broken is worse than slow and reliable.
How Do You Use CodePulse to Identify Failure Patterns?
Which Repos Have the Worst Failure Rates?
Filter your dashboard by repository and compare failure rates. Focus improvement efforts on the worst performers first—they'll have the biggest impact.
Trend Analysis
Compare failure rates across time periods:
- Improving trend: Recent infrastructure investments or test cleanup are paying off
- Worsening trend: Technical debt is accumulating; prioritize test reliability
- Spiky pattern: External factors (deploy days, specific features) may be causing intermittent issues
"The teams with the best test reliability didn't get there by writing more tests. They got there by deleting the bad ones and investing in infrastructure."
Correlate with Risky Changes
The Risky Changes feature flags PRs with failing checks. Review these to understand:
- Are failures concentrated in certain file types?
- Do large PRs have higher failure rates than small ones?
- Are specific types of changes (e.g., database migrations) failure-prone?
What Are the Best Strategies for Improving Test Reliability?
Quarantine Flaky Tests
Don't let flaky tests block the entire team. Implement a quarantine system:
- Identify tests that fail intermittently (track failure patterns)
- Move them to a non-blocking test suite
- Create tickets to fix each quarantined test
- Run quarantined tests separately and track their stability
- Graduate tests back to the main suite once fixed
Pre-Commit Hooks
Catch obvious failures before code is pushed:
- Run linters and formatters automatically
- Run unit tests for changed files
- Type-check in typed languages
- Keep pre-commit fast (<30 seconds) so developers don't skip it
Improve Environment Parity
Make local development match CI as closely as possible:
- Use Docker for consistent environments
- Lock dependency versions in CI and development
- Document required environment setup
- Consider development containers (VS Code devcontainers)
Test Infrastructure Investment
Sometimes you need to invest in better test infrastructure:
- Faster CI runners with more resources
- Better test parallelization
- Test result caching to skip unchanged tests
- Better test data management (fixtures, factories)
Cultural Changes
Technical fixes only go so far. Build a culture that values test reliability:
- Make "green builds" a team norm, not a suggestion
- Celebrate when flaky tests are fixed
- Allocate time for test maintenance (not just feature work)
- Track and celebrate improvement in failure rate
How Do You Set Up Test Failure Rate Alerts?
Don't wait for quarterly reviews to notice test health degrading. Set up proactive alerts:
Alert: Test Failure Rate Warning Metric: test_failure_rate_percent Operator: > Threshold: 15 Time Period: weekly Severity: warning Description: "Weekly test failure rate exceeds 15%" Alert: Test Failure Rate Critical Metric: test_failure_rate_percent Operator: > Threshold: 25 Time Period: weekly Severity: critical Description: "Weekly test failure rate exceeds 25% - immediate attention needed"
What Good Looks Like
Benchmark your failure rate against these targets:
Test Failure Rate Benchmarks
Target thresholdsIf you're above 20%, make test reliability a top priority—it's likely slowing down everything else your team does.
Which Guides Should You Read Next?
- Reducing PR Cycle Time — test failures are a major cycle time contributor
- Regression Prevention Guide — prevent bugs from reaching production
- Alert Rules Guide — set up proactive quality alerts
Frequently Asked Questions
Under 5% is excellent, 5-10% is good, 10-20% needs improvement, and above 20% is critical. If your failure rate is above 20%, test reliability should be a top priority because it is likely slowing down everything else. Focus improvement efforts on the worst-performing repositories first for the biggest impact.
See these insights for your team
CodePulse connects to your GitHub and shows you actionable engineering metrics in minutes. No complex setup required.
Free tier available. No credit card required.
See These Features in Action
Explore all featuresRelated Guides
The PR Pattern That Predicts 73% of Your Incidents
Learn how to identify high-risk pull requests before they cause production incidents.
DORA Metrics Explained: The 4 Keys Without the Hype
A complete breakdown of the four DORA metrics - deployment frequency, lead time, change failure rate, and MTTR - with honest benchmarks and gaming traps to avoid.
The 'Bus Factor' File That Could Kill Your Project
Use the Bus Factor Risk Matrix to identify where knowledge concentration creates hidden vulnerabilities before someone leaves.
100% Review Coverage Is a Lie (What Actually Matters)
Why 100% review coverage matters, how to track it, and practical steps to build a consistent code review culture across your team.