"When a measure becomes a target, it ceases to be a good measure." This is Goodhart's Law, and it's the most important thing to understand about engineering metrics. Every metric you track will eventually be gamed if you're not careful. This guide explains why, shows real examples from engineering teams, and offers strategies to measure performance without destroying it.
"The moment you start rewarding developers for lines of code, you'll get more lines of code—and probably worse software."
What is Goodhart's Law?
Goodhart's Law was originally stated by British economist Charles Goodhart in 1975, in the context of monetary policy. The idea is simple but profound:
"Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes."
Or, in simpler terms: when you optimize for a metric, you often destroy what the metric was supposed to measure.
This happens because metrics are proxies for things we actually care about. We care about "software quality," but we measure test coverage because it's countable. We care about "developer productivity," but we measure commits because they're trackable. The proxy is never the thing itself—and when people optimize the proxy, the underlying thing often suffers.
Real Examples in Engineering
Goodhart's Law isn't theoretical. It plays out in engineering organizations constantly:
Lines of Code
| The metric: | Lines of code (LOC) per developer |
| What you wanted: | High productivity, lots of features delivered |
| What you got: | Verbose code, copy-paste duplication, resistance to refactoring |
When LOC becomes a target, developers write more code than necessary. Functions that could be one-liners become twenty. Code that should be deleted stays. Refactoring that reduces LOC looks like "negative productivity."
Test Coverage
| The metric: | Code coverage percentage |
| What you wanted: | Well-tested, reliable software |
| What you got: | Tests that execute code but don't verify behavior, assertion-free tests |
Teams chasing coverage targets write tests that technically cover lines but don't actually test anything meaningful. A test that calls a function without asserting results increases coverage while adding zero value.
Velocity (Story Points)
| The metric: | Story points completed per sprint |
| What you wanted: | Predictable delivery, understanding of team capacity |
| What you got: | Point inflation, gaming of estimates, velocity as performance metric |
When velocity becomes a performance target, teams inflate their estimates. What was a "3" becomes a "5" to look more productive. Cross-team comparisons become meaningless. The metric stops tracking actual capacity.
Number of PRs
| The metric: | Pull requests merged per week |
| What you wanted: | Active development, regular integration |
| What you got: | Tiny PRs, split work unnecessarily, gaming for higher counts |
When PR count becomes a goal, developers split work into artificially small pieces. A feature that should be one PR becomes five. Review overhead increases. The metric goes up while actual throughput goes down.
Cycle Time
| The metric: | Time from first commit to merge |
| What you wanted: | Fast feedback, efficient process |
| What you got: | PRs opened before work is ready, rubber-stamp reviews to hit targets |
When cycle time has targets, teams find shortcuts. Opening PRs early inflates "coding time" but shortens the measured cycle. Reviews get rushed. Quality suffers while the metric improves.
/// Our Take
Every metric we show in CodePulse can be gamed. We know this. The question isn't whether metrics can be gamed—they can. It's whether you create a culture where gaming them is more attractive than improving genuinely.
That's why we focus on team-level metrics rather than individual leaderboards, and why we emphasize understanding bottlenecks rather than hitting arbitrary targets. Metrics should inform, not evaluate.
Why Does Gaming Happen?
Understanding why Goodhart's Law operates helps you design better measurement systems:
1. Incentive Alignment
People optimize what they're rewarded for. If bonuses are tied to metrics, people will optimize those metrics—even at the expense of what the metrics were supposed to measure.
2. Metric Simplification
Reality is complex; metrics are simple. "Good software" involves quality, maintainability, performance, user satisfaction, and more. Any single metric captures only part of that—and optimizing the part can harm the whole.
3. Campbell's Law
Related to Goodhart's Law, Campbell's Law states: "The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor."
4. Local Optimization
Individuals optimizing their own metrics can create system-wide dysfunction. A developer optimizing PR count might create more work for reviewers. The developer's metric improves; the team's throughput decreases.
"You get what you measure—which is why you should be very careful what you measure."
Strategies to Avoid Metric Gaming
1. Measure for Understanding, Not Evaluation
Use metrics to understand what's happening, not to judge people. When cycle time increases, ask "what changed?" rather than "who's to blame?" When you remove the evaluation pressure, you remove the incentive to game.
2. Use Multiple Metrics Together
Balance metrics against each other. If you track cycle time, also track quality metrics like change failure rate. Optimizing one at the expense of the other becomes visible. This is why DORA uses four metrics together, not one.
| If You Track... | Also Track... | To Prevent... |
|---|---|---|
| Velocity | Bug rate, tech debt | Shipping fast but low quality |
| Test coverage | Test effectiveness (bugs caught) | Coverage without value |
| Cycle time | Change failure rate | Speed without stability |
| PR count | PR size, merge frequency | Artificial splitting |
| Deployment frequency | MTTR, customer incidents | Deploying for the sake of deploying |
3. Focus on Team Metrics, Not Individual
Individual metrics create competition; team metrics create collaboration. "Our team's cycle time" encourages helping each other. "My cycle time" encourages gaming.
4. Let Teams Choose Their Own Metrics
When teams pick what to measure, they have ownership. Imposed metrics feel like surveillance; chosen metrics feel like tools. Self-selected metrics are also harder to game because the team knows why they chose them.
5. Rotate or Sunset Metrics
Don't measure the same things forever. Once a metric has served its purpose (identified a problem, tracked an improvement), consider retiring it. Long-lived metrics develop long-evolved gaming strategies.
6. Look at Trends, Not Absolutes
"Cycle time decreased 20%" is more useful than "cycle time is 48 hours." Trends show improvement; absolutes invite comparison and competition. Focus on direction, not position.
📊 How CodePulse Addresses This
We designed CodePulse with Goodhart's Law in mind:
- Team-level focus: Default views show team metrics, not individual rankings
- Balanced metrics: Executive Summary combines speed, quality, and collaboration metrics
- Trend emphasis: Charts show change over time, not just current state
- No arbitrary targets: We show benchmarks for context, not goals to hit
Building a Healthy Metrics Culture
The solution to Goodhart's Law isn't to stop measuring—it's to measure thoughtfully:
What a Healthy Metrics Culture Looks Like
- Metrics are discussed in retrospectives, not performance reviews
- Teams ask "what does this tell us?" not "how do we improve this number?"
- Bad metrics are questioned and changed, not blindly optimized
- Leaders use metrics to understand, not to judge
- Gaming is discussed openly—and addressed by changing the metric, not punishing the gamer
Red Flags of Unhealthy Metrics Culture
- Metrics tied directly to compensation or performance reviews
- Individual leaderboards visible to management
- Targets set without team input
- Consistent "green" dashboards that don't match reality
- Fear of reporting bad numbers
- Metrics never change despite changing circumstances
"The goal of engineering metrics isn't to prove teams are productive. It's to help teams become more productive. Those are very different goals with very different measurement approaches."
Related Guides
- Engineering Metrics Without Surveillance — Building trust with metrics
- Measure Team Performance Without Micromanaging — Balanced measurement approaches
- DORA Metrics Guide — A well-designed multi-metric framework
- Data Quality in Engineering Metrics — Ensuring metrics are accurate and meaningful
Conclusion
Goodhart's Law isn't a reason to avoid metrics—it's a reason to use them wisely. Every metric you track will be gamed if the incentives are wrong. The solution is:
- Measure for understanding, not evaluation
- Use balanced metric sets, not single indicators
- Focus on team outcomes, not individual numbers
- Track trends, not absolute targets
- Rotate metrics as circumstances change
- Create psychological safety to report bad numbers
Remember: the map is not the territory. Metrics are maps of engineering performance, not the performance itself. Use them to navigate, not to judge.
"When a measure becomes a target, it ceases to be a good measure. When a team uses measures to improve, they become better measures—and a better team."
Start by auditing your current metrics. Are any tied to compensation? Are teams gaming them? Use CodePulse to understand your delivery flow, not to evaluate your developers. The metrics are there to serve you—not the other way around.
See these insights for your team
CodePulse connects to your GitHub and shows you actionable engineering metrics in minutes. No complex setup required.
Free tier available. No credit card required.
Related Guides
How to Measure Developers Without Becoming the Villain
Learn how to implement engineering metrics that developers actually trust, focusing on insight over surveillance and team-level patterns.
Engineering Metrics That Won't Get You Reported to HR
An opinionated guide to implementing engineering metrics that build trust. Includes the Visibility Bias Framework, practical do/don't guidance, and a 30-day action plan.
DORA Metrics Are Being Weaponized. Here's the Fix
DORA metrics were designed for research, not management. Learn how to use them correctly as signals for improvement, not targets to game.
Your Engineering Metrics Are Lying to You
Learn how engineering analytics tools ensure data accuracy through bot filtering, file exclusions, and reliable sync mechanisms.
