How do force pushes affect engineering metrics?

Force pushes rewrite commit history, which can cause analytics tools to lose track of original commit timestamps. This affects coding time calculations and can make cycle time appear shorter or longer than reality. Most tools handle this by using PR events (which are immutable) rather than commit timestamps for key metrics.

How do I clean up noisy data in engineering metrics?

Filter bots (Dependabot, Renovate, GitHub Actions) using the exclude_bots flag. Exclude PRs to/from release branches. Set up branch exclusion rules for staging, QA, and hotfix branches. Archive inactive repos that inflate team-level averages. Most analytics platforms support these filters natively.

Why is my cycle time metric misleadingly high?

Common causes: PRs left open as drafts for days before requesting review (inflates "waiting for review" phase), weekend/holiday hours counted in cycle time (configure working hours), stale PRs that were opened months ago and recently merged, and auto-generated PRs from bots that sit in queues. Filter for human PRs opened in the last 90 days for accurate baselines.

What should I do about archived or forked repos in my metrics?

Exclude archived repos from active metrics. They skew averages because they contain old PRs with historical cycle times. For forks, only include them if your team actively commits to them. Most analytics platforms let you select which repos to analyze.

Engineering Metrics Look Wrong? Fix These 7 Data Quality Problems

Your engineering metrics look wrong. Cycle time is impossibly high. GitHub Insights shows different numbers. Bots are inflating your commit count. This guide covers the 10 most common data quality problems in engineering analytics and how to fix each one.

Quick Answer

Why do my engineering metrics look wrong?

The most common causes of misleading engineering metrics are: bot activity inflating counts (filter Dependabot, Renovate), draft PRs skewing cycle time (exclude drafts), stale PRs from months ago being merged (filter by date range), branch configuration including release/staging branches in metrics (set up branch exclusions), and force pushes rewriting commit timestamps. Start by enabling bot filtering and setting a 90-day window.

Why Your Numbers Don't Match GitHub Insights

This is the #1 question we hear from new users. You connect an analytics tool and the numbers differ from what GitHub shows. This is expected and usually means the analytics tool is more accurate for your purposes.

Difference	GitHub Insights	Analytics Tools (CodePulse)
Bot commits	Included	Excluded by default
Draft PRs	Counted in activity	Excluded from cycle time
Self-merges	Counted as merges	Flagged separately
Time calculation	Calendar time	Configurable (working hours option)
Branch filtering	All branches	Configurable exclusions
Private repos	Limited in free tier	Full access via GitHub App

If you need the numbers to match exactly, disable bot filtering and include all branches. But you probably do not want that. Filtered metrics are more useful for decision-making.

🔥 Our Take

Precision is the enemy of useful metrics. An 80% accurate metric that you act on weekly beats a 99% accurate metric that takes a month to calculate.

Stop trying to make your analytics tool match GitHub Insights exactly. The goal is consistent, actionable trends, not accounting-grade precision. If cycle time is trending up, it does not matter whether the absolute number is 18 hours or 22 hours. What matters is the direction.

Problem 1: Bots Inflating Your Metrics

Dependabot, Renovate, GitHub Actions bots, and other automated accounts can generate hundreds of PRs per month. If these are included in your metrics, they distort everything: cycle time (bot PRs merge instantly or sit forever), PR volume (inflated by dependency updates), and review load (if someone reviews bot PRs manually).

Fix:

Enable bot filtering in your analytics tool (CodePulse excludes bots by default)
Check for accounts with [bot] in their login name
Add custom bot accounts specific to your org (deploy bots, CI bots)

Problem 2: Stale PRs Skewing Cycle Time

A PR opened 6 months ago and merged yesterday has a cycle time of 180 days. If that is included in your team's median, it massively distorts the picture. Stale PRs are usually abandoned work that someone merged to clean up, not representative of normal delivery.

Fix:

Filter metrics to PRs opened within the last 90 days
Use median instead of mean (medians are resistant to outliers)
Set up alerts for PRs open longer than 7 days to prevent staleness
Review and close abandoned PRs monthly

Identify bottlenecks slowing your team with CodePulse

Problem 3: Wrong Branch Configuration

PRs to release branches, staging branches, and hotfix branches have different lifecycle patterns than feature PRs. Including them in the same metrics pool creates noise.

Fix:

Exclude PRs targeting branches like main, master, release/*, staging, hotfix/* from source branches
CodePulse automatically excludes PRs FROM main/master/develop/staging branches
Configure additional exclusion patterns for your branching strategy

Problem 4: Force Pushes Breaking History

Force pushes (git push --force) rewrite commit history. This can cause analytics tools to lose track of original commit timestamps, making coding time calculations unreliable.

Fix:

Prefer git push --force-with-lease (safer but still rewrites history)
Use squash merges instead of rebase-and-force-push workflows for cleaner history
Most analytics tools use PR events (immutable) rather than commit timestamps for key metrics
CodePulse uses PR lifecycle events (created, reviewed, merged) which are not affected by force pushes

Problem 5: Weekend and Holiday Hours in Cycle Time

A PR opened Friday at 5 PM and merged Monday at 9 AM shows 64 hours of cycle time. But only a few minutes of actual work happened. Calendar time cycle time can be misleading for teams that do not work weekends.

Fix:

Configure working hours in your analytics tool (CodePulse supports working days configuration per org)
Use "working hours only" mode if your team has consistent work schedules
For distributed teams across time zones, calendar time may be more appropriate since someone is always working

"The best metric configuration is the one your team agrees represents reality. Consistency matters more than precision."

Problem 6: Archived and Forked Repos in Metrics

Archived repositories contain historical PRs with old cycle times that skew team averages. Forked repos may include upstream PRs that your team did not create.

Fix:

Only include repositories your team actively commits to
Remove archived repos from your analytics tool configuration
For forks, filter to only PRs authored by your team members

Problem 7: Delayed or Missing Webhook Events

If your analytics rely on webhooks, delayed or dropped events create gaps in your data. GitHub webhooks have a 99.9% delivery rate but are not guaranteed.

Fix:

Use polling-based tools (like CodePulse, which syncs every 15 minutes via API) instead of webhook-only approaches
Check GitHub's webhook delivery log: Settings → Webhooks → Recent Deliveries
Implement webhook retry logic with exponential backoff
Run a daily reconciliation job that compares webhook data against API data

Detect code hotspots and knowledge silos with CodePulse

Data Quality Checklist

Run through this checklist when setting up engineering analytics for the first time:

Bot filtering enabled? Check that Dependabot, Renovate, and custom bots are excluded.
Branch exclusions configured? Exclude PRs from main/master/release/staging branches.
Time range appropriate? Start with 90 days. Include historical backfill only after verifying current data quality.
Working hours configured? Decide whether to use calendar time or working hours for cycle time.
Archived repos excluded? Remove repos that are no longer actively developed.
Using median, not mean? Medians are resistant to outliers and give a more accurate picture of typical performance.

Getting Started

Connect CodePulse with bot filtering enabled (the default).
Review the initial sync data. If numbers look off, check the problems above in order.
Configure branch exclusions and working hours in Settings.
Compare a sample of 10 PRs manually against the tool's calculations to build confidence.

For more on data quality, see our data quality in engineering metrics guide and GitHub metrics guide.

Frequently Asked Questions

GitHub Insights counts all activity including bots, draft PRs, and self-merges. Engineering analytics tools like CodePulse filter bots by default, exclude draft PRs from cycle time calculations, and may use different time boundaries. The numbers will differ, and the analytics tool numbers are usually more useful because they reflect actual human engineering work.

Engineering Metrics Look Wrong? Fix These 7 Data Quality Problems

See these metrics for your own team

Why Your Numbers Don't Match GitHub Insights

🔥 Our Take

Problem 1: Bots Inflating Your Metrics

Fix:

Problem 2: Stale PRs Skewing Cycle Time

Fix:

Problem 3: Wrong Branch Configuration

Fix:

Problem 4: Force Pushes Breaking History

Fix:

Problem 5: Weekend and Holiday Hours in Cycle Time

Fix:

Problem 6: Archived and Forked Repos in Metrics

Fix:

Problem 7: Delayed or Missing Webhook Events

Fix:

Data Quality Checklist

Getting Started

Frequently Asked Questions

See these insights for your team

See These Features in Action

Related Guides

Your Engineering Metrics Are Lying to You

GitHub Metrics: What to Track and What to Ignore

Engineering Metrics Dashboard: The 7 Metrics You Need

The Complete Guide to Engineering Metrics in 2026