A/B Test Statistical Significance
Statistical significance determines if your A/B test results are valid or just random variation.
Key Metrics
Interpreting Results
Z-Score ≥ 1.96: Statistically significant at 95% confidence. Safe to declare winner and implement variant.
Z-Score 1.65-1.96: Marginal significance (90% confidence). Consider running longer or use directional insight only.
Z-Score less than 1.65: Not statistically significant. Results could be random chance. Don't implement variant yet.
Example Calculation
Landing page A/B test:
- Control: 10,000 visitors, 200 conversions = 2.0% rate
- Variant: 10,000 visitors, 250 conversions = 2.5% rate
- Relative Uplift: ((2.5 - 2.0) / 2.0) × 100 = 25%
- Absolute Uplift: 2.5% - 2.0% = 0.5 percentage points
- Z-Score: ~2.5 (statistically significant at 95%)
- Decision: Implement variant, expect 25% conversion improvement
Common Mistakes
Stopping too early: Seeing early uplift and declaring winner before significance = high risk of false positive.
Not accounting for variance: Conversion rates fluctuate. Tuesday might differ from Saturday. Run full weeks.
P-hacking: Running multiple tests and only reporting winners leads to false discoveries.
Too small sample: 10 conversions per variant is meaningless. Need 100+ minimum, ideally 250+.
Best Practices
- Set target sample size and confidence level before starting
- Run for minimum 1-2 weeks regardless of early results
- Test one variable at a time for clear attribution
- Use 95% confidence as standard (99% for critical changes)
- Document everything: hypothesis, variants, results, decision