Average test duration: 2-4 weeks for most sites. Low-traffic sites (1,000/week) need 8-12 weeks. High-traffic sites (100,000/week) can conclude in days. Always run for at least 1 full business cycle (usually 1 week) to account for day-of-week effects.
Why test duration is harder than it looks
People ask “How long should an A/B test run?” because duration is the visible constraint. But duration is actually a proxy for sample size: you’re waiting for enough data to separate signal from noise.
Duration is not a goal
The goal is a decision you can trust. Duration is just how long it takes to get there.
Rules of thumb are incomplete
“Run 2 weeks” helps with seasonality, but doesn’t fix underpowered tests.
Duration by Traffic Level
| Weekly Traffic | Typical Duration | Minimum Duration |
|---|---|---|
| 1,000/week | 8-12 weeks | 4 weeks |
| 5,000/week | 3-4 weeks | 2 weeks |
| 25,000/week | 1-2 weeks | 1 week |
| 100,000/week | 3-7 days | 2 days |
What actually drives test duration
Traffic
More visitors per day = faster accumulation of evidence.
Baseline conversion rate
Lower base rates need more samples for the same detectability.
Minimum detectable effect (MDE)
If you want to detect a 2% lift, you need far more traffic than for 15%.
Noise/variance
Noisy metrics (revenue, LTV) require more data than binary conversions.
Duration by Industry
| Industry | Typical Duration | Notes |
|---|---|---|
| E-commerce | 2-3 weeks | Higher traffic, faster results |
| SaaS | 3-4 weeks | Lower conversion rates, need more data |
| B2B | 4-8 weeks | Low traffic, long sales cycles |
| Media/Content | 1-2 weeks | High traffic, quick iterations |
Factors Affecting Duration
- Traffic volume: More traffic = faster results
- Baseline conversion rate: Higher rates need less data
- Effect size: Bigger changes are detected faster
- Confidence level: 95% vs 90% affects sample size
Common duration mistakes (and fixes)
Stopping when you first see significance
Fix: Commit to a stopping rule up-front (fixed horizon or true sequential).
Not running through a full weekly cycle
Fix: Run at least 7 days (often 14) to cover weekday/weekend shifts.
Testing for tiny lifts with low traffic
Fix: Increase effect size, combine pages, or change the metric/segment.
Changing variants mid-test
Fix: Treat changes as a new experiment (restart) to avoid invalid inference.
Important Rules
- • Minimum 1 week: Always run for at least 1 full business cycle
- • Don't stop early: Reaching significance early doesn't mean you should stop
- • Calculate upfront: Use a sample size calculator before starting
Calculate Your Duration
Use our free calculator to estimate how long your test needs to run.