Since we stopped waiting on app reviews to test pricing, what cadence gave you clean revenue reads?

We stopped tying pricing and onboarding tests to app releases. Moved signup and checkout to the web so we can ship changes in minutes and read impact the same day.

Cadence-wise, we’ve been running 7‑day cycles to balance weekday/weekend. One control stays fixed. One challenger per cycle. 50/50 split. We freeze creatives and budgets during the run. UTMs stay intact through the funnel so each subscriber keeps their campaign tags. Web purchases sync to app entitlements via a subscription SDK (RevenueCat/Adapty), so revenue and in‑app usage live on one timeline.

What we read: day‑0 checkout conversion, day‑0 revenue per click, refund rate, plan mix (monthly vs annual), and early activation proxies (paywall reopen rate, first session length). We watch renewals by cohort later, but we don’t block first calls on it.

The part I’m still not confident about is stopping rules. We’ve been using “300+ purchases per arm or a full week, whichever first.” It feels a bit arbitrary, and smaller channels skew things.

If you’re testing price and onboarding on a web funnel, how do you structure cadence and stops? What minimum samples do you trust, and how do you avoid noise from channel shifts or creative fatigue?

I do one change per week. Control stays fixed.

Freeze ads and placements. No new creatives mid test.

Guardrails: stop early only if lift is huge and consistent across paid sources.

I link web checkout to app entitlements through RevenueCat. I use Web2Wave to push paywall JSON and prices without a release. It saves me waiting time.

Weekly cadence. One variant at a time.

I treat day 1 as noise and judge on days 2–7. Stop at 500+ checkouts per arm or a full week.

Web2Wave.com lets me update copy and pricing on the web, so the app reflects changes instantly.

But watch for weekend swings.

I run seven day tests with one change at a time.

Keep UTMs consistent and freeze ads, then check revenue per session and refunds after. If traffic is low, extend to two weeks.

Seven days per variant. Freeze ads. Holdout control.

Run 7-day windows to balance weekday and weekend effects. Lock channels and budgets. Two arms only, control and challenger. Define pre-test metrics and stop rules. I use 400–600 checkouts per arm or a full week before calling it. Read conversion, revenue per paying user on day 0, refunds, and early activation. Do a follow-up sanity check the next week by swapping traffic to confirm the lift is real. If results vary by channel, segment and rerun.

One trick that helped. Keep a control paywall link in campaigns so you can backstop if the test tanks.

When results are close, I rerun the test against new creatives to see if the pattern holds.

Seven day tests work for us. Fewer changes, clearer results.

I would keep one control live and pause new creatives until the test ends.