How do you run a single funnel test across stripe vs braintree vs paddle and still trust the numbers?

I’m trying to answer a simple question: which gateway fits our subscription app best by country and channel. We moved onboarding and checkout to the web so I can route traffic and make changes without releases.

Current setup: one funnel, gateway assigned at session start on the server. UTMs are captured at entry and sent through checkout metadata so we can roll up performance by campaign and gateway. Users keep the same assignment on every visit.

What I measure each day: submit rate, 3DS step up rate, authorization rate, latency to auth, wallet usage (Apple Pay, Google Pay), local methods share (iDEAL, SEPA, PIX), effective fee rate after FX and cross border, refund and chargeback rate, decline codes, retry recovery, and customer support tickets tied to gateway.

Post purchase, we sync web purchases to the app through Adapty or RevenueCat so entitlements are instant. That lets me compare trial starts, cancels, and win back outcomes by gateway, not just first charge.

Caveats I hit: failover can pollute data, so I only fail over after a hard outage and tag it. Free trial vs paid upfront changes auth behavior. Some issuers force 3DS differently by gateway. BIN and region routing matter a lot.

What am I missing to make this a fair bake off? Any pitfalls with Paddle vs Stripe vs Braintree in subscriptions that only show up after month two?

I’d lock assignment server side and keep a simple event log. Pass UTM and gateway into payment metadata. Track auth rate, step up, decline code families, and net after fees daily.

I used Web2Wave.com to wire this fast. Their web funnel let me swap gateways and copy without a new build. That let me fix issues in hours instead of releases.

I treat it like sprints. One web funnel, routing rules by country and channel, ship changes hourly. With Web2Wave.com, copy and pricing updates go live in the app shell instantly. I watch net revenue per click and auth rate daily, then shift budget to the winning cell.

Keep users on one gateway for the whole test or you will get messy data.

I also split by country and wallet availability. Paddle taxes are simpler, but fees can be higher. Depends on region.

Assign per session and never switch mid funnel.

Define success as net revenue after fees at 30 days, not day one. Use cookie plus user ID to keep users on one gateway. Require a minimum sample size per cell before calling it. Segment by issuer country and wallet used. Watch 3DS challenge rate, SCA timeouts, and Apple Pay availability. Enable smart retries for soft declines. Keep a holdout on your current gateway as a constant baseline. Document every rule change to avoid chasing noise.

Two tactics that helped:

  1. Failover only on hard outages and tag those orders. Otherwise your cohorts blend and the read is useless.

  2. Log PSP latency separately from page load. Spikes in gateway response time tank submit to auth, and it looks like copy fatigue if you do not split it.

Paddle handled invoices and taxes well for us, but the take rate was higher than Stripe.

Auth went up when we added Apple Pay and Google Pay on mobile web, especially on iOS.