DEV Community

Cover image for A/B tests for developers

A/B tests for developers

Adrian B.G. on January 04, 2019

This article is only the first part of the main story from my blog: A/B tests developers manual. Best case scenario: Your [product owner,boss,pr...
Collapse
 
tadman profile image
Scott Tadman

Don't forget basic statistics here. If you're developing an A/B test for a case that will be used millions of times per day then a hundred tests is not going to be conclusive. You need to test a statistically significant number of times relative to your actual use case.

Collapse
 
bgadrian profile image
Adrian B.G.

Yes ofc. Beside the statistically relevance there are other factors like user aquisition, cohorts have to be very similar and there is also an error margin of at least 5% which should be taken into consideration when comparing the stats.

That is part of the business part of doing tests, I tried to cover only the technical details.

The web is full of articles on how to do proper testing, but it was lacking on the implementation details, so I wrote this story.

Collapse
 
ben profile image
Ben Halpern

Yep. This is a conversation I've definitely had numerous times.

Collapse
 
timkor profile image
Timkor • Edited

We use load balancing to distribute users over different GIT branches. That works actually quite nice. And works for both clean and spaghetti code!

The nicest advantage here is that you can change anything in a variant including backend parts. If, of course, your project contains both the front and backend part.

Not possible with cross variant testing though.

Collapse
 
bgadrian profile image
Adrian B.G.

We use load balancing to distribute users over different GIT branches. That works actually quite nice.

I don't see how how you can achieve a good user experience and a relevant result based on that, I mean you cannot guarantee that the user will end up in that variant for their entire lifetime (across sessions) with just a LB.

Also you can induce technical bias, for example a backend or set of servers can have different latency, which will screw your business logic results. The owners will think that the feature A is better, but the users actually responded better because of the lower latency.

So I would not recommend this approach for a complex project. For small stuff or landing pages where the users have only 1 visit sure, nothing can go wrong.

Collapse
 
timkor profile image
Timkor • Edited

We use cookies to set the variant within the load balancer. The instances run within the same server so the only latency difference should be a result of changes to the code. Which is exactly what you want to test.

Also, how are cookies different than your approach?

Thread Thread
 
bgadrian profile image
Adrian B.G.

I said users lifetime (across sessions). Cookies are session based/volatile, if the user uses other browser or cleans its cookies they will see different version, which result in the previous issues I mentioned.

I did not presented my approach because I don't know all the details, but all the tests we did in gaming were using a database for persistence, we had the luxury of having all our visitors authenticated users so we know who is in what test.

Depending on what is tested persistence is required or not. Usually our features require a few weeks of measuring its impact on the user behavior, without authentication this cannot be done properly.

Thread Thread
 
timkor profile image
Timkor

Nevertheless, this would also be possible using load balancing, although it might require some customization.

Anyway, I use it for webshops, targeting new users. Most e-commerce websites do not have the luxery to use authentication before any actual conversion. Then this approach will work perfectly.

Good article though.

Thread Thread
 
bgadrian profile image
Adrian B.G.

When testing an action like Conversion sounds great, you do not even care about user as a lifetime, but rather to which version reacted better/got converted.

Beside Round Robin, the LB/Proxy can also be used to make the cohorts, for example based on country, or limit an entire Test based on a property (example country, region, language, device, mobile vs web).

Collapse
 
david_j_eddy profile image
David J Eddy

Nice article; esp. the A/B VS Blue/Green.

Could you share where you found the 'Journey Prep' image?

Looking forward to the next article ADrian. Thanks.

Collapse
 
bgadrian profile image
Adrian B.G.

Thanks!

I think is from canva.com free images, but I have to search again, it's been more than 1y since I wrote this.

Collapse
 
pavandixit profile image
PavanDixit

hi Great Information Can Anyone help me With tips as i have an Interview for the role of AB testing Developer !!