Introduction
During today’s practice we will investigate how to perform A/B Testing. A/B testing is used in numerous ways to test different versions of web pages, UX, surveys and questionnaires, changes in policies, different marketing campaigns, emails and so on.
Broadly speaking, A/B tetsts are run mostly on two types of data:
Continuous or discrete numbers, for example average number of clicks, time spent on the page Proprotions or percentages, for example, conversion rates For the first data type, the t-test is most frequently used. For the second type of data, the Pearson’s Chi-squared test is the obvious choice. Let’s take a look at both cases.
Libraries
library(data.table)
library(dplyr)
library(ggplot2)
library(nortest) # install.packages("nortest")
library(pwr) # install.packages("pwr")
Loading the data
Let us take a look at the following data:
dt <- fread(file.choose()) # AB_clicks.csv
We should look at the data:
View(dt)
Second column in our dataset contains names of the tags in HTML. HTML is specific language used for building of the web pages.
Let’s look ad unique values in the data:
Amout of different html elements:
length(unique(dt$Element_ID))
## [1] 108
What different tags are there:
unique(dt$Tag_name)
## [1] "area" "a" "input" "button" "span" "p" "div"
## [8] "form" "img" "ul" "li" "object" "center" "strong"
## [15] "font"
Valuese for feature “Visible”:
unique(dt$Visible)
## [1] FALSE TRUE
Valuese for feature “Version”:
unique(dt$Version)
## [1] "Interact" "Connect" "Learn" "Help" "Services"
This is the cleaned version of the data from https://scholarworks.montana.edu/xmlui/handle/1/3507. University of Montana explored that the button Interact on their page is heavily underused. They surveyed the problem by conducting questionnaires and realized that the name might be one of the reasons being too intimidating. They came up with several other versions: