The ‘science’ of game analytics is becoming increasingly sophisticated, however, there is an ever present nuisance which can stop an analysis in its tracks; very few players spend. Considering that much of our focus is on understanding payer behaviors, it is frustrating that the typical payer fraction in F2P games is usually around 1-2%. This means that even for moderately successful F2P games, which may have 10,000s of monthly users, there will only be a small sample size of a few hundred payers to analyze. Our analyses are often, therefore, subject to significant amounts of shot noise.
What is Shot Noise?
Shot noise (or Poisson noise) exists whenever we are looking at discrete measurements. It was first considered in the context of individual electrons travelling in vacuum tubes, but equally applies in our scenario of aggregating the behaviors of discrete players.
Fortunately shot noise is described by a very simple equation: σ=√(N), i.e. the standard deviation (or error) in our measurement is equal to the square root of the number of samples. To illustrate how this works let’s consider a common application in games, measuring the payer fraction. Let’s say we wish to measure the number of paying players from a sample. Our sample is 10,000 players and the true payer fraction is 2%. This would mean that we would expect to see 200 payers in our sample, but because we are subject to shot noise the number we measure could be anywhere between 185 and 215.
How Sample Size can Affect Measuring Payer Fraction in F2P Games
There is also a shot noise error in the total player count of 10,000, so when measuring the payer fraction, the smallest error on our calculation (i.e. ignoring any other biases or data issues) is σ=P/N√(1/P+1/N) where P is the number of payers and N is the total number of players. So, in our example of 10,000 players, and with a true payer fraction of 2%, we could typically observe values between 1.95% and 2.15%. This is only a small range, so for this scenario shot noise would not be a problem, but if we only had 1,000 players the range our measurement could fall within is 1.55% – 2.45%.
It is easy to prove this concept applies to F2P games by performing a simple Monte Carlo simulation. Let’s take 1 million players and randomly assign 2% of them as payers. We can split our 1 million players into 1,000 samples of 1,000 players and measure the payer fraction within each sample. Below is a histogram of the resulting payer fractions.
The red line is a Gaussian distribution with mean = 0.02 and standard deviation = 0.0045. So we can see that the result agrees exactly with the prediction from shot noise. Now you may argue that you would never in real-life take 1 million players and split them into samples of 1,000 each, but this is directly comparable to when we cohort players by install date, acquisition campaign, or region.
Consider Sample Size when Defining Comparison Tests
It is obvious that sample size is an important factor when performing analysis; being able to quantify it in this simple way is vital to the effective planning of AB tests and other comparisons that may drive decision making.
The simple square root rule means that 10 participants = 30% error, 100 participants = 10% error and 1000 is 3% error. This means that for e.g. AB tests, unless you think that the variants will drive exceptionally different conversion rates, some consideration of how many players will experience the test is needed to ensure that meaningful results will be obtained.
Is it Significant?
As a final example, let’s say I have an AB test where 1.5% of variant A convert and 2.1% of variant B convert. Cohort A and B have 1500 players each. Is it significant? Tweet us your answer @deltaDNA.
The deltaDNA platform incorporates advanced model fitting and testing methods, as well as all varieties of machine learning. Find out more about how our advanced analytics tools could help you understand your game.