Data visualization is an important tool as it provides new insights into player behavior and improves gameplay. Visualization can turn intricate and complex data into easier to digest visual presentations of data patterns, parallels and trends. Collecting vast amounts of this player data is effortless with tools such as deltaDNA, but without killer analytics and visualizations it can be hard to construct a meaningful player narrative, and consequently influence game design.
In our earlier player data visualization post here, our Insight team shared their thoughts on simple but powerful visualization techniques. However, most of these techniques are limited to a few features, while modern games have multiple features that can impact gameplay and performance.
In this article, we have outlined more advanced visualization techniques that you can use when considering multiple game features at once.
1. Stacked plots
A simple way to look at 3 features is stacked charts. These are particularly effective if you want to split a simple result e.g. a histogram of players by days since install, by a low cardinality feature e.g. user level. An example of this kind of plot from deltaDNA’s data mining tool is shown below.
This type of visualization is great for comparing real world time and in-game progression. In the example above we can see that players are leveling up very slowly, with the majority of players stuck on Level 1 even on Day 9. While an average of user level by day might reveal this, it will be biased by the minority of players that are making it to high levels.
While the stacked plot is great for 3 features, to achieve the same result with more we need to use a treemap. These are useful for understanding how different player groups are progressing through the game compared to real world time. deltaDNA provides direct connections to complementary tools to do more advanced analytics such as treemaps. The chart below is an evolution of the example above produced in Tableau:
The size of each cell represents the number of players, while the color represents the user level (as before). Now in addition to the days since install (D), we can also populate the cells with other meaningful statistics like the ARPU (A$) and the Win Rate (W). This visualization allows meaningful trends to be revealed quickly, for instance in this example, we can see that players who are stuck (e.g. Level 1 on D6) have a significantly lower win rate than those that are progressing nicely.
3. Mapping to 2D
While the methods above are great for a small number of features, higher dimensional data (e.g. 10+ features) requires more advanced techniques. At this point, the goal of visualization is to identify correlations and clustering amongst the player base. A good way to achieve this is to ‘map’ the high dimensional space back to a 2D or 3D map that can be more easily grasped. Rather than using the absolute values of each feature, mapping methods can calculate the distance, or dissimilarity, between pairs of player features.
Methods which use this approach include Multi-Dimensional Scaling (MDS), Local Linear Embedding (LLE) and t-distributed Stochastic Neighbor Embedding (tSNE). The last of these, tSNE, is particularly good for visualizing player metrics, as it is robust against data with important structure on many scales. An example of a tSNE visualization for a sample game with the same features from before – Days Since Install, Level Reached, Win Rate and Currency Spent – is shown below.
The X and Y axis in this class of visualization is meaningless, points are plotted so that similar players are near, and dissimilar players are far away. While the interpretation of such plots can be difficult, by using tracers such as color, size and shape for the points, important trends can be found. For example, in the plot above, the points are color-coded by Days Since Install, and the size of the points is Currency Spent. We can see that the small points are spread out in a fan with individual days isolated; for these players that don’t spend, their features are dominated by how long they have had to play (e.g. grind). The larger the points (i.e. the more the player has spent), the closer the points are, suggesting that metrics for players that spend a lot are similar.
Understanding how players interact with your game is an important step towards optimizing both the player experience and game performance. The best player data visualization tools tell a detailed story – simple visualizations are great for quantitative balancing of a game, but taking a step further and visualizing the interplay between different and complicated player features, can then result in a deeper understanding of the player base and in-game behavior.
If you liked this article, you can read ‘How first session length impacts game performance’ here.