There are many things to consider when deciding what to track in your data schema on a game analytics platform, and if you don’t pick the right approach, you can find that it’s impossible to do anything with the data.
Setting up your data schema is an important process to ensure that your analytics team can generate actionable insights. But if you want to set it up correctly, you need to think hard about the needs of the business.
What not to do
A ‘less is more’ approach can be very tempting.
You only have to think about the really big picture – installs, retention, conversion, and revenue. All this can be great stuff to know and will give you a good idea of the health of a game. Without any supporting data to give some context, however, there’s no way of knowing whether these measurements are good or bad.
Another option is to take the scattergun approach, where you track everything and figure it out later.
There are a multitude of problems with ‘tracking everything’, not least when it comes to accessing and storing the data. Beyond the obvious costs of having ‘everything’ in a database, there then comes the trouble of trying to understand all that data. It can be tempting to think that the data will appear like a map and it will be easy to work out where the problems lie. Sadly, this is not the case.
Even if ‘everything’ is tracked, there still has to be some sort of plan as to what to look for, otherwise the data is just rows and rows of letters and numbers with all the irrelevant stuff to sift through.
It’s far more efficient to work from the other direction. You need to first think about what insights are expected or desired, then what is needed to be known, in order to judge a hypothesis.
What you need to know
Even if there are no specific questions before starting, having some idea of the areas that should be looked at helps a lot in narrowing down what to track and how to set up your data schema. Some games have hundreds of items, characters, moves, and countless other miscellany that could severely bloat a database; in such cases, you should be able to discount what you don’t need to know.
If the player can have a huge inventory in the game, it would be inadvisable to track the entire contents at every moment. Instead, consider tracking items only when they’re gained, used, or lost, with contextual parameters for additional information for the item name, type, amount, how it was gained, used, lost, etc.
Another common example is games where players can collect a number of characters that can be used. It’s often not important to know the status of every character at a given time, rather try keeping data narrowed down to significant events surrounding a character or even summaries at key moments. Is it important to know every time a character performed a special move, or would it be just as useful to know that at the end of a mission that character performed a special move X times? Does every hit on an enemy need to be tracked, or could number of hits, hit rate, damage done, etc. all be sent in an attack summary event?
A final but very important consideration is to think about how the data is going to be handled.
It can be illuminating to involve the analyst, if not the data schema designer, who’ll be going through all the data and ask if they can picture how they’ll be able to look at it. If the person who’ll be doing the investigative work can’t think of a way to query what seems like a very simple but important piece of information, it’s probably a good sign that the schema needs to be re-thought. If the schema designer and analyst are the same person, try looking at the design from a purely analytic standpoint, or otherwise get another analyst to give feedback on the schema.
At deltaDNA we’ve done hundreds of data schema and have seen all manner of variations from different developers, so get in touch if you have any questions about getting set up with analytics.