First-class analytics can only happen with quality data. As the old saying goes, ‘garbage in and garbage out’, and it still holds true – incorrect data is of very little use. Data quality is therefore vital to ensure accuracy and reliability. Some analytics systems allow you to query your data without validating it, however we only analyze validated data.
Laurie McCulloch, head of operations at deltaDNA, has over 20 years’ experience of working in game content and services. He outlines the importance and the process of data validation.
Data validation is vital to ensure the data is clean, correct and useful. If you are sending billions of events from millions of players, you will not want to have to clean your data before you can run any analysis. Therefore, running validation on your data as it is ingested means you can be confident with the results.
We validate all the events that players send, so we can check the type of events that they have sent match what you have specified in your events schema. Within those events we check all the parameters to ensure they are the correct type and the values they have sent are within the ranges you are looking for. We also validate transaction receipts to exclude any hacked transactions from the revenue reporting. This is crucial because the validated data is then something that you can trust and use to make informed decisions and decisive actions.
To find out whether or not your analytics system validates data, if it asks you to define an event schema before sending in any data, then there is a good chance that they have data validation. On the other hand, if it doesn’t ask you, any errors you get in your data are going to flow right through into the data warehouse.
If your data is not validated it can have a huge impact. Imagine a scenario where your game is hacked and receives invalid transactions from fake app stores. This would affect your revenue numbers, your average revenue per user, lifetime value and ultimately all your economy reporting would be completely out, which is why it’s immensely important to validate your data! In the case of transaction receipts, if you do validate them, we can then identify which users are potentially abusing the system and are unlikely to make genuine In App Purchases, then potentially even target campaigns to them and increase their ad frequencies as a result.
You may be wondering if there is ever a circumstance when it’s ok not to validate data? We believe that it is never worth the risk. A little bit of thought and planning upfront goes a long way to ensure your data is accurate and clean, and that you are capturing the right sort of data to run your analysis with.
If you liked this article, you can read ‘top tips for analytics tracking in first person shooters’ here.