Embrace the dark data matter

Companies, big and small, all have the same problem when it comes to data analytics. Despite the best efforts of IT and senior management, business users end up building shadow BI/data systems of their own. Talk to anyone who has been part of a large scale BI implementation, they would readily accept that the end result turns out to be to just take data from the right sources, clean it and provide in an Excel format. So much so that, in many a places, the busiest time for the mail exchange is usually the 30 minutes before the official business hours. That is when thousands if not hundreds of reports get delivered in Excel format to business users across the organisation. These Excel datasets form the basis of spreadmarts or dark data matter, that no one has a clue it exists but invariably forms the core fabric of the decision making process in an organisation.

Data is gold, but not technology itself

Business users don’t care about technology, they don’t participate in the NoSQL vs SQL debate, they don’t care which visualisation tool offers in-memory capabilities or about write back features into OLAP cubes. Don’t get me wrong, as a total Data Geek, this is what gets my creative juices flowing. But end business users care one and only about accessibility of the right data. Once they have the right data, they can do whatever they want with it.

They can plug in their human intelligence (move over AI bots, nothing beats clever clogs when it comes to modelling decisions) to understand what is impacting their business and why.

Flexibility is the key

Neither is this modelling culture inefficient nor is a waste of time and money, let’s take a step back. Isn’t this what a modern day enterprise should be doing in the first place. Democratise information and make it available to every stakeholder and let them decide what they want to do with it.

This is where an analytics sandbox comes into play. Click here to read more about it, definitely one of the finest articles I have seen online on this topic. As it rightly points out, what fascinates the business users is the flexibility of Excel (and spreadsheets) to plugin any data and create business rules with it. But that flexibility is also the key frustration of IT, in that they don’t have a clue what is going on in the individual PCs. It could be that there are really valuable insights lurking in those spreadmarts, which could be distributed to other users for decision making. You need to find the right platform to enable these spreadmarts.

Just sharing these spreadsheets on SharePoint or DropBox is not going to solve the problem. You need an area where business users can create models without restraint, clone them, add rules, create outcomes, slide and dice across business attributes and build dashboards full of KPIs based on those outcomes. You want them to do this without sitting in umpteen meetings or sprint planning sessions. And you want the main source of data for that area to be a clean dataset that has undergone transformations deemed as fit for analysis by the gatekeepers of the data. That main source can then be mashed up with other minor data sources as needed by the business users for their specific situational analysis.

Business language only please

Business users  have functional responsibilities in an organisation, be it finance, marketing, operations or sales. Building decision models is only a sub-task, for which they shouldn’t have to spend too much time doing it. Building using natural business language also means that, any consumer of insights out of these models can easily understand them without email chains and PowerPoint presentations to explain the business logic behind the model.

Giving them read only access to a copy of the data using a querying tool doesn’t help. You can’t expect them to learn SQL syntax or learn coding to build the analytical models in the sandbox area. What you really need is a tool as simple as spreadsheets when it comes to maintaining rules and building outcomes. To underpin that simplicity, you need a framework that stores data in a format to enable flexibility.

P.S. At DataQuarks, we are building a platform that can be deployed as an analytics sandbox. All the above mentioned points form the core of our design principles, vision and these drive our roadmap too.



Image Source: Mapping Dark Matter in Galaxies by European Southern Observatory on Flickr