The Age of the Citizen Data Scientist and Why You Need to Consider Data Democratisation
Updated: Jan 13, 2022
Every day huge amounts of data are produced, this was true in 2010 during the boom of big data technologies and it has only escalated since then. Many companies are now realising that to give themselves the competitive edge they need, they need to de-silo that data and the insights hidden within. The problem? This data is currently locked behind a skill barrier. It is for this reason companies call on the services of data scientists, but unfortunately they don’t grow on trees. The issue is that it requires a great deal of training and experience to become a data scientist, so there are never enough to go around, compared to the greater number of companies who require them. Not only this, but as the digital world becomes increasingly complex, so do the use cases they’re required to solve, and they’re growing in number as well. So how do you remove the bottlenecks caused by too much data and not enough data scientists in order to raise efficiency and give yourself more agility in today’s highly competitive digital world?
The answer? Data democratisation. In case you don’t know, data democratization is the process of moving access to data from the hands of a few data scientists into the hands of a wider group of employees, empowering individuals at all levels and in all departments of the company to use data in their decision making. The value? Time, productivity and ultimately money. By making a business data driven, it reduces the decision-making episode, allowing your business to be more agile when responding to opportunities or potential risks. However, data democratisation needs to be done properly in order to avoid the mistakes of past and becoming a pitfall of its own.
When we talk about democratising data, what we’re really talking about is the creation of citizen data scientists. The term ‘citizen data scientist’ is becoming a pretty popular these days; here at Massive Analytic we’d like to take credit, but I suspect its continued use by Gartner has more to do with it. They define it as “a person who creates or generates models that use advanced diagnostic analytics or predictive and prescriptive capabilities, but whose primary job function is outside the field of statistics and analytics.” That’s a little wordy in my opinion; I prefer to define it as someone who is an expert in their field, who through the use of technology has been empowered to fill in the gaps left by data scientists. The goal of creating citizen data scientists isn't to replace current data scientists but to let them pick up where data scientists leave off and fill in any skill gaps. For example deploying machine learning models, wrangling data or managing databases. This then allows data scientists to focus on the jobs only they can do, such as building the predictive models and algorithms, essentially splitting the traditional roles of the data scientist into two piles.
Now there is a school of thought that says data should be left to the experts and that data democratisation is almost synonymous with poor data governance. I’ve heard horror stories of companies that have had multiple teams of data scientists even, working off the same data sources, building models iteratively and ending up with different versions of the truth and no idea how they got there. It’s not hard to imagine the type of logistical nightmare that can arise if you give everyone a data science platform and tell them to get to work. This is why data democratisation needs to be a managed process with a strong infrastructure in place to support it. The promotion of business users to citizen data scientists needs to be planned with careful consideration to the roles within teams and a dotted line to a central analyst hub. By creating citizen data scientists, instead of being deprioritised, lower value insights work can be pushed to self-service capabilities and these insights can be more readily filtered through the business. Data scientist roles now become building the models rather than deploying them, with the citizen data scientists focusing on the smaller day to day tasks.
So now you’ve bought into the idea, the question then becomes how do you create a citizen data scientist? Well the simplest way is through the use of AI platforms, like our Oscar Data Science platform. By automating machine learning, you lower the level of technical expertise required to interrogate data and find insights and by adding in automated processes and other code-free features, you enable these citizen data scientists to deploy the more complex models that the data scientists have built. At Massive Analytic, we have three different editions of our Oscar platform, designed to support the business and decision episode. The Studio Edition aims to upskill business users, it provides decades of data scientist knowhow in an easy-to-use platform. It combines machine learning, a UI to manage and wrangle data on Spark and visualisation in a single code-free platform. We also have an edition designed for data scientists and one for senior execs, but I won’t be covering those in detail here. In short, the Data Scientist edition is something of a data science sandbox, built to allow data scientists to import, build and export models company wide. The Viewer edition is designed to be a hub for viewing the various reports built in Oscar without the distracting bells and whistles. By democratising data in this tiered fashion, data scientists can maintain quality control while also empowering citizen data scientists to solve lower priority business cases.
Data democratisation is vitally important for the future of your business, but of equal importance is ensuring it’s done in the right way. By promoting the best suited members of your team to citizen data scientists, they can bring their individual skill sets to the problems at hand and help move your business forward. Careful attention must be paid to governance in order to ensure that the quality of insights remains high, however companies must not be afraid to embrace the age of the citizen data scientist, lest they be overtaken by more agile competitors.
Today’s more complex digital world requires a new approach to data, and I hope I have left you with some food for thought. If you would like to find out more information about Oscar or any of our other AI Platforms, contact us here: https://www.massiveanalytic.com/contact