Updated: Jan 21
Here are the 5 Data Trends that I believe, will be the most significant in 2020.
#1: Data Warehousing as a service (DWaaS)
Enterprises are building modern industry-specific and shared Data Warehousing services to enable clients and partners to consume data within data warehouse environments. According to Eckerson Group, a Fortune 500 consulting firm is rolling out a DWaaS platform on which clients in the healthcare industry can run real-time reports on member enrolment, claims processing and other operations. Also, a major farm equipment manufacturer is preparing a similar offering for dealerships across the United States, as is a major financial services firm for its clients globally. Using cloud and data warehouse automation technologies, companies are converting data warehousing from traditionally time-consuming and bottlenecked back-office operations into predictable and repeatable revenue engines.
#2 - Data Vault is going mainstream
Data Vault modelling techniques to build and re-build the "hub" layer of a 3-tier data warehouse architecture will continue to rise. 2019 was a year that we heard from many companies and partners across Europe and around the world that they adopting this technique. The ability of Data Vault to build a scalable and flexible Data Warehouse is being an alternative to 3NF and dimensional models, to modernize data warehouses.
#3 - Multi-cloud and Hybrid are here to stay
Companies increasingly their desire to move into the cloud, however, the process of moving the data, data integration and preparation from an on-premises solution to the cloud is not an easy one. For the other hand, the migration of the existing data to the cloud can take, from weeks to months. So the trend that is emerging is the use of hybrid deployments. Early adopters of the cloud are using their cloud storage for dynamic workloads, while on-premises platforms remain highly useful for stable workloads. Another complexity is that most enterprises already have a multi-cloud footprint. In 2020, it’s expected to see the hybrid and multi-cloud methodology in data ecosystem strategies.
#4: Data Fabric
Data Fabric is a trend that supports agile data at scale. The goal is to have all the data in a single data warehouse. Data fabric is created to virtual integrate data silos and it enables a logical data warehouse architecture that allows easy access and integration of data across heterogeneous and disparate storage. Data Virtualization technologies such as Denodo, Tibco DV and Dremio enable companies to implement this trend in an agile and easy way.
#5: Explainable and Augmented AI in data-driven organizations
The use of ML and AI in data-driven organizations are being accelerated due to 2 main trends:
a) The “citizen data scientist” that should be able to use some basic ML and AI algorithms within their data pipelines, as those capabilities being to show up in more traditional BI and data integration platforms;
b) The ability of data scientists to use more automated tools to put advanced ML and AI algorithms into production.
In 2020, automation frameworks will allow data scientists to create their own data pipelines that easily put them in production.
Another trend for 2020 is Explainable AI. One of the big problems in many cases, is the fact AI models have severe limitations when a decision or prescribed action has to be “justified” with reasoning. One great application of explainable AI may be forensic science, where decisions based on evidence has to be justified. Automated tools like DataRobot, Dataiku or FICO opens up black boxes with explainable AI models and make these models more human trust, because the justification logic is built into the models, so that the models can “explain” the reasons behind their decisions.
by Paula de Oliveira
Passio Consulting co-founder and Managing Partner