Table of Contents
- Summary
- Data Connectors
- Virtualized Data Layers
- Data Integration
- In-Memory Database/Grid Platforms
- Data Warehouse Platforms
- Business Intelligence (BI)
- Business Intelligence on Big Data/Data Lakes
- Big Data/Data Lake Platforms
- Data Management and Governance
- Conclusion
- About Andrew Brust
- About GigaOm
- Copyright
1. Summary
Enterprise information workers have more data analytics choices than ever, in terms of paradigms, technologies, vendors, and products. From one perspective, we are presently in a golden age of data analytics: there’s innovation everywhere, and intense competition that is beneficial to enterprise customers. But from another point of view, we’re in the data analytics dark ages, as the technology stack is volatile, vendors are prone to consolidation and shakeout, and the overwhelming variety and choice in technologies and products is an obstacle to analytics adoption, progress, and success.
This is a stressful time for enterprise IT and lines of business alike. No one wants to make the wrong decision and be held responsible for leading their organization down the wrong data analytics path. Yet the responsibility to act, and act soon, is palpable. The pressure is great, and opportunities for evasion and procrastination are receding. It’s a perfect technology storm.
What should you do? Stick with “the devil you know:” old school data warehousing and business intelligence (BI), running on-premises? Or go the no guts/no glory route and dive in head-first to open source big data technologies and run them in the cloud? Most people won’t want to go to either extreme and would instead prefer a middle-ground strategy, but there are a lot of options within that middle-range.
Even understanding all your choices—let alone making a decision—is daunting and can bring about a serious paralysis, right at the dawn of real digital transformation in business, where we realize data is a competitive asset. What’s needed is an organizing principle with which to understand the crucial difference in products and technologies. Such a framework is the only way organizations can understand their options, and make the right choice.
Consider this: one of the biggest challenges in analytics is the issue of data “silos,” wherein the data you need for successful insights is scattered – in different databases, applications, file systems, and individual files. And while that may only seem to add to the pressure, there is in this circumstance an opportunity. The structural challenge of siloed data, the many ways it manifests, and the various ways to mitigate and resolve it can act as the organizing principle with which to understand vendors, technologies, and products in the data analytics space. This principle will help you understand your own requirements and criteria in evaluating products, making a buying decision, and proceeding with implementation.
Data siloes aren’t really a “defect,” but rather a steady state for operational data. That is to say that data, in its equilibrious state,issiloed. But analytics is ideally executed over a fully integrated data environment. As such, a big chunk of the analytics effort is to coalesce siloed data, and a careful investigation of vendors, product categories, and products in the analytics arena will show all of them to be addressing this task.
They don’t all do it the same way though. In fact, what distinguishes each analytics product category is the point along the data lifecycle where it causes data silos to coalesce. And that’s why the concept of removing the silos in the data landscape is such an integral part of an organizing principle for the analytics market.
Whether under the heading of data ingestion; data integration; data blending; data harmonization; data virtualization; extract, transform and load (ETL) or extract, load and transform (ELT); data modeling; data curation; data discovery; or even just plain old data analysis, every single vendor is in some way, shape, or form focused on unifying and melding data into a unified whole.
Surely, analysis and insights, technology aside, are about putting pieces of the puzzle together, from different parts of the business and from different activities in which the business is engaged. Each part of the business and each activity involves its own data and often its own software and database. Viewed this way, we can start to see that removing silos shouldn’t be viewed as an inconvenience; rather, it’s an activity inextricable from the very process of understanding of the data.
Once this concept is acknowledged, understood, accepted, and embraced, it can bring about insights of its own. How a product brings disparate data together tells us a lot about the product’s philosophy, approach and value. And, again, it can also tell us a lot about how the product aligns with a buyer’s requirements, as types of silo and degrees of data segregation will be different for different users and organizations.
In this report, we will look at an array of analytics product categories from this point of view. We will name and describe each product category, identify several companies within it, then explore the way in which they approach the union of data and the elimination of the silos within it.
The product categories we analyze in this report are:
- Data Connectors
- Virtualized Data Layers
- Data Integration
- In-Memory Database/Grid Platforms
- Data Warehouse Platforms
- Business Intelligence
- Business Intelligence on Big Data/Data Lakes
- Big Data/Data Lakes Platforms
- Data Management and Governance
For some product categories, that characterization will be obvious, or at least logical. For others, the silo removal analysis may be subtle or seem a bit of a stretch. By the end of the report, though, we hope to convince the reader that each category has a legitimate silo elimination mission, and that viewing the market in this way will provide buyers with an intuitive framework for understanding the myriad products in that market.
Contrast this to the approach of viewing so many products in a one-at-a-time, brute-force manner, a rote method that is doomed to failure. Understanding a market and a group of technologies requires a schema, much like data itself. And with a set of organizing principles in place, that schema becomes strong and intuitive. In fact, such a taxonomy itself eliminates silos and contributes to comprehensive understanding, by connecting the categories across a continuum, instead of leaving them as mutually-exclusive islands of products.
With all that said, let’s proceed with our first product category and how products within it address and eliminate data silos.
Perhaps the best way to progress through the analytics product categories is to go bottom-up, starting with products that mostly deal in the nuts and bolts of data analysis, and then work our way up to broader platforms that provide higher layers of abstraction.