Unstructured Data Management 2017

Table of Contents

  1. Summary
  2. Introduction
  3. Usage Scenarios
  4. Disruption Vectors
  5. Tools Analysis
  6. Key Takeaways
  7. About William McKnight

1. Summary

The majority of the information in your organization that is not under management is unstructured data. Unstructured data has always been a valuable asset to organizations, but it can be difficult to manage. Emails, documents, medical records, contracts, design specifications, legal agreements, advertisements, delivery instructions, and other text-based sources of information do not fit neatly into tabular relational databases. Even many NoSQL databases and Hadoop do not adequately address the specialized pre-processing, query, and organization requirements of pure unstructured data, and instead rely on techniques that essentially structure the data first.

There is no one set of tools that will solve everything. However, if you have a heavy unstructured data workload and wish to optimize your results, a new breed of powerful search and data management tools is changing the game. These tools, which are poised to expand dramatically within organizations that realize the gravity of the challenge, have been chosen for this report.

This Sector Roadmap is focused on unstructured data management tool selection for multiple uses across the enterprise. We eliminated any products that may have been well-positioned and viable for limited or non-analytical uses, such as log file management, but deficient in other areas. Our selected use cases are designed for high relevance for years to come and so the products we chose needed to match all these uses. In general, we recommend that an enterprise only pursue an unstructured data management tool capable of addressing a majority or all of that enterprises’ use cases.

Organizations today need to take advantage of the numerous relevant data platforms, while maintaining a central repository where governance can be enacted and quality can be assured. Managing the data effectively is a key indicator of success in analytics. Progressive organizations have more data platforms than ever before, and there is a clear need to bring key data together for the entire company. However, with hybrid and cloud architectures—key data from sources and for target systems distributed among on-premises, cloud, and third-party systems—the data management challenge is moving exponentially.

Success in data, analytics, and even their business is concomitant to an enterprise’s ability to manage and glean insight from unstructured data with a modern platform and mature process.

In this Sector Roadmap, vendor solutions are evaluated over five Disruption Vectors: query operations, search capabilities, deployment options, data management features, and schema requirements.

Key findings in our analysis include:

Key:

Number indicates company’s relative strength across all vectors
Size of ball indicates company’s relative strength along individual vector

Full content available to GigaOm Subscribers.

Sign Up For Free