Paul Miller, Author at Gigaom Your industry partner in emerging technology research Wed, 14 Oct 2020 00:31:21 +0000 en-US hourly 1 https://wordpress.org/?v=6.5.3 Survey: What the Enterprise Cloud Needs to Become Business-Critical https://gigaom.com/report/survey-what-the-enterprise-cloud-needs-to-become-business-critical/ https://gigaom.com/report/survey-what-the-enterprise-cloud-needs-to-become-business-critical/#comments Mon, 09 Mar 2015 13:00:27 +0000 http://research.gigaom.com/?post_type=go-report&p=246459/ Enterprise IT decision makers are moving beyond pilots and proofs of concept to strategically embrace the cloud for business-critical applications and workloads.

The post Survey: What the Enterprise Cloud Needs to Become Business-Critical appeared first on Gigaom.

]]>
Among those widely adopting cloud computing today are startups without existing infrastructure investment and small teams or individuals within larger organizations. There is also growing enthusiasm for both public and private clouds as a formal component of the enterprise IT strategy, sitting alongside existing data center investment.

At least initially, enterprise adopters tended to use the cloud for non-critical workloads. Typical uses include (but are not limited to):

  • Proof of concept
  • Development and test environments
  • Temporary workspaces
  • Short data processing jobs focused on non-sensitive data
  • To satisfy seasonal workload spikes

Many organizations are now progressing beyond these workloads, putting cloud computing to work in support of business-critical applications and workloads. Gigaom Research surveyed over 300 IT decision makers from companies with at least 1,000 employees, to understand enterprises’ shifting attitudes to the cloud as a viable piece of their business-critical infrastructure.

Key findings from this report include:

  • Security remained the principal consideration. Sixty-five percent agreed that the security of network connections to their cloud-based applications was a cause for concern. A variety of methods get employed to mitigate this risk, from requiring employees to connect from company devices or over virtual private networks (VPNs) to deploying private networks between data centers that bypass the public internet altogether. Respondents tend not to favor a single approach, and use a range of technical and organizational procedures in an effort to balance cost and complexity with organizational requirements for security or regulatory compliance.
  • Sixty-six percent of respondents consider one or more Software-as-a-Service (SaaS) applications to be business-critical today, and a significant number also support critical workloads with public Database-as-a-Service (DBaaS) or Infrastructure-as-a-Service (IaaS) compute and storage offerings.
  • Future growth expectations in these areas is low over the next two years, and survey respondents identify a number of pressing technical and legal hurdles standing in the way of further adoption in support of critical workloads. These include concerns around data security, regulatory issues, the quality of network connectivity, and the cost of moving off existing hardware investments.
  • These barriers are not expected to diminish in the near term, and there is some concern that they remain significant for five years or more.

Thumbnail image courtesy of Bim/iStock.

The post Survey: What the Enterprise Cloud Needs to Become Business-Critical appeared first on Gigaom.

]]>
https://gigaom.com/report/survey-what-the-enterprise-cloud-needs-to-become-business-critical/feed/ 1
Ensuring a Successful OpenStack Deployment https://gigaom.com/report/ensuring-a-successful-openstack-deployment/ Fri, 27 Feb 2015 14:00:21 +0000 http://research.gigaom.com/?post_type=go-report&p=245905/ Knowing the different ways in which an OpenStack cloud can be deployed will ensure a successful and sustainable solution to local business requirements.

The post Ensuring a Successful OpenStack Deployment appeared first on Gigaom.

]]>
The OpenStack cloud-management framework has come a long way since its launch four and a half years ago. Backed by several hundred of the IT industry’s biggest players, and with significant code releases every six months, the project consistently figures prominently when CIOs consider the ways in which their IT capabilities can best adapt and evolve to changing demands.

With all of OpenStack’s code freely available under an open-source license, it can appear straightforward to download the code and deploy an OpenStack cloud for internal use. Some, such as LivePerson, have done this and successfully run an internal cloud deployed across hundreds of hosts in multiple data centers.

But for many, the early cost savings and flexibility of homegrown deployments ultimately prove to be a false economy. Local configuration choices do not always benefit from the latest best practice in the broader OpenStack community. Local customizations and tweaks to the code gradually move the local installation further and further from mainstream OpenStack, and these problems usually grow with each subsequent release from the OpenStack Foundation. Although affordable to launch, homegrown OpenStack deployments may end up isolated from improvements to the mainstream code, increasingly expensive to patch and maintain, and potentially unsuccessful.

This report discusses some of the ways in which OpenStack projects are deployed. It explores lessons from across the industry in order to highlight emerging best practices that help ensure a successful and sustainable solution to local business requirements.

Thumbnail image courtesy of mgkaya/iStock.

The post Ensuring a Successful OpenStack Deployment appeared first on Gigaom.

]]>
Extending Hadoop Towards the Data Lake https://gigaom.com/report/extending-hadoop-towards-the-data-lake/ Thu, 19 Feb 2015 15:46:55 +0000 http://research.gigaom.com/?post_type=go-report&p=245578/ Early adopters of the data lake are integrating Hadoop into current workflows and addressing challenges around the cleanliness, validity, and protection of their data.

The post Extending Hadoop Towards the Data Lake appeared first on Gigaom.

]]>
The data lake has increasingly become an aspect of Hadoop’s appeal. Referred to in some contexts as an “enterprise data hub,” it now garners interest not only from Hadoop’s existing adopters but also from a far broader set of potential beneficiaries. It is the vision of a single, comprehensive pool of data, managed by Hadoop and accessed as required by diverse applications such as Spark, Storm, and Hive, that offers opportunities to reduce duplication of data, increase efficiency, and create an environment in which data from very different sources can meaningfully be analyzed together.

Fully embracing the opportunity promised by a comprehensive data lake requires a shift in attitude and careful integration with the existing systems and workflows that Hadoop often augments rather than replaces. Existing enterprise concerns about governance and security will certainly not disappear, so suitable workflows must be developed to safeguard data while making it available for newly feasible forms of analysis.

Early adopters in a range of industries are already finding ways to exploit the potential of their data lakes, operationalizing internal analytic processes and integrating rich real-time analyses with more established batch processing tasks. They are integrating Hadoop into existing organizational workflows and addressing challenges around the completeness, cleanliness, validity, and protection of their data.

In this report, we explore a number of the key issues frequently identified as significant in these successful implementations of a data lake.

Key findings in this report include:

  • As Hadoop continues to move beyond its MapReduce-based origin, its potential as a source of data for multiple applications and workloads—a data lake—grows more persuasive.
  • Operational workloads, which are an important aspect of most large organizations’ data processing requirements, place very different requirements on an IT infrastructure than the analytical batch processing duties traditionally associated with Hadoop.
  • Even when fully implemented, a Hadoop-based data lake augments rather than replaces existing IT systems of record such as the enterprise data warehouse.
  • Hadoop’s code is being hardened and enhanced in order to cope with the increasingly stringent requirements associated with security, compliance, and audit functions. Progress in all of these areas was required before commercial adopters—especially in heavily regulated sectors such as finance and health care—were comfortable deploying Hadoop for key workloads.

Thumbnail image courtesy of nadia/istock.

The post Extending Hadoop Towards the Data Lake appeared first on Gigaom.

]]>
Cloud Computing Market Trends in 2015 https://gigaom.com/report/cloud-computing-market-trends-in-2015/ Mon, 02 Feb 2015 14:00:57 +0000 http://research.gigaom.com/?post_type=go-report&p=244694/ Despite multibillion-dollar earnings and the continued birth of new startups, both buyers and suppliers still grapple with what the cloud computing industry will become and what their place within it should be.

The post Cloud Computing Market Trends in 2015 appeared first on Gigaom.

]]>
Cloud computing didn’t disappoint expectations in 2014, and it continued to grow in its many forms over the year. Leaders such as Amazon Web Services set a punishing pace, as usual, but it was pursued with increasing conviction by the likes of Microsoft and Google. New entrants brought products to market regularly, and established stalwarts of the enterprise IT market like IBM and HP continued their sometimes-painful process of adaptation. Each of these pieces forms part of a broad and evolving whole, which sometimes becomes obscured by the endless minutiae of deals and acquisitions, profits and losses.

In this report, we identify a small set of overarching trends shaping the cloud-computing market today, and offer a perspective on the ways in which they will shape that market in 2015 and beyond.

Key findings in this report include:

  • The adoption of cloud computing remains in a relatively early stage, and customers are keeping their options open.
  • Hybrid-cloud implementations dominate the enterprise-cloud landscape today, but hybrid will become less important over the next few years.
  • Providers of pure platform-as-a-service solutions are being squeezed out by competing approaches to achieving platform-like capabilities. The rise of containers offers an alternative way to achieve some the same promise offered by PaaS.
  • The adoption of cloud continues to threaten established enterprise IT providers as they struggle to adapt. However, ignoring those providers would be unwise at this point.

Thumbnail image courtesy of byryo/iStock.

The post Cloud Computing Market Trends in 2015 appeared first on Gigaom.

]]>
Apache Hadoop: Is one cluster enough? https://gigaom.com/report/apache-hadoop-is-one-cluster-enough/ Mon, 15 Dec 2014 16:00:08 +0000 http://research.gigaom.com/?post_type=go-report&p=242688/ Projects like Apache YARN expand the types of workloads for which Hadoop is a viable and compelling solution, leading practitioners to think more creatively managing data.

The post Apache Hadoop: Is one cluster enough? appeared first on Gigaom.

]]>
The open-source Apache Hadoop project continues its rapid evolution now and is capable of far more than its traditional use case of running a single MapReduce job on a single large volume of data. Projects like Apache YARN expand the types of workloads for which Hadoop is a viable and compelling solution, leading practitioners to think more creatively about the ways data is stored, processed, and made available for analysis.

Enthusiasm is growing in some quarters for the concept of a “data lake” — a single repository of data stored in the Hadoop Distributed File System (HDFS) and accessed by a number of applications for different purposes. Most of the prominent Hadoop vendors provide persuasive examples of this model at work but, unsurprisingly, the complexities of real-world deployment do not always neatly fit the idealized model of a single (huge) cluster working with a single (huge) data lake.

In this report we discuss some of the circumstances in which more complex requirements may exist, and explore a set of solutions emerging to address them.

Key findings from this report include:

  • YARN has been important in extending the range of suitable use cases for Hadoop.
  • Although mainstream Hadoop deployments still largely favor a single cluster, that simple model does not make sense for a range of technical, practical, and regulatory situations.
  • In these cases, deploying a number of independent clusters is more appealing, but this fragments the data lake and risks reducing the value of the whole approach. Techniques are now emerging to address this challenge by virtually recreating a seamless view across data stored in different physical locations.

Thumbnail image courtesy of JimmyAnderson/iStock.

The post Apache Hadoop: Is one cluster enough? appeared first on Gigaom.

]]>
Beyond MapReduce: How the new Hadoop works https://gigaom.com/report/beyond-mapreduce-how-the-new-hadoop-works/ Tue, 12 Aug 2014 07:30:43 +0000 http://research.gigaom.com/?post_type=go-report&p=234965/ New features included in Hadoop’s latest releases go some way towards freeing an increasingly capable data platform from the constraints of its early dependence on one specific technical approach: MapReduce.

The post Beyond MapReduce: How the new Hadoop works appeared first on Gigaom.

]]>
In only a few years Hadoop has graduated from a personal side project to become the poster child of the nascent multibillion dollar big-data industry. Leading providers of technical solutions based on Apache Hadoop attract large investments, and Hadoop-powered success stories continue to spread beyond the Silicon Valley giants in which these technologies were initially nurtured.

New features included in Hadoop’s latest releases go some way towards freeing an increasingly capable data platform from the constraints of its early dependence on one specific technical approach: MapReduce. Those same advances are also powering a new drive to embrace the complex and diverse enterprise workloads for which MapReduce was not necessarily the most appropriate data-processing tool, and where Hadoop’s early reputation for complexity and an apparent disregard for established enterprise processes around security, audit, and governance hindered adoption.

At the same time, the big-data landscape is becoming more complex. New tools like Apache Spark were quick to integrate with Hadoop but today also function increasingly well without it. Established enterprise IT firms co-opt the Hadoop name where they can while also pushing refreshes to their own tried and tested products.

In this report we explain what Hadoop is, how it has recently transformed, discuss what it’s good for, and consider how it might evolve as technology, expectations, requirements, and the broader competitive landscape alter around it.

The post Beyond MapReduce: How the new Hadoop works appeared first on Gigaom.

]]>
Moving Hadoop beyond MapReduce https://gigaom.com/report/moving-hadoop-beyond-mapreduce/ Wed, 30 Jul 2014 20:36:21 +0000 http://research.gigaom.com/?post_type=go-report&p=234187/ Hadoop far more than the one-trick pony it was once characterized as, and with its new capabilities, businesses are using it to gain real insight and add real value.

The post Moving Hadoop beyond MapReduce appeared first on Gigaom.

]]>
Backed by an extensive open source community and significant investment from startups and more established technology businesses, Hadoop has evolved into a credible platform for supporting enterprise-class analytics at scale. Originally designed to excel at running batch MapReduce jobs over a large static data set on clusters of commodity hardware, the combination of Apache Hadoop with a growing collection of associated projects and products is increasingly capable of far more. With Apache Hadoop 2.0, released in 2013, the project introduced a clear split between management of cluster resources and processing of data. The newly introduced YARN handles resource management across the cluster, and MapReduce has become just one of several tools with which a Hadoop cluster might process and analyze data. Alongside batch processing of static data with MapReduce, Hadoop is increasingly being used to process streaming data with tools like Apache Storm, to explore data interactively with applications that incorporate Apache Tez, or in conjunction with powerful in-memory frameworks like Apache Spark.

These technical advances make Hadoop far more than the one-trick pony it might once have been characterized to be. Parallel innovations around data governance, security, and integration are transforming the Hadoop silo of old into an effective and integral piece of the enterprise IT estate.

This report discusses the capabilities of today’s Hadoop platform and explores ways in which businesses are using it to gain real insight and add real value.

  • In 2013, with the release of Hadoop 2.x, the Apache project separated MapReduce-based data processing from the generic management of cluster resources. Afterwards, the new YARN module in Hadoop enabled a set of data processing options independent of MapReduce.
  • YARN begins to position Hadoop as a viable tool for storing and processing a growing proportion of an organization’s data assets.
  • Issues around data governance and security are key to Hadoop’s broader adoption.
  • Companies like TrueCar and Neustar are embracing Hadoop to enable and accelerate their transformations into profitable data-based organizations.

 

The post Moving Hadoop beyond MapReduce appeared first on Gigaom.

]]>
Bringing Hadoop to the mainframe https://gigaom.com/report/bringing-hadoop-to-the-mainframe/ Tue, 17 Jun 2014 21:46:17 +0000 http://research.gigaom.com/?post_type=go-report&p=231448/ Mainframes still account for 60 percent or more of global enterprise transactions.

The post Bringing Hadoop to the mainframe appeared first on Gigaom.

]]>
According to market leader IBM, there is still plenty of work for mainframe computers to do. Indeed, the company frequently cites figures indicating that 60 percent or more of global enterprise transactions are currently undertaken on mainframes built by IBM and remaining competitors such as Bull, Fujitsu, Hitachi, and Unisys. The figures suggest that a wealth of data is stored and processed on these machines, but as businesses around the world increasingly turn to clusters of commodity servers running Hadoop to analyze the bulk of their data, the cost and time typically involved in extracting data from mainframe-based applications becomes a cause for concern.

By finding more-effective ways to bring mainframe-hosted data and Hadoop-powered analysis closer together, the mainframe-using enterprise stands to benefit from both its existing investment in mainframe infrastructure and the speed and cost-effectiveness of modern data analytics, without necessarily resorting to relatively slow and resource-expensive extract transform load (ETL) processes to endlessly move data back and forth between discrete systems.

Key findings include:

  • Mainframes still account for 60 percent or more of global enterprise transactions.
  • Traditional ETL processes can make it slow and expensive to move mainframe data into the commodity Hadoop clusters where enterprise data analytics processes are increasingly being run.
  • In some cases, it may prove cost-effective to run specific Hadoop jobs on the mainframe itself.
  • In other cases, advances in Hadoop’s stream-processing capabilities can offer a more cost-effective way to push mainframe data to a commodity Hadoop cluster than traditional ETL.
  • The skills, outlook and attitudes of typical mainframe system administrators and typical data scientists are quite different, creating challenges for organizations wishing to encourage closer cooperation between the two groups.

Feature image courtesy Flickr user Steve Jurvetson

The post Bringing Hadoop to the mainframe appeared first on Gigaom.

]]>
Understanding the Power of Hadoop as a Service https://gigaom.com/report/understanding-the-power-of-hadoop-as-a-service/ Thu, 12 Jun 2014 21:27:40 +0000 http://research.gigaom.com/?post_type=go-report&p=231066/ Hadoop as a Service could make it much easier for you to build a big data capability inside your organization and run it at scale.

The post Understanding the Power of Hadoop as a Service appeared first on Gigaom.

]]>
Across a wide range of industries from health care and financial services to manufacturing and retail, companies are realizing the value of analyzing data with Hadoop. With access to a Hadoop cluster, organizations are able to collect, analyze, and act on data at a scale and price point that earlier data-analysis solutions typically cannot match.

While some have the skill, the will, and the need to build, operate, and maintain large Hadoop clusters of their own, a growing number of Hadoop’s prospective users are choosing not to make sustained investments in developing an in-house capability. An almost bewildering range of hosted solutions is now available to them, all described in some quarters as Hadoop as a Service (HaaS). These range from relatively simple cloud-based Hadoop offerings by Infrastructure-as-a-Service (IaaS) cloud providers including Amazon, Microsoft, and Rackspace through to highly customized solutions managed on an ongoing basis by service providers like CSC and CenturyLink. Startups such as Altiscale are completely focused on running Hadoop for their customers. As they do not need to worry about the impact on other applications, they are able to optimize hardware, software, and processes in order to get the best performance from Hadoop.

In this report we explore a number of the ways in which Hadoop can be deployed, and we discuss the choices to be made in selecting the best approach for meeting different sets of requirements.

Key findings include:

  • Hadoop is designed to perform at scale, and large Hadoop clusters behave differently from the small groups of machines developers typically use to learn.
  • There are a range of models for running a Hadoop cluster, from building in-house talent and infrastructure to adopting one of several Hadoop-as-a-Service solutions
  • Competing HaaS products bring different costs and benefits, making it important to understand your requirements and their strengths and weaknesses. Some offer an environment in which a customer can run — and manage — Hadoop while others take responsibility for ensuring that Hadoop is available, maintained, patched, scaled, and actively monitored.

Feature image courtesy Flickr user Pattys-photos

The post Understanding the Power of Hadoop as a Service appeared first on Gigaom.

]]>
The importance of benchmarking clouds https://gigaom.com/report/the-importance-of-benchmarking-clouds/ Thu, 12 Jun 2014 15:35:52 +0000 http://research.gigaom.com/?post_type=go-report&p=231025/ Before you choose a cloud provider, evaluate factors like geography, certification, service level agreements, price and performance.

The post The importance of benchmarking clouds appeared first on Gigaom.

]]>
For most businesses, the debate about whether to embrace the cloud is over. It is now a question of tactics — how, when, and what kind? Cloud computing increasingly forms an integral part of enterprise IT strategy, but the wide variation in enterprise requirements ensures plenty of scope for very different cloud services to coexist.

Today’s enterprise cloud deployments will typically be hybridized, with applications and workloads running in a mix of different cloud environments. The rationale for those deployment decisions is based on a number of different considerations, including geography, certification, service level agreements, price, and performance.

Price and performance are often seen as closely – almost inextricably – linked, and this connection is only likely to grow stronger as basic cloud infrastructure becomes increasingly commoditized. For some workloads, cost may be an overriding factor in selecting a cloud provider, and for others, performance is all that matters. For most, a complex combination of these and other factors will lie behind any deployment decision, making it important to ensure that buyers are fully informed about their options.

Frequent high-profile price reductions, equally numerous but less visible infrastructure upgrades, and the less rapidly evolving landscape of certifications, accreditations, and legal considerations combine to ensure that even the best and most informed selection processes require regular review; the best cloud provider for a particular set of requirements today may be surpassed by a competitor tomorrow.

Key findings include:

  • There is significant variation in the price and performance of competing cloud solutions, both public and private.
  • Lack of consistency in specification or description makes it challenging to accurately compare the capabilities of competing solutions.
  • The characteristics of specific applications (web servers, e-commerce applications, Hadoop clusters, etc.) mean that their performance will change from one cloud to another.
  • The most effective way to really understand the best place to run a given application is therefore to test how that application performs in different clouds and to build a comprehensive view of the complex balance between price and performance for any given workload.

Feature image courtesy Flickr user Scott Akerman

The post The importance of benchmarking clouds appeared first on Gigaom.

]]>