Enrico Signoretti Jan 15, 2015 (Oct 13, 2020)

How to Solve Performance Problems at Petabyte-Scale

Summary
The Problem
A Deeper Look at the Problem
The Storage System Solution
How It Works
Why It Is Important
An Example of Performance at Scale
Final Notes
About Enrico Signoretti
About GigaOm
Copyright

1. Summary

Today’s huge storage systems, in the order of many petabytes, are associated more with capacity than performance, but that perception is changing. Until recently, the most requested storage feature has been active archiving, but the cloud, new technologies, and increased mobile applications now demand performance as well as capacity.

The three usual measurements for storage-system performance are input/output operations per second (I/OPS), throughput, and latency. Combining the three at a reasonable price is challenging, especially at high capacity. Even more demanding are the number of clients, applications, and workloads that contend for system resources from a multi-petabyte storage infrastructure. Adding to these demands is the challenge of achieving high performance from a distributed storage system spanning a geographically large, often global, area.

The first report in this four-part series describes how a traditional network-attached storage (NAS) system can scale to a few hundred terabytes and sometimes a few petabytes. But some scale-out NAS systems, though amazingly fast, are still not sufficient for webscale and large-organization infrastructures that must reach new scalability levels and indisputable performance while serving tens of thousands of local and remote clients with massive throughputs. An additional challenge is coping with long-distance data communication.

A deeper look at local and distributed performance helps illustrate the problem. For local performance, the clients are traditional servers and PCs, and connections are almost always reliable. For distributed performance, a variety of connections, protocols, and devices produce and consume data at blistering speeds, demanding efficiency and productivity.

Some next-generation multi-petabyte scale-out storage infrastructures have the feature set needed to leverage performance and capacity workloads simultaneously—either when data is saved locally or distributed globally. Separate load balancing, smart-caching techniques, scale-out file-system interfaces, clever use of flash memory, and so on, occur simultaneously to scale capacity at the backend, while delivering the needed performance at the front end.

Image courtesy of scanrail/iStock.

Blog

Matt Jallo May 15, 2024

Save Money and Increase Performance on the Cloud

One of the most compelling aspects of cloud computing has always been the potential for cost savings and increased efficiency. Seen…

CxO Decision Brief CxO

Comissioned Research

Howard Holton May 10, 2024

CxO Decision Brief: Kubernetes Management for Platform Engineering Teams

This GigaOm CxO Decision Brief commissioned by Diamanti. In the rapidly evolving landscape of platform engineering, Kubernetes has emerged as a…

Radar Engineer

Premium

Matt Jallo May 6, 2024 (Apr 29, 2024)

GigaOm Radar for Cloud Resource Optimization

Cloud resources that are not optimized can prove costly. Cloud resource optimization solutions provide a holistic view of an organization’s public…

Radar Engineer

Premium

Dana Hernandez Apr 19, 2024 (Apr 12, 2024)

GigaOm Radar for Cloud FinOps

In modern IT environments, hybrid and multicloud infrastructures are now the norm, but runaway costs due to unmonitored growth and unanticipated…

Key Criteria VP/Architect

Premium

Matt Jallo Apr 16, 2024 (Apr 16, 2024)

GigaOm Key Criteria for Evaluating Cloud Resource Optimization Solutions

Cloud resource optimization solutions provide a holistic view across an organization’s public or private cloud infrastructure. They deliver and provision resource…

Key Criteria VP/Architect

Premium

Dana Hernandez Apr 5, 2024

GigaOm Key Criteria for Evaluating Cloud FinOps Solutions

The cloud has become the go-to hosting platform for enterprises, with multicloud and hybrid cloud infrastructures now the norm. However, managing…

How to Solve Performance Problems at Petabyte-Scale

Table of Contents

1. Summary

Full content available to GigaOm Subscribers.

Table of Contents

1. Summary

Related Research

Full content available to GigaOm Subscribers.