Cassandra Total Cost of Ownership Studyv1.2022

Product Evaluation: Serverless Cassandra and Self-Managed OSS Cassandra: 3-year Total Cost of Ownership (TCO)

1. Executive Summary

This study examines the full cost and true value of self-managed OSS Apache Cassandra® vs DataStax AstraDB fully managed DBaaS in Google Cloud. Our three-year total cost of ownership (TCO) calculations account for dedicated compute hardware (for self-managed Cassandra), cost per read and write operation (on Astra DB), storage growth (each write operation adds new data) and people cost. People costs consider that certain capabilities in Astra DB needed for the workload were not available in self-managed Cassandra, requiring workarounds. We used market rates and typical splits of full-time equivalent (FTE) and consulting to determine our people costs.

A realistic performance test, based on a usage pattern relatable to a modern enterprise, was established using NoSQLBench, an open-source benchmarking tool for the NoSQL ecosystem. This set the basis for our Astra DB pricing and helped determine the configuration of the self-managed Cassandra platform.

For our test, we selected a 12-node, self-managed Cassandra cluster of n2-highmem-16 virtual machines (VM) using the Ubuntu 20.04 operating system and installed the latest version of OSS Apache Cassandra 4.0.4 available at the time of testing. Since Astra DB is fully managed, there was no configuration to select.

The use case was built on a consumer application, which typically experiences a peak and a trough of activity each day. In our workload, on average, the peak is around midday and the trough around midnight. For simplicity, and given the ongoing nature of online commerce and business activity today, we also assumed this usage pattern continued seven days a week, 365 days a year.

For Astra DB, we used the pay-as-you-go pricing on GCP (without any discounts) available at the time of this testing. For self-managed Cassandra on GCP, we considered the cost of an enterprise-grade deployment, a production cluster with 12 nodes. We also provisioned enough storage to cover data growth, replication, and re-indexing.

Our final three-year total TCO figures for the study showed a cost of $353,346 for Astra DB and $2,797,123 for self-managed Cassandra. That makes Astra DB approximately one-eighth the cost of self-managed Cassandra in our test. Astra DB proved out with 95% less staffing costs, 3x less complexity, and 80% lower infrastructure costs than self-managed Cassandra.

This test shows the immense value of choosing an Astra DB database as a service to deploy Cassandra for an enterprise project.

2. From Open Source to Serverless

There have been explosive developments in enterprise open-source software in the past few years. There has also been a corresponding growth of commercial vendors that take open-source solutions and close the code for capabilities that are all but required for enterprise applications. These capabilities include performance boosters, administrative task management such as backup and recovery, clustering capabilities, and so forth. In addition, there has been growth in training, jump starts, documentation, a community of serious users, and more.

Some organizations require a commercial vendor behind every piece of software in the shop. Others decide on a case-by-case basis, often leaning toward open-source options that boast low up-front costs and the opportunity to prove software, albeit without the safety net of a commercial vendor arrangement.

At the same time, NoSQL adoption has skyrocketed, with many organizations deploying a blend of open-source and commercial-vendor solutions. The realities of the pandemic exposed the need for reliable, agile technology infrastructure like Cassandra, which is crucial for business continuity, mining intelligence, and effectively supporting increased traffic. Demand for data at all companies will only increase in the coming years, and much of this is hidden cost.

Understanding the cost implications of Cassandra deployments in this environment is important, especially as IT shops adopt serverless compute approaches that enable them to build and run applications and services without having to manage infrastructure. The emergence of serverless databases brings these benefits to the data tier, empowering users to pay only for the data they use, and to scale up or down on demand.

Serverless approaches promise to bring the same efficiency, cost, and agility gains to databases. This report evaluates Cassandra as a self-managed database on GCP and as a fully managed serverless Cassandra Astra DB service.

3. Cassandra TCO Test Setup

Calculating the total cost of ownership is something enterprises increasingly undertake, formally or informally, before embarking on a new program. Sometimes, baseline TCO calculations are used to justify a program, but the measurement of the actual TCO can be a daunting experience, especially if the baseline TCO wasn’t adequately gauged.

This review assesses the costs of self-managed, open-source Cassandra on Google Cloud (GCP) and on DataStax’s fully managed Astra DB offering, also on GCP.

The categories or components in this NoSQL stack used in our TCO calculations include:

  • Dedicated Compute Hardware (for self-managed Cassandra)
  • Cost per Read and Write Operation (on Astra DB )
  • Storage Growth (each Write operation adds new data) (on Astra DB)

Note that while it’s not part of our calculations, DataStax also charges for data transfer. In addition to these components, other cost factors were considered. We realize it takes more than simply buying cloud platforms and tools to achieve a production-ready, enterprise NoSQL stack. It takes people. They are required to build and maintain the platform to deliver business insight and meet the needs of a data-driven organization.

Additional cost factors include:

  • People cost for the build phase (integration and migration)
  • People cost for the production phase (ongoing maintenance and continuous improvement)

We used the performance test NoSQLBench and its baseline v2 Tabular workload to establish the basis for our Astra DB pricing and determine the configuration of the self-managed Cassandra platform.

We combined all of these costs to reach our final three-year total cost of ownership figures for the study.

Configuration

To configure the self-managed Cassandra 4.0.4 cluster, we needed a cluster that could handle 250,000 operations per second with 99% of latencies less than 50 milliseconds.

High-performance Cassandra workloads need plenty of CPU and memory. We first created a 12-node cluster of n2-highmem-16 Google Compute Engine virtual machines (with 16 CPUs and 128GB RAM). We tested all three of the NoSQLBench baselines v2 workloads (keyvalue, timeseries, and tabular), but ultimately used the results of the tabular workload to check the upper bounds of our OSS Cassandra cluster. This test gave us an upper limit of approximately 261,000 operations per second.

This was close to our upper limit for a production-grade configuration. Thus, we selected a 12-node cluster of the same n2-highmem-16 virtual machines for our pricing scenario, which had equivalent ops/s, p95, and p99 latencies.

Since Astra DB is fully managed, there was no configuration to select. We tested it with the same NoSQLBench baselines v2 workloads (keyvalue, timeseries, and tabular), which it handled without errors and with millisecond latencies.

For OSS Cassandra and Astra DB, we sampled latencies at the 95th and 99th percentiles from the Grafana dashboard for the tabular workload. In our slice of data, OSS Cassandra showed a p95 latency of 2.3ms and a p99 latency of 3.7ms. Astra DB showed a p95 latency of 1.8ms and a p99 latency of 1.9ms. These values were used to confirm our configuration choices, but they do not impact TCO.

Performance Use Case Test

To create a use-case scenario giving us a basis for our pricing calculations, we used the benchmarking tool, NoSQLBench, to develop a workload similar to a realistic use pattern in a modern enterprise.

The use case was built on a consumer application, which typically experiences a peak and a trough of activity each day. In our workload, on average, the peak is around midday and the trough around midnight. For simplicity, and given the ongoing nature of online commerce and business activity today, we also assumed this usage pattern continued seven days a week, 365 days a year.

To generate this usage pattern on NoSQLBench, we used a sine wave as the base pattern for usage and operations per second. To produce a more realistic, randomized pattern, we created a normally distributed random multiplier with a mean of 1 and a standard deviation of 0.5 (shown as R in the equation below) to deviate up and down from the sinusoidal pattern. If the multiplier resulted in an operations-per-second below zero, we omitted it and generated a new target (a read or write operation). We also used a mixed workload of reads and writes. We used a 30% writes to 30% reads ratio on average, shown as 0.3 in the target write formula:

In this equation, A is the amplitude of the wave, b is the base or floor, and f is the number of iterations at which the target changes. For our experiments, we used an amplitude of 25,000 operations per second, a base of 1,000 ops per second, and 1,440 iterations (to change the target every minute).

Our target read operation was:

Additionally, we wanted to add random operations per second “spikes.” These spikes are meant to simulate outlier events when there is a massive influx of transactions. These events should be considered part of a production scenario and could represent any number of real-world situations, including:

  • Unexpectedly high demand for a new product release indicated by a high number of logins, purchases, views, API calls, etc.
  • Network outage that causes a flood of backlogged streaming messages all at once.
  • Cyber, malware, or ransomware attack
  • Greedy/degraded/runaway application or process

We simulated these outlier spikes by random chance:

  • There was a 1 in 240 chance of the read/write operations per second being 3x more than normal.
  • There was a 1 in 480 chance of the read/write operations per second being 6x more than normal.

Running the workload resulted in the following operations per second over a seven-day period, where green represents reads per second and blue represents writes per second. (Figure 1)

Figure 1. Use Case Workload (Green = Reads/Sec, Blue = Writes/Sec)

This usage pattern resulted in the infrastructure statistics shown in Table 1. The table shows the number of reads and writes per day, week, month, and year.

Table 1. Usage Pattern Statistics

Per Day Per Week Per Month Per Year
Reads 107M 747M 3.2B 38.4B
Writes 35M 244M 1.0B 12.6B
TOTAL 142M 992M 4.2B 51.0B
Source: GigaOm 2022

In terms of data storage, we made the assumption that each write added 186 bytes of compressed data to the NoSQL database. To get this figure, we sampled 100,000 rows of output data from the NoSQLBench Baselines v2 Tabular workload, which had an average uncompressed row size of 453 bytes. We compressed the output with the Lz4 compression algorithm (one of the most common compression algorithms used by Cassandra), and it achieved a 40.95% compression ratio. This gave us 186 bytes per write value.

This means at the end of the first month, the database size would be 183GB. At the end of three years (our TCO period), it would be 6.4TB.

With self-managed Cassandra, we also multiplied the storage requirements by a replication factor of 3 (three copies of the data) and a compaction factor of 2 (giving the disks enough room for compaction to remove old data). That is 6.4TB x 2 x 3 for a total of 38.7TB.

4. Test Results

Platform Costs

We used the following method to determine the costs of running both the fully managed Astra DB and self-managed Cassandra on Google Cloud.

Astra DB
For Astra DB, we used the pay-as-you-go pricing on GCP available at the time of this testing:

  • Reads (per 1M): $0.33
  • Writes (per 1M): $0.49
  • Data storage (per GB-month): $0.25

Using this simple pricing model, and using the usage statistics from Table 1, we determined the three-year cost of operating Astra DB, as shown in Table 2:

Table 2. Three-Year Operating Cost for Astra DB

Year 1
Total Ops
Year 1
Price
Year 2
Total Ops
Year 2
Price
Year 3
Total Ops
Year 3
Price
3-YEAR
TOTAL
Reads 38.4B $12,824 271.5B $12,824 271.5B $12,824 $38,472
Writes 12.6B $6,222 82.7B $6,222 82.7B $6,222 $18,666
Data Storage - $3,574 - $10,173 - $16,772 30,520
Total - $22,621 - $29,219 - $35,818 $87,658
Source: GigaOm 2022

NOTE: Data storage is a running total calculation. It grows 183.3GB every month, so for month 1, you pay for 183.3GB. For month 2, you pay for month 1 + month 2, or 366.6GB, and so on. Then we added all those up. The per-month cost increase is calculated as follows:

244,191,471 writes/week x 186 bytes/write x 52 weeks/year ÷ 12 months/year ÷ 1024^3 bytes/GB = 183.3 GB/month

For development/test serverless environments, DataStax Astra DB includes a $25 per month tier, which covers the cost of a non-production environment.

Self-Managed Cassandra
For self-managed Cassandra on Google Cloud, we computed the following infrastructure costs, shown in Table 3. We considered the cost of an enterprise-grade deployment that included a 12-node cluster for production. This does not include the costs for a development/test cluster. We also grew the storage each year, assuming we wanted to retain three years of data. Over the three-year period, 6.4TB of data would accumulate. Thus, 6.4TB times a replication factor of 3 and multiplied by the compaction factor of 2 to provide room to re-index gives us 38.4TB provisioned disk space.

Table 3. GCP Infrastructure Costs (based on GCP Compute Engine Persistent Disk standard provisioned cost)

- Year 1 Year 2 Year 3
GCP Compute Engine VM n2-highmem-16 n2-highmem-16 n2-highmem-16
Price per Node per Hour $1.048 $1.048 $1.048
Nodes 12 12 12
Storage per GB-Month $0.04 $0.04 $0.04
Storage Allocated 38.4TB 38.4TB 38.4TB
Source: GigaOm 2022

We used the current (as of the time of this writing) GCP Compute Engine Persistent Disk pricing.

These costs resulted in one-year total for operating self-managed Cassandra for the workload on Google Cloud, shown in Table 4.

Table 4. Yearly Operating Costs

Year 1
Price
Year 2
Price
Year 3
Price
3-YEAR
TOTAL
Compute per Year $110,178 $110,178 $110,178
Storage per Year $38,009 $38,009 $38,009
1-Year Cost $148,187 $148,187 $148,187 $444,561
Source: GigaOm 2022

Other Costs

In addition to these components, we considered other cost factors. We understood that it takes more than simply buying cloud platforms and tools to achieve a production-ready enterprise NoSQL stack—it takes people, who are essential to build and maintain the platform to deliver business insight and meet the needs of a data-driven organization.

The additional cost factors include:

  • Labor
  • Build (integration and migration)
  • Production staffing (ongoing maintenance and continuous improvement)

Labor Costs
To figure labor costs for our TCO calculations, we used a blended rate for both internal staff and external professional services. We estimate labor costs based on our industry experience as follows:

  • Internal Staff: $73 per hour
  • External Professional Services: $150 per hour
  • Work Hours Per Year: 2,080

For internal staff, we used an average annual cash compensation of $125,000 and a 22% burden rate, bringing the actual cost to a blended $152,500. We also estimated the year to have 2,080 working hours, resulting in an effective hourly rate of $73. For external professional services, we chose a nominal blended rate of $150 per hour.

To support both the integration and migration to the NoSQL platform (build phase) and the ongoing maintenance and continuous improvement (production phase), we estimated a mixture of internal and external resources (Table 5):

Table 5. Internal and External Resource Costs

Build Phase Production Phase
Internal Staff 50% 100%
External Services 50% 0%
Blended Rate $112 per hour $73 per hour
Source: GigaOm 2022

Build Phase
To estimate the build costs for migrating from an existing platform to the NoSQL solution, we used the figures shown in Table 6:

Table 6. Estimated Build Costs

Astra DB Self-Managed Cassandra
Source Objects 50 50
Effort – Hours per Object – Typical 24 24
Complexity Multiplier 1 3
Effort – Hours per Object – Expected 24 72
Total Hours 1,200 3,600
Reduction Due to POC Work 0% 0%
Contingency Effort % 30% 30%
Total Effort (Hours) 1,560 4,680
Average per Object 1.3 3.9
Labor Costs per Hour $112 $112
Total $174,188 $522,563
Source: GigaOm 2022

Data migration included a significant number of data objects that needed to be exported from the old platform and loaded onto the new platform. Because Astra DB is a fully managed platform, there is little management overhead. Thus, we used a complexity factor of 1.0. For the self-managed, build-your-own OSS Cassandra platform, we considered all the configuration and tuning required (see below), and gave it a complexity factor of 3.0 to demonstrate the time and effort saved by using a completely managed platform. We arrived at 3.0 based on our own experience of setting up the testing environment—it took three times longer to set up OSS Cassandra for testing than it did to set up Astra DB.

We also assume that a net new OSS Cassandra user may require more time setting up, configuring, tuning, and migrating their data operations compared to the fully managed, serverless Astra DB, which requires little setup and no configuration or tuning. The additional tuning and configuration usually required for self-managed but not needed for Astra DB includes the following:

Considering developers:

  • When creating a table, they do not need to decide on, test, and configure a compaction strategy, as Astra DB uses a universal compaction strategy.
  • When creating a table, they do not need to configure the gc_grace_settings, which must be matched to the frequency of repairs.
  • When creating a table, they do not need to configure settings that impact memory usage per table, such as min_index_interval, max_index_interval, and bloom_filter_fp_chance.
  • When creating a table, they do not need to configure and test the speculative_retry setting, which impacts p99 latencies and resource usage.

Considering operators:

  • They do not need to install, configure, operate, and monitor a repair process for the cluster.
  • They do not need to install, configure, operate, and monitor a backup process for the cluster.
  • They do not need to configure CPU, memory, and disk settings such as concurrent_reads, file_cache_size_in_mb, and compaction_throughput_mb_per_sec.
  • They do not need to configure data distribution using the num_tokens setting, which impacts features such as repair and scaling the cluster.
  • They do not need to configure internode and client-server encryption and key management.

For the overall three-year TCO calculation, build phase costs are only applied once.

Production Phase
The cost of maintaining and improving the NoSQL platform has to be considered as well. This includes support work, such as database administration, disaster recovery, and security. However, no platform is (or should be) static. The environment needs constant improvement and enhancement to grow with business needs, so the work of project management and CI/CD integration is considered here as well.

The labor costs shown in Table 7 were used in our calculations and are based on the internal labor rate of $73 per hour.

Table 7. Labor Cost Estimates

Astra DB Self-Managed Cassandra
DBA FTE 0.2 2
Infrastructure FTE 0 2
Total FTE 0.2 4
Total per Year $30,500 $610,000
Three-Year Total $91,500 $1,830,000
Source: GigaOm 2022

Administration tasks are reduced and automated with a fully managed Astra DB platform.

This analysis assumes an organization is just starting its journey with distributed database technology, either migrating from traditional relational databases, or is a new company adopting new technology as it builds the product. A learning curve is associated with ramping up infrastructure and DBA expertise, which is accounted for with the number of FTEs selected.

Three-Year TCO

Ultimately we arrived at our final three-year total cost of ownership figures for the study. Figure 2 shows the grand total for both NoSQL deployments—Astra DB and open-source Cassandra—over a three-year span.

Figure 2. Deployment Costs Compared

Non-Quantifiable Benefits

While it’s relatively easy to quantify the benefits of a serverless and a hosted Cassandra offering, it is much harder to quantify features that offer intangible benefits. Integrated into the Astra DB serverless Cassandra offering from DataStax, used for this evaluation, are storage-attached indexes and an open-source data API gateway.

One of the challenges of using Cassandra is the esoteric nature of querying data, including the inability to do joins. This required multiple workarounds, some of which reduced Cassandra’s performance or added complexity. Storage-attached indexes (SAI) allow developers to query any column without using the primary key, removing a source of developer friction from Cassandra.

Modern developers want to use APIs; they don’t want to install drivers or learn Cassandra Query Language. With a data API gateway, a developer with no prior knowledge of Apache Cassandra can start in minutes by using familiar REST, GraphQL, gRPC or Document (JSON) APIs. Storage-attached indexes and the data API gateway are difficult to quantify but both increase developer productivity and accelerate software velocity.

5. Conclusion

Cassandra is crucial for business continuity, low latency, and effectively supporting increased traffic. We compared open-source Cassandra on Google Cloud with a fully managed Astra DB service.

We included dedicated compute hardware (for self-managed Cassandra), cost per read and write operation (on Astra DB), storage growth (each write operation adds new data), and people costs in our three-year total cost of ownership calculations.

A realistic performance test based on a use pattern relatable to modern enterprises was established using NoSQLBench. The use case was built on the ebb and flow of transaction volumes on a given day and assumed a peak and a trough of activity each day.

Our final three-year total cost of ownership figures for the study showed $353,346 for Astra DB and $2,797,123 for self-managed Cassandra. That makes Astra DB approximately one-eighth the cost of self-managed Cassandra in our test. Astra DB proved out with 95% less staffing costs, 3x less complexity, and 80% lower infrastructure costs than self-managed Cassandra.

This test shows the immense value of choosing Astra DB when deploying Cassandra for an enterprise project. Astra DB reduces deployment time and is the elegant serverless, multi-cloud Cassandra option.

6. Disclaimer

Cost is important, but only one criterion for a Cassandra platform selection. This test is a point-in-time check into specific costs. There are numerous other factors to consider in selection across factors of performance, administration, features and functionality, workload management, user interface, scalability, vendor, reliability, and numerous other criteria. It is our experience that costs change over time and is competitively different for different workloads. Also, a cost leader can hit up against the point of diminishing returns, and viable contenders can quickly close the gap.

GigaOm runs all of its tests to strict ethical standards. The report’s results are the objective results of the application of tests to the simulations described. The report clearly defines the selected criteria and process used to establish the field test. It also clearly states the tools and workloads used. The reader is left to determine how to qualify the information for individual needs. The report does not make any claim regarding the third-party certification and presents the objective results received from the application of the process to the criteria as described in the report. The report strictly measures TCO and does not purport to evaluate other factors that potential customers may find relevant when making a purchase decision.

This is a sponsored report. DataStax chose the competitors, the test, and the DataStax serverless configuration was the default provisioned by DataStax. GigaOm chose the most compatible configurations for self-managed Cassandra and ran the models. Choosing compatible configurations is subject to judgment. We have attempted to describe our decisions in this paper.

7. About DataStax

DataStax is the open, multi-cloud stack for modern data apps. DataStax gives enterprises the freedom of choice, simplicity, and true cloud economics to deploy massive data, delivered via APIs, powering rich interactions on multi-cloud, open source, and Kubernetes.

DataStax is built on proven Apache Cassandra™, Apache Pulsar™ streaming, and the Stargate open-source API platform. DataStax Astra is the new stack for modern data apps as-a-service, built on the scale-out, cloud-native, open-source K8ssandra

8. About William McKnight

William McKnight is a former Fortune 50 technology executive and database engineer. An Ernst & Young Entrepreneur of the Year finalist and frequent best practices judge, he helps enterprise clients with action plans, architectures, strategies, and technology tools to manage information.

Currently, William is an analyst for GigaOm Research who takes corporate information and turns it into a bottom-line-enhancing asset. He has worked with Dong Energy, France Telecom, Pfizer, Samba Bank, ScotiaBank, Teva Pharmaceuticals, and Verizon, among many others. William focuses on delivering business value and solving business problems utilizing proven approaches in information management.

9. About Jake Dolezal

Jake Dolezal is a contributing analyst at GigaOm. He has two decades of experience in the information management field, with expertise in analytics, data warehousing, master data management, data governance, business intelligence, statistics, data modeling and integration, and visualization. Jake has solved technical problems across a broad range of industries, including healthcare, education, government, manufacturing, engineering, hospitality, and restaurants. He has a doctorate in information management from Syracuse University.

10. About GigaOm

GigaOm provides technical, operational, and business advice for IT’s strategic digital enterprise and business initiatives. Enterprise business leaders, CIOs, and technology organizations partner with GigaOm for practical, actionable, strategic, and visionary advice for modernizing and transforming their business. GigaOm’s advice empowers enterprises to successfully compete in an increasingly complicated business atmosphere that requires a solid understanding of constantly changing customer demands.

GigaOm works directly with enterprises both inside and outside of the IT organization to apply proven research and methodologies designed to avoid pitfalls and roadblocks while balancing risk and innovation. Research methodologies include but are not limited to adoption and benchmarking surveys, use cases, interviews, ROI/TCO, market landscapes, strategic trends, and technical benchmarks. Our analysts possess 20+ years of experience advising a spectrum of clients from early adopters to mainstream enterprises.

GigaOm’s perspective is that of the unbiased enterprise practitioner. Through this perspective, GigaOm connects with engaged and loyal subscribers on a deep and meaningful level.

11. Copyright

© Knowingly, Inc. 2022 "Cassandra Total Cost of Ownership Study" is a trademark of Knowingly, Inc. For permission to reproduce this report, please contact sales@gigaom.com.