Enterprise Readiness of Cloud MLOps

Summary
MLOps Enterprise Readiness
Competitive Platforms
Field Test Setup
Field Test Results
TCO Analysis
Conclusion
Appendix: Assessment Methodology and Scoring
About William McKnight
About Jake Dolezal
About GigaOm
Copyright

1. Summary

MLOps is a practice for collaboration between data science and operations to manage the production machine learning (ML) lifecycles. As an amalgamation of “machine learning” and “operations,” MLOps applies DevOps principles to ML delivery, enabling the delivery of ML-based innovation at scale to result in:

Faster time to market of ML-based solutions
More rapid rate of experimentation, driving innovation
Assurance of quality, trustworthiness, and ethical AI

MLOps is essential for scaling ML. Without it, enterprises risk struggling with costly overhead and stalled progress. Several vendors have emerged with offerings to support MLOps: the major offerings are Microsoft Azure ML and Google Vertex AI. We looked at these offerings from the perspective of enterprise features and time-to-value.

For the analysis, we used categories of Total Cost of Ownership (TCO) time-to-value and enterprise capabilities. Our assessment resulted in a score of 2.9 (out of 3) for Azure ML using managed endpoints, 1.9 for Google Vertex AI, and TK for AWS SageMaker. The assessment and scoring rubric and methodology are detailed in an annex to this report.

We hope this report is informative and helpful in uncovering some of the challenges and nuances of platform selection: we leave the issue of fairness for the reader to determine. We strongly encourage you, as the reader, to look past marketing messages and discern for yourself what is of value in technical terms, according to the goals you are looking to achieve. You are encouraged to compile your own representative MLOps use cases and workflows, and review these platforms in a way that is applicable to your requirements.

2. MLOps Enterprise Readiness

The MLOps process primarily revolves around data scientists, ML engineers, and application developers creating, training, and deploying models on prepared data sets. Once trained and validated, models are deployed into an application environment that can deal with large quantities of (often streamed) data, enabling users to derive insights.

For modern enterprises, use of ML goes to the heart of digital transformation, enabling organizations to harness the power of their data and deliver new and differentiated services to their customers. Achieving this goal is predicated on three pillars:

Development of such models requires an iterative approach so the domain can be better understood, and the models improved over time, as new learnings are achieved from data and inference.
Automated tools and repositories need to store and keep track of models, code, data lineage, and a target environment for deployment of ML-enabled applications at speed without undermining governance.
Developers and data scientists need to work collaboratively to ensure ML initiatives are aligned with broader software delivery and, more broadly still, IT-business alignment.

To a large extent, these goals can be addressed using MLOps platforms; and a lack of functionality (and performance) in a chosen MLOps solution can lead to workarounds, more personnel effort, and missed opportunities. In our assessment, there are three main ways the platforms can be differentiated from one another: time-to-value, and enterprise readiness capabilities.

TCO dimensions incorporate overall platform costs and costs of processing
Time-to-value dimensions are ease of setup and use, and MLOps workflow
Enterprise capability dimensions are security, governance, and automation

TCO Dimensions

The pure financial cost of using an MLOps platform is clearly going to have a bearing on the ROI of the ML program. Several benefits accrue to an organization from the effective use of ML, including many that have a direct bearing on profitability. Successful MLOps improves the cost equation of ML by reducing ML costs. MLOps product selection matters immensely to MLOps costs and there are 2 main dimensions to the cost.

The 2 dimensions of MLOps costs:

Overall Platform Costs
Costs of Processing

The following table estimates the costs of processing we calculated for a small to medium-sized organization for one year.

Amount

For this calculation, we made the following assumptions:

The amount of model prediction compute for Azure ML and Google Vertex AI stays fixed at 16 compute nodes running 24/7/365.

Enterprise Time-to-Value

Time-to-value is the measure of how much the cloud MLOps platform can shorten the amount of time from installation to when the machine learning lifecycle is at full, daily operating capacity and delivering value to the business.

This is a critical factor in the race to leverage machine learning to improve business outcomes. Enterprises are clamoring to employ machine learning and artificial intelligence to make their businesses more efficient, customer-oriented, profitable, and competitive. The less time it takes to get a machine learning platform up and running and operating effectively, the sooner a company can go to market with its newfound insights and capabilities.

Enabling this are:

Ease of Setup and Use. This greatly shortens the time-to-value, especially when an organization is just beginning its machine learning journey on a fully managed cloud service. Ease of setup and use has a multiplicative effect. As new data science teams are on-boarded onto the solution, a platform that is easy to use shortens the amount of time it takes for them to set up their environment, establish their workflows, and adopt and tailor the platform to meet their specific needs and requirements.
MLOps Workflow. Assessing the MLOps workflow capabilities of each platform is a measure of how well the cloud platform improved the efficiency of ongoing, day-to-day operations as machine learning models traverse a workflow through orchestration.

Enterprise Capability Assessment

Enterprise capability readiness is determined by the features enabled by the cloud–these create a robust, feature-rich environment that makes the platform easier to manage and control.

Features include an MLOps platform service that ensures it can operate and be secured, governed, and monitored within a modern IT infrastructure and governance/compliance structure. It also includes automation and integration capabilities, such as code management and continuous improvement and development (CI/CD) tasks.

These features drive security, governance, and automation as follows:

Security. In any IT discipline, a best practice is to restrict access to only the users who need it. Each cloud platform differs in the authentication and authorization model used by the service. Some organizations might also want to restrict network access, or securely join resources in their on-premises network with the cloud. Data encryption is also vital, both at rest and while data moves between services. Finally, you need to be able to monitor the service and produce an audit log of all activity.To fully secure the MLOps platform, network perimeters must be put in place to avoid potential attackers and data exfiltration. IT administrators need to configure the platform and other services, like storage, key vault, container registry, and compute resources (virtual machines) in a network-secure way, such as using virtual networks to enable end-to-end machine learning lifecycle security. A virtual network acts as a security boundary, isolating your resources from the public internet. You must also be able to join a cloud virtual network to an on-premises network. By joining networks, you can securely train models and access deployed models for inference.
Governance. This can be a critical requirement and should not be overlooked. IT administration, data governance, and corporate compliance need the ability to ensure users are creating machine learning workspaces and other services that remain compliant with corporate standards or government regulations. Governance capabilities of the MLOps platform should allow users to set up network and data protection policies that, for example, ensure users are not able to create workspaces with public IPs or without customer managed keys.Additionally, monitoring is also key to maintaining effective governance. The cloud platform should offer full stack monitoring, whether it’s embedded in the MLOps platform GUI or part of the cloud vendor’s overall monitoring toolset.

Automation. This is a key differentiator in MLOps platforms because of the efficiency gains and saved effort that can be achieved through well-developed features. Generally speaking, automation falls into the following categories:

- Automated experiments, i.e., the ability to automatically pick an algorithm and generate a deployment-ready model
- Automated workflows, i.e., the ability to automate workflows by automating time-consuming and iterative tasks
- Code, application, and CI/CD orchestration, e.g., using GitHub or Team Foundation Server (TFS) for versioning, approvals, gate phasing, and deployments
- Event-driven workflows, i.e., the ability to trigger a workflow activity when an event occurs

3. Competitive Platforms

Azure Machine Learning

Azure Machine Learning (Azure ML) is fully managed platform-as-a-service ML. Azure ML provides developers and data scientists the ability to not only build, train, and deploy machine learning (ML) models, but also accelerate time-to-value with end-to-end, fully featured MLOps.

Azure ML provides end-to-end lifecycle management, keeping track of all experiments. It stores the code, settings and environment details to facilitate experiment replication. Models can be placed into containers for deployment like any container that runs on Kubernetes.

There are several ways to carry out machine learning on Microsoft Azure’s cloud computing platform. A popular choice is to leverage the Azure Machine Learning service, a collaborative environment that enables developers and data scientists to rapidly build, train, deploy and manage machine learning models.

Solution Architecture

With security controls in place, a user can provision a workspace private link, customer managed keys, and role-based access control (RBAC) using AML python SDK, CLI, or UX. ARM templates can be used for automation.

Compute instance is used as a managed workstation by data scientists and is used to build models. IT Admin can create a compute instance behind a VNet if there are restrictions in place to not use a public IP.

Compute Cluster is used as a training compute to train ML models. IT Admin (not shown) can create a compute cluster behind a VNet or enable a private link if there are restrictions in place to not use a public IP.

Once a model is created it can be deployed on AKS cluster. A private AKS cluster with no public IP can be attached to the AML workspace and an internal load balancer can be used so that the deployed scoring endpoint is not visible outside of the virtual network. All the scoring requests to the deployed model are made over TLS/SSL.

With flexible compute options in Azure, Azure ML makes it easy to start locally and scale as needed.

Amazon SageMaker

Amazon Sagemaker is an accessible Amazon Web Services offering for fully-managed machine-learning as a service.

SageMaker is a machine learning environment that provides tools such as Jupyter for model building and deployment. SageMaker comes with an impressive set of algorithms. These include Linear Learner – a supervised method for classification and regression, Latent Dirichlet allocation – an unsupervised method used for finding document categories, and many more.

Solution Architecture

Image credit: https://aws.amazon.com/blogs/architecture/field-notes-accelerate-research-with-managed-jupyter-on-amazon-sagemaker/

SageMaker is composed of many services connected via an API that coordinates the ML lifecycle. SageMaker uses Docker to execute ML logic. You can download a library that lets you easily create Docker images. SageMaker retrieves a specific Docker image from ECR and then uses this image to run containers to execute the job.

The artifacts from the model training are stored in S3. Whenever developers create a job, SageMaker launches EC2 instances to perform the work. SageMaker relies on Identity and Access Management (IAM) users for authentication and access control and HTTP for requests to the API.

SageMaker works extensively with The Python SDK open source library for model training using prebuilt algorithms and Docker images as well as to deploy custom models and code.

You can also add your own methods and run models, leveraging SageMaker’s deployment mechanisms, or integrate SageMaker with a machine learning library. SageMaker’s design supports ML applications from model data creation to model execution. The solution architecture also makes it versatile and modular. You can use SageMaker for only model construction, training, or deployment.

Google Cloud AI Platform

Google Cloud (GCP) offers AI Platform as a fully managed, end-to-end platform for data science and machine learning.

AI Platform is designed to make Google AI accessible to enterprise ML workflows. AI Platform has a number of services for MLOps Pipelines provide support for creating ML pipelines. Continuous evaluation helps you monitor model performance and Deep Learning Containers provide preconfigured and optimized containers for deep learning environments.

AI Platform is a suite of services on Google Cloud specifically targeted at building, deploying, and managing ML models in the cloud. It is used with AutoML (Google’s auto ML engine selection), as well as models built with Tensorflow and SKLearn.

AI Platform offers a suite of services, designed to support the activities seen in a typical ML workflow: prepare, build, run, validate and run. AutoML provides a GUI to model selection for faster performance and more accurate predictions while AI Platform Vizier offers a complete ML black-box optimization service.

In addition, AI Platform labels the data to perform the tasks of classification, object detection, entity extraction.

4. Field Test Setup

The field test was designed to assess the capabilities, features, ease-of-use, and documentation of the three platforms. We strove to eliminate as much subjectivity as possible from the test plan, methodology, and measurement. However, we concede that assessing MLOps for enterprise readiness in the cloud is challenging. Certain use cases may favor one vendor over another in terms of feature availability, environments, established workflows, and requirements. Our assessment demonstrates a slice of potential configurations and workloads.

GigaOm partnered with Microsoft, the sponsor of this report, to select competitive platforms that offer comparable features and capabilities to address organizations’ MLOps use cases. GigaOm selected the test scenario, methodology, and configuration of the environments. The parameters used to replicate this test are provided throughout this document. We have provided enough information in the report for anyone to reproduce this test.

Test Scenario

For the MLOps platforms, we selected a simple and straightforward, but very common, use case for our testing. A company has an attrition dataset and would like build a model to uncover the factors that lead to employee attrition and explore important questions such as:

Show me a breakdown of distance from home by job role and attrition
Compare average monthly income by education and attrition

The dataset is in CSV format that we upload to the respective cloud storages for each platform. We then used each respective platform to build, train, and deploy the model.

The field test then consisted of five separate tests:

Test 1 – Ease of Setup and Use
Test 2 – MLOps Workflow
Test 3 – Security
Test 4 – Governance
Test 5 – Automation

We document our approach to each test, together with how we scored each set of steps, in the annex below.

5. Field Test Results

This section analyzes the levels of differentiation between the three MLOps cloud vendor platforms described in the previous section.

Test 1: Ease of Setup and Use

The overall results are as shown in the following table.

Beginning with the initial setup and the creation of the ML workspace, all three platforms have an intuitive point-and-click interface. The documentation supports the activity, but it was barely needed as the interface walked us through the configuration of the environment, networking, security, and storage. Within minutes, we had a working environment ready. Each platform received full marks.

Next, we tested the creation of compute resources—both a single compute instance (or virtual machine) and a production-grade cluster. Azure had the quickest and easiest set of steps with just a few clicks. Google Vertex AI was also simple and straightforward.

The following workflow diagrams outline what we uncovered.

Figure 2. Ease of Setup and Use Test Workflows

To manage the resources, such as delete operations, all three platforms did well. Google Vertex AI was simple, but required many more steps than Azure ML. SageMaker received a slight deduction because it did not have an option to SSH into its compute instance from the interface.

Test 2: MLOps Workflow

The overall results are as shown in the following table.

To assess these capabilities, we compared both model and data orchestration across the three platforms. Azure ML provided one-click access for us to build our models in Jupyter, JupyterLab, and RStudio. We also could easily distribute and reuse the models by sharing them with our team members.

For data orchestration, we took each platform through the paces of importing new data, configuring the data source, defining its schema, validating, cleansing, transforming, normalizing, and finally staging the data set for further integrations and downstream use. Using the Import Data module in Azure ML, this was all a straightforward and simple process. We appreciated the ability to view descriptive statistics of columns of data, as well as the handy “clean missing data” module. Transformations, normalizations, and all modules were drag-and-drop and code-free. SageMaker was nearly as robust and intuitive, offering all the same functionality. Unfortunately, at the time of our evaluation, Google Vertex AI was painfully behind in this arena. All of these data orchestration operations involved multistep, extra-large coding tasks without a visual, drag-and-drop interface.

Test 3: Security

The overall results are as shown in the following table.

Both Azure ML and AWS SageMaker offered the fully-realized security features we sought by isolating both workspaces and training environments with a virtual private connection (VPC). Google Vertex AI missed on these requirements by not offering the ability to isolate either workspaces or the training environment out of the box—which left us with resources for our team to access through the public internet. We gave it partial credit, because there is a method in beta to create a service perimeter and place our resources behind it.

In addition to network security, user security is of extreme importance. Identity and Access Management (IAM) determines who should have access to the service and what operations they are authorized to perform. The cloud MLOps platform should provide built-in role-based access controls (RBAC) for common management scenarios. Azure ML received full marks due to its built-in RBAC for common management scenarios. Azure Active Directory (Azure AD) can assign these RBAC roles to users, groups, service principals, or managed identities to grant or deny access to resources and operations. Those familiar with Active Directory and Microsoft Single Sign-On (SSO) will appreciate this. AWS SageMaker also showed well with its ability to set up user security through its uniform IAM service, but it can also handle managed identity access through AWS Single Sign-On. The only downside is that this would require some adoption and migration work unless your enterprise already has infrastructure security managed under AWS IAM and SSO. For Google Vertex AI, our requirements were much less satisfied by using their IAM service. Also, we were unable to find managed identity support—either natively or using a third-party solution.

Finally, an MLOps platform requires security mechanisms companies can leverage to protect data and maintain its confidentiality and integrity. A fully capable platform must support encryption at rest that includes encrypting data residing in persistent storage on physical media, in any digital format. The platform also should support encryption using either cloud vendor managed keys or customer managed keys. It should also support encryption in transit using the Transport Layer Security (TLS) protocol to protect data when it is traveling between the service and on-premise or remote-based users. Both platforms passed our tests by offering all of these capabilities.

Test 4: Governance

The overall results are as shown in the following table.

Only Azure ML offered the ability to allow users to set up network and data protection policies that, for example, will make sure users are not able to create workspaces with public IPs or without customer managed keys.

In terms of monitoring, Azure ML has a built-in monitoring capability that allowed us to track key pipeline metrics. For AWS, Model Monitor continuously monitored the quality of our SageMaker machine learning models in production. However, it lacked a built-in overall monitoring capability and we had to rely on Amazon CloudWatch logs for our other metrics and monitoring requirements. Google Vertex AIs has improved in this area over the former AI Platform. In addition to its audit log monitoring, we could now track pipeline metrics, as well as job activity in the interface.

Test 5: Automation

The overall results are as shown in the following table.

To perform automated experiments, Azure ML offered Automated ML, which iterated over many combinations of algorithms and hyperparameters and helped us find the best model based on a success metric of our choosing. SageMaker had Autopilot, which explored our data, selected the algorithms relevant to our problem type, and prepared the data to facilitate model training and tuning. Google Vertex AI offers the black-box service Vertex Vizier to perform automated optimizations.

We also tested workflow automation features. For example, Azure ML allowed us to join a data preparation step to an automated ML step. SageMaker provided some MLOps templates that automated some of the model building and deployment pipelines. Google Vertex AI offered us some no-code capabilities with their built-in algorithms. We used it by submitting some training data, selecting an algorithm, and then allowing Google Vertex AI Training to handle the preprocessing and training.

The code and application orchestration tests we performed also revealed some differences in the platforms. Azure ML offered versioning through GitHub Actions and direct deployment; however, we found the approvals and gate-phasing options through GitHub Environments to be a bit cumbersome and non-intuitive. SageMaker allowed us to register a model by creating a model version that specifies the model group to which it belongs. We were able to perform approvals and gate-phasing in the SageMaker UI and publish from Jupyter, which earned the platform full marks. Google Vertex AI allowed us to version our code using their REST API (projects.models.version) and publishing from the GCP Console, but we found no approval or gate-phasing capabilities.

Finally, Azure allowed us to create an event-driven application based on Azure ML events, such as failure notification emails or pipeline runs, when certain conditions were detected by Azure Event Grid. With SageMaker, we created actions on rules using CloudWatch and AWS Lambda. We also set up S3 Bucket event notifications. With Google Vertex AI, it was a bit harder. We set up a workflow using the SDK with Cloud Functions to set up event-triggered Pipeline calls. Unfortunately, this required some additional setup and Kubeflow, which complicated the solution.

Overall Scoring Rubric

The following table reveals the complete scores we determined during our assessment tests.

Table 1. Cloud MLOps Scoring

6. TCO Analysis

Our assessment from a total cost of ownership perspective shows Azure ML to be 17% cheaper than Amazon SageMaker and 64% cheaper than Google Vertex AI.

To test Ease of Setup and Use and MLPs Workflow tasks, we used two variables: the total number of steps needed to complete the task, and the average size of each step. For all tests, the total number of steps was quantified by the completion of a unit of work.

Once the number of steps and the size of each step was determined, an average step size was calculated by adding up the task size (where XS=1 and XL=5), dividing by the number of steps, and rounding to the nearest task size (a number between 1 and 5).

To assess MLOps Workflow, we simulated day-to-day MLOps including model orchestration and data orchestration.

To assess Security as an enterprise readiness capability, we tested network, user and data security.

Finally, to assess Governance as an enterprise readiness capability, we tested monitoring and control. We tested experiments, workflow, Code and app orchestration and event-driven workflows for our Automation Tests.

The appendix contains more detail.

7. Conclusion

Overall, we found that Azure ML is the easiest to set up a test, orchestrate the model and data, and set up security

Azure ML really shined in governance monitoring and control, a critical requirement that should not be overlooked. IT administration, data governance, and corporate compliance need the capabilities to make sure that data scientists and other users are creating machine learning workspaces and other services that remain compliant with corporate standards or government regulations. Governance capabilities of the MLOps platform should allow users to set up network and data protection policies that, for example, will make sure users are not able to create workspaces with public IPs or without customer managed keys. Only Azure ML offers the ability to do this.

SageMaker was the best in our judgment for automation.

While the MLOps selection will generally coordinate with the cloud platform decision, it is important to know the relative strengths of the various MLOps solutions and incorporate MLOps into plans for delivering the benefits of ML to the organization. Once you have made some discrete progress and need to consolidate and coordinate ML efforts at scale, you are ready for MLOps. The opportunity exists to start to drive MLOps as a practice, assuring a framework of governance and tooling that can minimize bottlenecks as efforts progress.

8. Appendix: Assessment Methodology and Scoring

In this appendix, we explain the assessment methodology we employed and how we scored each of the respective cloud vendor platforms. The methodology and scoring used a rubric that we developed to test the key criteria for MLOps enterprise readiness. The time-to-value and enterprise capability dimensions are covered in Section 4 in the above discussion.

To test Ease of Setup and Use and MLPs Workflow tasks, and to remove as much subjectivity as possible, we used two variables: the total number of steps needed to complete the task, and the average size of each step. We fed the total number of steps needed to complete each task into a scoring matrix.

The total number of steps was quantified by the completion of a unit of work—either filling out the form on a single screen or completing a set of sub-tasks as defined by the cloud vendors’ public-facing documentation. Each major step was counted once. To compute the average size of each task, we used T-shirt sizes:

Extra-small (XS): A task that took less than a minute to complete with only point-and-click effort on one screen of the vendors’ GUI web interface
Small (S): A task that took 1-2 minutes with only point-and-click effort on one screen of the vendors’ GUI web interface
Medium (M): A task that took 2-4 minutes and required either a small amount of coding or point-and-click effort that required work across more than one screen apart from the main MLOps interface (e.g., opening a new browser tab and working in the IAM interface)
Large (L): A task that took 5-10 minutes and required several lines of code, using a command-line interface (CLI), and possibly in conjunction with point-and-click effort across multiple screens.
Extra-large (XL): A task that took more than 10 minutes, required multiple lines of coding, CLI commands, and point-and-click effort, and also required the user to piece together the appropriate actions from multiple sources within the documentation

Table 2: Methodology for scoring step complexity

Test 1 – Ease of Setup and Use

To assess Ease of Setup and Use, we performed the following tests by simulating the setup of our own development environment as if we were operations and data science teams using the platform for the very first time.

Test 1a – Create ML workspace
We assessed how quickly and easily a data persona can create, configure (networking and security), and connect to a workspace using the vendor portal.

Test 1b – Create compute resources
We assessed how quickly and easily a data persona can deploy and attach compute resources to a workspace. In this case, we created a Compute Instance/Cluster with startup scripts, auto shutdown policy, provisioned by an admin persona but assigned to a data scientist, and put it behind a VNet with no public IP. We deployed the following resource types listed below:

A single instance
Production-grade cluster

Test 1c – Manage compute resources
We assessed the following capabilities:

Delete resource

Test 2 – MLOps Workflow

To assess MLOps Workflow, we performed the following tests by simulating the same operations in our own development environment as if we were carrying out the operations on a day-to-day basis.

Test 2a – Model orchestration
We assessed how quickly and easily a data scientist persona can perform the following:

Build models, i.e., the one-click ability for a data scientist to launch a Jupyter, RStudio, or terminal interface to build models
Distribute models
Reuse models

To score Test 2a, we gave each platform 1 point each for being able to build, distribute, and reuse models with one click.

Test 2b – Data orchestration
We assessed how quickly and easily a data engineer could perform the following data operations for both model development and automated ML:

Import new data
Validate and cleanse data
Transform (data munging) and normalize data
Stage data

To score Test 2b, we used the same number of steps and task size scoring matrix we used for Test 1.

Test 3 – Security

To assess Security as an enterprise readiness capability, we performed the following tests:

Test 3a – Network security
We assessed the ability to put network perimeters in place to avoid potential attackers and data exfiltration, including isolating resources in a virtual network, including:

Workspaces
Training environment

Test 3b – User security
We assessed the ability to determine which users should have access to resources and what operations they are authorized to perform, including:

Identity and access management (IAM)
Managed identities–assess the ability to use managed identities to access resources without embedding credentials inside code

Test 3c – Data security
We assessed the availability of mechanisms to protect data, maintaining its confidentiality, and integrity, including:

Encryption of data at rest and managed keys
Encryption of data in transit (TLS)

To score the Security features, we used the following scale. Each component is given a score and the results are totaled and averaged.

Fully capable – 3
Partially capable – 2
Capable only with external tool – 1
Missing – 0

NOTE: Beta features were given an automatic 2

Test 4 – Governance

To assess Governance as an enterprise readiness capability, we performed the following tests:

Test 4a – Monitoring
We assessed the capabilities to monitor logs and activities and to track metrics from pipeline experiments.

Test 4b – Control
We assessed ability to enforce compliance with corporate standards.

To score the Governance tests, we used the same scale as the Security tests described above.

Test 5 – Automation

To assess Governance as an enterprise readiness capability, we performed the following tests:

Test 5a – Experiments
We assessed the ability to automatically pick an algorithm and generate a deployment-ready model.

Test 5b – Workflows
We assessed the ability to automate workflows by automating time consuming and iterative tasks

Test 5c – Code and app orchestration
We assessed how quickly and easily an MLOps persona could support the following CI/CD activities (e.g., we used GitHub Actions for these tasks in Azure ML):

Versioning
Approvals
Gate phasing
Deploying/publishing

Test 5d – Event-driven
We assessed the ability to trigger an event-based workflow.

To score the Automation tests, we used the same scale as the Security tests described above.

9. About William McKnight

William McKnight is a former Fortune 50 technology executive and database engineer. An Ernst & Young Entrepreneur of the Year finalist and frequent best practices judge, he helps enterprise clients with action plans, architectures, strategies, and technology tools to manage information.

Currently, William is an analyst for GigaOm Research who takes corporate information and turns it into a bottom-line-enhancing asset. He has worked with Dong Energy, France Telecom, Pfizer, Samba Bank, ScotiaBank, Teva Pharmaceuticals, and Verizon, among many others. William focuses on delivering business value and solving business problems utilizing proven approaches in information management.

10. About Jake Dolezal

Jake Dolezal is a contributing analyst at GigaOm. He has two decades of experience in the information management field, with expertise in analytics, data warehousing, master data management, data governance, business intelligence, statistics, data modeling and integration, and visualization. Jake has solved technical problems across a broad range of industries, including healthcare, education, government, manufacturing, engineering, hospitality, and restaurants. He has a doctorate in information management from Syracuse University.

11. About GigaOm

GigaOm provides technical, operational, and business advice for IT’s strategic digital enterprise and business initiatives. Enterprise business leaders, CIOs, and technology organizations partner with GigaOm for practical, actionable, strategic, and visionary advice for modernizing and transforming their business. GigaOm’s advice empowers enterprises to successfully compete in an increasingly complicated business atmosphere that requires a solid understanding of constantly changing customer demands.

GigaOm works directly with enterprises both inside and outside of the IT organization to apply proven research and methodologies designed to avoid pitfalls and roadblocks while balancing risk and innovation. Research methodologies include but are not limited to adoption and benchmarking surveys, use cases, interviews, ROI/TCO, market landscapes, strategic trends, and technical benchmarks. Our analysts possess 20+ years of experience advising a spectrum of clients from early adopters to mainstream enterprises.

GigaOm’s perspective is that of the unbiased enterprise practitioner. Through this perspective, GigaOm connects with engaged and loyal subscribers on a deep and meaningful level.

Table of Contents

1. Summary

2. MLOps Enterprise Readiness

TCO Dimensions

Enterprise Time-to-Value

Enterprise Capability Assessment

3. Competitive Platforms

Azure Machine Learning

Amazon SageMaker

Google Cloud AI Platform

4. Field Test Setup

5. Field Test Results

Test 1: Ease of Setup and Use

Test 2: MLOps Workflow

Test 3: Security

Test 4: Governance

Test 5: Automation

Overall Scoring Rubric

6. TCO Analysis

7. Conclusion

8. Appendix: Assessment Methodology and Scoring

Table 2: Methodology for scoring step complexity

Test 1 – Ease of Setup and Use

Test 2 – MLOps Workflow

Test 3 – Security

Test 4 – Governance

Test 5 – Automation

9. About William McKnight

10. About Jake Dolezal

11. About GigaOm

12. Copyright

Related Research