Enrico speaks with Matthew Wallace about cloud, cloud storage and how enterprises move their data across different environments.
Guest
As Chief Technology Officer, Matt is responsible for product development, managed and professional services, and architecting Faction’s cloud infrastructure offerings. Prior to Faction, Matt has worked 20 years in technology in roles at both startups and Fortune 500 companies, including leadership roles at Level 3 Communications, ViaWest, Exodus Communications, and others.
Matt is the co-author of “Securing the Virtual Environment: How to Defend the Enterprise Against Attack,” one of the first books to holistically address cloud security concerns. Matt attended the University of California at Santa Cruz and is an official member of the Forbes Technology Council.
Transcript
Enrico Signoretti: Welcome, everyone, to a new episode of Voices in Data Storage, brought to you by GigaOm. I’m Enrico Signoretti. Today, we will talk about cloud, cloud storage and how enterprises move their data across different clouds and in hybrid environments. To help me with this topic, I invited Matt Wallace from Faction. He’s the CTO of Faction. Hi Matt! How are you?
Matt Wallace: Hey, I’m doing good. Enrico, thank you for the invite.
Usually, I ask my guest for a little bit of introduction about themselves and the company. Maybe we can start with this.
Sure. I’d be happy to. Like you said, I’m Matt Wallace. I’m the Chief Technology Officer at Faction. Faction is Denver-based in the United States. We’ve been in business now for ten years and change. We’ve spent a lot of that time doing private cloud environments, but we cut our teeth in the early days doing really interesting things with data center networking that led to us having a portfolio of patents – intellectual property and doing some really complex networking.
In the past few years, we’ve applied this to building out what we refer to as a Multi-Cloud Platform as a Service (PaaS). We’re starting to do things where we take things like disaster recovery use cases and data storage use cases and we apply our networking technology to turn those into multi-cloud products. I, myself, have been over 25 years now in technology companies like VMware, Level 3, Exodus, doing a lot of deep engineering work and now product work.
Lately, every time we talk about storage, it’s about networking. We are entering in phase of storage where networking is – it has been an important part of storage for a long time, but actually now, how you move data [is] becoming really, really important to everybody.
Back to our story today. I wanted to start this conversation by talking a little bit about how storage works in large cloud providers. Everything in the cloud is shiny. It’s wonderful etc... but I think there are a few issues, if you’ve come from a traditional environment. If you build something new and you architect it for the cloud, everything looks great. Object storage, you have block storage. Most of the block storage is a little bit different, if we think about the enterprise storage array, but you have a lot of tools to work around it.
Then you have hybrid cloud storage lately. It came later in the game. Not always done properly, in my opinion. For some vendors, it looks like a little bit of an afterthought. What’s your opinion about cloud storage? What do you think about the cloud computing, cloud native applications and the interaction between cloud native application and legacy applications?
That’s a great question. It actually just reminds me, too, of my first experiences with AWS in 2007 – I almost said 2017. It’s been a long time! When the instances that you turned up back then didn’t have any option for persistent disks because that was before EBS volumes even existed. Coming to your point, cloud native originally – essentially, depended on object storage. Those days, you would boot off an image that was stored in S3 out of object storage. You would have to retrieve your data from there and you needed to really build an architecture around that. Of course nowadays, you can get block storage in the form of an EBS volume. You can get file storage from something like an EFS and you can blend that with object storage, obviously, still. We’re seeing every single cloud provider offer those sorts of options.
I think what we’re really wrestling with now, though, is: everybody wants to be able to take every application and deal with it from both the perspective of, ’How do I do this in the first cloud I was in? How do I do it in other clouds? – because my team wants to use every cloud for different reasons.’ Larger enterprises [are] in every cloud today. How do I do that on prem because I want to have a model going forward where I have a consistent operational model between what I do, cloud native in a public cloud, and what I do on prem? I think that’s only getting more important.
Obviously, with use cases like edge computing, with 5G coming out, with products like Outpost and Azure Stack, they’re bringing some of the cloud operating models on prem. People are definitely asking, ”How do I have this consistent model?”
Historically speaking, it’s been tough to do a couple of those things. It’s been tough to do file in a clean way, especially if you wanted to use more than one protocol, like NFS and CIFS, with the same data set. It’s certainly been difficult to solve the multi-cloud challenge and really answer the question of, ”How do I have this same data without having to make copies of it on prem and in each cloud all at the same time?” Those have been tricky to solve.
On one side, we have these enterprises wanting to go to the cloud but actually, they don’t want to be able to... again, silos. They experienced data center silos in the past and they are afraid a little bit to recreate cloud silos now: full stacks and applications that can move from a stack to the other one, and this is probably the same thing that you explained.
On the other side, building this foundation, storage foundation, that is the same, somehow, between the clouds and on premises, it’s very, very difficult. There are several challenges. There is latency, bandwidth. There [are] different products. Another thing is that you don’t have your array on the cloud usually.
That’s true. In fact, I think one of the things that we’ve really been tackling is: there’s things like Outpost and Azure Stack, where they’re trying to take that cloud operating model, cloud native applications and the cloud providers want to penetrate on prem, give you hardware on prem that runs like the cloud hardware. The enterprises that we interact with, they’ve got significant investments in their platforms. It might be Dell EMC, it might be NetApp, but they’ve spent a lot of time and energy building applications that leverage those stacks.
We’re helping to extend that same model and those same platforms up into the cloud for cases like DR, for cases like analytics. That’s where we’ve come into play because we don’t make hardware – storage hardware, but we are interested in taking the platforms that people have already invested in and enabling them to easily connect them to multiple clouds. A good example being: we have one customer that has 3.5 petabytes of investment on Isilon on prem and they replicate that Isilon footprint to us because, first, they want a second copy. They just want the resiliency of having an offsite copy that is not subject to data center failure, etc.
From a DR perspective, that second copy helps them with assurance, IT resilience. Then once it’s there, we’re able to actually connect that data into all of the public clouds. Now, they have that same data set that works in the same way that it works in their on-premises environments and because it’s tied into all the clouds, they can leverage the tool sets from those clouds to do things with that data. It’s a neat data set, where actually having access to each of those platform services actually matters to this client.
Let me understand better. You provide this networking layer and the customers can have the same identical platform they have on premises on your cloud – on your data center, no matter if it’s an Isilon or a NetApp as you mentioned, so potentially everything. You manage the stuff for them and actually you provide the connectivity to their other cloud provider. Did I get it right?
Yeah that’s true. It’s important to think about this too, because we’re not just trying to manage customer equipment. The other thing that we find is that folks who are pursuing their cloud strategy, they tend to do it because they’ve got good reasons for going ‘cloud native’ in the first place. They want to reduce the number of people. They want to simplify their environments, manage [fewer] data centers, manage [fewer] heterogeneous environments. They’re actually looking to offload the work of maintaining infrastructure so they can keep their teams focused on how they bring the business value through their IT efforts.
We’re actually doing this as a cloud service. Even though it is Isilon under the hood or NetApp under the hood or Dell EMC Unity XT under the hood, regardless of what those platforms are, we’re applying this uniform layer. We’re running it as a service. We’re providing this uniform layer from a network standpoint, to plug it in.
Operating all that and removing the complexity of the cloud. For example, when it came time to take that Isilon storage and connect it to AWS, all they have to do is go into their Amazon console and accept our virtual interface request and that’s it. Their VPC is talking to the storage. Nothing to buy. Nothing to configure. They don’t need a network engineer to understand that. They don’t need to worry about IP overlap. We just take care of that.
On the other hand, we have vendors that now provide virtual appliance of their arrays or they are not. Do you provide both of them? Does it make any sense in your model or is it more something that is for the other public cloud?
We have had some limited involvement with helping folks with appliances and they tend to be in the POC or just evaluation phases. I think that’s because most people, if they’re satisfied with what an appliance does, they’re perfectly happy to go and just do that themselves. That’s a marketplace thing. They could just go to the cloud provider of their choice; choose the appliance they want and deploy it. Yeah, some of those people may get a little bit of value out of having someone administer that virtual hardware platform for them, but it’s not as big of an uplift as the physical side.
If you want to do a real array that’s connected, that’s physical, that has all the attributes of a real system, now you need a co-lo contract. You’ve got to put it in a co-location because you want it to be in a facility that’s really low latency, that’s very, very tightly connected to the cloud. You need network engineering expertise to understand how to stand up and maintain that network. Of course you need storage administration still, but you have to layer on a lot of these other things. Doing that cloud adjacent model adds a lot of complexity.
There’s some really big advantages, obviously, as well. If you use an appliance like you mentioned, that’s in the cloud, there’s a bunch of limitations that may not apply to the physical. In a lot of cases, our platform is based on real hardware, it can scale up as high as the hardware can. The virtual appliances are often limited. The virtual appliances, of course, are totally dependent on the cloud provider’s hardware, so between clouds, you’re going to get different performance, different scaling limits and that sort of thing. Of course, finally, there’s this fact that if you use a virtual appliance in a cloud, you risk taking your data and getting it stuck in that one cloud.
To your point, that lock-in, that portability is a matterful thing. It’s significant for customers, especially if they know that they’re going to need or expect they might need in the future, access to that data from multiple clouds. Now, they need to store two or three copies, whereas our service, inherently, can do that multi-cloud connectivity. There’s one copy that sits in between the clouds, like a hub and spoke model, where our data platform services, that hub, there’s spokes to each cloud; and that means if you read it, you can read the same copy from all of them. If you’re writing to it, you can write from one and then immediately read it from the other, which is pretty cool, and actually opens up a whole set of use cases that you can’t do if you’re stuck inside a single cloud.
In a certain way, you’re sort of a gateway between the public cloud and the on-premises. You enable your customers to bring their data to the cloud and maintain the same identical environment. They maintain the same level of access, performance, and any other characteristic of the array.
What I’m curious about is: do you also provide some disaster recovery services? I think that the next step is to go a little bit further and provide also hardware, and probably the VMware environment, to do that.
I think actually one of the reasons we’ve seen significant adoption is because it’s not enough just to say, “I have this storage that you can connect to multiple clouds.” Enterprises are frankly, still in that transition period on the way, in that cloud journey, figuring out how they’re going to use all these services. DR is a problem everybody wrestles with. Either we find they don’t have a plan, in which case they realize they really need to have a DR plan, and it’s something they’ve been looking to try to do in an economical way; or they have a DR plan but it involves another data center They frankly just don’t want to carry a whole data center footprint, just to deal with DR, because it’s a very expensive, very operationally intense way to handle that.
Where we’ve done a lot is by helping people do disaster recovery, where we can replicate their data into our platform, use VMware cloud on AWS as a recovery environment. There’s this turnkey, on-demand environment where we can turn up really large compute workloads. Then, since we have the data all the time, doing things to make that data available.
The 3.5 petabytes I mentioned before, that big data footprint is great for that customer because they’ve got it connected to multiple clouds. The tent pole or the thing that really got them through the door in the first place, was the fact that from a budget perspective, they actually also have another couple of hundred terabytes of virtual machines.
We were able to provide a DR home that would [land] that virtual machine workload. It can do a recovery in a VMware cloud on AWS and it can have the same access during a DR to that same data footprint. Now they’re really killing two birds or three birds with one stone, by having that same data footprint serve the DR use case as well as the analytics use case across multiple clouds.
We find that that kill-two-birds-with-one-stone approach is really what helps this to become really easy to adopt for folks. They’re immediately solving their DR challenge but then they’re also making that data available for their future exploration with public cloud services.
What is the profile of your customer, your usual customer?
It’s an interesting question. I think, historically, we’ve hit on a lot of business with folks who were small and medium businesses because we work through a lot of channel partners and system integrators, value added resellers that would go out and use our platform to, essentially, solve these problems for their end customers. We were cloud providers to cloud providers, if you will, which was how we got to be a very, very large cloud-verified – one of the largest VMware service providers in the country, in the United States.
Since we’ve rolled out this multi-cloud storage platform and our disaster recovery services, we’ve been doing a lot of selling directly to end users with partners, and we’ve found that our customer profile has really shifted much more towards large enterprise. We are solving these challenges that large enterprises are most aware of, they wrestle most with, they’re aware of all the inefficiencies if they do this themselves. They usually have a fairly well-formulated cloud strategy but they’re wrestling with the multi-cloud aspect. Sometimes they’re really wrestling with DR.
For our typical customer now, we have found that they did DR themselves, essentially, and they’re looking to leverage the cloud. We actually are finding that our typical disaster recovery customer is much larger than what you used to see in the Disaster Recovery as a Service market.
We have large enterprises that are looking to VMware cloud on AWS as a recovery environment. Now something that instead of being stuck at the more ‘mom and pop’ level of private cloud provider, they actually can recover into, essentially public cloud scale, leveraging our services, backed by VMware cloud on AWS. Lately, it’s been much more Fortune 500 and other large enterprises that we’ve seen traction with.
This is very interesting, for two reasons. One, you mentioned that you have a customer – other providers. I don’t know [about] in the US, but actually in Europe, there are more and more resellers that are becoming more MSPs, managed service providers. They act as a broker, so they found a solution in the cloud and they sell them, adding some value on top of it. I thought at the beginning, you were referring to this kind of cloud provider.
I agree. Looking at what you’re saying, in fact it sounds [like] this kind of service is very, very good for large enterprises. You mentioned that you are growing in the US, but actually, what about Europe?
I’m glad you asked. Actually, we expanded last year into London; it’s our first international location, driven, partially, by those large enterprises. We had a Fortune 500 financial firm, for example, who first engaged with us about DR here in the United States. They wanted to replicate that same model, so we opened up in London. We also had another one of our, actually, largest traditional customers, when we were doing these service provider to service providers, they have tens of thousands of end customers that they service on top of our platform, they also wanted to expand into Europe.
Because we have something really unique that we do with VMware cloud on AWS, we’re actually the only way to add additional external storage into a VMC environment. We also are opening up in Frankfurt now as well. Plenty of demand in Europe. We’re also – additionally, looking at some locations in the Asia-Pacific region. This whole strategy has definitely been driving some international expansion for us.
That’s great. Still, they are US organizations wanting you to go there. Do you think you will have some traction also from European companies to take advantage of the same model?
I think so. We definitely have fielded quite a few inquiries, especially in advance of opening up this Frankfurt location, from other European organizations that are really interested in adopting the same model. What’s another funny thing is: we’ve actually had some US customers who, even though they are US-headquartered and US-based, they’re actually adopting us first overseas. Our first Frankfurt customer, for example, is a United States enterprise but they’re not actually yet a customer of ours here in the US, but they want to leverage our services in Frankfurt as part of their digital transformation. I thought that was really interesting.
Indeed. Also because this is a different model. For them, it’s so they can avoid [opening] a data center there, at the same time, taking a platform that they know as a service. They limit the initial investment and they have somebody that they can trust overseas.
Yeah, you got it absolutely. Obviously, we’ve been investing for a long time with this platform. We’ve got a lot of experience there. Just knowing that somebody has been operating successfully elsewhere and they’re not an equivalent, makes it an easy decision, obviously, to go with somebody who has some experience operating a service like this.
I have another question, then. In this case, by taking advantage of your services, consolidating your data in a single location and using it as a large repository that can be accessed by multiple clouds, you somehow solve this data gravity problem. Do you have all your data accessible from all the cloud at a decent latency?
Yeah there’s really two kinds of data gravity considerations that I think we’re solving. One is the question of the on-premise’s data versus the cloud data. We have a lot of enterprise customers that have collected significant data sets on prem. Their on-premise applications work just fine with those, but if they want to use those in-cloud native applications, they wrestle with this question of ’How do I move data to the cloud, how do I move it back, how do I keep them in sync?’ In a lot of cases, leveraging our multi-cloud platform, they’re able to use the actual storage array technology to stitch those together. If they have, for example, like we talked about Isilon, if you have Isilon on-prem versus Isilon in the cloud, you can actually move data back and forth pretty easily.
The other thing, of course, is and this is even more significant as you go not just from ‘how do I get on-prem working with the cloud,’ but ‘how do I get on-prem working with multiple clouds?’ Now, if you have applications that are running on Amazon and Azure and Google and you’re trying to worry about data synchronization, there’s a certain amount of messiness involved with that. How do I get data out of one into the other? How do I worry about: ‘well if this application from this cloud wrote to that and then I read it from a different cloud, how is that going to work?’ I think what we solve for is being the hub in the middle where you can write data and the moment it’s written, you can read it out.
We actually went to Dell Technologies World earlier this year and we actually showed this demo live, where we actually had instances from Azure and instances from Amazon connecting to the same storage and one was writing and the other was immediately reading the data that was being written. Because these are all very low latency ‘in the metro area’ connections, it’s almost like our storage platform helps you turn Amazon and Azure into two different data centers you have that are across the street from each other, where you get to interconnect them in a way that makes sense for your enterprise. That’s something that, historically, wasn’t possible with public cloud. It really unlocks the things you can with all the cloud native services, once you have that really low latency storage that links the two together.
I totally agree with you. Having this kind of platform that helps you to consolidate the data but actually move data around between the two applications, two different clouds, solves a lot of problems that are difficult to solve today with other technologies.
It also obviously helps you normalize, to a certain extent, the way that you deal with things. This is sort of dependent on which cloud service you’re getting. Some of our clients that are leveraging Isilon, they like it because not only can they access the data from multiple clouds, but they can access it with multiple protocols. One client can be reading and writing NFS, another Windows client can be reading and writing CIFS traffic, and those can be spread across multiple clouds. It’s all just one workflow and it’s all consistent data that doesn’t require some extra technology for synching, and of course, it doesn’t require you to store two copies and pay for two copies.
What can you share with us about the future of Faction?
There’s a few things we’re continuing to work on here. I think we’re working on taking the same technology that we developed with this DR to the public cloud, which we call Hybrid Disaster Recovery as a Service, and try to broaden the availability so it works for everyone. I mentioned that our adoption so far was from large enterprises, and part of that is because it’s a very cost effective way of doing disaster recovery. Really, ‘best of breed’ in terms of what you get for the dollar. There is a high minimum footprint. It’s not a small and medium business-type of product. We’ve been evolving this so that we can actually do the same things, provide those same advantages, to enterprises of every size. That’s [some]thing we’re really looking to do.
I think the other thing, of course, and this is where things start to get really fun: we’re working on transformative data services. You start off with this idea of: ‘I’ve got block storage, I’ve got file storage, I can plug that into the two multiple clouds.’ That’s just Faction’s strike zone. What we want to do is we want to be able to add value on top of that storage. In other words, you’ve already got it there, what else can I do with it? I’ll give you an example. We have customers that are choosing to replicate file volumes to us for resiliency that happen to have VMware data on some of those.
We have a tool that’s built into our portal that will index virtual machines that are being replicated, and give you access through our portal to export a specific virtual machine and automatically migrate that into a cloud provider. Now we have this high speed, low latency pipe into the public clouds. You can take a copy of a virtual machine that you have off of our replicated index and just choose where you want to move it and we’ll drop it right into your Amazon account for you. It turns into an Amazon machine image that you can turn up and you can batch those together. Now, it gives your team an easy workflow, essentially, for doing a test dev creation or a cloud migration. Things like that become possible just in the portal.
We have a lot of ambition to extend those data services, to just let you do more with the data in terms of transformation. That could be things like ETLs on large data sets that you have, or it could be migrations. It could be things like security checks. We could be monitoring the data that you’re replicating for things like virus infection, stuff like that. That’s I think, an ambition we have.
That sounds really cool. Giving services on top of the data that you replicate anyway, it’s fantastic because sometimes, it’s very expensive to implement these kinds of services at home. Also, if you are collecting data from several on-premises locations from your customers, you become a consolidation point before going to the cloud. From this point of view, you have better access to more data.
I think in the long run, our ambition here is to really make this – we say “platform,” and today we’re talking about the fact that we’ve got compute, storage, network, all integrated in a very specific way, to make this just a turnkey experience for customers. The platform means more to us. We want to actually take a lot of these things, expose more APIs to the end customers.
I foresee a day where it’s not just about what we build, but about us providing access to that data, where they can integrate products that tap into that multi-cloud data set. Just being able to unlock the functionality of many different sorts of innovation from different organizations, leveraging our compute, network and storage, just to multi-cloudify that marketplace concept.
That’s really, really interesting. Great vision. Okay, fantastic. Probably it’s time to wrap up this episode, but actually I’d like to have a few links where we can find Faction on the internet and maybe if you have a Twitter account, you are on any other social network, if you can share your Twitter handle, maybe we can continue the conversation online.
Awesome, yeah, that’d be great. Everybody’ll find me on Twitter. I’m old school @mattwallace is my Twitter handle and @factioninc is the company Twitter handle. Be more than happy – we’re always doing other things: webinars and so on, presentations at major conferences and stuff, where we also love to have conversations with folks.
Fantastic. Thank you, Matt, again, for joining me today and talk to you soon. Bye-bye.
- Subscribe to Voices in Data Storage
- iTunes
- Google Play
- Spotify
- Stitcher
- RSS