Voices in Data Storage Archives - Gigaom https://gigaom.com/show/voices-in-data-storage/ Your industry partner in emerging technology research Mon, 08 Nov 2021 21:35:32 +0000 en-US hourly 1 https://wordpress.org/?v=6.5.3 Voices in Data Storage – Episode 39: A Conversation with Russell Reeder https://gigaom.com/episode/voices-in-data-storage-episode-39-a-conversation-with-russell-reeder/ Wed, 29 Apr 2020 12:00:41 +0000 https://research.gigaom.com/?post_type=m-podcast-episode&p=966383 In this episode, Enrico speaks with guest Russell Reeder from Infrascale. In this episode, Enrico speaks with guest Russell Reeder from Infrascale.

The post Voices in Data Storage – Episode 39: A Conversation with Russell Reeder appeared first on Gigaom.

]]>
In this episode, Enrico speaks with guest Russell Reeder from Infrascale.

In this episode, Enrico speaks with guest Russell Reeder from Infrascale.

Guest

Russ is a 25+ year tech, sales, product and branding exec. His hi-tech background ranges from start-up ventures to Fortune 500 giants like Oracle and his first programming job at Mobil Oil. Russ has managed high-growth global organizations that have transformed industries and consistently drives customer-centric performance and product innovation at scale. Leveraging his technical background combined with his successful sales and cloud industry knowledge, Russ has the unique ability to drive global growth while maintaining a diverse, fun, and strong work culture.

Throughout his career and his life beyond business, Russ has been an attentive student of innovation that drives change. He continues to hone and apply a leadership philosophy first inspired by his grandfather and then his professional mentors.

Read about his leadership philosophy on his blog, http://russreeder.com/h-e-a-r-leadership.

Find Enrico Online

You can find Enrico on Twitter and LinkedIn. If you enjoyed this podcast, please check out his recent report for GigaOm, “Key Criteria for Evaluating Hybrid Cloud Data Protection“.

The post Voices in Data Storage – Episode 39: A Conversation with Russell Reeder appeared first on Gigaom.

]]>
Voices in Data Storage – Episode 38: A Conversation with Molly Presley of Qumulo https://gigaom.com/episode/voices-in-data-storage-episode-38-a-conversation-with-molly-presley-of-qumulo/ Wed, 15 Apr 2020 12:00:04 +0000 https://research.gigaom.com/?post_type=m-podcast-episode&p=966136 Enrico speaks to Molly Presley about hybrid-cloud, data mobility, and data gravity. Enrico speaks to Molly Presley about hybrid-cloud, data mobility, and

The post Voices in Data Storage – Episode 38: A Conversation with Molly Presley of Qumulo appeared first on Gigaom.

]]>
Enrico speaks to Molly Presley about hybrid-cloud, data mobility, and data gravity.

Enrico speaks to Molly Presley about hybrid-cloud, data mobility, and data gravity.

Guest

Molly Presley joined Qumulo in 2018 and leads worldwide product marketing. Molly brings over 15 years of file system and archive technology leadership experience to the role. Prior to Qumulo, Molly held executive product and marketing leadership roles at Quantum, DataDirect Networks (DDN) and Spectra Logic. Presley also created the term “Active Archive”, founded the Active Archive Alliance and has served on the Board of the Storage Networking Industry Association (SNIA).

Find Molly at Linkedin or on Twitter.

Find Enrico Online

You can find Enrico on Twitter and LinkedIn. If you enjoyed this podcast, please check out his recent report for GigaOm, “Key Criteria for Evaluating Hybrid Cloud Data Protection“.

The post Voices in Data Storage – Episode 38: A Conversation with Molly Presley of Qumulo appeared first on Gigaom.

]]>
Voices in Data Storage – Episode 37: A Conversation with Rob Lee https://gigaom.com/episode/voices-in-data-storage-episode-37-a-conversation-with-rob-lee/ Tue, 07 Apr 2020 12:00:06 +0000 https://research.gigaom.com/?post_type=m-podcast-episode&p=965980 Enrico speaks with Rob Lee, Vice President, and Chief Architect at Pure Storage. Enrico speaks with Rob Lee, Vice President, and Chief

The post Voices in Data Storage – Episode 37: A Conversation with Rob Lee appeared first on Gigaom.

]]>
Enrico speaks with Rob Lee, Vice President, and Chief Architect at Pure Storage.

Enrico speaks with Rob Lee, Vice President, and Chief Architect at Pure Storage.

Guests

Rob Lee is Vice President and Chief Architect at Pure Storage, focused on product architecture and strategy and identifying new innovation opportunities. He joined Pure in 2013 as part of the FlashBlade founding team and led the software architecture and development for the product from concept to launch. Prior to Pure, Rob was an Architect at Oracle, where he was particularly focused on building JIT compiler and language runtime technologies for the Oracle RDBMS, as well as high-performance distributed transaction processing for Oracle Coherence. Rob has been granted over 40 patents in the areas of distributed systems, language runtimes and storage systems. Rob holds a Bachelor of Science and a Master of Engineering in electrical engineering and computer science from the Massachusetts Institute of Technology.

Find Rob at Linkedin or at Pure Storage

Find Enrico Online

You can find Enrico on Twitter and LinkedIn. If you enjoyed this podcast, please check out his recent report for GigaOm, “Key Criteria for Evaluating Hybrid Cloud Data Protection“.

The post Voices in Data Storage – Episode 37: A Conversation with Rob Lee appeared first on Gigaom.

]]>
Voices in Data Storage – Episode 36: A Conversation with Anand Babu Periasamy and Jonathon Symonds on MinIO https://gigaom.com/episode/voices-in-data-storage-episode-36-a-conversation-with-anand-babu-and-jonathon-symonds-on-minio/ Wed, 18 Mar 2020 12:00:29 +0000 https://research.gigaom.com/?post_type=m-podcast-episode&p=965870 On this episode, Enrico brings on Anand and Jonathon from MinIO to talk about object storage. On this episode, Enrico brings on

The post Voices in Data Storage – Episode 36: A Conversation with Anand Babu Periasamy and Jonathon Symonds on MinIO appeared first on Gigaom.

]]>
On this episode, Enrico brings on Anand and Jonathon from MinIO to talk about object storage.

On this episode, Enrico brings on Anand and Jonathon from MinIO to talk about object storage.

Guests

Anand Babu Periasamy

AB Periasamy is the CEO and co-founder of MinIO. One of the leading thinkers and technologists in the open-source software movement, AB was a co-founder and CTO of GlusterFS which was acquired by RedHat in 2011. Following the acquisition, he served in the office of the CTO at RedHat prior to founding MinIO in late 2015. AB is an active angel investor and serves on the board of H2O.ai and the Free Software Foundation of India. He earned his BE in Computer Science and Engineering from Annamalai University.

Jonathan Symonds

Jonathan Symonds is the CMO of MinIO where he is responsible for corporate, brand, field and product marketing. Prior to MinIO he was the CMO of Ayasdi, a pioneer in unsupervised machine learning and held senior marketing roles at OmniSci, Ace Metrix and 2Wire. He earned his BA from Washington and Lee University and his MBA from Cornell University. 

Find MinIO at @MinIO

Find Enrico Online

You can find Enrico on Twitter and LinkedIn. If you enjoyed this podcast, please check out his recent report for GigaOm, “Key Criteria for Evaluating Hybrid Cloud Data Protection“.

The post Voices in Data Storage – Episode 36: A Conversation with Anand Babu Periasamy and Jonathon Symonds on MinIO appeared first on Gigaom.

]]>
Voices in Data Storage – Episode 35: A Conversation with Krishna Subramanian of Komprise https://gigaom.com/episode/voices-in-data-storage-episode-35-a-conversation-with-krishna-subramanian-of-komprise/ Wed, 04 Mar 2020 13:00:36 +0000 https://research.gigaom.com/?post_type=m-podcast-episode&p=965597 In this episode, Enrico discusses how to face the challenge of data growth in the enterprise. In this episode, Enrico discusses how

The post Voices in Data Storage – Episode 35: A Conversation with Krishna Subramanian of Komprise appeared first on Gigaom.

]]>
In this episode, Enrico discusses how to face the challenge of data growth in the enterprise.

In this episode, Enrico discusses how to face the challenge of data growth in the enterprise.

Guest

Krishna Subramanian has built three successful venture-backed IT businesses and held senior leadership positions at major companies such as Sun Microsystems and Citrix. She has over three decades of industry experience in cloud computing and data management.

She is currently the co-founder and COO of Komprise, the leader in analytics-driven data management software. Komprise enables enterprises to address the two biggest problems they face with data – how to manage today’s massive data growth and how to unlock the business value of data. The software is used by leading enterprises across the globe and IBM, HPE and AWS are strategic worldwide resellers. Prior to Komprise, Ms. Subramanian was the VP Marketing for Cloud Platforms at Citrix, where she led the desktop and cloud computing marketing and business development for three years. Prior to Citrix, she was Co-founder and COO of Kaviza, which was acquired by Citrix,

Before founding Kaviza, Ms. Subramanian led the cloud computing strategy and acquisitions for Sun Microsystems as its Sr. Director Corporate Development and Cloud. She led five acquisitions for Sun during her tenure which contributed to over half a billion dollars in revenues. The acquisitions were in the areas of security, identity management, data warehousing, and cloud computing. Prior to Sun, Ms. Subramanian was Co-Founder and CEO of Kovair, a venture-backed startup that was acquired. Subramanian holds a Master’s degree in computer science from the University of Illinois, Urbana-Champaign.

You can find her on Twitter and LinkedIn, as well as on Komprise’s website, Twitter and LinkedIn.

Find Enrico Online

You can find Enrico on Twitter and LinkedIn. If you enjoyed this podcast, please check out his recent report for GigaOm, “Key Criteria for Evaluating Hybrid Cloud Data Protection“.

The post Voices in Data Storage – Episode 35: A Conversation with Krishna Subramanian of Komprise appeared first on Gigaom.

]]>
Voices in Data Storage – Episode 34: A Conversation with John Goodacre of Bamboo https://gigaom.com/episode/voices-in-data-storage-episode-34-a-conversation-with-john-goodacre-of-bamboo/ Wed, 19 Feb 2020 13:00:49 +0000 https://research.gigaom.com/?post_type=m-podcast-episode&p=965338 In this episode, Enrico speaks to John Goodacre of Bamboo about ARM systems and their place, both historic and future in the

The post Voices in Data Storage – Episode 34: A Conversation with John Goodacre of Bamboo appeared first on Gigaom.

]]>
In this episode, Enrico speaks to John Goodacre of Bamboo about ARM systems and their place, both historic and future in the Data Center.

In this episode, Enrico speaks to John Goodacre of Bamboo about ARM systems and their place, both historic and future in the Data Center.

Guest

John Goodacre, Professor, Chief Scientific Officer, and Co-founder Bamboo Systems

Professor John Goodacre founded Bamboo Systems to bring the benefits of an innovative architecture and the resulting ground-breaking, energy-efficient servers, to market. He holds a Professorship in Computer Architectures from the School of Computer Science at the University of Manchester having recently transitioned from being the Director of Technology and Systems at ARM Ltd.

John’s career lists multiple firsts, including the realization of the first scalable commodity telephony platform, the introduction of the first real-time, online collaboration tools that shipped in Microsoft Exchange 2000, while more recently, the design and introduction of the ARM MPCore multicore processor and associated technologies that moved ARM from the feature phone to smartphones and beyond.

Having successfully bid for a UK Industrial Strategy Challenge Fund, John has helped the UK government funding agency, UKRI, as the Challenge Director responsible for the £200M program of activities and research to fundamentally increase the security of the world’s digital infrastructure. His roles today extend across the academic, industrial and government sectors. His research interests include web-scale servers, the application of ARM technologies, exascale efficient high-performance systems and secure, ubiquitous computing. John sits on various advisory boards at both the national and European level and is a frequent conference speaker.

You can find Bamboo Systems on both Twitter and LinkedIn

Find Enrico Online

You can find Enrico on Twitter and LinkedIn. If you enjoyed this podcast, please check out his most recent report for GigaOm, “Key Criteria for Evaluating Hybrid Cloud Data Protection“.

The post Voices in Data Storage – Episode 34: A Conversation with John Goodacre of Bamboo appeared first on Gigaom.

]]>
Voices in Data Storage – Episode 33: A Conversation with Vitess https://gigaom.com/episode/voices-in-data-storage-episode-33-a-conversation-with-vitess/ Wed, 05 Feb 2020 13:00:19 +0000 https://research.gigaom.com/?post_type=m-podcast-episode&p=964716 Enrico discusses kubernetes, databases & persistent applications within data storage with Jiten Vaidya and Sugu Sougoumarane of Vitess. Enrico discusses kubernetes, Databases

The post Voices in Data Storage – Episode 33: A Conversation with Vitess appeared first on Gigaom.

]]>
Enrico discusses kubernetes, databases & persistent applications within data storage with Jiten Vaidya and Sugu Sougoumarane of Vitess.

Enrico discusses kubernetes, Databases & Persistent Applications within data storage with Jiten Vaidya and Sugu Sougoumarane of Vitess.

Guests

Jiten Vaidya (@yaempiricist) is the co-founder and CEO at PlanetScale. Prior to starting PlanetScale, Jiten held various backend and infrastructure roles at US Digital Service, Dropbox, Youtube, and Google. At Youtube, he managed the teams that operated Youtube’s MySQL databases at a massive scale using vitess.io.

Sugu Sougoumarane (@ssougou) is the CTO at PlanetScale and the co-creator of the Vitess project, which he’s been working on since 2010. The project has helped multiple companies with running MySQL at a massive scale. Prior to Vitess, he has worked on various scaling and infrastructure projects both at YouTube and early days of PayPal. Sugu is passionate about distributed systems and consensus algorithms and is proud of his contributions in this area.

Find Enrico Online

You can find Enrico on Twitter and LinkedIn. If you enjoyed this podcast, please check out his most recent report for GigaOm, “Key Criteria for Evaluating Hybrid Cloud Data Protection“.

The post Voices in Data Storage – Episode 33: A Conversation with Vitess appeared first on Gigaom.

]]>
Voices in Data Storage- Episode 32: A Conversation with Veeam https://gigaom.com/episode/voices-in-data-storage-episode-32-a-conversation-with-veeam/ Wed, 22 Jan 2020 13:00:02 +0000 https://research.gigaom.com/?post_type=m-podcast-episode&p=964431 In this episode, Enrico speaks to Anthony Spiteri and Michael Cade of Veeam about data protection across physical and virtual data storage

The post Voices in Data Storage- Episode 32: A Conversation with Veeam appeared first on Gigaom.

]]>
In this episode, Enrico speaks to Anthony Spiteri and Michael Cade of Veeam about data protection across physical and virtual data storage in the age of ransomware.

In this episode, Enrico speaks to Anthony Spiteri and Michael Cade of Veeam about data protection across physical and virtual data storage in the age of ransomware.

Guests

Michael Cade is a Senior Global Technologist for Veeam Software. Based in the UK, Michael is a dynamic speaker and experienced IT Professional who meets with customers and partners around the world. As an active blogger and social media persona, Michael is influential throughout the industry. His expertise and advice are sought after on Data Center technologies including Virtualisation and Storage. Michael is a leading member of multiple technology organizations, including VMware vExpert, NetApp A-Team, Veeam Vanguards, and Cisco Champions.

Anthony Spiteri is a Senior Global Technologist, vExpert, VCIX-NV and VCAP-DCV working in the Product Strategy group at Veeam. He currently focuses on Veeam’s Service Provider products and partners. He previously held Architectural Lead roles at some of Australia’s leading Cloud Providers. He is responsible for generating content, evangelism, collecting product feedback, and presenting at events.

Transcript

Enrico: Welcome, everyone. This is Voices in Data Storage brought to you by GigaOm. I am your host, Enrico Signoretti. Today we will talk about data protection. Of course, data protection is something that evolved tremendously in the last few years. We started with a physical system, moved to virtualized infrastructure, and we saw a big change there. Actually, the cloud really changed everything.

Now we have an environment that spans between the public cloud, hybrid cloud, private cloud, Saas applications and mobile devices. It’s crazy, right? We need to protect it all, especially now with ransomware, malware of any sort. Solutions that are on the market today are somehow changing as well to cover all these and maybe more. To talk about this, I invited today Anthony Spiteri and Michael Cade. Both of them are senior technologists in the product strategy group at Veeam. Hi, guys. How are you?

Anthony Spiteri: Hello, how are you?

Before getting into the details, maybe we can start with a short introduction about yourself and what you do in the company.

Anthony: Yeah, no worries. I think I’ll start, Michael. I’ve been with the company just over three years now. I actually started almost at the same time as Michael. We both work in the product strategy group in Veeam. What that means effectively is that we’re the conduits between our research and development team and the field.

When we talk about the field, we mean internally with our sales organization, our SEO organization, but also, more importantly, customers and partners. The team is responsible for product feedback directly between those two sort of lines. It’s a pretty cool job because not only do we get to interact with our customers and partners, but we also attend most of the major conferences around the world, present at them; and also locally in the region we attend lots of functions and lots of conventions there as well. We’re lucky to be able to create the content and present in front of people. That’s effectively what we do. It’s a pretty good job and good to be working with Michael most of the time.

Personally with myself, I’ve been working in IT all my career. The majority [of time] I’ve been working in the service provider space. That’s my area of expertise. I’ve been lucky to work at ISPs and ASPs and hosting providers and infrastructure providers pretty much my whole career before I started at Veeam. My focus is certainly around the cloud, working with our Veeam cloud and service providers, and just seeing what’s happening in that area as well. That’s me. Michael, do you want to explain what you do?

Michael Cade: Yeah. I’m almost exactly the same as Anthony. I think another focus that really has grown as we’ve grown in our IT careers is around community. We’re members of the vExperts, Cisco Champions, all of the awesome community programs out there that are really vocal on Twitter as well with our wider community people out there. For my personal background, I’ve also been in IT since day one where I started building computers for Cambridge University in a small little village out there and then moved up into more of an infrastructure guy, so virtualization, storage – storage being probably my biggest background. As you move into these new worlds of hybrid cloud and where that data resides, that’s a big shift and a big change for me and also for the whole community.

I didn’t ask you about Veeam. I think that everybody in the IT industry knows about Veeam. Maybe a short update also about Veeam could be of help.

Michael: Like you said, Enrico, everybody has probably heard of Veeam. We’ve just ticked over 13 years. We’ve got 365,000 customers across the globe. We really started as a focus around virtualization backup specifically around VMware. Then over the last 18 months we’ve really brought in that data protection angle, built out a platform where it allows us to protect SaaS-based work within Office 365, agents for both Windows and Linux and some other older legacy-type infrastructure. We’re still focused on that backup piece, but broadening more into the other aspects of where that data can reside.

What I wanted to discuss with you guys today, as I said, now the world is hybrid, at least this is the direction that many companies are taking. It’s a little bit more difficult than in the past, taking care of all this environment when we have different silos at the end. It’s not a single huge data center. Actually, we have data that are now created and consumed everywhere.

What do you see in the field from your customer asking for this kind of data protection? Are they still looking for point solution? Do they want to protect their Office 365 in a different way than they do with their virtual machines or are they looking for something that is more integrated on a single platform?

Anthony: I’ll take this one. I think it’s interesting. You talked about this whole new world of the fact that people are consuming different types of platforms for their applications and where they store their data. That means not only is there data sprawl, but there’s tremendous data growth as well. Then they both pose particular challenges.

In terms of the sprawl and the ability for organizations to back up their critical data across all platforms, what we’re seeing is people are only really reaching an inflection point now in terms of them understanding that they need to look at this from a holistic point of view. Before, they understood they had to have some VMware virtual machines, and they had to back that up. Then if they had a little bit of information in Google somewhere on Google Drive, they might think about backing that up. What normally happens is the pieces of data that were offsite, not on-premises, but maybe in cloud-based situations and platforms, they weren’t really considered as data that needed to be backed up because a lot of people thought because it was in the cloud, it’s backed up natively.

What we’ve seen specifically in the industry in the past 12 to 18 months is that organizations are becoming more aware that even if you’ve got your data sitting in a SaaS-based platform in a drive somewhere that’s cloud-based, you still need to consider back up. Just because it’s in a cloud, it doesn’t mean that it’s going to always be available. That’s one of the big things I think we’ve seen in the past 12 to 18 months is the realization that these different workloads need to be backed up – still, in a similar way to what you would have considered your on-premises workloads.

That’s why we’ve seen tremendous growth inside back up of Office 365 product. That’s actually become our fastest growing product of all time. The growth of that is tremendous, but it’s only been able to grow because people are more aware that they do need to back up their Office 365 data. It’s definitely a different world, Enrico, than what it was even two to three years ago.

What I am trying to understand here is more around their strategy. Do they think about the cloud, for example, Office 365, as a separate silo or as a part of their entire infrastructure? I mean this because sometimes the guys that manage Office 365 are part of a totally different team organization.

Anthony: Yeah, that’s an interesting one. My personal view on it is that at the moment it’s still relatively siloed. That’s more from an operational perspective, like you say, because there are different teams typically. The infrastructure team might not be the ones that are looking after the software service exchange, as an example. I think to a certain extent they are siloed, but what they do want and what we’re finding, obviously, is that they want a particular vendor to give them the flexibility to back up all those siloed platforms.

Holistically they’re siloed. They’re all separate. I think what organizations do want is a single platform that is going to have all the ability to back up all of their data across multiple platforms. Michael, what do you think on that?

Michael: This is a trend that’s kind of come into fruition now. People are now understanding a little bit more about what that data set is and where it needs to reside to get either more efficiency with less cost, whether it’s performance or whether it’s there to provide a better business reason for it being there, a better business outcome. People need to really understand what that data is. That then determines where that data needs to reside in things like the SaaS-based products.

Really the way forward in terms of being able to remove that headache of looking after the underlying exchange environment and the infrastructure behind that, the operating system that then runs on that, the clustering, the HA, by being able to just migrate into Exchange Online, Office 365, you take away that headache. Anthony, I know you were an Exchange admin back in the day as well. You know what that’s like.

Anthony: Yeah, for sure. That’s the change in dynamic of what it is to operate something like Exchange. It’s effectively been offloaded to the cloud now. That just changes the way in which the IT ops need to think about backup as well.

You guys come from two different parts of the world. Michael is still in Europe, and Anthony is in Australia. What do you see in these different regions? Are the adoption of these tools and these methodologies similar? There are some regions that are way ahead. We know that everything happens first in the US usually from the cloud perspective.

Anthony: It’s interesting, actually. If you talk to even VMware, us as well, they will actually say that the majority of innovation or a thirst to try new things actually happens in New Zealand first. Ironically, New Zealand seems to be leading the world not only in the fact that it gets the day quicker than everyone [else] because it’s first in the time zones, but they seem to adopt technology [more] quick[ly]. That’s reflective in the whole of Australia and New Zealand. Australia and New Zealand was always traditionally the most highly virtualized region of the world. Probably now everyone else has caught up.

Certainly in Australia and New Zealand, for some reason, virtualization took hold quickly. Then the percentage of virtualized workloads took hold even quicker. That actually was a similar pattern to the adoption of software as a service as well. It’s quite interesting here.

What’s really interesting about that though is that in the wider APJ region, which we kind of look at after a little bit, the uptake of virtualization and cloud and software as a service is actually probably five years behind the rest of the world. It’s really interesting that Anzac has always been first to adopt, almost like a test bed. Then the rest of Asia has been a little bit slow, almost the last in the world. Michael, what do you think about England, the UK, and Europe?

Michael: Europe is a funny one. We can literally get across the whole of Europe in four or five hours’ worth of flight. You hit X amount of countries over the route. One of the biggest things for us is the location of where that data has resided, especially with the recent last couple of years, 18 months. It’s been around regulation of data, GDPR compliance from an EU perspective.

We have lots of little or smaller countries that we have to potentially keep data locked within the four walls of that country for regulation purposes. I think that’s a big challenge that I’ve seen especially this year. People need to understand what their data is, why we need to keep it, and then they need to understand where they’re keeping it and make sure that it’s also regulated from that perspective.

From this point of view, we can say that the US comes first. Most of the technology starts there. Then we have regions like APEC, that maybe there are some early adopters, very advanced, but mostly they fall short in the wide adoption, while maybe European regulations force us to look into new things and in general to look at data protection more closely. We have a lot of small countries, and everybody at the end of the day has to think for their local regulations or sometimes also with the idea that they don’t want to keep data too far from their business. It’s a little bit patched kind of thing that maybe we can see in every segment of IT, not only data protection. You’re talking about regulations.

The advantage of being data protection vendors is in the fact that you’re collecting data from a lot of sources in the company. What I’m seeing more and more often now is that on top of the traditional data protection tools, we are adding more. Many vendors are working on the mechanisms to dig into the data and try to find information useful for business, for compliance. I know you launched in May at your corporate event something really cool around that announcement. Maybe we can talk a little about that announcement in general, when the source is backup.

Michael: I’ll take this one. I think one of our key focuses starts with (potentially) around backup and recovery. Now it needs to span… We’ve just spoken about where clouds are moving into these various different pockets of infrastructural platforms that we can leverage as a production point and be able to use those workloads. Obviously, we need to be able to back that data up. Regardless of where it is, it needs to be backed up.

It needs to be made available so that if anything does happen or if there was a problem, a failure scenario, then we need to be able to recover that. That can be potentially the first bullet point in what we believe is focusing around cloud data management. Just because those workloads today sit in Azure, AWS, or on-premises in vSphere doesn’t mean that in between that 90 days it is the most relevant place or most efficient place for that workload to reside.

We’ve got to then look after the mobility of that data or that function of being able to take workloads from on-premises and push them into the cloud or back again and put them into a different cloud just based on a level of reasoning and why that is, whether it’s efficiency, like I said before, whether it’s speed, whether it’s cost, whether it’s just being able to provide more agile infrastructure for that because of Black Friday. We’re recording this actually on Black Friday.

Being able to burst that workload into the public cloud where we have ultimately infinite resources to be able to use and then on Monday or Tuesday because we’ve got Cyber Monday coming up as well, we can pull up that data, we can pull that workload back and put it back on-premises where we know we’ve got this constrained infrastructure that we’ve already paid for and we don’t have that burstable cost as well that we have to consider.

Then we get around to the areas that you mentioned, Enrico. How do we provide some sort of analytics to that data? How do we provide the ability for our end users to go into that data in an isolated fashion and be able to actually pull some insight out of that and be able to provide a better way of – we know it’s costing X amount here or it’s going to be more efficient if we run it on-premises?

Also on that same vein is the governance and compliance that we’ve also mentioned. How do we push that into an isolated environment? How can we enable people to do more with that data to then really extend or make better business choices around that as well? All of that really ties into orchestration automation in terms of being able to make those decisions or act upon those decisions or that insight we’ve gathered from the leveraging of that data.

Also, it’s interesting to see if I look at the market now with several vendors working on our leveraged data that you actually have, understand the data better, understand the value of all the data that you have in your company, it comes to two different approaches. One is build an application to do some data management, data analytics, or whatever directly on top of the data protection platform. The other one is to give others the mechanism, APIs, or whatever to access this data, specialized applications that are built by third parties or by their community so that it’s not the data protection vendor that has the full control. By giving it to others, maybe you can find solutions that are more focused on the market segment because sometimes if you’re an SME, you don’t have the same needs even for analytics or whatever than a larger enterprise.

On the other side, by having everything integrated, maybe it’s easier to do other things. I know your stance on this because you announced that the product is in preview. I’m talking about the data integration. You announced first in May and then last week in Tech Field Day again data integration APIs. What is the status of this product and your expectation? What are your customers asking from it?

Anthony: This is interesting because this goes to the whole situation where once you have backup data, what do you do with it? How do you make it more valuable for the customer? Data integration API is the first step for us. We’ve, to be fair, made that data work before. We’ve had things like data labs, which allows us to instantly mount backup data for validation just to make sure the backups have gone through. We’ve done stuff in the past, actually make the data that is backed up valuable.

Where this is going is more about activating the data, having these massive data lakes that have you back up files. What do you then do with them? The data integration API works by effectively exposing the backup data and matching that data to some system or some platform, which then can be accessed by a third-party application to run analytics over. It’s interesting because traditionally we’re very agnostic. That’s one of our core pillars as a company.

We want to be agnostic for our customers. We don’t want to dictate what particular piece of software or hardware our customers use. We give them flexibility of choice. This data integration API is very much built with that in mind because what we’re doing is matching the data and then letting our customers choose their tool to either run a check against and try to find some ransomware, try to find credit card information within that data. That’s what this has all been laid out for.

That’s going to be part of our v10 release, which is due very early in the new year. It’s going to be a feature of that release. Effectively you’ll get that when v10 comes out. We’re quite excited about that because we’re keen to see what our customers are going to be doing with this functionality and what it’s going to offer them.

As far as I know, there are already a few integrations coming from the community that are pretty cool. You have this very large community across the world and over time. Sometimes getting these little pieces of code that just resolve your day sometimes, they are pretty good, and they come for free because usually it’s open source.

Anthony: That’s right. If you look, we’ve got what we call VeeamHub, which is a GitHub page where community members can basically put their code for everything. That’s actually been around for a couple years. It’s very mature. It’s got lots of different code examples already in there. The data integration API is certainly one area which we will hope our customers will create new bits of code and new solutions, which hopefully we can share with the community. Like Michael said, that’s kind of where we’re at and what we’re thinking.

While we are talking about data integration API, you mentioned ransomware. This is one of the hottest topics in the industry. It looks like data protection is becoming the tool – not to prevent, but to fight ransomware. Usually when we talk about security, we think about firewalls, we think about attack surface on infrastructure. Ransomware is a very sneaky thing. You discover you’re attacked when it’s usually too late. Everything is encrypted and everything stops. It’s like a major disaster.

Anthony: It’s a huge issue. It’s got lots of public attention. Specifically where the data integration API can help is that because typically – you know, Enrico, as well ransomware basically lies dormant on the systems for months, even years potentially, before they actually get activated by some trigger in the system.

One of the good things about the data integration API is that you can manage your backup data from yesterday, two weeks ago, three weeks ago and effectively run it against antivirus protection or ransomware checker to try to detect ransomware. Therefore, what you’re actually doing is you’re detecting this dormant ransomware before it actually impacts you. That’s just a really good example of being able to activate the backup data before you get hit by ransomware. I think Michael even showed this a couple months ago at an event that we had.

Michael: Yeah, it was actually the Cloud Field Day back in April. As much as you’ve been attacked or you think you’ve been attacked or you’re generally just going to restore some data back into the live production system, then we’ve got the ability to do what Anthony just said about being able to trigger the anti-virus scan. What we’ve also done in that same API is expose it so that when we do things like Direct Restore to an AWS EC2 instance or in Azure, VM, or whether we’re taking a HyperVM and we’re moving that to vSphere or a physical machine and moving that to vSphere just from a recovery point of view, we can also trigger that scan, that antivirus scan.

We’re just telling the antivirus software to perform that scan. We’re not the security guys. We don’t have our own antivirus definitions. We just expose this feature out to any antivirus software that has a command line support function. We can trigger that antivirus to perform that scan and make sure that workload is not compromised before it goes back into wherever it needs to be.

Sometimes it’s not only about the data management piece, but the fact that if you have some sort of file gap in between the data backup repository and the rest of the infrastructure. Even if you can’t analyze the data and something very bad happens, you can still restore your data.

One of the major problems is that literally ransomware is very smart in how it attacks the infrastructure. The first thing that they try to do is to encrypt the backup so you can’t retrieve your data. It looks like this is changing. More and more vendors are trying to build immutable backup repositories to prevent malicious changes in the backup files, for example. How are you implementing these kind of techniques?

Anthony: Enrico, it’s almost like you were there at our Tech Field Day seeing the presentation that we gave. Part of the v10, in version 9.5 of Backup and Replication, Update 4, which we released earlier this year on the 22nd of January, we released a feature called the CloudSphere. The CloudSphere effectively leverages object storage as an extension of our scale-up backup repository to allow you to offload data from a local repository into an object storage repository. That object storage repository can be Amazon S3, it can be Azure Blob, it can be S3 compatible. Effectively what we did with the existing version that’s out is that we moved data from the local on-premises location to the object storage. Part of v10 is we’re enhancing that to add a couple of things.

The first thing is a copy mode. We’re going to basically have the ability to instantly copy the data from the local performance tier when it’s created by the backup engine and effectively copy that into object storage. What that does is it creates a whole new copy of that data in the object storage. You’ve got two distinct copies, one being offsite in object storage.

In addition to that, we’re also introducing an immutability feature that’s compatible with Amazon S3 and also S3 compatible that supports Amazon Object Lock and versioning as well. What that’s going to do is that effectively puts immutability lock on the most recent backup files. You set that as part of a policy when you create the object storage repository. Then if you set it for 30 days, it will mean that as soon as the backup files get created locally and are then copied into the object storage, they’re immutable for that set period of time, which effectively means that ransomware has no way to actually change those files that are up there. That’s one way to protect it.

It’s not ransomware protection per se in that it’s not going to stop something happening locally. You’ll still get hit locally. You’ll still have the ability to recover. Fundamentally what’s different is that once those files get in object storage, they’re locked. They cannot be altered. They can’t be deleted. They can’t be changed.

You can basically remount them, re-inject the backups into a backup server anywhere and effectively restore from that point forward. The immutability is something that’s coming in, and we’re really looking forward to that. We think it’s going to be a very big feature.

We are talking about a simple approach to problems like ransomware without spending a huge amount of money. Creating complex infrastructure is easy for everybody. Then you have to manage them. You have to pay for licenses or hardware, etc. With this kind of feature, you can bring ransomware protection with just a bunch of best practices and some cloud storage. From my point of view, it’s not the large enterprise that is at risk with ransomware as much as a small company. They don’t have the budget. They don’t have the tools to make sophisticated scanning on –

Anthony: Exactly, yeah. That’s correct, Enrico. That’s why the fact that we’re simple, reliable, flexible, but also agnostic, so not locked to any particular hardware, we’re going to work with anything. We’ve made it very simple for this feature to be effectively just checkboxed, and you set the policy, and then your backups are protected and are immutable. You’re right; it’s going to be a very simple solution for those smaller companies.

We talked about a little bit of everything here: data protection, SaaS. We’ve got data protection being at the core, also some data management stuff being important for security, ransomware. It looks like the more and more we go on, data protection becomes a critical component, even more than in the past in every infrastructure. What do you think about the next steps? Where are we going as an industry, and what are users asking for the next step of their data protection strategy?

Michael: I’ll take this one. I think at the moment we’re in this adoption phase where we’re seeing a lot of customers really heading into that hybrid cloud mentality. They’ve made a huge investment for that, the systems, their infrastructure, their platform that they have on-premises, yet they can see absolutely the benefit of the public cloud, the hyperscalers, and you can see a lot of the conversations that we’re having that they’re absolutely trying to leverage that platform within the hyperscalers, whether it’s Azure or Google or AWS or any of the others that are out there. I think that has to be a focus from a data protection point of view.

We’ve actually got next week at AWS re:Invent, we announce our ability to agentlessly protect AWS EC2 instances in a very easy, simple fashion. There will be a lot of news around that next week. Two weeks ago at Microsoft Ignite we announced exactly the same for Azure. If you look at both of the interfaces, they look exactly the same. They’re just leveraging APIs underneath that perform different functions at the relevant public clouds. That’s really going to allow us to protect those workloads natively within AWS and Azure.

The biggest point to that is yes, that allows us to protect those workloads that customers have moved up. The format of those backups is still going to be in that Veeam portable data format, the VBK format. Now that format can still be read by Veeam Backup and Replication on-premises. All of the recovery options we have, whether it’s Guest File Level Restore, whether it’s application item restore, or even the Cloud Mobility storage that we have around being able to take that data, take that workload and convert that into an Azure or AWS VM or EC2 instance. That whole flexibility that Anthony briefly just touched on is expanding more into the platform to allow us to protect those cloud-based workloads.

It feels like there’s a toe in the water around platform as a service, people actually migrating from on-premises databases and pushing them natively into more RDS, whether it be SQL as a PaaS solution. Looking at how we protect those workloads from our customer’s point of view and researching into that and how that needs to look, just because we did it on-premises with SQL, then it’s going to be potentially a different approach. We can’t just ‘lift and shift’. To put that into perspective, if you’ve got the traditional infrastructure guys that me and Anthony are, but we’re moving into more of this cloud and cloud native with our workloads, it’s very much the time that virtualization hits. A lot of the data protection vendors out there just took their agent that they were protecting from a physical point of view and put that onto a virtual machine.

Veeam comes along and changes the way we protect those virtual machines in an agentless fashion. Then it’s really about how efficient, how fast performant can we be in those areas. I think we’re at an inflection point now where whatever we do next is not going to be the same as what we’ve done before. It has to be a different approach. Then as we move into that PaaS-type workload, then you’ve got more cloud native, changing the way the actual application even looks and feels, so more containerization. We’re absolutely looking into those areas as well as how people adopt those.

Speaking quite closely to VMware’s engineer, VMware’s show this year, and I know you were there, Enrico, was very much around Kubernetes and how they’re going to provide their infrastructure. Their operations guys have known vSphere for the last 10, 15 years with the easy button to start getting into Kubernetes and start really focusing on that area. From a data protection point of view, it’s a very different function. It’s a very different look and feel. We have to be very mindful of that on how we protect the state of data that comes with that.

Something that you said is really interesting. When you start looking at protecting data in different types of environments and then converting virtual machines, for example, in AMIs or whatever, that’s pretty cool. Data protection platforms are also becoming somehow a data migration source. In the long term, I can think about this, especially because of data integration APIs, for example. You can start backing up Microsoft SQL and then converting into something else that is available in the cloud, this kind of thing.

Again, at first sight, if you think about backup as a liability, as a way to protect your data only, that’s a little bit of a boring thing. Sorry, guys. I know you work in that field. If you think about different aspects that data protection brings to the table, we’re talking about many of them, many potential use cases on top of the data you’re protecting, then it’s fantastic. We have a lot of potential there that we can just explore with a little bit of integration and some features that maybe will become one click and go. From my point of view, it’s amazing. From this point of view, the world is really changing.

Michael: Yeah, it really is. I think you’re absolutely right, Enrico. The backing up is kind of a table stake. We need to understand what their data is. How can we take it back up as fast as possible? It’s kind of table stakes.

All of our competition, we can all do it. We can all back up. The first cool part is how quick can you recover? How quickly can you recover, and where are you recovering that from? Then I think you get into: what else can we do with that?

Can we leverage that data to do something more? Can we leverage that data to migrate or potentially offer a self-service-type sandbox environment to our developers or to our security team and be able to do something along those lines? Can we offer that data out for data classification-type compliance or even just reporting so you understand what that data is and just have a little bit more focus around what is that data? Then you’ve got the whole monitoring analytics.

I think that’s really where the coolness for me is: one, not knowing what’s coming from a roadmap point of view because there’s so many different angles that we’re exploring and going down. The Cloud Mobility is exactly that. It’s workload migration, workload mobility, being able to move data wherever it needs to be based on insight that we’ve got from a company or a monitoring point of view. Then also being able to orchestrate that workload; where does it need to go because of cost, because of performance, because of XYZ? That’s the exciting part from a data management point of view. It’s not just about backup. It’s about all of those other different areas and all of these new areas that are available to our customers as well.

Anthony: Also, what makes me excited – it’s funny, Enrico, because if you had asked me three years ago whether I’d be willing to work for a backup company, I would have said “No.” The reality is that Michael and I keep very busy on a number of different angles. When I look at our technology and what it enables, it’s quite amazing. It’s not just to back up. Backup is the start. That’s what Michael has been saying with the table stakes.

We obviously have a great service provider committee that offers Cloud Connect backups. They offer a cloud repository that’s very easy to back up into. They also offer replication services, so being able to replicate more Tier 1 workloads with the press of a button for DR purposes. I think that’s more than just backup.

I think one really good example of the coolness of the technology and innovation that we have is even in that cloud tier that I talked about, you mentioned migration. When the new copy mode comes out in v10, that cloud tier can be moved for migration purposes and migrating with really small RPOs as well. Like I said, as soon as you create the copy of the backup file, it gets copied into the object storage. From there you can recover it anywhere as well. Then using Instant VM Recovery, which is our patented technology, you can bring up those workloads instantly on a vSphere or a Hyper-V within seconds.

Backup in and of itself, you’re looking at what backup is traditionally. It can be looked upon as a bit bland. Like you said, it’s what you do to activate the data, it’s how you take the different technologies of what a company like Veeam offers and is innovating around, and how you make it work in probably ways you wouldn’t have thought they would have worked for you before. That’s a really exciting part of being in data protection today, I think.

That’s good. I think we had a great conversation, but it’s, unfortunately, time to wrap up this episode. Maybe we can finish it with a few links about where we can find information about the Veeam community, you guys on Twitter, and maybe Veeam itself.

Anthony: My Twitter is @AnthonySpiteri. I also blog at Virtualization is Life!, which is AnthonySpiteri.net. Like I said, a lot of the content we put up is also on the Veeam.com/blog website. From a community perspective, I’ve got GitHub going as well. That all leads into our Veeam GitHub page, which is called VeeamHub. From a community perspective, that’s what you want to be looking at. Michael, do you want to talk about your particular –

Michael: From a Twitter point of view, you’ve got @MichaelCade1. I’m pretty active. We’re both pretty active on Twitter. Any questions that you’ve got that you’ve heard today about, we’d be more than happy to answer. I blog a bit over at vzilla.co.uk.

The only other resource that I’d add in there is the Cloud Field Day, Tech Field Day we did. The Cloud Field Day went really into that Cloud Mobility in that multiple data format. I’m really interested in terms of how we do it different[ly] than the others. More recently the Tech Field Day that we’ve just done, Tech Field Day 20 was where we got to go into a little bit more detail around Cloud Tier in particular and the new features coming in there [that]we’ve already mentioned today, as well as our NAS backup that’s coming with the v10 release as well and a few other new things that are coming later on down the line we’ve got to mention in there.

Fantastic. I think we can call it an episode with this. Thank you again for your time today, guys. Bye-bye.

The post Voices in Data Storage- Episode 32: A Conversation with Veeam appeared first on Gigaom.

]]>
Voices in Data Storage – Episode 31: A Conversation with Eyal David of Kaminario https://gigaom.com/episode/voices-in-data-storage-episode-31-a-conversation-with-eyal-david-of-kaminario/ Wed, 08 Jan 2020 13:00:06 +0000 https://research.gigaom.com/?post_type=m-podcast-episode&p=964222 Enrico and Eyal David go beyond data storage to discuss embracing hybrid cloud strategies and new solutions to persistent issues. Enrico and

The post Voices in Data Storage – Episode 31: A Conversation with Eyal David of Kaminario appeared first on Gigaom.

]]>
Enrico and Eyal David go beyond data storage to discuss embracing hybrid cloud strategies and new solutions to persistent issues.

Enrico and Eyal David go beyond data storage to discuss embracing hybrid cloud strategies and new solutions to persistent issues.

Guest

As CTO at Kaminario, Eyal David is responsible for charting Kaminario’s technical vision and is the company’s lead technology evangelist. Eyal joined Kaminario in 2009 as a developer then software team leader. He has also held the positions of VP of Product Management and VP of Business Development at Kaminario. Prior to Kaminario Eyal led the advanced algorithms group for a large enterprise. Eyal holds a B.Sc. in Mathematics, Computer Science and Physics from the Hebrew University in Jerusalem, and a M.Sc. in Computer Science from The Interdisciplinary Center, Herzliya.

Transcript

Enrico Signoretti: Welcome, everybody. This is GigaOm, Voices in Data Storage, and I am Enrico Signoretti, your host. Today, we will talk about something that is beyond data storage. A lot of organizations are embracing hybrid cloud strategies and they are looking for solutions that are totally different from the traditional storage array. The vision is more cloud-driven and they need new solutions to face these new challenges. To help me with this, I have invited Eyal David, CTO of Kaminario today. Hey, Eyal! How are you?

Eyal David: Hi, Enrico, thank you for having me. I’m well.

Kaminario is an upstart in the all-flash realm, but actually we will discover during the episode that they are changing and they are evolving into more of a cloud company, data company than it was in the past. Maybe we can start with a little bit of background about you, Eyal, and the company.

Thank you, Enrico. Yes, Kaminario has been around for about ten years. I’ve been with the company since its inception. I would say that over the last ten years, the company and its vision has evolved to mirror the changes we’ve seen in the market. As time goes by, it’s clear that the impact of cloud on modern deployment models and the expectations customers have [of] how they can transition to the cloud, is central to any IT strategy. We’ve been in the process of reshaping the company and addressing the major problems that companies see today in transitioning to the cloud.

Well in fact, I have to say that I’ve followed the company since the very beginning and the first product you launched was called K2. It was a record-breaking all-flash array at the time. Then you evolved a lot, not only with data services and functionalities, but actually a couple of years ago, you did a major change. You presented a new solution that was software-only.

Even more than that, the other day I was browsing your website and it’s incredible. In the home page of your website, there is no mention of ‘data storage,’ it’s data, applications, cloud, virtualization. All terms that, yes, some are connected with storage but none of it. Can you give me an idea what is happening?

Certainly what you’ve seen on the website is reflective of the changes in the market. Today’s challenges customers are facing [are] in building their hybrid cloud strategy around data mobility, performance application, refactoring and re-platforming. The ability to have true flexibility in deploying workloads where it makes sense from a business aspect, requires a completely different approach from traditional storage. As much as Kaminario started its way in storage, we now focus on data virtualization. Kaminario, in essence, allows customers to encapsulate business-critical data in what we call data pods and deliver extreme performance at scale, with competing cost efficiency across any type of cloud, private or public cloud.

It’s an interesting proposition. With all these companies doing this with all these companies implementing these hybrid cloud strategies now, they want to think more about data and not data storage. At the same time, they want a single platform that can span across different environments, so data mobility becomes a critical aspect of it, but also not just data mobility.

The other important aspect that I find in the field is that many enterprises, especially traditional enterprises, they want the same identical platform on different clouds, so that they can find something familiar. They don’t have to retrain their system administrator and they can migrate applications from one platform to the other, finding the same identical features.

I don’t know if your platform works this way or there is a difference but what do you find in the field from your customers and how does your platform help them to achieve this goal?

Yes, definitely. As we talk to customers, every organization today is somewhere on their cloud journey. They’re either planning for it they’re either in that process, they have done some transition into the cloud. Some have tried and not succeeded in moving some of their workloads to the cloud and are not able to benefit from the flexibility and scale of a public cloud deployment. This is where we come in. Usually, what we see is that when it comes to a core mission-critical application, that is where customers are most challenged to transition to a public cloud model.

It is often required of them, if they want to use native cloud resources, to go on a significant process of refactoring and re-platforming their applications to adhere to more cloud-like architectures. That is usually a lengthy process. We often see customers projecting two to three years of work, until they can actually leverage and deploy their mission-critical workloads in the cloud.

We offer a different proposition. We offer the ability to deploy the Kaminario data plane virtualization platform. That is the underlying technology driving the ability to move the data pods I mentioned earlier across multiple clouds, without compromising on the level of data services or performance at scale that is provided, where we can provide an order of magnitude better performance than the native cloud resources at a fraction of the cost.

We basically allow the customers to accelerate significantly their adoption of cloud infrastructure, by taking their mission-critical applications ‘as is’ without refactoring, and moving them to the cloud; in a sense, significantly reducing the risk they run in these cloud migration projects. That also gives them access to mobility between clouds. They can choose whatever public cloud provider makes sense for their workload at any given time. They can also combine that with workloads still deployed in their private cloud. This gives a uniform set of data services that deliver performance at scale, risk reduction, and a uniform experience across any type of cloud.

This is somewhat aligned with what I see. What I find is also that it’s very, very complicated to maintain a single level of performance in the cloud. When you are used to having an all-flash array on your premises, getting on an AWS or Azure, or the cloud that you choose, the same level of performance while keeping the same platform is very, very complicated, not just the peak performance, but most of the time, it’s about consistency.

Another problem that I see often is that, yes, we have a beautiful appliance but actually, the cloud is not really designed for traditional software architectures and traditional storage architecture. The way it manages failures is totally different. How do you approach these kinds of challenges?

These are definitely challenges that our customers are facing, as they want to move to the cloud. How can they achieve their performance SLAs with the native cloud resources? How do they now, potentially, need to take on [for] themselves data resiliency and data availability, where the cloud doesn’t give the same assurances as on-prem local resources used to do? In this case, we actually lean back on our mature stack of data services that delivers solutions to the problems you just mentioned.

As much as we over the last few years, we separated ourselves from the hardware, from a business model, from its inception, our technology has always separated software from hardware and capacity from compute. For this aggregation, between the ability to scale performance and the ability to scale capacity, it’s a perfect fit when trying to solve the issues you discussed in the cloud.

If a customer wants to take a high performance, mission-critical workload that they run on their private cloud and run it on the public cloud, we can give them access to a shared data resource that can scale performance infinitely, scale capacity as you go, as needed, without hitting any of the cloud limitations. Moreover, it is also significantly more cost efficient because these days, high performance native cloud resources are extremely costly. Leveraging our data virtualization platform, they gain access to this performance at scale at a significantly lower price point.

At the same time, we also give them access to dealing with resiliency and availability. By deploying our stack across multiple zones, multiple regions, or even multiple clouds, they can design to their needed resiliency level. Leveraging a uniform set of data services, they can choose to have cross-cloud, cross-region, and cross-zone resiliency in their applications, without the need to re-architect the actual workload layer.

Well yes, but another issue that I usually see is that by expanding this kind of architecture, so having something that runs on your premises, something that is on the cloud, maybe more than one cloud, there is another major problem that surfaces. It’s the fact that you need to control, monitor, maybe having some detailed information [about] what is really happening, especially to spot new trends, especially if you want to act on an issue, it becomes a real problem. Do you have something to deal with these kinds of problems?

Yes, definitely. First, when running in a private cloud and when running with the public cloud, we integrate with the local orchestration and monitoring frameworks. At the same time, we have the Kaminario Clarity platform, which is the Kaminario AIOps, analytics, and machine-learning platform that collects ongoing telemetry from across the entire install base, regardless if it’s deployed in a private or public cloud. Delivering for us and for our customers a very strong suite of reporting tools, predictive analytics, and preventative analytics capabilities to spot trends, spot potential future failures, direct resource management, optimize resource deployment, and maintain the SLAs they need and protect their costs from going through the roof.

I see, but again, let me play the devil’s advocate here and talk about the complexity also…from a financial point of view. It’s not unusual now to find solutions from major vendors that have some components on the cloud and some components on premises. Actually, it’s very complicated to migrate these workloads for several reasons. One is just finance. I made a huge investment on premises; now I’m moving some of my stuff into the cloud and I have to buy new licenses or subscriptions or whatever. It’s not really convenient at the end of the day.

We’ve definitely addressed that issue as well because you are correct: a cloud migration project is complex, not only from a development perspective but also from an operational perspective, and certainly from a financing and cost perspective. We’ve gone to great lengths to simplify that process. We’ve covered how we simplify it from a development and operational perspective, by allowing customers to avoid expensive and complex refactoring of the application.

From a financial and business perspective, leveraging our ability to deliver a subscription-based software license that is tied to the ongoing usage of the customers, we actually completely decouple the licenses from the underlying infrastructure where our stack is deployed, which means that customers can enjoy full mobility of their licenses between any underlying infrastructure.

If a customer actually is going through some form of consolidation for their private cloud, and at the same time also, moving some workloads to the public cloud, they don’t actually have to waste all that investment. We leverage commodity components on-prem and we leverage “commodity components in the cloud” to deliver the robust set of data services that we provide.

The software license to run our stack is actually 100% mobile, so there’s 100% investment protection for somebody going through a consolidation and migration process with us. As they move data to the cloud, they can leverage the same set of licenses that they were leveraging on the private cloud to do that, without any need to rebuy anything.

This is very nice. You give your customers the ability to buy the license or the subscription once and then move it according to the needs of, not the day, but of the needs that they have during the evolution of the infrastructure.

Yes, I would say an additional aspect of improving the financials of such a solution is that both in a private cloud and in a public cloud, through our analytics platforms, we give the customers access to ongoing infrastructure optimization. If we would see a Kaminario data pod running on a public cloud is underutilizing its performance capabilities, we would recommend to the customer: “Why don’t you shut down a few of the performance components of this data pod; reduce your cost without compromising your SLAs?” If the need arises and the workload requires additional performance, we can immediately spin-up new compute elements, increase the performance in a linear fashion on the spot, and then maintain the updated SLA. In the meantime, you can reduce your cost significantly on an ongoing basis.

Yes, this sounds interesting not only to enterprises but also to other kinds of organizations, MSPs for example, or service providers in general that can borrow some resources from other clouds for example or resellers.

I’m pretty curious now to know a little bit more about your customers. Is it 100% enterprises? Do you sell to other markets?

Definitely. We focus on the large SaaS providers, usually focusing on data-intensive applications. At the same time, we also work closely with the managed service providers that deliver services to these types of companies. It is a common use case for us to work with a managed service provider that aggregates services to multiple enterprises.

With our data plane virtualization platform, they can themselves achieve a significantly more cost efficient model for their operations. They can run some parts of the infrastructure on-prem, some parts of it in the cloud, and they can very quickly adapt to the changing needs of their customers and spin-up new customers because they’re not tied to physical resources. If they have a new customer coming in, they can spin them up in the cloud in a matter of minutes.

I see. This leads me to another couple of questions actually. One is you mentioned SaaS providers. These guys are moving very quickly to new forms of virtualization or application development and deployment, meaning with these containers and Kubernetes. Do you have a strategy for Kubernetes?

Yes, of course. As deployment models move to cloud-like architectures and at the same time, adopt containerized workloads, usually driven by a Kubernetes orchestration, they’re still faced with similar challenges when trying to deliver resiliency and performance at scale. This is where our data plane virtualization platform can come into play. We can provide that persistent backend to any type of containerized deployment, either in a private cloud or a public cloud.

Also, with the added benefit of providing cross-cloud mobility for containerized workloads, maintaining this uniform set of APIs and data services. Containerized workloads can now enjoy a higher performance SLA, improved resiliency and durability, and also cross-cloud mobility.

Interesting. At the same time, on the other side of the spectrum, when we talk about this data platform that spun across clouds and on premises locations, it’s a software at the end of the day. There is a hardware component that you need to run your software on the premises. Do you have a hardware compatibility list or best practices to help the customers building their infrastructure, depending on their needs, the number of IOps they expect and so on?

Yes. Kaminario works to qualify and certify the underlying stack where our platform runs on. Our customers can gain access to a certified stack through our partners for on-prem deployment or through the marketplace on the public cloud providers, to deploy the needed underlying cloud resources. We do all the testing and qualification and configuration and the customers can get it delivered as a qualified solution; they do not need to build it on their own. That is a strong expectation when you’re dealing with mission-critical applications and data.

Yes, indeed, it’s a very clever way to operate. That leads to another question. It’s about the size of the customer that you can serve. This is the kind of question that I ask almost every time to all the vendors. I’m curious just to understand where your solution can be applied: very large organizations, every type of organization, all these small organizations? It would be awesome if you could give me an idea of the smallest possible installation and the biggest one that you can get from Kaminario.

As we discussed earlier, the Kaminario platform can deliver its data services in a decoupled way from a data capacity and data performance perspective. Our solutions can scale from the low tens and hundreds of terabytes to the multi-petabyte scale and tens of petabytes. From a performance perspective, from a few hundred of thousands of IOps to multi-million IOps configurations. It all depends on the level of performance at scale that is needed and the performance density that a specific workload requires. We actually give that ultimate agility to our customers based on their workloads.

The IP we’ve developed and built over the years allows us to deliver this consistent performance at scale, which is a key requirement for these types of applications. Applications that want to deliver real-time analytics, transaction processing, for their core workloads, require performance at scale but more importantly, persistent performance at scale.

How are your customers distributed across the world?

We have presence in North America, Europe, Israel, and [are] building our business in the Far East.

Is your business based on a channel model or on direct sales?

We work in a combined model, both with channels and with a direct sales force for a certain part of our customers. Especially working with the larger customers on their hybrid cloud strategies, you see a lot of variance if they’re working directly with the vendors or through a solution provider.

How many customers do you have?

We have thousands of deployments in mission-critical environments, which is the core of our install base.

I think this conversation was very good and we got a very, very nice profile about Kaminario, especially if we think about the alignment of this company with the trends in the market. I’m also curious to know a little bit more about the future. What can we expect from Kaminario in the next 12/18 months?

First of all, thank you for having me. It’s always fun having a conversation, Enrico. Looking into the future, I think we will continue to follow the market trend of giving a cloud-like experience to any deployment location or deployment model. The Kaminario data plane virtualization platform will continue to deliver improved orchestration capabilities, specifically focusing on workload mobility and resource management, and how you deliver the improved uniform, seamless experience across any type of cloud. As time goes by, data management and data orchestration becomes a key factor in any cloud strategy and we’re building the foundations to deliver on that need.

Yes that sounds very cool. Maybe we can keep the channel open here. If you have a Twitter handle that we can share with our listeners, maybe we can continue the conversation online, if somebody has questions. You can also share the website link for Kaminario, so if somebody wants to read more about the solution, maybe ask for a demo, they can go there.

Yes, Enrico, definitely. You can find us on Twitter on @kaminarioflash and our website is www.kaminario.com. You’re welcome to come in, browse and reach out if you have any questions.

Yeah, thank you very much for the time you dedicated to me today. If anybody wants to contact me on Twitter, you can find me on @esignoretti or on gigaom.com. Bye-bye!

The post Voices in Data Storage – Episode 31: A Conversation with Eyal David of Kaminario appeared first on Gigaom.

]]>
Voices in Data Storage – Episode 30: A Conversation with Andres Rodriguez https://gigaom.com/episode/voices-in-data-storage-episode-30-a-conversation-with-andres-rodriguez/ Wed, 25 Dec 2019 13:00:41 +0000 https://research.gigaom.com/?post_type=m-podcast-episode&p=964011 Enrico Signoretti and Andres Rodriguez discuss how data storage is evolving in the cloud computing era. Enrico Signoretti and Andres Rodriguez discuss

The post Voices in Data Storage – Episode 30: A Conversation with Andres Rodriguez appeared first on Gigaom.

]]>
Enrico Signoretti and Andres Rodriguez discuss how data storage is evolving in the cloud computing era.

Enrico Signoretti and Andres Rodriguez discuss how data storage is evolving in the cloud computing era.

Guest

Andres Rodriguez is CTO and cofounder of Nasuni, where he is responsible for refining and communicating Nasuni’s technology strategy.

Andres was previously Founder and CEO at Archivas, creator of the first enterprise-class cloud storage system. Acquired by Hitachi Data Systems, Archivas is now the basis for the Hitachi Content Platform (HCP). After supporting the worldwide rollout of HCP as Hitachi’s CTO of File Services and seeing the Archivas team and technology successfully integrated, Andres turned his attention to his next venture, Nasuni (NAS Unified). Delivering value-added enterprise file services on top of cloud object storage was the natural progression of Andres’ cloud storage vision.
Before founding Archivas, Andres was CTO at the New York Times, where his ideas for digital content storage, protection, and access were formed. He joined The Times through its acquisition of Abuzz, the pioneering social networking company Andres co-founded.

Andres has a Bachelor of Science degree in Engineering and a Master of Physics degree from Boston University. He holds numerous patents and is an avid swimmer.

Transcript

Enrico Signoretti: Ciao, everybody. Welcome to a new episode of Voices in Data Storage brought to you by GigaOm. I am Enrico Signoretti and today we will talk about how data storage is evolving in the cloud computing era. I mean, there are a lot of things happening not just on the cloud, per se, but with distributed enterprises and with the fact that we don’t have the same kind of organization that we had a few years ago. Branch offices are becoming more hedge things and things like that. [Fewer] people [are] working on IT even if IT has more purposes than ever. Today, with me, I have Andres Rodriguez, CTO at Nasuni and co-founder of the company. Hi, Andres. How are you?

Andres Rodriguez: Very well, Enrico. Very good to be on your show.

Andres, thank you very much for joining me today. Maybe we can start with a little bit of background about you and your company.

Sure thing. I’m technical, I have a background in distributed systems. Started my career as a CTO of The New York Times. Then was a CTO at Hitachi Data Systems. Started my own company in object storage which today we call ‘cloud storage’ and then decided we needed a file system for object storage. I created Nasuni around the idea of building a global file system that would be native to object storage which gives you some pretty formidable capabilities; specifically around the stuff you were talking about, which is how organizations have changed and are now global and so they need to support infrastructure that is also global.

Before the cloud, but even at the beginning of cloud, we have to remember that the first service launched by Amazon in 2006, if I remember well, was object storage. Object storage was totally different from any other storage that we had in the past. I mean, block and files because block is very good for your databases. File systems, NaaS systems are good if you access them locally, in a local area network because the product, those usually are very, very – They are not optimized for long distances.

Object storage on the other hand, is accessible from everywhere because – especially S3 protocol which is now considered the standard in this industry is something that you can access with HTTP, so from everywhere. It’s easy. You don’t have specific rules for firewalls and things like that. Theoretically, it’s the perfect storage for the internet here, but you know that better than me that, Andres, there are some limitations of the object storage.

Yes, absolutely, Enrico. I’ll tell you a good story. I built an object storage company and Hitachi – first [as an]OEM and then Hitachi Data Systems eventually bought the company. When they made me the CTO there, they said, what do you want to do? I said, “Look, object storage is the future of storage.” There is no question in my mind that given its scale and its ability to protect the data in a distributed way, all data, all enterprise data will eventually end up in object storage, but without file systems, really IT can do very little with it. It’s great if you’re a developer. It’s great if you’re writing a website, supporting a website like Facebook. All those pictures of cats are stored in object storage, but it’s not good if you need access control. You need versions. You need consistency. You need performance. NFS saves all that stuff. When Amazon launched S3, we were all sitting there trying to convince a very advanced engineering but traditional storage company that the future was: ‘Let’s build giant object storage data centers and build a file system around that and deliver the whole thing as a service to our clients.’

Once we saw Amazon launch S3, we pretty much all looked at each other. We read the Dynamo Papers at the time and we said, “This is pretty much identical to the stuff we just built because you give engineers the same problem constraints, and they’ll come up with some pretty similar stuff.” We had a REST API. We had many of the things at S3 was broadcast into the world at that time. We said, “More big players are going to follow suit. They’re unlikely to be the traditional storage guys” and it’s proven to be that way.

If you look at the leaders in this market, they’re really Amazon, Microsoft, and Google. None of them were in the storage or infrastructure market before. We said, “We are going to build the enterprise class, the global file system that can be portable across those three systems so that we can bring the benefits of object storage into the gnarly world of file systems in the enterprise.

Yeah. There is another interesting fact. I mean, you mentioned the major service providers and what is really interesting to me is that, I think, we started seeing files happening in the cloud two, three years ago. Like an afterthought for many of these providers, they thought that object storage is going to rule the world. Yes, they added block storage because it was needed for their virtual machines in the cloud, but something was missing.

It’s true that when enterprises started to adopt the cloud, they started bringing their workloads and most of their workloads are still based on files. Not only your office workloads, but also application workloads. This created a few issues. Everybody started building file systems, but I don’t know. Many of them look like something – working on top of the object. Something with an object storage interface on the back-end but actually with a lot of limitations: scalability, performance, everything.

Absolutely. We actually tried that when I was doing my company that we ended up selling to Hitachi. You can put NFS or CIFS as a protocol on top of object storage and get by being able to put files through those protocols, but the resulting file system is going to be pretty lousy. Like you mentioned, it’s going to be slow. It’s not going to have real versioning. It’s not going to have the atomic high performance that you expect from real file systems.

The concept for our design is really to start with the object store and build a file system inside the object store where the inodes, the metadata structures that hold the file system are native objects in the object storage. Once you have that image in the object store, synchronizing changes back and forth to something that’s a separate – essentially an edge appliance that’s doing both a protocol conversion and a transformation back and forth when the synchronization happens becomes a much more elegant, much more streamlined process. At that point, you get to match the performance levels of traditional file systems and data centers, what we call NaaS, NaaS arrays, but you get the benefits of object storage. Those benefits are what you mentioned before.

If you have a file system built into object storage, it will scale forever. You can build protection into that file system by taking the same – at the end, what you’re trying to do is you’re trying to remove the biggest limitation in file systems which has been a limited pool of inodes. If you look at what happened when we went from monolithic arrays to distributed file systems, it was all really about trying to bring more capacity and more inodes into the file system.

While object storage can give you an infinite pool of objects, if you map the inodes to the objects, you get infinite inodes. You get every scale constrained removed from the file system. That means you can go millions to billions of files. You can go terabytes to petabytes in capacity. Most importantly, or just as importantly, you can have an infinite number of snapshots or versions of the file system. That means the file system can protect itself, which means you can get rid of all that junk backup that really it can’t support file systems once they get to a certain size.

From my point of view, it’s not only scalability though. I mean, we have changed the way we create, consume, and distribute data. The teams are now distributed across huge distances. Sometimes you have teams working on the same project on different continents. It’s not just the file system because if you think about the file system, even if you have an object storage back-end that is theoretically accessible from everywhere, without a file system that is accessible from everywhere, you miss out on the story.

Very good, very good. That’s actually the third property, which is brand new. That is the difference between a distributed file system which is just a very scalable file system, something like Isilon and a global file system. A global file system is not only scalable, but you can distribute it geographically.

That changes the whole equation from DR and business continuity because all of a sudden you can fail from any data center or branch office to any other geographic location and you have the same synchronized file system to enabling collaboration with end users by offering global shares like CIFS shares that are just like CIFS shares, except they behave globally. They exist everywhere. You have one with edge appliances all the way to very heavy workloads like media workloads or game development; things that require hundreds of terabytes or petabytes of data to be synchronized around the world through this global file system.

That’s what’s exciting is that not only have we resolved some of the issues that pester IT around file system and management and backup and all that stuff, but we’re enabling with the global file systems a whole new array of capabilities that are important to the line of business.

If you can collaborate geographically with heavy file payloads, it means you can move videos around. It means you can move very large data sets around so that multiple groups can work on it. Now you have infrastructure that’s global which is what the companies are. I can’t tell you the number of very large global companies that come to us with their heads of infrastructure, the CIOs essentially saying, “Look, we’ve become global. We’re very successful, but our infrastructure still feels provincial. It works really well in one or two places around the world, but we have dozens of locations around the world where we have important projects going on and they just get the fumes of the infrastructure.” It’s slow infrastructure. It’s hard to get resources. It’s hard to plan, hard to scale. The idea of the cloud is that you can make every location around the world equally resourced without a huge amount of effort and cost.

Also, as you mentioned, this kind of file system, having everything synched in the back-end, in the object store, means also that the front-end is no longer important. I mean, you can lose it. It becomes easier to backup the system or plan for disaster recovery, strategies, these kinds of things. Especially now that we have these teams very distributed. This is not like then the two teams work on the same project in a distance from each other, but sometimes they are in very small offices in the middle of nowhere, and you don’t have IT people managing their infrastructure.

That’s exactly right. That is in general the benefit of SaaS. That’s one of the reasons why companies love to deploy SaaS across their vast organizations because it’s the same level of service for everyone. In the past, SaaS as a service has been limited to just software applications that don’t require a whole lot of interaction with the end users. The cloud is changing all of that.

It’s not just Workday and Salesforce that are now SaaS offerings. Infrastructure is now a SaaS offering and applications, full application stacks, can be SaaS offerings as well. That gives you a lot more flexibility when… a couple of trends that are important that typically go out with a cloud architecture when organizations are beginning to change.

We’ve seen in the last ten years a real evolution from a cloud, what? We just want to be educated. We don’t really want to do much with it. Maybe put some backups in it to give me a cloud option for everything I want to do. To make cloud option the first option we consider before buying more hardware which is where a lot of organizations are today to ‘cloud only.’ In other words, we want to take ourselves out of the data center business altogether.

Yeah. I totally agree with you. In all this dream of the cloud and having a file system distributed all across the world, there is a small issue. For example, I live in Europe. We have a lot of regulation around GDPR for example [on] data privacy, but not only that, in some countries you can’t move the data outside the country, so data sovereignty, these kinds of things. I think sometimes that everything is cool, but if you don’t have the tools to manage all of this, with policies, with the necessary rules to manage your data, then it becomes a nightmare.

You’re absolutely right that you need tools to be able to manage the compliance. There are two misunderstandings about cloud that are important to clarify when you’re trying to set up your cloud strategy, because I think a lot of people understand the benefits: scale, cost, global reach. What I think is misunderstood is the cloud is not like an ethereal magical place that exists in the actual clouds. The cloud is in data centers all around the world, physically in countries, in locations that are just being run as services by these giant service providers.

Let me give you an example when it comes to compliance. We have many clients that will have a private object store and deploy the large majority. They’ll have petabytes in this object store. They’ll be running all their file systems happily with private object stores that happen to be in North America.

Then they go to Europe and they want to be able to provision the storage and be compliant in Europe, where say in Germany, you can’t get the data out of Germany. Rather than set up a complete object storage stack in Germany themselves, they’ll go to Azure or they’ll go to Amazon AWS and use a local resident data center from those providers in those countries and be able to meet the requirements, the country requirements with public cloud.

It’s very important to understand that public cloud just gives you a very large menu to choose from in terms of how you actually localize the data. Like you say, Enrico, it’s very important to have the tools that allow you to do the management of being able to do that, but the physical plant is there and you don’t have to build it. As an organization, as a customer, you don’t have to build it.

The other misunderstanding about cloud is that you need incredible network infrastructure to get to the cloud. The opposite is true. You typically need a lot of network infrastructure when you’re doing say traditional backups. That’s why you see a lot of customers consume their MPLS networks with backup streams. When you go to the cloud, the cloud systems are designed to work in streaming modes, which is far more efficient than the batch type transfers, the typical backup or typical replication attempts to do. On one acceleration say that they’re horrible. CIFS NFS where one is just a nightmare for end users. By adopting a cloud architecture, you can actually go much farther in places that have much smaller pipes and then have public pipes because the cloud is designed to be secure from the edge to the cloud without needing to have all these additional pieces of infrastructure to provide security.

We see a lot of clients that are going from MPLS to SD-WAN and benefiting from it and they want it because they want to get to the cloud faster from any location that they’re in the world and there’s a lot more availability of just straight internet pipes than there is this complicated expensive, hard to manage MPLS networks.

We have clients that are mining for aluminum and they will be in the Amazon jungle and have one, two megabit pipes. Backup as you can imagine with that kind of infrastructure, the file server backups were crushing them and were impossible to actually achieve. With a cloud infrastructure, you’re just synchronizing all day long. Even though it’s infrastructure, it feels a lot more like the way Dropbox feels on your laptop. As soon as it’s got a little bit of air, it just reaches out and synchronizes, synchronizes, and everything is always up-to-date at your edge location.

Those are the two concepts that I think are very important. The cloud is everywhere, really physically almost anywhere in the world you need it. The cloud can run with various degrees of network infrastructure that can reach.

Let me play the advocate here for a moment. We’re talking about file systems and we all know that you have a global file system, but why not use sync and share then? Some of the features look similar. Now at the end of the day, you have your users accessing files remotely and they share the same vehicle volume somehow.

Right. There’s two reasons for it. One is: there is tremendous architectural advantages to having a local cache distributed out to the sites where a lot of users are accessing the same files. By being able to localize the storage closer to where the users are having all their interactions, you’re gaining a tremendous performance advantage. That’s the benefit of bringing the cloud into the on-premises sites where the actual end users are sitting and doing their work.

The other benefit, and this is the one that organizations realize, they’ll move their home directories to OneDrive or to Box or to something like that and everyone is super happy. Then they’ll try to do what they do with shares with those SaaS applications and everyone immediately gets very cranky. What happens is, there is a ton of glue. There is a ton of infrastructure in terms of how applications talk to each other. Links in Excel documents that are all predicated on links through the file system, through a share file system that all the users can see as the same share-file system. All that breaks.

There is the need when you’re in the enterprise to scale well beyond the capacity of what you can put in any one user’s workstation or laptop. The moment you get into hundreds of terabytes, you’re not going to have that accessible to any end user and be able to do that at scale. What happens is, you really benefit from having file servers. Things where all of that power, all of that skill is being aggregated so that the users can consume it in a shared mode.

It’s very important not to confuse a SaaS product, which is what essentially Dropbox, Box, OneDrive is [with] infrastructure. Infrastructure for everything we’ve come to hate about it, we depend on it. It’s one of those things that, why do people hate infrastructure? Because it’s hard to plan for it, because it’s complex to run. There are many things that are difficult about running, whether it’s NaaS or SAN storage or just on the infrastructure level. However, we depend on it.

There is no way that you can run an organization without an organization’s file system or NaaS any more than you can run virtual machines without a SAN infrastructure of some sort, a block infrastructure of some sort. Infrastructure is necessary and is not to be confused with SaaS.

Let’s take a few moments to talk a little bit more about Nasuni then. So far, we’re talking about a global file system with an object storage back-end. We have already an idea on what you do. Maybe you can go a little bit more in the details and explain how it works actually.

Sure thing. Let’s start at the edge. At the edge, you deploy these Nasuni edge appliances and they are, for all intents and purposes, very similar to what any enterprise class NaaS is. It has CIFS. It has NFS. It blocks into AD. The goal of those edge appliances is on the front-end to not change anything about the way IT delivers file services into the organization. Again, because of that need to keep the links, that need to keep infrastructure the same so that everything that plugs today into it can continue to plug into it.

There [are] two massive differences though, with these appliances. First of all, they are compact. Even if you are handling a file system that’s in the hundreds of terabytes or even petabytes, the appliance itself can just be a handful of terabytes. Now it needs to be high performance terabytes because that’s the way that you’re delivering that high level of IOPS that your end users come to expect. These IOPS can be delivered from a virtual machine because people know how to run high performance storage from VMs and delivered out to VMs. That means that everything you come to completely depend on for virtualization, you can leverage with this model at the NaaS layer.

The second piece is that the appliances themselves are all integrated into a common control plane that is actually giving you central monitoring, central configuration to all of them. As I mentioned, they can be deployed anywhere you have a hypervisor. You can deploy them on all your on-premises locations. We have many clients that will deploy on UCS around the world, but they can also be deployed in the cloud, which means you can have say, a disaster recovery site somewhere far away from your headquarters.

Say you want a recovery site for Bangladesh. You can have a local recovery site provided by AWS or Azure because like I said before, the cloud is physically in many, many places around the world that make it very convenient for you to deploy access to your file systems there. The result of all this, as the appliances are all working away, the file systems is being formed in the object storage layer. If it’s Amazon, it’s in S3, as we mentioned before. If it’s Microsoft, it’s going to be in the Azure Blob Storage.You’re going to get access to those same files, that same file system from all of the locations.

Just like you do today, you’re not going to have one file system, you’re going to have many file systems to cater to whether it’s compliance needs or just for management reasons. You want to partition it, but unlike what you have to do today, you won’t have to back it up. You’ll never going to run out of space with it and you’ll never going to have to deploy another file server because you’re out of inodes, you’re out of resources for what one file server can take because you’re consuming the unlimited resources of the cloud, of the Amazon or the Microsoft or the Google object store core.

That’s what we like to call that architecture. In the market, we call it an edge core architecture because you’ve basically taken all of the problems with management, scale, availability into the core and you’re still leaving that edge to just deliver high performance and edge availability, nothing else.

In this architecture, I mean, you remove all the complexities at the edge, but actually these devices become expendable in the sense that if you…

Exactly. Yes. We’d like to think of it almost like a smartphone. You lose your iPhone, it’s not – yes, you’re sorry you lost it. In terms of the data…

The iPhone’s cost today, it’s a B-plus.

That’s an interesting analogy. Think about how we used to think about phones especially when phones started having data in [them]. Every time you got a new phone, it was a hassle because you had to get the data somehow from the old phone to the new phone. That’s a situation that most organizations find themselves in when they’re trying to migrate from that NetApp array that was taught to you three, five years ago to this year’s version. You have to do this bulk migration and professional service and all this nonsense.

In the cloud model, the moment you want to replace that appliance at the edge, you want to spin up a new appliance, you just resynchronize like you do with your phone today to the core services. Everything is handled behind the scenes. There is no ‘state’ in the edge appliance. There is nothing to go get from the edge appliance. It’s actually the same thing that allows the global file system to exist because the state of the global file system, it’s in the cloud core object store. The appliances are just constantly synchronizing against that shared common image of their file system.

Another advantage here is that it’s much easier to migrate data or enable the data mobility between on-prem and the cloud because you just deploy one of these appliances in the cloud so you can migrate data there to be whatever you need.

Yes. I’ll tell you, one of the very cool things that we’re doing is… remember the file system ends up in the cloud. That happens because that’s a place where you can scale and you can protect the data and you can distribute the data geographically from, but once the file system is in the cloud, it’s logically centralized in the cloud. That is, you have a logical handle on your corporate file systems in the cloud.

You had mentioned GDPR. We have a new series of services coming out from Nasuni that basically allow you to plug in, for instance, GDPR engines that look at the file system in the cloud or with your own encryption keys, your own access control, and plug directly into say, the AWS GDPR compliance engine. What that’s doing is that’s basically scanning. It could be scanning billions of files, hundreds of terabytes, petabytes of data in the cloud outside your infrastructure at a speed that is unimaginable in traditional array infrastructure. It’s because you’re looking directly at a cloud file system.

This is one of our goals as a company. For years, we’ve sold this basic infrastructure. Companies that are basically backup and moving their files around DR, business continuity. Our customers are now asking for better insights into their data. Instead of trying to give them some kind of analytics tool within the Nasuni infrastructure, we’re plumbers. We’re a file system company. We are giving them connectors so that they can now bring their data to the ‘best in class’ analytics tool; essentially transforming what has been traditional NaaS and file system in the enterprise into big data that you can access with the cloud analytic tools.

Yeah. That’s very clever. Last question that I have is around licensing because we’re talking about clients or everything as a service, a subscription. How [does] it work for Nasuni? You have two companies, the software part and the hardware appliance.

More and more we are just all in the software appliance business, but our clients have always wanted to have an option that – by the way, we support every hypervisor in the market and the other trend that pretty much I talked about as the one before, the other change that companies are typically undergoing when they bring in Nasuni is hyper-converged.

If you’re thinking about how to deploy a simple stack across all your locations where you just want to run VMs, the last thing you want is a very large VM full of files. They’ll deploy a compact Nasuni virtual machine on top of their Nutanix or the UCS and run their NaaS that way, run their file services that way. However, in some situations, you don’t have IT staff. You don’t have any way to support a hypervisor that’s far away. We have a special OEM program that allows our clients to access an appliance that has no hypervisor. It’s just bare metal Nasuni code and runs that way. The virtual machine gives you a lot of advantages, a lot of things for free including resizing dynamically, high availability.

All of our advanced features are really meant to run as software only. We’ve hardly touched on this, Enrico, but I think one of the major trends on what’s happening that the cloud is a big part of, is: ‘software is eating the data center.’ It’s eating the world and it’s eating the data center. Every single part of the stack in infrastructure is being transformed into just software with no hardware dependencies. The cloud is the ultimate ocean of resources for being able to deploy software tools because you basically have no limitation on the hardware. That’s all being managed behind the scenes. Our clients all want to go do software. They all want to do orchestration. They want to automate. They want to access everything through APIs.

The last thing they want is to run into any physical limitation or dependency when it comes to their infrastructure. You can see that what’s happening is, for the more conservative side of the house, they’re doing private hyper-converged, very large though, infrastructure deployment. Then the guys are thinking five years in the future are already going to the cloud and deploying cloud only. Deploying their entire data center, virtual, in the cloud which is very aggressive for today’s standards, but it’s absolutely the way things are going to go. You want to pick tools that make software infrastructure possible. We are a software-defined NaaS and as such, we give you a strategy that allows you to go all the way from that virtual machine, on-premise today to that virtual machine in a pool of virtual machines in the cloud tomorrow.

This conversation was very useful I think and thank you again for the time that you spent with me. Last thing that I want to ask you to wrap up the episode is, where we can find more about Nasuni, find some documentation, and maybe follow-up on Twitter or other social media to continue this conversation.

Absolutely. Nasuni.com has everything you want to learn from us about and it’s wonderful. By the way, that comes from NaaS, N-A-S and UNI which means unify. Bring in all those headaches and all the potential of object storage into one integrated system. It’s where the name comes from. Yes, Nasuni.com. You can find everything, technical papers, and our blog because I blog in there. There’s lots of content there that’s very informative.

Okay. Thank you very much, Andres. Bye-bye.

The post Voices in Data Storage – Episode 30: A Conversation with Andres Rodriguez appeared first on Gigaom.

]]>