Skip to main content

Is Scale-Out File Storage the New Black?

Storage vendors, especially startups, are finally supporting new media types in their systems. In this article I want to talk about the impact that quad-level cell (QLC) NAND memory and Intel Optane have in the development of scale-out systems and how vendors like VAST Data are taking full advantage of them in their product design.

In my recent GigaOm report about scale-out file storage (Key Criteria for Evaluating Scale-Out File Storage), I mentioned several key criteria that can help users evaluate this type of storage system. Today I want to explore specific solution functionality that can make a difference in a scale-out storage deployment, and how it should impact your thinking.

Key Criteria, Considered

Before I go further, however, I want to dive into the structure of our GigaOm Key Criteria reports and how they help inform decision making. For each sector we assess, the Key Criteria report explores three sets of criteria specific to that sector. They are:

Table stakes are solution features or characteristics that you should be able to take for granted in a product class. For example, if the class were cars, things like infotainment systems and air conditioning are now standard in every car, and therefore are not generally significant in driving a decision.

Emerging technologies are all about what’s next—what features or capabilities can you expect to emerge in the near future (12-18 months). For cars, that might be the first implementation of autonomous driving for highways. I honestly don’t know—I’m not an expert in car technology.

Key criteria are the core of the report and address features and characteristics that really make a difference in assessing solutions. In a car, this might be the use of an electric powertrain or advanced safety devices to protect you and pedestrians from accidents.

In addition, the Key Criteria report breaks down a number of evaluation metrics, which are broad, top-line characteristics of solutions being evaluated, and are helpful in comparing the relative value of solutions in specific areas. For cars, these top-line characteristics might be performance, comfort, fuel efficiency, range, and so on.

These reports finish with an analysis of the impact that each of the key criteria has on the high-level evaluation metrics. This indicates whether a particular solution might meet your needs. For example, an electric powertrain may have a significant, positive effect on metrics like comfort, performance, and fuel efficiency, while its impact on vehicle range is less beneficial.

In the Key Criteria report for scale-out storage, the important differentiating key criteria we analyzed were:

  • Integration with object storage
  • Integration with the public cloud
  • New flash-memory devices
  • System management
  • Data management

Now I want to focus on one of these key criteria and give you an example of a vendor that has interpreted this important metric in a very innovative and efficient way.

Flash Memory and Scale-Out

I explore the impact of flash memory storage on scale-out systems in depth. The following excerpt comes directly from my GigaOm report, “Key Criteria for Evaluating Scale-Out File Storage.”

Modern scale-out file storage is utilized for a growing number of workloads that need capacity associated with high performance. The data stored in these systems can be anything from very large media files down to files a few kilobytes in size. Finding the right combination of $/GB and $/IOPS is not easy but in general, active data is only a small percentage of the total capacity. As a consequence, flash memory is becoming the standard for active workloads, while object storage (either an on-premises, HDD-based or cloud-object store), houses inactive data. This two-tier approach is explained in this gigaom report.

Hard drives still provide the best $/GB ratio, but to stay performance-competitive with solid state flash media solutions, HDD vendors are developing new technologies and access methods to close the gap. Unfortunately, these approaches add complexity that can actually make HDD utilization in standard enterprise arrays more and more difficult.

On the other hand, even though MLC and TLC flash memory prices have been falling steadily for quite some time, the $/GB ratio remains too high to satisfy the capacity requirements of some applications. Meanwhile, performance of standard flash memory falls short for applications that require accessing data at memory speed without the cost, persistence and capacity limitations imposed by RAM. To address this price/performance conundrum, the memory industry has come up with two solutions:

Quad-level cell (QLC) NAND memory is the next iterative leap in the evolution of flash memory, thanks to new manufacturing techniques that stack up to 96 layers of cells in the same chip (and even more for the next generation). This density enables vendors to build very cheap and high-capacity storage devices.

Unfortunately, these devices have a few important drawbacks. They have a very weak write endurance (up to just 500 write cycles) and a much lower write speed compared to MLC SSDs. In comparison, an MLC NAND device can endure between ten and twenty thousand write cycles, and write speeds can reach up to four times higher than QLC. The combination of lower $/GB ratios and higher media density and efficiency can significantly impact the overall TCO of a solution. Not all architectures are ready for QLC 3D NAND. How it is implemented makes the difference in terms of efficiency, TCA, and TCO.

Memory-class storage (i.e. Intel Optane) is gaining traction in the enterprise storage market. This new kind of device bridges the gap between DRAM and NAND memory in terms of latency, cost, and features. It is not as fast as RAM, but prevents access to slower storage and improves overall system response and availability when used in persistent mode.

In addition, storage vendors can use this media for caching and as a landing zone for hot data for performing operations like compression, optimization, and erasure coding. It is also useful for fast handling of metadata. In any case, the user can expect a general improvement in performance, at reasonable cost.

These new classes of memory do not remove HDDs from the game. Spinning media remains viable as near-line tier storage in modern systems and can be effective in large systems that consolidate multiple workloads on a single storage system, providing a capacity tier at low cost. That said, organizations are moving away from hybrid configurations in favor of all-flash storage systems associated with an external object store, or they take advantage of the cloud to develop the necessary capacity.

Impact Analysis: VAST Data

VAST Data, a startup that is doing really well with its scale-out file storage solution, has implemented Intel Optane and QLC-NAND synergistically in their system with great results. Intel Optane is used as a landing area for all data. This really isn’t a cache but more a staging area where data is prepared, chunked, and organized to minimize write amplification and optimize operations in the QLC back end.

By implementing this data path, which is associated with innovative data compaction and protection techniques, the system can achieve impressive $/GB while maintaining performance throughput at the front end. This makes the solution a good fit for workloads that need both capacity and performance at a reasonable cost. This is an oversimplification, perhaps, but I encourage you to check out the videos recorded during Storage Field Day 20, to get a good grasp of VAST Data architecture.

As mentioned earlier, the impact that key criteria have on evaluation metrics provides a better understanding of the solution and, ultimately, how well the product might fit your needs.

Table 2 shows the high impact that new flash memory devices have on performance, scalability, system lifespan, TCO and ROI, and flexibility. Only usability is minimally impacted. What this means is if you place value on the metrics that are highly impacted by this key criterion, you should put VAST on your short list for a scale-out storage evaluation.

Figure 1. The Impact of Key Criteria on Evaluation Metrics

Closing the Circle

The research at GigaOm extends beyond the Key Criteria report, which sets the table for the detailed market sector analysis in our GigaOm Radar reports. The Key Criteria report is essentially a structured overview of a product sector, while the Radar report presents a market landscape that summarizes all the key criteria and metrics evaluations for each vendor and positions them together on a chart.

One of the most important characteristics of the Radar report is that it is technology focused and doesn’t really take into account market share. This may sound odd when compared to analysis available from other sources, but our commitment is to put the technology first, and to help IT decision makers understand both what a solution can do for their organizations and where its development is going. As I often say, it is a forward-looking approach rather than a conservative, backward-looking approach–which approach would you prefer to inform your IT organization?

As you can see in the diagram below (Figure 2), VAST is well-positioned in the Leaders circle of the GigaOm Radar report chart and was graded as an Outperformer. This reflects the company’s aggressive forward movement not just with its flash-memory implementation, but its success shaping its solution to solve challenges with unstructured data. VAST is a storage vendor that is worth a look.

Figure 2. GigaOm Radar Chart for Scale-Out File Systems



from Gigaom https://gigaom.com/2020/09/14/is-scale-out-file-storage-the-new-black/

Comments

Popular posts from this blog

Voices in AI – Bonus: A Conversation with Hilary Mason

[voices_in_ai_byline] About this Episode On this Episode of Voices in AI features Byron speaking with Hilary Mason, an acclaimed data and research scientist, about the mechanics and philosophy behind designing and building AI. Listen to this episode or read the full transcript at www.VoicesinAI.com Transcript Excerpt Byron Reese: This is Voices in AI, brought to you by Gigaom and I am Byron Reese. Today, our guest is Hilary Mason. She is the GM of Machine Learning at Cloudera, and the founder and CEO of Fast Forward Labs, and the Data Scientist in residence at Accel Partners, and a member of the Board of Directors at the Anita Borg Institute for Women in Technology, and the co-founder of hackNY.org. That’s as far down as it would let me read in her LinkedIn profile, but I’ve a feeling if I’d clicked that ‘More’ button, there would be a lot more. Welcome to the show, amazing Hilary Mason! Hilary Mason: Thank you very much. Thank you for having me. I always like to start with...