VMUG (2024)

This article is about vSAN, its policies, and why they matter. During my time working with vSAN, those who have newly adopted this software defined storage solution often experience issues due to storage policies. Why are these things issues? Generally, people don’t know that policy directly effects IOPS, throughput, and latency on vSAN clusters. Hopefully this article gives people a better understanding regarding this matter and can be used as a reference in the future when discussing such things.

First, I’m going to establish some acronyms so that the data is easier to read.

FTT: Failures to Tolerate. Number of node failures within cluster a vm can sustain and still be functional. Based on the number of nodes in a cluster, with possible values being 1 (3 node min.), 2 (5 node min.), and 3 (7 node min.).

FTM: Fault Tolerance Method. Three possible values assigned to FTM- R1, which is simply replication of an object, R5, which, while referenced as raid is in fact erasure code striping across 3 different nodes (not on disks inside a node) and a parity stripe on a fourth node, and R6, which is again erasure coding, with 4 data stripes and 2 parity stripes across six nodes.

SW: Stripe Width. Unrelated to FTT or FTM, SW dictates how many disks an object is striped in 1MB stripes. Max value 12.

You can learn more about these vSAN policy settings here: https://storagehub.vmware.com/t/vmware-vsan-demonstrations/create-a-vsan-storage-policy/ and here: https://storagehub.vmware.com/t/vmware-vsan-demonstrations/vsan-resilience-and-data-placement/

H: this is the number of Hosts per cluster

DD&C: Deduplication and Compression

The story behind this first set of tests – a customer in the financial sector wanted to see how policies affected IOPs, Throughput, and Latency in vSAN. After running the first two tests we were able to add two hosts to the cluster. Here’s the tests I ran and their HCI Bench information:

HCI Bench Case NameHCI Run Definition
vdb-8vmdk-100ws-4k-70rdpct-50randompct-2threads-1538686968RD=run1; I/O rate: Uncontrolled MAX; elapsed=28800 warmup=600; For loops: None
vdb-8vmdk-100ws-4k-70rdpct-50randompct-2threads-1538719096RD=run1; I/O rate: Uncontrolled MAX; elapsed=28800 warmup=600; For loops: None
vdb-8vmdk-100ws-4k-70rdpct-50randompct-2threads-1539131626RD=run1; I/O rate: Uncontrolled MAX; elapsed=28800 warmup=600; For loops: None
vdb-4vmdk-100ws-4k-100rdpct-100randompct-4threads-1539299433RD=run1; I/O rate: Uncontrolled MAX; elapsed=14400 warmup=600; For loops: None
vdb-4vmdk-100ws-4k-100rdpct-100randompct-4threads-1539319933RD=run1; I/O rate: Uncontrolled MAX; elapsed=14400 warmup=600; For loops: None
vdb-4vmdk-100ws-4k-100rdpct-100randompct-4threads-1539220221RD=run1; I/O rate: Uncontrolled MAX; elapsed=14400 warmup=600; For loops: None
vdb-4vmdk-100ws-4k-100rdpct-100randompct-4threads-1539237266RD=run1; I/O rate: Uncontrolled MAX; elapsed=14400 warmup=600; For loops: None

And here are the results:

VMUG (1)

I want to point out in this chart, even though FTT 2 FTM R1 provides more copies to read from, it doesn’t produce more IOPs than FTT 1 FTM R1. Additionally, even though a write load was added, there is a rather marked difference in the FTM R5 and R6 workloads, as is also shown in the Throughput chart below. Note they have the same trending- higher numbers (which are good in IOPs and Throughput) at FTM R1, but lower at R5/R6. I’d like to state for the record that there is absolutely nothing wrong with R5/6 erasure coding. If your workload can operate inside the IOPs and throughput supplied when utilizing erasure coding, it’s a great way to efficiently utilize capacity. However, if you find the performance of your vm with that type of FTM is not acceptable, I suggest an FTM R1. It’s important to draw attention to the SW setting as well. While the effect is more apparent in an FTT2 situation, it’s difficult to tell with these graphs, but I assure you there is a very slight difference in FTT 1 as well, with the number of disks in a diskgroup minutely more productive than the number of disks in a host. In my opinion, if one really needs to eke the very possible last bit of performance from a vm or stateful storage, the way to do it is keep the stripe width at or under the number of disks in a diskgroup while utilizing FTT 1 FTM R1.

VMUG (2)

VMUG (3)

For Latency, we see the opposite trend, which is what we want to see, as lower is better in this case.

VMUG (4)

While those results alone aren’t much proof by themselves, I was asked by a large agricultural supply chain company to do some testing around SPBM, but the with the additional tasks of checking policies with and without deduplication and compression (DD&C) and measuring rebuild times while utilizing different policies. For these tests I used the following workload at the customer’s direction: block size 16k, R/W ratio of 50/50, 50% random, 24 vms per node with a 200gb working set size and 4 vmdk’s per vm. The cluster was 8 nodes throughout testing, and a default SW of 1 was used. Tests ran for four hours. The policies tested are in the table below:

POLICY
FTT1 FTMR1 NO DD&C
FTT2 FTMR1 NO DD&C
FTT1 FTMR5 NO DD&C
FTT2 FTMR6 NO DD&C
FTT1 FTMR1 DD&C
FTT2 FTMR1 DD&C
FTT1 FTMR5 DD&C
FTT2 FTMR6 DD&C

Here are the results of those test, and specifically I want you to look at the trends based on policy utilized. Notice the same curve pattern with regards to the representation by policy. Also notice what occurs when you combine an FTM R5/6 with deduplication and compression. Personally, I would go with utilizing the FTM of R5/6 on a per vm basis, OR, I would choose to turn on deduplication and compression, not both.

VMUG (5)

VMUG (6)

VMUG (7)

We also see the same trends in latency and the impact of utilizing DD&C with FTM R5/6.

Observing what effect policy has on rebuild times for an 800gb vm. FTT and FTM both altered rebuild time. The policies tested were

FTT1 FTM R1 DD&C
FTT1 FTM R5 DD&C
FTT2 FTM R1 DD&C

And the results:

VMUG (8)

I’m first to admit, it’s not a very large data set. It would be great if someone can prove my viewpoint incorrect. However, in multiple HCI, vSAN, VxRail projects I’ve been involved with, this approach to policy alteration has been successful for me. So please, do your own testing with policies and share your results. I eagerly await them.

VMUG (2024)
Top Articles
Cutting Carbon Costs: Learning from Germany's Energy Saving Program
Paramount Plus: Price, plans, and how to subscribe
English Bulldog Puppies For Sale Under 1000 In Florida
Katie Pavlich Bikini Photos
Gamevault Agent
Pieology Nutrition Calculator Mobile
Hocus Pocus Showtimes Near Harkins Theatres Yuma Palms 14
Hendersonville (Tennessee) – Travel guide at Wikivoyage
Compare the Samsung Galaxy S24 - 256GB - Cobalt Violet vs Apple iPhone 16 Pro - 128GB - Desert Titanium | AT&T
Vardis Olive Garden (Georgioupolis, Kreta) ✈️ inkl. Flug buchen
Craigslist Dog Kennels For Sale
Things To Do In Atlanta Tomorrow Night
Non Sequitur
Crossword Nexus Solver
How To Cut Eelgrass Grounded
Pac Man Deviantart
Alexander Funeral Home Gallatin Obituaries
Energy Healing Conference Utah
Geometry Review Quiz 5 Answer Key
Hobby Stores Near Me Now
Icivics The Electoral Process Answer Key
Allybearloves
Bible Gateway passage: Revelation 3 - New Living Translation
Yisd Home Access Center
Pearson Correlation Coefficient
Home
Shadbase Get Out Of Jail
Gina Wilson Angle Addition Postulate
Celina Powell Lil Meech Video: A Controversial Encounter Shakes Social Media - Video Reddit Trend
Walmart Pharmacy Near Me Open
Marquette Gas Prices
A Christmas Horse - Alison Senxation
Ou Football Brainiacs
Access a Shared Resource | Computing for Arts + Sciences
Vera Bradley Factory Outlet Sunbury Products
Pixel Combat Unblocked
Movies - EPIC Theatres
Cvs Sport Physicals
Mercedes W204 Belt Diagram
Mia Malkova Bio, Net Worth, Age & More - Magzica
'Conan Exiles' 3.0 Guide: How To Unlock Spells And Sorcery
Teenbeautyfitness
Where Can I Cash A Huntington National Bank Check
Topos De Bolos Engraçados
Sand Castle Parents Guide
Gregory (Five Nights at Freddy's)
Grand Valley State University Library Hours
Hello – Cornerstone Chapel
Stoughton Commuter Rail Schedule
Nfsd Web Portal
Selly Medaline
Latest Posts
Article information

Author: Rueben Jacobs

Last Updated:

Views: 6309

Rating: 4.7 / 5 (77 voted)

Reviews: 84% of readers found this page helpful

Author information

Name: Rueben Jacobs

Birthday: 1999-03-14

Address: 951 Caterina Walk, Schambergerside, CA 67667-0896

Phone: +6881806848632

Job: Internal Education Planner

Hobby: Candle making, Cabaret, Poi, Gambling, Rock climbing, Wood carving, Computer programming

Introduction: My name is Rueben Jacobs, I am a cooperative, beautiful, kind, comfortable, glamorous, open, magnificent person who loves writing and wants to share my knowledge and understanding with you.