Reducing latency on Equallogic storage with VMware vSphere
I had an older Equallogic PS6000E SAN, configured for RAID 6 that was attached to a couple of vSphere hosts. Being comprised of a bunch of 1TB 7200 RPM SATA disks, it wasn't exactly built for performance and I would often see it top out on IOPS for long periods of time in SAN HQ. After a bit of shuffling in our other datacenter, I freed up a PS6000XV SAN (600GB 15,000 RPM disks, in RAID 10) and decided to add it to the same pool in order to utilize the auto-tiering capabilities and boost performance of the SATA SAN. My problems with IOPS were solved, but read latency remained stubbornly high. As I spent more time looking at the graphs, I realized that, strangely, the latency was highest when the IOPS were lowest, which is the opposite of what you'd expect. Shouldn't requests be answered faster when there is less work to do?
I did a bit of Googling, and decided to re-read the Best Practices for VMware guide for Dell's Equallogic storage. Buried inside there are two very helpful tips, that I don't remember being there years ago when I set up those SANs for the first time.
The important bits are found on pages 9-11. The section on Delayed ACK describes EXACTLY what I was seeing, so I disabled it, and Large Receive Offload (LRO) for good measure. Note that this will require a reboot of your hosts, but that's what we have vMotion for, right?
As you can see in the graphs below, the improvements in my read latency were pretty stunning and instant. If you are experiencing high latency during periods of relatively low IOPS with your Equallogic SANs, then definitely give this a try.