The Case Against SAN

Despite an inflamatory post title, I believe that SAN (Storage Area Networks) is a great technology with numerous scenarious where it is the exact right technology and several scenarios that only exist because of SAN’s availability.  However, that being said, many enterprises today use SAN without doing any proper strategy, architecture or engineering.  It is being chosen as a technology not because of its appropriatness to the task at hand but simly because technology managers see it as easier, or more popular, to use it broadly than to carefully evaluate each system in question based on technical and financial factors.

SAN is an amazing technology that wonderfully compliments virtualization, clustering and other advanced use case scenarios.  Not every machine is using these types of scenarios and SAN has many downsides that need to be carefully considered before implementing it blindly.

SAN is Complex. Simply by chosing to use SAN we introduce another layer of complexity into the server equation.  (I am assuming server use situations here as SAN is nearly unheard of in the desktop space.  That being said, I use SAN on my own desktop.)  Having SAN means that either your system administrators need to wear yet another hat or you need to hire and maintain a dedicated storage administration, and possibly engineering, staff.

It also means that you will probably need to deal with sourcing and managing a fibre channel network along with the associated HBAs, fiber optics, etc.  Servers that would otherwise have just three simple Ethernet connections (I’m generalizing horribly here) are suddenly up to five or more connections making your datacenter folks oh so happy.

SAN is Expensive. Unless you opt to use a shared network SAN technology like iSCSI (or Z-SAN) then SAN introduces an expensive array of proprietary networking hardware, cabling and host bus adapters.  Only after all of those expenses must we consider the cost of the SAN itself.  SAN systems are generally quite expensive and only begin to approach being cost effective when utilization rates are extremely high and the systems are very large.  Heavy up front investments can make SAN difficult to cost justify even if long term utilization rates might be high.

SAN is Not Performant. High speed SAN networks, massive switching fabrics and huge drive arrays all play into an expensive and mostly futile attempt to get SAN technologies to perform at or near traditional direct attached storage technologies.  During the Parallel SCSI and PATA drive era, fibre channel SAN had an advantage over most local drives simply because of the high performance of its networking infrastructure.  Today this is not the case.

Unlike shared bandwidth technologies like Parallel SCSI and Parallel ATA (PATA), SAS and SATA drives have dedicated, full duplex bandwidth per device providing greatly increased transfer rates while lowering latency.  Only the largest, most expensive of high performance SAN systems could hope to overcome this gap in technology.

Typical SAN systems tend to use, in my experience, SATA devices traditionally running at 7,200 RPM.  Local drives are often SAS drives running at 15,000 RPM.  Often, especially in the AMD and Intel server worlds, local drives are handled via high powered RAID controller cards with dedicated processors and their own cache.  These cards move the cache closer to the system memory making their burstable throughput far greater than can normally be acheived in a SAN situation.

SAN is Not Easily Tunable. In most situations, SAN is managed as a single, giant storage entity.  Tuning is performed to an entire array but little thought is generally given to small segments within an array.

This is made nearly impossible and definitely impractical by the simple fact that physical drive resources are often shared and the concerns of each accessing system must be considered.  The obvious solution is to just tune for “average” use given no special considerations to any particular system.  If drive resources are not dedicated then we must question where the value of the SAN comes into play.

Drives located on a local machine can easily be tuned for cost and performance as needed.  Careful consideration of high speed SAS versus large volume SATA can be made on a volume by volume basis by the system engineer.  Drives can be grouped as needed into carefully chosen RAID levels such as 0 for raw performance, 5 for high speed random access with some additional reliability, 1 for good sequential access with full redundancy, 6 for additional redundancy over 5, etc.

Drive volumes can also be isolated so that drive systems often accessed simultaneously do not share command paths.  Carefully filesystem design can greatly reduce drive contention and minimize drive head movement for increased performance and reliability.

SAN is Often Political. Simpy by introducing SAN to a large organization we risk introducing new management, new skill sets, new job descriptions and, inevitably, confusion and paperwork.  By separating the storage from the server we create another point of coordination keeping the system administrator from being a single point of contact and troubleshooting for system issues.

Anytime that we introduce a separation of duties we introduce company politics and a chain of communication.  Instead of troubleshooting a single system when a server goes down we have to, in the case of SAN, now consider the server, the SAN box and the connecting network plus the peripheral pieces like the host bus adapters and the local configuration.  What might otherwise be simple, almost meaningless changes like the addition of another drive to expand a server’s capacity by a terabyte, can suddenly scale into major enterprise issues requiring much lead time, planning and expenditure, and, of course, a system outage that used to take minutes to repair could easily become hours as company departments seek shelter rather than simply fixing the issues at hand.

SAN uses Additional Datacenter Footprint. Because almost any server already comes with internal storage capacity, the datacenter space needed by SAN equipment is generally redundant.  Until additional storage capacity is needed beyond that which can fit inside of the existing server chassis the SAN storage is completely additional within what are generally cramped and overutilized data centers.  In many cases when a server needs additional drive capacity SAN is still not necessarily a good option from a footprint perspective as many external drive array systems can be locally attached and use very little datacenter space.

SAN systems require more than simply physical space within the datacenter for their switching and storage pieces, they also require additional power and cooling.  In an era when we are fighting to make our datacenters as green as possible, SAN needs to be considered carefully with respect to its overall power draw.

SAN does not address Solid State Drives. Solid State Drive technology, or SSD, poses yet another obstacle for SAN in the enterprise.  SSD drives are much smaller capacity, currently, than traditional spindle based hard drives but often provide better performance at a fraction of the power consumption.  A traditional hard drive generally draws roughly fifteen watts while a standard SSD generally draws around one watt – a very significant power reduction indeed.

SSDs often have very high burstable transfer rates which swing the performance balance far in favor of the locally attached storage options based on their greatly superior throughput.  For example, a standard Hewlett-Packard DL385 G5 server, a very popular model, as eight 3Gb/s SAS channels available to it for a total aggregate of 24Gb/s.  Six times that of the most common SAN connections.

SANs which choose to use SSD, which is likely to take quite some time because SANs generally lean towards large capacity over performance, will suffer from a lack of throughput available but will have the benefit of eliminating almost all issues mentioned early in regards to drive contention from shared drive resources.

SAN is Confusing. While this factor comes into play less often, it still holds true that a majority of server “customers”, those people who utilize servers but are not the server or storage administrators, have a very poor understanding of SAN, NAS, DAS or filesystems in general and by introducing SAN we can inadvertantly introduce forms of complexity that cause communications and support issues.  While not an issue with SAN itself, in some cases technical confusion can impede adoption even when the technology is appropriate.

Bottom Line. SAN suffers from performance, organization, cost and issues of complexity while local storage is well understood, extremely inexpensive, simple to manage and offers extreme performance.  With rare exception, SAN, in my opinion, has little place competing with traditional direct attached storage options until DAS is unable to deliver necessary features such as resource sharing, certain types of replication, distance or capacity.

Leave a comment