Man Bites Dog – Why IBM (and everyone else) should fear EMC’s acquisition of Isilon Systems

Firstly, apologies for the length of time since my last post.  A combination of family, work and late vacation hasn’t left much time for blogging in the last few months.

The recent fun and games with Dell and HP nearly brought me back to the keyboard – okay, so the technology was wrong (3PAR, really?), but I did say HP needed to do SOMETHING about the EVA, so half points for trying, maybe? 🙂

I have to say though, I was at a Dell partner event recently, and the Dell guys seemed pretty relaxed about the whole thing – consensus seemed to be that they dodged a bullet (and a repeat of EMC’s black eye when they bought RSA for what the market thought was “too high” a price a few years ago) by not staying in to the bitter end.  The market seems to agree with them (Dell shares have been rising steadily since they stepped aside), which I suppose in the end is the important thing really.

But while I was away, I had the chance to evaluate a number of up and coming technologies, one of the best of which was by a small company named Isilon Systems.  These guys have an excellent story but seem to have been ignored by large sectors of the market (up until this week that is).

So I saw this product, went away with my mind full of the potential and started talking to my sales people, getting things lined up, and had half the draft of an article lined up which would have made me look like some kind of prophet if I’d published it last week (or maybe arrested on suspicion of espionage – but hey, them’s the breaks :-))

Because now, from being a complete unknown, Isilon are now in the spotlight as EMC’s next acquisition target, with a number of tech journals and media outlets twittering on about the “perfect fit” with EMC’s product set and how “complimentary” it would be.

So let me politely respond:

I don’t think so.

Isilon is in no way complementary to EMC’s current product line.  It’s a competitor to many of EMC’s core products, and it has the capability to do to EMC’s lineup, what XIV has done inside IBM (think hot knife through butter).

To back up this assertion, let’s consider a few things:

  1. EMC’s mid-range technology is based on variations of the Clariion platform.  This is the clunky dual-processor architecture from last century, being replaced in other forward-looking vendors by scalable grid architectures – spookily similar to the Isilon architecture.
  2. The Celerra NAS platform, the closest EMC currently has to Isilon, has spent 10 years developing from a “dire box of misery” (actual engineer quote) to its’ current, actually pretty usable form, but let’s be honest, it’s never going to move NetApp from the top spot.  EMC need a revolution in this space, and the Celerra/Clariion combo is never going to be that revolution, no matter how integrated it gets.  Isilon is built from the ground up for NetApp competitive takeout and forms a far more credible proposition.
  3. Data Domain, EMC’s other recent acquisition, uses a variation on the same Xyratex-manufactured storage server as its basic hardware.  Rather than moving Data Domain onto the Clariion platform, it would surely be less painful to integrate the Data Domain architecture with the Isilon stack (this would also plug one of the more painful holes in the Isilon array itself – more on this later).
  4. Centera, EMC’s archiving product, is being left further an further behind, with software vendors providing better resilience and compression/deduplication and is looking old and tired compared to products such as HDS’ HCP product family.  Replacing this with a product with the compliance capabilities of Centera and big disks would get EMC back in this game (similar to what NetApp have done with SnapLock – provide Tier 4 storage on an existing array, rather than force a new architecture).

If you strip away the Isilon File System, what is left looks suspiciously like a better version of my favourite product of the last year, the IBM XIV.  Based on the same Xyratex hardware platform, Isilon has the same scaling of processor, cache and disk capacity that makes XIV such fun to design with, but at the same time, Isilon has some excellent unique points:

  • 8Gb/s Infiniband back-end network
  • Scale to 144 modules (nodes)  in a single manageable environment – slightly more than XIV’s current 15 modules per manageable system limit – this is something XIV has promised since the original rollout presentation, but has yet to deliver.
  • Virtualised equivalent to RAID 5 (N+1 protection level)
  • Virtualised equivalent to RAID 6 (N+2 protection level)
  • Virtualised equivalent to protection levels the market doesn’t have a name for yet – N+3 and N+4.
  • Seriously, if there’s a snappy marketing term for these out there (other than “holy sh*t that’s a lot of protection”) then I want to know about it.
  • Choice of disks and enclosure types – chose from SSD (SSD!!!!) SAS (SAS!!) and SATA (oh well) to give a choice of performance and capacity
  • Diskless enclosure gives a performance boost without having to increase capacity – one of my big issues with XIV (#4 I think) was a need to scale capacity to get performance – if you wanted 50,000 IOPS, but only 5TB of capacity (actual customer of mine had this) then you bought 79TB and liked it (or else).
  • Starting point of 10TB (3 modules) – technically you could start with less, but you don’t get parity protection till you have 3 modules, so this is the best place to start – still less than XIV’s 6 module 27TB starting point.

Of course, it’s not perfect yet:

  • One of my more backup-focused colleagues poked a few holes in Isilon’s NDMP-based backup solution.  Okay, not so much poked holes as picked up the argument-based equivalent of a cannon and started blowing it away.  End result is – NDMP is unlikely to be fast enough to backup a moderate-sized Isilon.  The fix? A conversion appliance that takes the IP traffic and converts it to FC (for backup only).
  • This is where Data Domain comes in – combining the DD backup architecture with Isilon would remove backup issues.
  • Isilon is currently file only – EMC would have to develop a block-level equivalent which can run in parallel with Isilon’s ONEFS file system, unless they wanted to do as NetApp and keep everything in the file space.
  • To an extent this is feasible – with more enterprises moving to virtualised environments (courtesy of EMC again) the block vs file argument is becoming a little less important.
  • Still, at least we know it would be possible to develop a grid-based block system to run on the Isilon hardware (thanks for the pointers, Mr Yanai!!)
  • Isilon is not Fibre Channel capable for host traffic, but being based on Xyratex, FC capable HBAs can be fitted)
  • Of course, by the time Isilon is integrated into EMC’s portfolio, FCoE may be the standard (he says hopefully) – again, the Xyratex servers can be retro-fitted with CNAs as required.
  • Most important – please please please EMC, please hire someone to design a proper user interface?  Maybe the guys who did the XIV interface could do something tasteful for you in blue?

A couple of years ago, Moshe Yanai told me that EMC would never develop something like XIV, as they were too wedded to the Symmetrix and Clariion architectures – too much development has gone into these platforms for EMC to contemplate spend money developing a competing architecture.

As this was based on Yanai’s belief that nothing like XIV existed to be acquired, I’d argue that a successful acquisition/integration of Isilon by EMC would blow that position away completely.  Without spending time and money on the basic architecture, EMC will be in a position to develop a grid-based replacement, first for the Celerra then the Clariion, Centera and beyond (Isilon-MAX anyone?  Just kidding, like the DS8000 range, Symmetrix has a number of features that Isilon can’t and probably will never copy)

In fact, if I were an EMC employee working in anything starting with a “C” I’d be looking for a quick change of career if this goes through.

The Competition

Of course, as Dell proved, it doesn’t matter who makes the first bid, so much as who makes the last.  I’ve listed potential counter-bidders in order of who I think is most likely to try and take this away from EMC:

  1. IBM – at the conclusion of the original presentation from Isilon, my first comment was “you guys are so getting bought by IBM” (seriously, this was my actual comment).  I hadn’t actually considered EMC as a buyer (for reasons, see above) so IBM appeared the perfect choice – I don’t have a lot of faith in the SONAS stack yet, and Isilon would give IBM a properly developed scale-out file platform, with some of the features they’d like to have in iteration 3 of the XIV platform (Infiniband, parity protection, multi-rack scalability, choices).  Plus it keeps EMC stuck with last-generation architectures.
  2. Dell – flush with cash from the 3PAR try, Dell could make a try for Isilon.  Let’s face it, they’ve already hacked EMC off once this year by showing they want their own storage product, so why not go the whole hog just to see the look on Joe Tucci’s face?  Some synergies with their existing storage lines – the Equalogic iSCSI array is based on a similar Xyratex storage server (seriously, is there anything out there that isn’t made by Xyratex these days?).  At the very least they could do what they achieved with HP/3PAR and skyrocket the price, at least damaging EMC a little.
  3. NetApp – can’t really see the Synergies here, NetApp are unlikely to abandon WAFL, and could probably develop their own Xyratex-based or similar grid platform for an expanded WAFL file system in the future without hitting ideological barriers (of the kind that protect the Symmetrix architecture at EMC).  All the same, they can’t be pleased at the thought of EMC acquiring a “NetApp killer” and may be looking for pay-back after the Data Domain debacle.  As with Dell, NetApp may enter the fight just to push the price up and make things hot for EMC.
  4. Cisco – perfect chance to acquire a storage product to complement their UCS servers.  Unlikely as they have a lot invested in the different storage collaborations, but possible.

or Oracle – who from the rumours, may wait until EMC acquire Isilon, and then acquire EMC.  Not my favourite vision of the future from an independence point of view, but possible.

Conclusion

I really hope EMC goes through with this (the Isilon acquisition that is, not the Oracle one).  I’ve always enjoyed working with EMC (products and as an organisation), but it’s become harder to justify in the last year, when you know there are better products on the market – better service only takes you so far against better/cheaper/faster products such as XIV.  Assuming EMC make a proper job of integration (and they’ve done reasonably well with some of their other acquisitions) this may be the thing to get them back on track.

Here’s hoping.

Advertisement

June Q&A

When looking at the blog stats for reader numbers (never more than 5 times a day) one area that often catches my eye is the “search terms used to find your blog” column.  Many times I get the feeling that whoever has clicked on this blog after typing a particular search term, probably hasn’t got the answer they were looking for.  So in the interests of fairness (and since it’s friday), I thought I’d answer a few of the more common ones:

Q: Gorilla Box Storage

A: Sorry, this isn’t me

Q: Gorilla Case Storage

A: Still isn’t me (sorry)

Q: Gorilla Glue

A: Nosir, none of that here

Q:  Variations on the Theme “IBM XIV Problems”:

A:  Yup, XIV has problems.  Whether you think they are insurmountable will very much depend on your need for the product (if you’re a small business committed to running AIX 5.2 and wanting iSCSI for less than 20TB, then I suggest looking elsewhere).  Personally I don’t feel that any one item is a killer, and even the sum of the parts is still infinitely less than the crap I have to put up with from other vendors – the trade-off in ease of use and performance is worth it for a bunch of minor niggles.

Q:  NetApp or Lefthand?

A:  Depends what you’re looking for.  If you need small to medium scale iSCSI for applications, databases and other block requirements (VMware), and don’t mind proliferation of small storage arrays (or maybe have a number of distributed sites) go for HP Lefthand or Dell Equalogic.  If you’re looking for market-leading file server environments (of the same approximate size), look more at NetApp’s 2000 range.  I have a real issue with running applications on a file system, so don’t have much time for NetApp as a database or VM platform.

Q:  AIX not seeing IBM XIV FC LUNs

A:  Have you tried:

  • Turning it off, then on (sorry, couldn’t resist the AOL solution 🙂 )
  • Installing all relevant hotfixes
  • Installing the host attachment kit from the IBM website (Solaris shows up fine in the XIV without the HAK installed, AIX tends to sulk until it has its MPIO)
  • Attaching to XIV via a SAN (the only OS that insists on SAN attachment, AIX has dragged all others down with it)

Q:  Variations on “XIV or Clariion?”

A:  Had an interesting discussion over email on this a week or so ago (reposted in the comments section of my original article by kind permission of the sender).

Overall my feeling is that both are good technologies in the market as it stands.  Over time, I think the Clariion-type architecture will be superseded by variations on the Parallel Grid used by XIV, but right now, both serve a purpose.  In the sub-20TB range, Clariion would probably be more economical, but for >20TB, >20,000 IOPS it becomes a matter of pricing and discounting, particularly once EMC get unisphere into the public domain.  >40,000 IOPS I don’t see a contest – I’d put in an XIV anytime (unless you have V-Max money to spend, in which case it would be 4 XIV’s 🙂 ).

Q:  XIV or EVA?

A:  To the two people who asked this – XIV

(or just about any other array on the market, really.  Sorry HP, I love your servers and buy your printers, but isn’t it time you took these abominations out back of the barn and shot them in the head?)

Q:  XIV Async Minimum

A:  Minimum what?

Q:  Does XIV support RAID 6?

A:  Not yet, and if/when it comes, it probably won’t resemble the RAID 6 implemented elsewhere.  Just as the current layout isn’t restricted to using a specific set of disks (and so isn’t actually RAID at all in the strictest sense) protecting against triple failures on an XIV would involve a sort of 1+1+1 set of three copies spread across all disks.  Given the impact on capacity, this would probably have to be limited to 2TB disk models and above, as a 1TB model would end up with only 40TB usable .

Q:  IBM XIV worth a look?

A:  If not evident by now, I think so.  Whatever the use, it’s worth at least an evaluation, even if the eventual answer is no.

Q:  IBM XIV 12TB

A:  Not yet, maybe not ever if IBM keep it up

Hope this helps – if I’ve missed anything, please let me know at the usual address.

SG.

XIV Issues & Fixes

One of the areas I’ve found frustrating about working with IBM is the lack of any way to track down technical issues and fixes – even to work out what issues have been resolved and what haven’t tends to involve a bit of detective work.

Almost every gathering of XIV-focused professionals I’ve been to, whether customers, business partners or IBM themselves, invariably has a point in time where the subject turns to discovered issues.  And every time, I hear the same issues being discussed, usually all known issues which were resolved in the last firmware iteration but one, and each time voices from all around say.

“I’d never heard of that one”

Contrast this with other vendors, where there is at least an attempt to keep up to date with known and resolved issues – any time I need to know why a Clariion is acting weird, I can go onto Powerlink and chances are, the issue and a fix or workaround will be documented in their internal knowledgebase.  Coming from an airline background, where every change to the diameter of a rivet is documented, stamped and approved, I find the mere existence of a central change log weirdly comforting, so this adhoc approach is seriously beginning to bend my head.

I’ve asked questions on this subject at several IBM Q&A sessions – each time I get told that it’s a “good idea” and “someone will look into it”.

(So why do I get the feeling that the chap “looking into it” rides a flying pig to work each day?)

So anyway, all of this is leading up to the reason for a new tab in the top right corner of the front page.  Titled “XIV Issues & Fixes”, it’s my stab at an unofficial issue log until IBM get their collective finger out.  At the very least I’ll be posting my own experiences and findings in the table, but consider this an open invitation:  if you’re an XIV user and/or business partner (or even IBM if they’ll let you) feel free to contribute to the list.

I’m not interested in anything under NDA, competitor FUD or maybes and might be’s, but if you’ve come across a solid issue, workaround, undocumented process or best practice, let me know and I’ll write it up (and make sure you get full credit 🙂 ).  The list is in text-only form at the moment, but when I get time I’m planning to add links to PDF documents with the full text, so be as detailed as you feel necessary.

By doing so, you’ll not only be helping others, you’ll be helping keep the alarm bells in my head to a reasonable minimum 🙂

Many Thanks in advance

The Gorilla.

DataDirect Networks – the “other XIV” ?

A year or two ago, I got a call from one of our sales guys, who was looking for a solution to compete with a technology I’d never heard of, for a pretty unusual requirement.  The customer was doing (very) high-performance video effects work, and the sales guy was looking at potential replacements for an existing solution, the customer having mentioned in passing that while the equipment worked extremely well, she had some concerns around support for what was a non-mainstream product.

Sales guy, having jumped on this opening like a starving weasel, wanted to understand which of the vendors would be the best solution to compete with the incumbent, a company called DataDirect Networks.

Having sat down and looked at the set of requirements provided and done a bit of research into DDN, my response was that nothing currently on the market would do what the customer wanted, at the price they were prepared to pay.  Sure, a DMX or USP-V could be configured to provide what were, quite frankly, outlandish performance requirements, but even preliminary configurations were pricing at around double what the customer claimed to have paid for the DDN system they already had.

“Are you sure they’re actually getting this level of performance?”

The above became my catchphrase for the next few days until the sales guy, disheartened by conversations around solid state disks, huge cache requirements and the need to mortgage a kidney to pay for it all, went off to look for some slightly lower hanging fruit and I was left with the feeling that I should find out more about this DDN bunch.

  

So who are DDN?

DataDirect Networks (DDN) can be characterised as the most successful unknown niche storage vendor on the market.  A privately held company founded in 1988 and based in Chatsworth California, DDN are the manufacturers of an expanding product range, based around a parallel processing architecture.

Basically, imagine if we took the XIV capabilities and rather than using SATA disk, we attach higher-speed SAS disks.  At this point, we get something which can power “40 of the 100 fastest computing environments in the world”.  At the time I looked at DDN, their architecture was capable of a sustained 2.8GB/s (bytes, not bits) – this has since been expanded to 6GB/s.

That’s.  21.  Terabytes.  Per.  Hour.

Ouch

Put simply, this is about what I’d expect to get practically out of an enterprise system costing twice as much and smothered in solid state technology.  DDN achieve this with spinning disks and a Clariion-style footprint. And, as I keep coming back to, a lower price-tag.  With the implementation of new architectures and solid state disks, DDN are claiming that 10GB/s is now possible.

  

So what’s the secret?

The DDN product range is based around the Silicon Storage Appliance (S2A) and Storage Fusion architectures.  As with the IBM XIV, these arrays use a massively parallel processing architecture to ensure that sequential data is processed at effectively wire speeds.  Unlike mid-range arrays such as the EMC Clariion or Hitachi AMS, data is not funnelled to a single cache to speed up disk access – all error checking is done by the individual S2A processing complexes and passed directly to disk.

So what can DDN do for me?

Basically, DDN have opted for more flexibility than IBM have with XIV.  Rather than a single disk type, DDN provide a range of disk types and shelf configurations.  The performance option is similar to XIV’s front-loaded, 12 disks per shelf design, providing a small capacity footprint, but achieving the promised high-performance.  This is the configuration which is used in the supercomputing environments which make up DDN’s more interesting case studies.

The high-capacity solution on the other hand, uses a top-loading, densely packed shelf, allowing 60 disks per enclosure – with 2TB SATA drives, this would be capable of around 2 petabytes in a single rack.  This is marketed as a high-performance VTL and archiving product, rather than as a compute platform.  More recently, the company has moved from building block storage arrays, to also providing scalable file-system storage, based on their own NAS designs.

So where can I buy one?

In the UK at least, DDN arrays tend to turn up as part of larger vendors’ solutions.  IBM have long sold the S2A arrays as part of their supercomputing portfolio and have integrated the capacity versions into their SONAS product. 

Update – I’m told on good authority that the DDN products are also resold outside of the suporcomputing area as the DCS9900.  I’d be interested to hear from any IBM customers (or anyone else) who are using DDN, but particularly in this area, as I’ve not seen a case-study or use-case for non BlueGene implementations of DDN.

HP have recently announced that they will sell DDN as their top-end NAS product.  It appears from the announcements that, rather than bolt the storage array onto their own NAS offerings as IBM have done, HP will sell DDN’s own NAS offering.

  

Conclusion – so is this a competitor to XIV?

Strangely no, not really.  At least not yet.

At least at the present time, DDN tend to sell to pretty specific use-cases.  While XIV is now sold as a general computing platform, the DDN arrays tend to be sold for big number-crunching and media-serving environments.  The head of the company was quoted as saying that, in non-media serving environments, the S2A architecture would be “horrible” compared to competitors such as the EMC Clariion. 

But then, as IBM showed by marketing the XIV in the early days as a web serving platform only, things can change (P45 for the head of marketing, hey?).  It may be that DDN forms the upper and lower tiers of a storage environment, with the middle taken up by a general computing platform such as Clariion or XIV. 

The diagram below shows a possible solution of this type:

My own feeling is that the only thing stopping DataDirect Networks from becoming the next generation of EMC or HP storage is the fact that it is still a private company.  If DDN ever take the IPO route, I predict they will last about 5 minutes before the first takeover attempts (hostile or not) begin. 

This might not be a bad thing.  XIV is currently gaining IBM market share based on a combination of usability, cost and performance which I’ve not seen traditional storage architectures begin to touch.  DDN for their part, would get the same kind of boost that XIV received from IBM – a properly global reach, greater R&D spend and far greater brand awareness, at the cost of identity.

Depressed by 10 years of reselling LSI seconds, IBM went for a complete makeover, and stole a march on the competition by purchasing the technology most likely to shake up the market.  Other storage vendors will have difficulty developing an answer to the XIV architecture in a reasonable time – purchasing the DDN architectures would give an immediate route to replace existing dual-processor architectures (and an existing user-base of weird and wonderful customers). 

The attempts may already have begun.  Commentators have pointed out that the HP reseller deal fills no gaps in the HP product range – they already have large scale NAS products so DDN is effectively redundant within their portfolio.  I’d suggest that this is an attempt by HP to get an understanding of the practical reality of the DDN products, with a view to understanding whether it’s a credible product set for HP to buy up – possibly as a replacement for the rebadged HDS arrays currently sold in the enterprise space. 

Either way, I’m keeping an eye on DDN – in any case, reading their customer case studies is always going to be entertaining.  A selection below gives a sample of customers:

  • TimeWarner Cable – Video on Demand
  • Pacific Title & Art – The post-production effects house for Batman – The Dark Knight movie
  • Microsoft Studios – Xbox Live Media & Data Serving
  • CCTV – Chinese State Television, Media Serving for Coverage of the Beijing Olympics
  • Shutterfly – Photo sharing site serving up to 1 billion photos at present (growth from 100 million in 2 years)
  • Slide.com
  • Kodak Easyshare
  • National Center for Data Mining (UIC)
  • Northwestern University & John Hopkins University – Winner, 7th Annual Bandwidth Challenge (2006)
  • CalTech, CERN, University of Florida and the University of Michigan – 2nd, 7th Annual Bandwidth Challenge
  • Indiana University – 3rd place, Bandwidth Challenge at Supercomputing 2006
  • Lawrence Livermore National Laboratories,
  • Sandia National Laboratories,
  • NASA and NASA Ames
  • Argonne National Laboratory

7 Reasons why IBM’s XIV isn’t Perfect

Let me just start by saying that I’m not biased – I’m really not.

No, really, I’m not.

I promise I’m not biased, cross my heart.

Honestly I get no more out of recommending IBM than I do from anyone else.

Really, I’m working with Netapp this week, EMC next week, and HDS the week after that.

I’m not biased at all.

If I seem to be labouring the point, it’s because over the last year and a half, I’ve found that every time I talk about how good IBM’s XIV storage array is, after a few minutes people start giving me funny looks (funnier than usual) and asking “what’s in it for you?”. 

Given that I spent the first years of my IT career designing solutions around EMC Symmetrix and in the years since have spent my time designing storage solutions for every major player in the market (and a good number of minor ones) I really don’t see myself as having any one favourite.  With no exceptions, the organisations I’ve worked for have been vendor-neutral, and I’ve never really got the hang of the cordial hatred that vendors seem to have for each other’s products.

Lately, I’ve found that so many people are primed with the idea that anyone who even mentions XIV as a possible solution must be in the pocket of the IBM sales mafia.  I’ve no sooner begun talking about the benefits of the architecture, when people begin to question my impartiality.

I’ve come to the conclusion that the problem is, people are used to storage technologies (and technology in general) letting them down.  No storage technology is ever perfect – there are always hidden flaws and gotchas which surface only after the array has your organisation’s most precious data stored in its belly.

So anyone who comes along talking enthusiastically about an array which “just works” is automatically suspect.  It’s big, it’s expensive – so it must have problems.  So in this article I’m going to look briefly at the benefits of the array but concentrate mainly on the issues I’ve experienced in the last year and a half of working with XIV, thus finally demonstrating my sceptical side.

The benefits

Anyone who’s read a marketing slide from IBM knows the benefits of XIV – it’s easy to manage, stores up to 79TB in one rack space, is highly resilient and performs well at a low price.  With an increasing number of my customers running happily on XIV, I have no reason to disagree – in the large, well publicised (important) areas covered by the marketing brochures, the XIV really does “just work”.

To me then, the XIV has earned its place at the top table.  At 79TB of capacity and 50 to 70,000 IOPS performance each, it’s never going to compete on a 1:1 basis with the largest Symmetrix or Tagmastore arrays (200,000 IOPS and 600TB of Tier 1 storage anyone?) but then it does cost around 10x less than these arrays, and I’ve found that several XIV arrays will work as well as one large array (a Tier 1 array with the capabilities discussed above will take up 9-10 rack spaces in the datacentre, compared to 4 for the equivalent XIV).

The old chestnut

 “Double drive failure on an XIV will lose data!” scream competing vendors, somehow managing to imply that in a similar situation, their own systems would operate untouched.

Really?

Maybe one day this will happen to one of my XIV customers and I’ll know for sure – in the meantime I have to go off “interpretations of the architecture” and “assumptions of how the array will work”.

I’ll use my own interpretation, thanks 🙂

My reading of the system is that data loss is at least statistically possible in the case of simultaneous double drive failure.  Data entering XIV is split into 1MB chunks (XIV confusingly calls them “partitions”).  These partitions are copied and both copies are spread semi-evenly across the array.  Distribution is not random, much work goes into keeping these partitions on separate drives, separate modules at opposite ends of the array. But at some point the two copies have to be on two disks – if those particular two disks are lost, the data is gone.  Once you have a million of those partitions floating around on any given disk, the counterparts of some will have to be on each disk in the array.

But how likely is the situation to occur?  In 10 years I experienced exactly two incidents of double disk failure – in both cases these were within the same RAID Group, a good number of hours apart.  Fortunately in the first incident the hot-spare drive failed during rebuild – embarrassing and time-consuming to fix but no data loss.  In the second incident the RAID Group itself was lost and had to be recovered from backups.  In both cases, the explanation given for the second failure was age, coupled with the stress put on the disks by the rebuild process (72+ hours of sustained writes to disk for the rebuild + trying to keep normal operation going cannot be good for the disk).

  RAID 5 146GB (4+1) RAID 6 146GB (12+2) XIV 1TB RAID x
Number of copies of data 2 3 2
Rebuild time after disk loss 72+ hours 72+ hours 30 minutes
How many drives can be lost? 1 2 1
Disks involved in the rebuild 4 12 160
Increase in full load during rebuild 20% 7% 0.63%
Estimated lost data if 2 disks lost 526 GB 0 GB 9 GB

 

The point for me is that the examples above don’t make me stop discussing RAID 5 solutions with customers – if a customer wants to survive double drive failure they put in RAID 6, accept that they need many more drives to get performance (and start worrying about triple drive failure).  Implementing RAID 5 involves a risk that two drives may go at some point and they may lose data – this may also be a risk with the XIV.  If our data is this important, isn’t this what we have backup systems, snapshots, and cross-site replication for?

A little history

To me, it’s one of these issues that gets blown out of proportion – back in the early 2000’s, I kept hearing tales of EMC sales people, who went into meetings with customers, only to be asked probing questions (rather obviously planted by competitors) of the type “why is Symmetrix global cache a single point of failure?”

This was a problem for EMC, as the answer is “it may look like that’s the case, but actually it’s not an issue – here’s why…………… [Continued for the next 3 weeks]”.  At the time, what the customer saw was EMC sales teams descending into jargon and complex technobabble rather than just give a simple (to them) yes or no answer.  To me this is the same issue – the explanation of why there is no real issue is so involved, that customers lose patience.

EMC veteran blogger Chuck Hollis says it better than I could in his discussion of reasons why EMC delayed implementing RAID 6.  The full text can be found here:

http://chucksblog.emc.com/chucks_blog/2007/01/to_raid_6_or_no.html

Comments that I find particularly relevant to this discussion are:

“And way, way, way down the list – almost statistically insignificant – was dual disk failure in a single LUN group.”

“There’s a certain part of the storage market that is obsessed with specific marketing features, rather than results claimed”

“As a result of our decision [delaying RAID 6 implementation], I’m sure that every day someone somewhere is being pounded for the fact that EMC doesn’t offer RAID 6 like some of the other guys”

The issues

So, having gone through all of that, the IBM XIV is perfect?

No chance.

Over the last year and a half, a number of issues have become obvious in the operation of the XIV.  None are fundamental to the technology itself, but have formed a barrier to customer take-up of the array.

1. High capacity entry point: 

XIV can be sold with a minimum of 6 modules, or 27TB.  This is way down on the “at launch” configuration of 79TB only, but is still too high an entry point for many customers.  This high start point pretty much assures the survival in some form, of XIV’s internal IBM competitors, the DS3000 and DS5000 – these arrays can scale from much smaller volumes of storage, so will need to be kept alive in some form to provide for the low end of the market, at least until XIV can be sold in single module configurations.

2. Upgrade Step: 

Once customers reach 27TB, the next place they can go from 6 modules is 9 modules – 43TB.  Again, this is a tremendous jump and has put off some customers who prefer a smooth upgrade path.

3. Lack of iSCSI in the low-end configurations:  

So your small customer has spent more than he needs to on getting his very performant, very easy to manage storage array, but at least he can save money by using iSCSI, and avoiding the cost of dedicated fibre switches and HBAs?  Not a chance – in the 27TB config, XIV has no iSCSI capability.  Until you upgrade to 43TB you don’t even get the physical iSCSI ports at all.  So the segment of the market that could make best use of iSCSI, doesn’t get to use it.

4. Rigid linkage of performance to capacity: 

Traditional storage tends to have a central processor, with capacity added by adding disk trays.  For increased performance the central processor is upgraded and faster disks (or solid state disks) can be added.  With XIV, the growth model is fixed – each module adds both disk capacity and increased performance, as cache and processors are built into the module itself.  This is an immensely simple way of doing things and ensures that customers always know how much performance headroom is available, but it’s a double edged sword.

I have found there are times when a customer has a tiny storage volume, but massive performance requirements (in a recent case, 5TB of volume, average 50,000 IOPS performance requirement).  At this point, the customer has two choices:

A)     Provision the largest storage processor going on a traditional-model storage array, pack with a number of SSD drives or,

B)      Provision a full 79TB XIV to get the benefits of the entire 15 modules of cache and processor.

Both of these end up roughly the same price, but with option B, the purchaser has to explain to his superiors why he has 74TB of capacity that no one else can use (he then has to explain again next week when finance decide they want some of his unused space, and the week after to purchasing….).

Some might say that this is a situation that capacity on demand models were made for, but in the situation above, rare as it may be, CoD will only stretch so far.  If a customer is demanding that 90% of the technology delivered will never be used (or paid for) it may not be a commercially advantageous deal to make.

5. Lack of Control:

This is one which has only become apparent as the first IBM-badged XIV arrays have come to the end of their original support contracts.

IBM restricts access to a number of key technical functions.  Manually phasing out (removing from service) and phasing back in of modules and disks can only be performed by an IBM technician – access to these functions is locked, and guarded by wolves.

So you replace a disk – and until IBM support remotely phase your disk back in, that disk will just sit there, glowing a friendly yellow colour in your display and not taking on data.

The upside of this is that IBM support will probably call you immediately to tell you that the disk is awaiting phase in, and ask would you like them to do it (this has happened to me on a number of occasions).

But now hey – you’re out of support!  IBM will no longer phone you, when you call you’ve no support contract to draw on to get them to do the work for you, and to top it all you don’t have access to do it yourself.  A number of the underlying controls are locked away and IBM appears to have no plans to give out access, precluding any form of “break/fix” maintenance option.

This sort of IBM control-freakery is what keeps me awake at night – a decision by an IBM suit on the other side of the world, may make perfect sense at the time, but at 3am in a cold datacentre, when IBM are calling to tell me that my “issue can’t be resolved” due to that policy, is really not the time I want to find out I have an “insurmountable opportunity” on my hands.

6. Scheduling of Snapshots:

XIV makes copying data incredibly easy.  Snapshots are created at the click of a button, can be made read/write and mounted as a development volume.  Up to 16,000 can be created and maintained at any one time (eat that, “8 snaps maximum and 20% drop in performance” traditional storage!).

So why IBM, couldn’t you have included a built-in scheduler in the XIV interface, to let me make a new copy of a snapshot at regular intervals? 

Oh, you did?  You included built-in scheduling for the snapshots used by the asynchronous replication process?  So new replication snapshots can be created and overwritten on a regular basis to ensure that the replication stays on target?  But for my own application snapshots, I still have to buy an external replication manager application and set it up outside of XIV?  Thanks a bunch, IBM.

7. Support for older AIX versions:

Of all the operating systems supported by XIV, none has given as much trouble as AIX.  From direct fibre connection support (it doesn’t, end of story) to load balancing (it does, but only on newer releases and needs to be set manually) the AIX/XIV combination has taken some time to get into a usable state.  Recently though, as long as you’re on 5.3.10 or 6.1, you’re sorted.

If you happen to have applications which need to remain on a version older than 5.3.10, you may find you have issues – starting with no automated load balancing and low queue depths.  This in turn leads to low performance, complaints, heart-burn, indigestion and generally bad stuff.

I’m not an AIX expert, but from the work-arounds described in various places on the web, the cure is worse than the disease – manual scripting and complex processes which take all the fun out of managing an XIV environment.

IBM’s response so far has been:

“All you Luddites join the 21st century and upgrade to 6.1 – and can we sell you Power 7 while we’re at it….”

In Conclusion

In the last year and a half, I’ve gone from disbelief in the claims made by IBM, to grudging acceptance to a genuine liking for the product.  It is simple, powerful and cost effective, and I can see why there is a concerted effort by competing vendors to remove it from the field by throwing up a FUD screen around it. 

The point for me is that most of the negative comments I’ve seen tend to be of the “there is no possible way that XIV can do what it claims!!!” school of thought.  With a growing customer base running everything from Tier 1 Oracle, to MS Exchange, to disk backup systems on the arrays, I beg to differ.

It does no single thing massively better than the competition, but just does everything very well:

–          It’s easy to manage (but so is an HP EVA or SUN 7000)

–          It’s very fast for its size and cost (but an EMC Symmetrix is faster)

–          It contains a large volume in a small area, (but an HDS USP holds more and can virtualise)

But you can see there is no overlap – XIV appears in all categories, while the competition tends to focus on one or two areas.

As a solutions designer, I see the acceptance of XIV’s simple way of working leading to an end of complex LUN and RAID Group maps, the end of pre-allocation of storage months in advance, the end of pen and paper resizing exercises, and so on.

That said, it can be seen above that the array does have some nagging problems, most of them soluble if the will exists.  In most of these decisions (i.e. no iSCSI in low-end arrays) I see the corporate hand of IBM –“let’s not make the box too convenient for small customers, or they’ll never buy upgrades”

I see the IBM connection as a two-edged sword.  On the one hand, without IBM’s name, support, and R&D spend, XIV would still be languishing down in the challengers’ space with the likes of Compellent and Pillar data.  Having XIV harnessed to the IBM machine leap-frogged it into a mainstream position, bypassing potentially years or decades of effort.

But on the other hand, fitting XIV into IBM’s corporate strategy causes decisions that are hard to stomach.  The XIV modules that are added as part of the 6 to 9 upgrade have only one difference – the addition of a dedicated card for iSCSI connection.  There is no reason I can see why these cards could not have been added to the first set of modules to allow iSCSI at 27TB.

In the last year, I’ve probably spent as much time designing solutions around other vendor’s products as I have IBM’s.  This is because in the real world a buying decision takes in many more factors than just “which disk array is the fastest/best/cheapest”. 

EMC for example, have a completeness of offering in the storage area, which IBM can only aspire to; having spent the last 10 years developing an ecosystem of complementary products, EMC are fixing problems that IBM hasn’t even really started to address, apart from a spate of acquisitions of the type that EMC started before 2000 and have continued ever since. 

I’m pretty sure that while the other vendors are attacking the XIV way of doing things, in the background they’ll also be coming up with their own ways to match it – IBM has a head-start, but that’s all it is – a temporary advantage in one field of storage.  To capitalise, they need to stop thinking that the IBM way is the only way and look at some of the engineering decisions that are not working for customers.

All of the issues discussed above have either been experienced by me either during design or implementation.  Whether they are an impediment to a customer considering XIV will very much depend on the situation.  Personally I find few occasions when XIV is not at least worth a look, even if it’s not the be-all and end-all.

But maybe I’m just biased.

Update – a few days after I posted my article, Tony Pearson at IBM posted an article making some similar points regarding double drive failure, but providing an additional piece of very key information.

https://www.ibm.com/developerworks/mydeveloperworks/blogs/InsideSystemStorage/entry/ddf-debunked-xiv-two-years-later?lang=en#comments

Tony is careful to point out that no customer has ever experienced double drive failure, but his calculations for worst case data loss match mine at 9GB  (always nice when the professionals agree with you) 😉

The additional info has to do with the “Union List” – this is something we’ve known must exist, but up till now it’s not something I’ve seen published confirmation for (secretive bunch, IBM).  Basically the Union List will tell you which 9GB of data has been lost in the form of a logical block address list, allowing targeted recovery of the lost data. 

I’ve not seen this in action so no idea how well it works in practise, but I’m going to have much fun pursuing it with IBM over the next few months…….

Update (2) – A couple of other articles have linked to this one.  Thanks to both Simon Sharwood at Techtarget ANZ  and Ianhf at Grumpy Storage for their favourable reviews and the redirect – I wondered where all the traffic was coming from!  Will try to return the favour sometime 🙂