7 Reasons why IBM’s XIV isn’t Perfect

Let me just start by saying that I’m not biased – I’m really not.

No, really, I’m not.

I promise I’m not biased, cross my heart.

Honestly I get no more out of recommending IBM than I do from anyone else.

Really, I’m working with Netapp this week, EMC next week, and HDS the week after that.

I’m not biased at all.

If I seem to be labouring the point, it’s because over the last year and a half, I’ve found that every time I talk about how good IBM’s XIV storage array is, after a few minutes people start giving me funny looks (funnier than usual) and asking “what’s in it for you?”. 

Given that I spent the first years of my IT career designing solutions around EMC Symmetrix and in the years since have spent my time designing storage solutions for every major player in the market (and a good number of minor ones) I really don’t see myself as having any one favourite.  With no exceptions, the organisations I’ve worked for have been vendor-neutral, and I’ve never really got the hang of the cordial hatred that vendors seem to have for each other’s products.

Lately, I’ve found that so many people are primed with the idea that anyone who even mentions XIV as a possible solution must be in the pocket of the IBM sales mafia.  I’ve no sooner begun talking about the benefits of the architecture, when people begin to question my impartiality.

I’ve come to the conclusion that the problem is, people are used to storage technologies (and technology in general) letting them down.  No storage technology is ever perfect – there are always hidden flaws and gotchas which surface only after the array has your organisation’s most precious data stored in its belly.

So anyone who comes along talking enthusiastically about an array which “just works” is automatically suspect.  It’s big, it’s expensive – so it must have problems.  So in this article I’m going to look briefly at the benefits of the array but concentrate mainly on the issues I’ve experienced in the last year and a half of working with XIV, thus finally demonstrating my sceptical side.

The benefits

Anyone who’s read a marketing slide from IBM knows the benefits of XIV – it’s easy to manage, stores up to 79TB in one rack space, is highly resilient and performs well at a low price.  With an increasing number of my customers running happily on XIV, I have no reason to disagree – in the large, well publicised (important) areas covered by the marketing brochures, the XIV really does “just work”.

To me then, the XIV has earned its place at the top table.  At 79TB of capacity and 50 to 70,000 IOPS performance each, it’s never going to compete on a 1:1 basis with the largest Symmetrix or Tagmastore arrays (200,000 IOPS and 600TB of Tier 1 storage anyone?) but then it does cost around 10x less than these arrays, and I’ve found that several XIV arrays will work as well as one large array (a Tier 1 array with the capabilities discussed above will take up 9-10 rack spaces in the datacentre, compared to 4 for the equivalent XIV).

The old chestnut

 “Double drive failure on an XIV will lose data!” scream competing vendors, somehow managing to imply that in a similar situation, their own systems would operate untouched.

Really?

Maybe one day this will happen to one of my XIV customers and I’ll know for sure – in the meantime I have to go off “interpretations of the architecture” and “assumptions of how the array will work”.

I’ll use my own interpretation, thanks 🙂

My reading of the system is that data loss is at least statistically possible in the case of simultaneous double drive failure.  Data entering XIV is split into 1MB chunks (XIV confusingly calls them “partitions”).  These partitions are copied and both copies are spread semi-evenly across the array.  Distribution is not random, much work goes into keeping these partitions on separate drives, separate modules at opposite ends of the array. But at some point the two copies have to be on two disks – if those particular two disks are lost, the data is gone.  Once you have a million of those partitions floating around on any given disk, the counterparts of some will have to be on each disk in the array.

But how likely is the situation to occur?  In 10 years I experienced exactly two incidents of double disk failure – in both cases these were within the same RAID Group, a good number of hours apart.  Fortunately in the first incident the hot-spare drive failed during rebuild – embarrassing and time-consuming to fix but no data loss.  In the second incident the RAID Group itself was lost and had to be recovered from backups.  In both cases, the explanation given for the second failure was age, coupled with the stress put on the disks by the rebuild process (72+ hours of sustained writes to disk for the rebuild + trying to keep normal operation going cannot be good for the disk).

  RAID 5 146GB (4+1) RAID 6 146GB (12+2) XIV 1TB RAID x
Number of copies of data 2 3 2
Rebuild time after disk loss 72+ hours 72+ hours 30 minutes
How many drives can be lost? 1 2 1
Disks involved in the rebuild 4 12 160
Increase in full load during rebuild 20% 7% 0.63%
Estimated lost data if 2 disks lost 526 GB 0 GB 9 GB

 

The point for me is that the examples above don’t make me stop discussing RAID 5 solutions with customers – if a customer wants to survive double drive failure they put in RAID 6, accept that they need many more drives to get performance (and start worrying about triple drive failure).  Implementing RAID 5 involves a risk that two drives may go at some point and they may lose data – this may also be a risk with the XIV.  If our data is this important, isn’t this what we have backup systems, snapshots, and cross-site replication for?

A little history

To me, it’s one of these issues that gets blown out of proportion – back in the early 2000’s, I kept hearing tales of EMC sales people, who went into meetings with customers, only to be asked probing questions (rather obviously planted by competitors) of the type “why is Symmetrix global cache a single point of failure?”

This was a problem for EMC, as the answer is “it may look like that’s the case, but actually it’s not an issue – here’s why…………… [Continued for the next 3 weeks]”.  At the time, what the customer saw was EMC sales teams descending into jargon and complex technobabble rather than just give a simple (to them) yes or no answer.  To me this is the same issue – the explanation of why there is no real issue is so involved, that customers lose patience.

EMC veteran blogger Chuck Hollis says it better than I could in his discussion of reasons why EMC delayed implementing RAID 6.  The full text can be found here:

http://chucksblog.emc.com/chucks_blog/2007/01/to_raid_6_or_no.html

Comments that I find particularly relevant to this discussion are:

“And way, way, way down the list – almost statistically insignificant – was dual disk failure in a single LUN group.”

“There’s a certain part of the storage market that is obsessed with specific marketing features, rather than results claimed”

“As a result of our decision [delaying RAID 6 implementation], I’m sure that every day someone somewhere is being pounded for the fact that EMC doesn’t offer RAID 6 like some of the other guys”

The issues

So, having gone through all of that, the IBM XIV is perfect?

No chance.

Over the last year and a half, a number of issues have become obvious in the operation of the XIV.  None are fundamental to the technology itself, but have formed a barrier to customer take-up of the array.

1. High capacity entry point: 

XIV can be sold with a minimum of 6 modules, or 27TB.  This is way down on the “at launch” configuration of 79TB only, but is still too high an entry point for many customers.  This high start point pretty much assures the survival in some form, of XIV’s internal IBM competitors, the DS3000 and DS5000 – these arrays can scale from much smaller volumes of storage, so will need to be kept alive in some form to provide for the low end of the market, at least until XIV can be sold in single module configurations.

2. Upgrade Step: 

Once customers reach 27TB, the next place they can go from 6 modules is 9 modules – 43TB.  Again, this is a tremendous jump and has put off some customers who prefer a smooth upgrade path.

3. Lack of iSCSI in the low-end configurations:  

So your small customer has spent more than he needs to on getting his very performant, very easy to manage storage array, but at least he can save money by using iSCSI, and avoiding the cost of dedicated fibre switches and HBAs?  Not a chance – in the 27TB config, XIV has no iSCSI capability.  Until you upgrade to 43TB you don’t even get the physical iSCSI ports at all.  So the segment of the market that could make best use of iSCSI, doesn’t get to use it.

4. Rigid linkage of performance to capacity: 

Traditional storage tends to have a central processor, with capacity added by adding disk trays.  For increased performance the central processor is upgraded and faster disks (or solid state disks) can be added.  With XIV, the growth model is fixed – each module adds both disk capacity and increased performance, as cache and processors are built into the module itself.  This is an immensely simple way of doing things and ensures that customers always know how much performance headroom is available, but it’s a double edged sword.

I have found there are times when a customer has a tiny storage volume, but massive performance requirements (in a recent case, 5TB of volume, average 50,000 IOPS performance requirement).  At this point, the customer has two choices:

A)     Provision the largest storage processor going on a traditional-model storage array, pack with a number of SSD drives or,

B)      Provision a full 79TB XIV to get the benefits of the entire 15 modules of cache and processor.

Both of these end up roughly the same price, but with option B, the purchaser has to explain to his superiors why he has 74TB of capacity that no one else can use (he then has to explain again next week when finance decide they want some of his unused space, and the week after to purchasing….).

Some might say that this is a situation that capacity on demand models were made for, but in the situation above, rare as it may be, CoD will only stretch so far.  If a customer is demanding that 90% of the technology delivered will never be used (or paid for) it may not be a commercially advantageous deal to make.

5. Lack of Control:

This is one which has only become apparent as the first IBM-badged XIV arrays have come to the end of their original support contracts.

IBM restricts access to a number of key technical functions.  Manually phasing out (removing from service) and phasing back in of modules and disks can only be performed by an IBM technician – access to these functions is locked, and guarded by wolves.

So you replace a disk – and until IBM support remotely phase your disk back in, that disk will just sit there, glowing a friendly yellow colour in your display and not taking on data.

The upside of this is that IBM support will probably call you immediately to tell you that the disk is awaiting phase in, and ask would you like them to do it (this has happened to me on a number of occasions).

But now hey – you’re out of support!  IBM will no longer phone you, when you call you’ve no support contract to draw on to get them to do the work for you, and to top it all you don’t have access to do it yourself.  A number of the underlying controls are locked away and IBM appears to have no plans to give out access, precluding any form of “break/fix” maintenance option.

This sort of IBM control-freakery is what keeps me awake at night – a decision by an IBM suit on the other side of the world, may make perfect sense at the time, but at 3am in a cold datacentre, when IBM are calling to tell me that my “issue can’t be resolved” due to that policy, is really not the time I want to find out I have an “insurmountable opportunity” on my hands.

6. Scheduling of Snapshots:

XIV makes copying data incredibly easy.  Snapshots are created at the click of a button, can be made read/write and mounted as a development volume.  Up to 16,000 can be created and maintained at any one time (eat that, “8 snaps maximum and 20% drop in performance” traditional storage!).

So why IBM, couldn’t you have included a built-in scheduler in the XIV interface, to let me make a new copy of a snapshot at regular intervals? 

Oh, you did?  You included built-in scheduling for the snapshots used by the asynchronous replication process?  So new replication snapshots can be created and overwritten on a regular basis to ensure that the replication stays on target?  But for my own application snapshots, I still have to buy an external replication manager application and set it up outside of XIV?  Thanks a bunch, IBM.

7. Support for older AIX versions:

Of all the operating systems supported by XIV, none has given as much trouble as AIX.  From direct fibre connection support (it doesn’t, end of story) to load balancing (it does, but only on newer releases and needs to be set manually) the AIX/XIV combination has taken some time to get into a usable state.  Recently though, as long as you’re on 5.3.10 or 6.1, you’re sorted.

If you happen to have applications which need to remain on a version older than 5.3.10, you may find you have issues – starting with no automated load balancing and low queue depths.  This in turn leads to low performance, complaints, heart-burn, indigestion and generally bad stuff.

I’m not an AIX expert, but from the work-arounds described in various places on the web, the cure is worse than the disease – manual scripting and complex processes which take all the fun out of managing an XIV environment.

IBM’s response so far has been:

“All you Luddites join the 21st century and upgrade to 6.1 – and can we sell you Power 7 while we’re at it….”

In Conclusion

In the last year and a half, I’ve gone from disbelief in the claims made by IBM, to grudging acceptance to a genuine liking for the product.  It is simple, powerful and cost effective, and I can see why there is a concerted effort by competing vendors to remove it from the field by throwing up a FUD screen around it. 

The point for me is that most of the negative comments I’ve seen tend to be of the “there is no possible way that XIV can do what it claims!!!” school of thought.  With a growing customer base running everything from Tier 1 Oracle, to MS Exchange, to disk backup systems on the arrays, I beg to differ.

It does no single thing massively better than the competition, but just does everything very well:

–          It’s easy to manage (but so is an HP EVA or SUN 7000)

–          It’s very fast for its size and cost (but an EMC Symmetrix is faster)

–          It contains a large volume in a small area, (but an HDS USP holds more and can virtualise)

But you can see there is no overlap – XIV appears in all categories, while the competition tends to focus on one or two areas.

As a solutions designer, I see the acceptance of XIV’s simple way of working leading to an end of complex LUN and RAID Group maps, the end of pre-allocation of storage months in advance, the end of pen and paper resizing exercises, and so on.

That said, it can be seen above that the array does have some nagging problems, most of them soluble if the will exists.  In most of these decisions (i.e. no iSCSI in low-end arrays) I see the corporate hand of IBM –“let’s not make the box too convenient for small customers, or they’ll never buy upgrades”

I see the IBM connection as a two-edged sword.  On the one hand, without IBM’s name, support, and R&D spend, XIV would still be languishing down in the challengers’ space with the likes of Compellent and Pillar data.  Having XIV harnessed to the IBM machine leap-frogged it into a mainstream position, bypassing potentially years or decades of effort.

But on the other hand, fitting XIV into IBM’s corporate strategy causes decisions that are hard to stomach.  The XIV modules that are added as part of the 6 to 9 upgrade have only one difference – the addition of a dedicated card for iSCSI connection.  There is no reason I can see why these cards could not have been added to the first set of modules to allow iSCSI at 27TB.

In the last year, I’ve probably spent as much time designing solutions around other vendor’s products as I have IBM’s.  This is because in the real world a buying decision takes in many more factors than just “which disk array is the fastest/best/cheapest”. 

EMC for example, have a completeness of offering in the storage area, which IBM can only aspire to; having spent the last 10 years developing an ecosystem of complementary products, EMC are fixing problems that IBM hasn’t even really started to address, apart from a spate of acquisitions of the type that EMC started before 2000 and have continued ever since. 

I’m pretty sure that while the other vendors are attacking the XIV way of doing things, in the background they’ll also be coming up with their own ways to match it – IBM has a head-start, but that’s all it is – a temporary advantage in one field of storage.  To capitalise, they need to stop thinking that the IBM way is the only way and look at some of the engineering decisions that are not working for customers.

All of the issues discussed above have either been experienced by me either during design or implementation.  Whether they are an impediment to a customer considering XIV will very much depend on the situation.  Personally I find few occasions when XIV is not at least worth a look, even if it’s not the be-all and end-all.

But maybe I’m just biased.

Update – a few days after I posted my article, Tony Pearson at IBM posted an article making some similar points regarding double drive failure, but providing an additional piece of very key information.

https://www.ibm.com/developerworks/mydeveloperworks/blogs/InsideSystemStorage/entry/ddf-debunked-xiv-two-years-later?lang=en#comments

Tony is careful to point out that no customer has ever experienced double drive failure, but his calculations for worst case data loss match mine at 9GB  (always nice when the professionals agree with you) 😉

The additional info has to do with the “Union List” – this is something we’ve known must exist, but up till now it’s not something I’ve seen published confirmation for (secretive bunch, IBM).  Basically the Union List will tell you which 9GB of data has been lost in the form of a logical block address list, allowing targeted recovery of the lost data. 

I’ve not seen this in action so no idea how well it works in practise, but I’m going to have much fun pursuing it with IBM over the next few months…….

Update (2) – A couple of other articles have linked to this one.  Thanks to both Simon Sharwood at Techtarget ANZ  and Ianhf at Grumpy Storage for their favourable reviews and the redirect – I wondered where all the traffic was coming from!  Will try to return the favour sometime 🙂

Advertisements

28 Responses to “7 Reasons why IBM’s XIV isn’t Perfect”

  1. Michael Rogers Says:

    This is why I think you are biased. In the table you list the drive type for RAID5 and RAID6 as 146GB. You didn’t specify the XIV was SATA 1TB drives. This is the issue I have with the XIV. SATA drives are not built for this type of sustained workload.

    • storagegorilla Says:

      Actually, I put the 146GB in the RAID 5 and 6 columns to make it clear where the data loss figure was coming from. As the XIV data loss calculation isn’t based strictly on drive size I didn’t think it was relevant at the time.
      Still – easily fixed 🙂 Updated – added “1TB” to column header.
      Note I haven’t added “SATA”, as I didn’t qualify the RAID 5 and 6 columns with an interface type either. Obviously these 146GB disks could be “Fibre Channel” (as used by EMC) or “SAS” (as used by Hitachi) it won’t change the calculations in the table.
      Regarding SATA reliability and whether XIV can sustain the workload, I’m afraid I couldn’t disagree with you more. The XIV solutions I’ve put in are being heavily used in production by my customers and have been for some time, with no reliability issues or drive failures.
      (As far as I know this is also the case with my EMC/IBM DS/Netapp/HDS customers from the same period – drives are pretty reliable these days).
      But hey! If there are any of the original XIV (1st Generation) Nextra users out there reading this, please let us know – after 5 years of use, how reliable are your drives?

  2. dravene Says:

    I wanted to thank you for all these information. I was trying to find someone who actually has experience with the XIV. We are currently trying to decide if this product is good for us. We bought 2 years ago Netapp FAS3040 (we are using 20To of data and having 11000 IOPS) and we have been so disappointed with the performance of such a system. Maybe XIV is more adapted to our needs.

    • storagegorilla Says:

      Hey Dravine

      Glad that the article was of help to you 🙂

      11,000 IOPS sounds very (very) low for a FAS3040. If possible, would you be able to send me the configuration for your current Netapp setup, as it sounds like your current system may be misconfigured? If you could let me know:
      – Is the FAS being used as a NAS (file server) or for applications
      – If files, what type (general data, or specific such as CAD or video)
      – If applications, which applications (e.g. exchange, SQL, Oracle etc)
      – Is your system configured with one or two controllers? Do you have 4GB/8GB RAM installed?
      – Is the 20TB figure usable or raw storage capacity?
      – Can you tell me the RAID configuration used (e.g. RAID 5 (4+1) )
      – Can you tell me how much of the 20TB is currently being used?
      – What type of disks are installed, and in what numbers (e.g. 66 x 300GB Fibre Channel)
      – Are you using Fibre Channel or IP connect? Can you let me know port types and numbers (e.g. 4 x 4Gb/s FC Ports)
      – Do you attach servers to the FAS3040?
      – If yes, do you attach servers directly to the FAS, or do you onnect via a SAN?
      – Do you replicate between two FAS units? If so, do you use synchronous or asynchronous replication?

      Depending on the information above, it may be possible to save your FAS, at least for a year or two :).

      From the figures you’ve given, I’d guess you maybe have between 66 and 85 drives, possibly 300GB Fibre Channel drives? If these have been configured on a single set of drive loops, this would explain the low levels of performance you are achieving and could possibly be remedied
      (or I may be completely wrong and it’s something else entirely 😉 ).

      If you were looking at XIV, you would probably look to start with a 6 module unit. This is the smallest XIV available, providing 27TB usable. As discussed in the article, the performance of XIV is tied unchangingly to the number of modules used – with a 6 module system, from 20,000 to 30,000 IOPS is a reasonable expectation for performance, though there may be a variation based on the type of applications (I can give a better answer using the information about your current configuration if you’re able to send it). Given your current limit of 11,000 IOPS, this would probably provide a good increase in performance.

      (note that the 50,000 IOPS figure referred to in the article is the design figure that I use for a full 15 module, 79TB unit. I use this as a figure to cover any eventuality regardless of application type and so it is rather conservative – I’ve seen the full arrays test at higher figures than this. The equivalent for the 6 module would be to design for 20,000 IOPS, though my own demo system, also a 6 module unit, has tested far higher than this! ).

      Hope this helps
      SG

  3. dravene Says:

    Hi Storagegorilla,

    Thanks a lot for taking some time to take a look at my problem. I’ll try to give you as much information about my configuration as you need.
    We are a small web hosting company and we basically put every server we use/sell on our FAS3040. We are hosting over 250 hosts on it. At first, we were planning to use the FAS3040 as a NAS with a lot of NFS shares but it turns out that it really was a bad choice and NFS clearly didn’t have the features customers were expecting. We are using Fiber channel disks and SATA disks. As we had reached on the sata disks an average disk busy > 70% (a bottle neck according to netapp and to my experience) we had to migrate most of the servers to Fibre disks.
    To answer your questions:
    – We do not use the FAS as a NAS anymore , besides 1 or 2 NFS share, all the data are stored in luns
    – We have a variety of servers accessing the data most of them are apache web servers (80), qmail mail servers(40), few sql server (<10), few exchange (<5), few CPU intensive servers (antispam, antivirus….) (<5). The performance issues were felt with the mail servers. The disks were so slow that the server wasn’t able to deliver the mail fast enough and we had to stop accepting the incoming traffic for the server to be able to deliver them.
    – We have 2 controllers (one is dedicated to SATA disks and the other to the fiber) with 4GB of ram
    – 20TB is the used capacity. The raw capacity consists of 1 sata aggregate of 12TB and 3 fiber channel aggregates of 6TB,3TB and 4TB
    – The current RAID configuration is RAID DP (RAID6)
    – we have 3 kind of disks: 42 * 274GB fiber channel, 14*410GB fiber channel, 28*621GB SATA
    – we use fiber channel to connect to the Luns we have on the FAS3040. We have 4 interfaces 4GB and we are only using 2 of them
    – The servers (we are using diskless blades) are directly attached to the FAS3040
    – There is no replication between the 2 FAS units

    The limitation we are facing is the average "disk busy". Netapp says that over 70% the performance are getting poorer. With a better repartition of our servers we manage to have the following: On our sata aggregate the average disk busy above all disks is 64% with peaks at 80%. The performance are way better with the fiber channel: average 40% peaks 70%.
    We are now looking for a cheaper solution per TB. Our customers don't understand why the storage is so expensive when the price of 1TB disk is so cheap. Moreover, we really don't want to face the same performance issues as we faced. It took us too much time and money to reach a solution which is just OK. (We cannot put any thing more on the SATA array otherwise the whole array will be unusable (there is more than 3To of free space on it..) I want to be sure XIV won't suffer from the same problems.

    Once more, i want to thank you for your help!

    • storagegorilla Says:

      Glad to help Dravene

      What I can provide may be a little limited, as I’d normally do this kind of solutioning face to face to iron out any misunderstandings I may have about your current environment, but I’ll do what I can with what’s to hand 🙂

      From the specs you’ve sent, the root cause of the performance issue you currently have, appears to be as expected – your disk numbers are slightly different (I didn’t expect SATA drives in a performance configuration at all) but the maximum performance I would expect to get from the disk config you’ve got would be around 12,800 IOPS, so your actual figure of 11,000 isn’t too far off the mark (use of RAID 6 probably isn’t helping either). A point to note would be that, of that 12,800 IOPS, only 2,800 would come from your SATA drives – in a traditional RAID-based storage array (e.g. Netapp, EMC, HDS etc etc) these drives are not recommended to provide application-level performance. The only reason XIV is able to make use of SATA drives is by putting as much load as possible on the processor/memory architecture at the front end of the array, and by writing to many disks at the one time (if you’re using RAID 6, chances are you’re writing to around 14 disks at any one time – a full XIV writes to more than 10 times this at any one time).

      The way I see it, you have two issues:
      -Technical: Fix the performance issue
      -Commercial: Provide customers the extra performance at low cost

      The technical issue can probably be fixed using your Netapp array, but the cost of the solution may not meet the requirements of the commercial requirement. I would recommend looking at both Netapp and XIV solutions – this will provide a sense-check to ensure that you know whether the XIV solution will actually be more cost effective. The advantage with the Netapp solution is that you’ve already paid for the array itself – everything from now is just upgrades. With XIV, the cost per TB is lower, but you have the cost of the chassis (and the cost to re-implement a new storage architecture) to take into account.

      Netapp Upgrade
      In the case of the Netapp solution there are a number of areas which can be addressed. If going this path, I would address the areas in order – if the first solution fixes your problem then stop there, if not, move onto the next fix and so on until performance matches that required.

      The FAS 3040 is capable of approximately 20,000 IOPS while still providing a reasonable resonse time (though this will drop slightly if you are using snapshots). The order of solution would be:

      -Disk Upgrade: The FAS 3040 is capable of holding 252 drives so has plenty of capacity left – an upgrade of around 60-70 x 300GB 15K Fibre Channel drives should provide an additional 10,000 IOPS. I’d recommend switching to RAID 5 where possible. This part of the solution will also involve activating the back-end drive loops (as many as possible) to spread the load and reduce any congestion. This would also give another 12TB of capacity, and as you’re using small FC drives, most of it should be usable, so you can take some of the load off the existing drives, and still have some capacity to expand.

      -Additional Fibre Ports: The additional disks will increase back-end performance, but above around 15,000 IOPS you will need to use the additional 2 x FC ports in the FAS array. If your blade servers have in-built switching (physical or virtual connect) the extra lines can be run directly into the blade chassis – if not, you will need to look at SAN switching to allow fan-out of the connections. Given you’re using two FC lines and direct attaching to the array, I’m assuming a single blade chassis per FAS array? If so, what will you do when you run out of blades and need to add a second chassis? I’m not a fan of direct attaching, much prefer SAN switching for this type of application.

      -Performance Accelerator Modules (PAM): The previous two fixes should have boosted performance, but if you are still seeing performance top out at a lower level than expected, it will probably be that increasing the disk numbers and front end ports has moved the bottleneck to the processing complex. If this is the case, it is possible to boost the available memory by the addition of PAM modules – these add up to 32GB to the existing 8GB of cache and will clear any front-end bottleneck. Rather than adding disk, it is also possible to start the upgrade by adding PAM modules, as these will provide a boost to your existing configuration. The reason I leave them to last is that they are (incredibly) expensive, and so are something I tend to leave until all else has failed.

      But of course, none of this is cheap – your customers are looking at the price of SATA disks as a guide to what you should be charging, but you need to use more expensive, smaller Fibre Channel disks to provide the performance they demand. (you could use SATA, but you’d require twice the number of drives, and would not be able to make use of more than a third of the capacity on each drive). The issue for you is that, as a hosting provider, the mechanism by which you provide that performance is essentially invisible to the customer, hence the disconnect. In the past I’ve dealt with this by introduction of tiered service models – by involving the customer in the process (usually driven by a service “menu”) they get to make their own choice of performance over capacity (or visa versa) – back this up with a little education on why storage “isn’t just about capacity” may silence some of the flak you’re getting.

      XIV Solution
      With XIV, as per my previous reply, I’d expect that the 27TB unit would be your starting point. As you don’t need NFS or CIFS, the native XIV with FC connection will work for you. 20,000 IOPS is easily achieveable with the array itself, but there are a few things to consider in order to get the best out of an XIV solution:

      -Make everything parallel: XIV works best when handling as many streams as possible at once. That you are a hosting provider and so sharing a wide number of applications simultaneously is a benefit, but if your blade chassis doesn’t have internal switching, I’d recommend putting a SAN infrastructure (couple of small switches – doesn’t need to be massive) underneath your blade servers in order to fan the load out across as many of the array ports as possible.

      -Queue depth: Maximise queue depths wherever your applications make it possible – XIV works best where it can take those queued transactions and spread them across as many modules as possible. Keeping queue depth low works for other arrays, but on XIV it stunts performance (I’ve seen a few performance tests fail due to this, particularly on XIV).

      -Proof of Concept: Get IBM or your business partner to do a PoC for your requirements. Depending on your geography, IBM can either supply eval units, or do a test at a demo centre (IBM XIV-certified partners are required to have a working XIV demo unit – if your supplier doesn’t have one, walk away and find one that does).

      – Buy-back: As your Netapp unit should still be in warranty, IBM may offer a reduced price for the array based on a buy-back of your current arrays, thus offsetting the cost of the re-implementation.

      I can’t promise that you’ll have no problems, only that my own experience and most of the customers I’ve spoken to (not necessarily my customers) have been relatively trouble free – worst issue I’ve found has been a couple of modules failing on installation (seems to be shipping issues). In each case, these were replaced the next day, and provided quite a nice demonstration for the customers of how seamless the re-integration of new capacity actually is.

      Let me know if you have any further questions.
      SG

      • dravene Says:

        Thanks a million for these informations. I was aware of some of the solutions you were giving about improving the NetApp. You really know what you’re talking about! I’ll let you know what we finally decided. I’m currently looking at a solution which will be keeping our NetApp and buying a HP P4000 G2 (formerly LeftHand). This seems to be a good option for us because we wouldn’t have to pay the cost of a migration and we would be able to get access to cheaper storage than NetApp.

      • storagegorilla Says:

        Glad I was of assistance 🙂

        Lefthand seems a bit of a departure from your original requirements. I have a number of customers who are happily on Lefthand for the flexibility of the iSCSI connections (not XIV’s strong suit) but generally, even in SAS configuration they require a number of units distributed across their business to make the solution work – that doesn’t strike me as being the way you’re planning to do things. Are you planning to siphon off unused data onto the Lefthand unit? That I could see being a possibility, but I’d be wary of expecting performance from a single lefthand unit.

        Have to say, in the iSCSI space, my personal preference is the SUN 7410. I find them easier to manage than Lefthand, they have a good set of features (can also do CIFS/NFS and other IP protocols). And best of all, when it get slow, you can whack in Solid State Disks of various sizes – the unit uses them as cache boosters similar to the PAM concept we discussed in the NetApp arrays (but quite a bit cheaper).

        My first projects with Accenture were around implementing big SUN servers on top of EMC Symmetrix 8300 arrays, so I’ve always had a soft spot for SUN kit – never really liked their storage products though until the 7000 range. It’s a bit of a shame, SUN finally developed a good storage product of their own, just in time to lose it to Oracle.

        Probably not what you (or your HP salesman) wanted to hear :), but if you’ve not signed the deal yet, they’re worth a look for comparison.

        SG

  4. Dror Says:

    Hi,
    As one of XIV founders I found your article very interesting. You’ve got some very good and correct insights/guesses. It’s very pleasing to see that someone unbiased appreciates the work we’ve done. Since I’m not an IBMer for more than a year and a half I can only guess why decisions were made the way they did – but your assumptions are quite logical. About the iSCSI issue: when XIV started we had only iSCSI customers for quite some time so it really saddened me to hear that the support for it has gone down the drain.
    Cheers,
    Dror

    • storagegorilla Says:

      Dror

      Wow! Seriously, that one of the original Talpiot team thinks I have the right idea, that really means a lot to me – thank you.

      I’m told at some point the iSCSI will be introduced on the smallest array – hopefully this signals that, now they have the FC support developed to a good level, they are now going to return to fix iSCSI – nice to hope anyway.

      Really good to hear from you.

      SG

  5. Stuie Says:

    Hi SG,

    Just finished reading your blog on the 7 reasons XIV isn’t perfect and found it an excellent informative article. I am management level not technical.

    We are currently in the process of tendering for our storage and are down to comparing IBM XIV and EMC Clariion CX4 and having a hard time deciding where to go.

    Our current storage is around 12Tb and we are using a CX500 in production and replicate to a newer (at DR) CX3-20.

    We are considering the 6 module (27Tb) XIV and with the Clariion we will (haven’t yet) be working with EMC to “tailor” our disk requirements but at a minimum will include FC-SCSI and SATA and possibly ESSD.

    Our goal is 100% virtualisation with VMware vSphere 4, including Citrix terminal servers and MS SQL database servers.

    I can get more information on current workloads and the like (we don’t have everything virtualised currently and have around 7 hosts connected to the array) but we do know that we can cause performance issues when we initiate replication (we use Mirrorview/A). We have finetuned the current environment as much as possible and have largely reached the limit of the storage processors.

    I guess what I’d like to get is some independent view on how the XIV vs. Clariion might stack up for what we are doing and while I know that different disks and spindles etc. on the Clariion can cause different performance results but I note you said that with the XIV the “target IOPS” should be 20,000… what about for a Clariion?

    Happy to provide more info if it helps get us pointed in the right direction?

    Thanks in advance.

    • storagegorilla Says:

      Stuie
      Thanks for your question – depending on your planned purpose, there is a use-case for either of the technologies, I’ve used Clariion and XIV extensively and they are both excellent. A technical recommendation may come down to cost considerations caused by the limitations of things like replication methods which will change depending on the answers below.

      Couple of questions come to mind immediately:

      Can you let me know the current workload figures and CPU/cache usage if you have them.
      Can you let me know the breakdown of disks (or let me see the config guides if you can)
      How many FC ports are currently in use for host attachment? Do you use a SAN currently?
      DO you use FC or IP ports for replication? If so, how many of each?
      What volume of usable data do you expect to reach in the next 3 years – do you expect to reach over 20TB within a 3 year period?
      What kind of link do you use between sites (e.g. dark fibre, MPLS etc) and what is the site to site distance?

      Do you have any figures for the planned future performance? for example:
      If exchange, do you have user numbers and mailbox sizes? How many blackberry or mobile users?
      If SQL, do you have any estimate of number of users?

      and so on for other applications on the array – I’m not looking for anything too detailed, just enough to calculate a rough total IOPS figure. If you have a number in mind for total IOPS or MB/s planned for the array, that would work in place of looking at individual applications.

      Regards
      SG

      • Stuie Says:

        Hi SG,

        See inline below, which is what my tech guys gave me, hope it’s useful.

        Probably worth mentioning from a business perspective one key driver is the ability to provide snapshots etc. for our developers/testers when doing their application work, the XIV looks to have this “licked” but we know of performance overheads on the EMC?

        Can you let me know the current workload figures and CPU/cache usage if you have them.

        -Cache is 8KB page size. 2GB memory on each storage processor. Read cache is 190MB and Write cache is 1247MB

        -CPU usage averages around 45% – 65% and can peak higher depending on what is happening, during backups and increase use of snapshots it can higher.

        -I/O workload – We have varying data types all with different I/O patterns, however to average it across all storage it is something like 70% read and 30% writes and for short periods can be 60% read and 40% writes.

        Can you let me know the breakdown of disks (or let me see the config guides if you can)
        [SAB] Without going into too much detail:
        12 x 146GB 15K FC – SQL databases
        5 x 500GB SATA – Email archiving
        4 x 500GB SATA – QA SQL Databases
        5 x 500GB SATA – Low I/O SQL and File systems
        8 x 300GB 15K FC – VMFS datastores for operating systems
        7 x 750GB SATA – Low I/O SQL database, Exchange, File Systems
        4 x 146GB 15K FC – SQL transaction logs

        How many FC ports are currently in use for host attachment? Do you use a SAN currently?
        [SAB] There are currently 8 dual port FC hosts, 4 are SQL servers and 4 are ESX servers, currently use a CX500

        Do you use FC or IP ports for replication? If so, how many of each?
        [SAB] FC ports for replication out of the CX500 connected into a fibre to IP router (Qlogic 6142, 2 ports used). Mirrorview/A is used for replication

        What volume of usable data do you expect to reach in the next 3 years – do you expect to reach over 20TB within a 3 year period?
        [SAB] Not really sure on this one but I would say “yes” there is a good chance based on what we know.

        What kind of link do you use between sites (e.g. dark fibre, MPLS etc) and what is the site to site distance?
        [SAB] Point to Point Ethernet

        Do you have any figures for the planned future performance? for example:
        If exchange, do you have user numbers and mailbox sizes?[SAB] About 400 max How many blackberry or mobile users?[SAB] few… about 30
        If SQL, do you have any estimate of number of users?[SAB] about 400 max

        and so on for other applications on the array – I’m not looking for anything too detailed, just enough to calculate a rough total IOPS figure. If you have a number in mind for total IOPS or MB/s planned for the array, that would work in place of looking at individual applications.
        [SAB] After looking at performance stats out of the cx500 we average 1000-2000 IOPS during business hours, and 2000 – 5000 IOPS during backups. MB/S averages out at 50-100 MB/s during business hours and up to 180MB during backups.

        Thanks

  6. storagegorilla Says:

    Hi Stuie

    I’ve been taking a look at the information you sent – as expected, both vendors have technology to meet your requirements. In the end it will probably come down to a cost/benefit decision rather than a straight technology yes/no.

    I’ve produced a few models to look at the best solution – based on what you’ve given me, the metrics are:

    – 20000-25000 IOPS Overall Storage Requirement
    – Minimum 20TB Usable Storage Requirement
    – Mix of FC, SATA (possibly SSD) on the EMC
    – Asynchronous replication required between sites
    – Use of Snapshots for data protection

    Performance:
    Based on a minimum performance requirement of 20,000 IOPS, you’re at the performance limits of a CX4-240, particularly when using Mirrorview/A and/or Snapshots. If performance is to go above this point, I’d look at choosing a CX4-480 instead of a 240. The CX4-480 is around 25-30% more expensive for the same disk configuration, but should take you beyond 30,000 IOPS easily.

    I design XIV in the 6 module configuration for 20,000 IOPS, but I have seen it go to 25,000 IOPS and beyond while maintaining low enough latency to keep applications functioning well. Asynchronous replication and snapshots do not have the effect on an XIV that they do on other arrays (Netapp excluded) as the XIV implementation of Snapshots uses a less performance-impacting method (Clariion and others such as HDS must migrate the original data when writing new to maintain the snapshot – XIV writes the new data while leaving the old in place).

    Flexibility:
    At the low end, the CX range is more flexible than the XIV on the number of ports available. The full XIV gets 24 FC ports and 6 iSCSI ports, however the 6 module version gets 8 x 4Gb/s and no iSCSI ports. If performing ful bi-directional replication, 4 of the available 8 ports needs to be reserved for replication (2 per direction), leaving 4 x 4Gb/s for host traffic. For this reason direct-attachment of the XIV is not recomended – you will need to get a fibre channel SAN.

    By contrast, the CX4-240 comes with 4 x 4Gb/s FC and 4 x 1Gb/s IP, and can be upgraded to a maximum of 12 x 4Gb/s FC, and 8 x 1GB/s IP (or 4 x 1Gb/s and 4 x 10Gb/s). While the 4 x 4Gb/s is obviously too low to do FC-based replication and host access at the same time, the maximum configuration will take you to a connectivity that you would need a 9 or 12 module XIV to achieve (it’s all in making the right choice when you buy the array in the first place).

    Capacity:
    In the model I produced, to reach 20,000 IOPS would require approximately 100 15K FC drives and 25 SATA drives (the performance is split 18,000 IOPS in FC, 2000 IOPS SATA, as I’d normally look to SATA for low performance applications only). This would provide a performant solution to around 30-32TB of usable capacity – this level of capacity is provided whether required or not due to the need to reach the IOPS performance – with Clariion, when the processor is removed as a bottleneck, the disk numbers directly affect the performance of the array – as a result, you would need the same minimum number of disks whether a CX4-240 or CX4-480 is used, but of course the eventual maximum would be greater with the Cx4-480.

    The XIV does not provide a choice in this matter – the 6 module array provides 27TB of usable capacity and 20-25,000 IOPS usable performance. The next stage would be a 9 module array, capable of around 30,000 IOPS (which would give you iSCSI and some extra FC ports).

    Management:
    XIV is better to manage – nothing I’ve worked with gives the level of visibility of what I’m managing or makes provisioning so straight-forward.

    Replication:
    As said above, using Mirrorview/A on the Clariion will affect performance, as the same processes underlying Snapshots are in effect during asynch replication, with the same performance impacts. To boost performance either a) get a bigger controller (480) or b) switch off both replication and snapshots.

    Switching to EMC Recoverpoint is more expensive than Mirrorview/A, but may allow you to use a CX4-240 rather than a 480, which might offset the extra expense. Recoverpoint does most of the asynch processing on-board, allowing the array to function more effectively. Of course, still using snapshots means the performance gain of RP is lost, so you would either have to look at using Recoverpoints Local CDP function, or just buy the larger array after all.

    XIV provides asynch replication with a lower performance impact (it’s not completely transparent, but it’s not the heavy impact you get with the Clariion or similar). The main stumbling block with the XIV is the lack of IP-based replication at present (obviously the 6 module box cannot as it has no IP ports, but even with the 15 module box, iSCSI replication is not yet recommended). This normally causes an issue as customers have to buy an FC to IP conversion infrastructure. In your case this won’t apply, as you already have that infrastructure and would use it for either the EMC or XIV arrays.

    Snapshots:
    As per your understanding, snapshots on the XIV are effectively free of the performance impact that the CX will suffer. Copy on Frist Write snapshots used in Clariion and other arrays will effectively lose around 20% of the array performance – the Redirect on First Write used by the XIV does not.

    Services:
    My own experience of services provided is that EMC provides a more flexible offering than IBM (though not the XIV unit, who are semi-independent in the UK, and highly proactive). One of my next articles will be on my own demonstration XIV which experienced a full module failure a few weeks ago. The array lost a full tray of disks, 6TB usable data and half its connectivity, then rebuilt itself and carried on without any of my users noticing a thing. First I knew was when I logged in in the evening to check the array – the users were running test at the time of failure and it didn’t even hitch.

    So the technology was perfect – I was less happy with IBM’s call centre service when I tried to get someone in to look at the array. Issue was solved by the XIV-specific team in the UK, but I wasn’t massively impressed with the attitude of the muppet on the phone when I logged the support call.

    Overall, the solutions I have looked at will provide your requirements – my model was:
    Clariion CX4-240 / 480 (recommended)
    Minimum 12 x 4Gb/s FC Ports
    100 x 300GB FC Disks
    25 x 1TB SATA
    Mirrorview/Asynchronous
    Snapview

    Will give equivalent performance, slightly greater usable capacity and greater connectivity to the IBM XIV configuration:
    6 Module XIV
    8 x 4Gb/s FC Ports
    72 x SATA Disks
    Full Volume Copy
    Snapshot
    Synchronous/Asynchronous Replication

    Options:
    I looked at 400GB Enterprise Flash Drives for the Clariion – these provided a large performance kick (the performance of 40 – 50 of the FC disks could be replaced by a RAID Group containing 5 x EFD drives) – this however only gave 1.4 TB of capacity vs the 5TB of performant storage available from the FC disks, and actually increased the purchase price of the Clariion.

    Recoverpoint would replace Mirrorview/A and Snapview and would allow a CX4-240 to be used instead of a CX4-480, but has a cost of infrastructure of its own, and must be implemented.

    Hope this helps – let me know if you have any questions.

    Regards
    SG

    • Stuie Says:

      Hi SG,

      Thanks for this, really fantastic and provides us with great information. One thing is that one of my team just “bowled a googly” at me and said… what about NetApp? It seems to offer a very good middle ground between the two of this? Do you have much experience in this regard?

      Thanks again

      • storagegorilla Says:

        Hey Stuie, glad I can help

        Interesting you characterise NetApp as being middle ground,as that’s where I would put it too 🙂

        The NetApp FAS is the best NAS array on the market, with EMC’s Celerra a reasonable second. Where I’m mainly looking for a large volume of file serving, with possibly a small amount of block storage, I’d go for the NetApp. The problem with doing a primarily block storage solution on NetApp is that, as it was originally designed as a file server, even the block volumes have to sit on an underlying (WAFL) file system. WAFL has a 4K fixed block size, which is fine for dealing with files, but is less efficient at block storage (I prefer to design SANs for storage which can handle an average of 64K blocks). It’s possible to tweak the NetApp arrays to give a good benchmark, but I find that, real world, they’re just not as fast for Exchange, SQL etc.

        Your requirement appears to be more an application block storage requirement, so I saw the Clariion and XIV as being better choices (incidentally if you wanted to look at the Hitachi AMS, consider everything I’ve said about Clariion as being valid for AMS as they use essentially the same architecture).

        For your requirements, if you wanted to do NAS file serving off either the EMC or IBM platforms, currently the options are:

        EMC – the EMC Celerra (what EMC calls “Unified Storage” is essentially a Clariion CX4 array, with a Celerra NAS gateway on top. You can do file serving by IP attachment to the Celerra, and block-level storage by fibre or iSCSI attachment to the attached Clariion.

        IBM XIV – similarly, IBM offer the option of attaching an N-Series gateway to the XIV. Even though the N-Series is rebadged NetApp, it’s a gateway unit, so you can do file serving by IP attachment to the N-Series, and FC/iSCSI to the XIV itself. This gives the benefit of the NetApp for file, without the issues of block-attachment.

        IBM may release their own in-house developed NAS implementation for XIV in the future, but at present, those are the two options.

        Many thanks
        SG

  7. Stuie Says:

    Thanks again SG,

    We had a really good discussion with an IBM guy today and covered off the N series and N series gateway, you are right our requirement seems to suit the block type storage more than file (nfs).

    We have one main challenge left and that is being able to snapshot a database (when contained in a VM) to present to a different server. Can be done using raw mapping but our challenge is being able to present it to a server in a different network/subnet for our developers to troubleshoot problems if we end up with not using (as we do now) raw mapping.

    I will of course let you and all know what we decide.

    Cheers
    Stuie

    • storagegorilla Says:

      Good stuff – always glad to know when IBM agrees with me!

      Happens less than you’d think……… 🙂

      I’m not an Oracle expert, but if I understand your requirements, there are a couple of options for snapshots:

      If you have a single VMDK in a volume, the entire volume can be snapped and presented to the other server – as long as it is attached to the SAN, the other server can be on a different subnet (at least with XIV, which doesn’t require an IP control path). The XIV local snapshots are writable, which makes things easier.

      If you have multiple VMs in a volume, VCB or a 3rd party VM-snap product would perhaps be a better option.

      Option 3 depends on the type of database. Oracle has its own snapshot capability, but it’s pretty new and as far as I’m aware, has no integration yet to XIV or Clariion. SQL can be hooked into VSS – both Clariion and XIV have VSS providers.

      That what you’re after?
      SG

      • Stuie Says:

        Hi again SG,

        Well we have made a decision and taking all things into account from a pure “bang for bucks” and significant discounts etc. we have gone with the XIV. We are now planning for and looking forward to delivery and implementation.

        Thanks for you advice, it certainly helped us in terms of another “view” and not just our own.

        Fundamentally I believe that we thought both could meet our needs, but for the price – the XIV was as you have said able to do many things well.

      • storagegorilla Says:

        Hey Stuie

        Good to hear from you! I’m glad I helped – this was a difficult one as, for your requirements, I’d consider these the two best arrays on the market.

        I wouldn’t say EMC would have been a wrong answer, but the XIV was definitely a right one 🙂

        Let me know how you get on.

        Take Care
        SG

  8. June Q&A « The Storage Gorilla Says:

    […] Yup, XIV has problems.  Whether you think they are insurmountable will very much depend on your need for the product (if […]

  9. Stuie Says:

    Hey SG,

    Just thought I’d drop a note to say that we have commenced the migration process and so far so good. We have seen some excellent performance improvements and the admins feedback on the management interface have been very positive. Will post some more once we have migrated and provide some more information. Also worth noting that so far IBM have been excellent with regards to the project – nothing seems to be too much trouble to get a good result.

    • storagegorilla Says:

      Hey Stuie

      Apologies for not posting this sooner – between work, family and vacation I’ve not been on here recently (this will hopefully change for the better now I’m back).

      I’m glad it’s all going well – given your comment was a while ago have you completed migration?

      Thanks
      SG

  10. RK Says:

    Hey SG!

    Great info… I loved the unbiased analysis. I’m tired of the FUD about DDF. I read the blog from IBM about XIV which agrees with much of what you said:

    https://www.ibm.com/developerworks/mydeveloperworks/blogs/InsideSystemStorage/entry/ddf-debunked-xiv-two-years-later?lang=en

    Help me with the math though of how you got 526 GB on DDF of estimated data loss on 4+1 with 146 HDDs, and then how you arrived at the IBM-matching 9GB of data loss for DDF on XIV. That was awesome. I’m just trying to understand the math and can’t seem to figure it out.. maybe I’m a bit slow. 🙂

    Thanks!
    RK

    • storagegorilla Says:

      Hey RK

      Glad you liked the post – are you a user of XIV? It’s been interesting watching the level of acceptance of XIV grow over the past year – the first year was a bit like banging your head against a wall, same arguements “too good to be true” “but HDS/EMC/other competitor says” and so on. This year has been more of selling an accepted product – and now that IBM has made grid storage into a “cool” product, watching the other vendors struggle to emulate (3PAR, Isilon aquisitions etc). I’ll be interested to see how HDS deals with this – they seem to be the only vendor not at least making inroads into this space.

      The math on the 4+1 RAID group assumes that all disks are full, so loss of 2 disks means 100% loss of data in the RAID Group. In RAID5(4+1), effectively 4 disks hold live data, with one as parity (not logically true I know as parity is actually distributed across all 5 disks, but if you separate the parity sectors out, you would have the equivalent of one disk worth of parity, and four of live data). 4 disks of 146GB provide 548GB of raw space – adjusted for formatting, this is approximately 526GB of usable space – as the data in parity is a compressed copy of this data, 526GB is the data that can be lost. An approximation I know, but as close as you can get.

      With XIV it’s a bit more complicated, as you have to make assumptions on where the pseudo-random distribution of data on the XIV will place the data. We have to start by assuming that the data to be lost exists on both an interface and a data module (as losing two data or interface modules together would not result in data loss). This means that if a disk is lost in an interface module, then immediately a disk is lost in a data module (the highly unlikely part), there are only 108 disks where this failure can happen (as all other disks are interface modules). Assuming that all disks are 100% full and that the pseudo-random distribution will have spread the second copy of the data on the interface disk across those 108 disks, then the data lost would be 9.25GB

  11. Jim Says:

    SG,
    There are two major problems with the XIV and large OLAP work loads (known experience).

    1. The XIV does not handle OLAP workloads very efficiently as as a DS8300 does but the XIV is placed with IBM as Gold tier storage. Average IOPS possible with current workload is 31,000 at 13ms as opposed to a previous average of 43,000 at 4ms.

    2. Snapshots of the XIV are very bulky (current snapshot of 14TB database consuming 3TB of space). The fact that everything inside of the XIV is based on a 1MB block size and 17GB extent size causes a very bloated volume size.

    The overall performance of the XIV is not bad and for use as an Exchange storage device or a VMWare storage device, XIV will work fine but for a production SAP production environment it does not make a good candidate.

    A redesign on the block size and the extent provisioning would bring the XIV into a better category but for the price per terabyte it is very acceptable for SMB production or large enterprise DR/lab/dev systems.

    • storagegorilla Says:

      Hey Jim

      Thanks for this – really good to get informed opinion on any areas where products fall down. Do you mind if I add the section on OLAP performance to the “issues and fixes” section? Even though there isn’t a fix for it, it’s good to highlight this as it may help someone making a decision (or may spur IBM off their butts to rectify).

      Have IBM looked at the environment to understand where the performance limitation is coming from? With 43000IOPS it sounds like a monster of a data warehouse you’re running so would be useful to know – closest I’ve seen is an SAP environment where total requirement was 60,000IOPS, but BI was only part of that load so it’s not really a good comparison.

      On the snapshot side, 3TB isn’t that bad – that’s about 21.4% of the data size. Since the usual rule of thumb is 20%, and varies depending on the length of time the snaps are kept, you’re pretty much on the money (though if you tell me you’re making 2 snaps a day and only holding them for 24 hours, I might change my opinion :). My main concern with snaps on the XIV is the requirement to reserve space for them – if you’re consuming 3TB, are you having to reserve 5TB? This is wasteful, and I’d like to see some form of dynamic management of this space appear at some point.

      Generally I’m seeing the customers I have on XIV are still happy with the technology, even up to large volumes and performance rates (a mix of VMware, Windows, AIX and SAP users). Their main issue seems to be that, as XIV becomes more accepted within IBM, the organisation is becoming more like IBM, and is picking up the worst of those characteristics. I’ve been meaning to post for a while on my own experience last year when a tray on my own demo box failed – the technology performed flawlessly, the IBM call-centre response – not so much.

      Regards
      SG

      • Jim Says:

        To your first question of using the performance issue, I do not have an issue and you can add a fix as well. The performance issue has now been identified as an Oracle/SAP issue from a statistics job that is not tuned correctly (another long story). The point to the IOPS issue is that the XIV is not a “Gold” tier storage device as the DS8300 was enough of a beast to hide the issue (this is an SAP R/3 database our BW database is 20TB and has not experienced any performance issues to this point).

        The real issue with the 3TB snapshot size is the fact that there is not 3TB of changed data within this database. Tracked changes accumulate to ~1TB per day, so a snapshot growth rate of 3TB there is a waste of 2TB of overhead. IBM recommends sufficient growth space for the snapshots and best practice is to allocate the same amount of snapshot space as the volume being snapshotted until the growth patterns can be determined. To date I have been the lucky candidate to under size the volume and snapshot space and find that the snapshot(s) get deleted, of course this was before this feature was documented (similar to the way the documentation changed for the way the FC adapters should be configured on the back of the XIV as in your fix section). Our current snapshot space consumes 15TB (1 snapshots x 5 days x 3TB) so space is used very quick (good for XIV sales).

        I agree that since XIV support has matured it has taken on the bad characteristics of the other IBM storage support save the N Series support.

        I will give the XIV team the fact that the team is very proactive something the DS Storage team is not.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: