During his opening keynote at Oracle OpenWorld 2012, Larry Ellison launched the new Exadata X3. The new version appears to have some nice new capabilities, including caching writes to EFD, which are likely to improve the usability of Exadata for OLTP workloads. And he was nice enough to include the EMC Symmetrix VMAX 40K in detail on 30% of his slides as he announced the new Exadata. And for that, I give thanks. I am sure that Salesforce.com were similarly thankful when Larry focused so much of his time on their product in his keynote last year.
For those who have seen the presentation, Larry’s focus with respect to the VMAX was how much faster the new Exadata system was (which you can find starting about 31 minutes into his keynote). And I am sure that message got through. However, the other message that seems to also have been delivered was something he may not have intended: if you have a serious Oracle workload, your only real option other than Exadata is VMAX. He even quotes that “EMC announced the latest and greatest of all the disk arrays.” I am not sure we could ask for much more.
But Larry is Competing with VMAX
Some will note that since all of the references from Larry were to point out how much faster the new Exadata is, EMC should be concerned. So why am I so thankful? First, Larry is well known (as most vendors are) for putting up big numbers. But with Exadata, the level of exaggeration has it’s own reputation in the market. Since it is not possible to run a normal I/O test against an Exadata, the tests all have to be done with an Oracle database. And how the tuning is done for that makes all the difference in the world as to the ability for any customer to ever get the same results. The VMAX numbers, on the other hand, are pretty simple to recreate. Connect up some servers. Have them all do large block sequential reads. Measure the results. After all, that is what EMC does to build the numbers in the first place.
And when replication is turned on, the EMC numbers are primarily limited by the bandwidth of the connections and the latency between the sites. From the feedback from customers at the OOW sessions this week, it sounds like they are seeing a much larger impact to their Exadata performance with Data Guard enabled than they had been led to expect by their Oracle sales teams.
Customers are looking for a solution, not just great performance. While the performance numbers are being evaluated, customers will look at the rest of what they need. In general, high-end customers are looking for:
- Data integrity
- Availability
- Performance
- Cost
- Ease of implementation/use
And it goes about in that order. The business has needs for a level of data integrity and system availability. Without those, performance is meaningless. And unless the performance needs are also met, it really doesn’t matter what the price is. Once the first three have been met, then the business is looking to see how to make the solution as cost effective as possible, to get the best return on the investment. And finally, anything that makes the implementation or operation easier to handle is useful (and may have a cost trade-off).
Data Integrity
The data integrity side is pretty easy – either the system consistently produces the right results, or it doesn’t. And when it doesn’t, customers go elsewhere. EMC Symmetrix systems have a long history of excellent data integrity. In addition to all of the other safeguards taken by most systems, a checksum for every block is kept in cache at all times. When data is read from disk, the checksum is used to verify that the data has not changed, no matter which RAID protection may be used on disk. This sort of paranoia about data integrity has paid of for customers, and their loyalty pays off for EMC.
Exadata offers the ability to use triple mirrors (yes, 3 copies of every byte of data to separate disks) for the best protection. Given the failure modes, this is a reasonable way to protect most business-critical information. There is no comparison of the data on a read that does not show an error, but the drive seek failure rates are low enough these days that the risk of those errors in a given system in a year is fairly small. However, some customers have reported that using HCC to reduce the size of their data has resulted in ‘unexpected’ answers to some queries. I am sure Oracle has fixed those issues in the new release.
General Availability
Customers are looking for a system that can stay up 24x7. They want to offload the backup workload from their production systems, preventing that workload from being a drag on production. They want to quickly make copies for development and testing, with the ability to isolate these from disrupting production. These are table stakes for high-end system operations.
With Exadata, no storage-based copies are possible. So there are no clones or snaps. Larry announced that Oracle 12c will include the ability to present point in time images of a database. But that will still be within the same database, and sharing all of the same resoruces. Customers will need to decide if that makes it up to the level of what they expect from even the most basic storage systems.
Customers are also looking for more. They are looking for the ability of their systems to minimize or eliminate downtime for planned and unplanned events, even those that take out an entire data center. Eliminating downtime may include options like using Oracle RAC on Extended Clusters. EMC makes this easy with VPLEX, eliminating the need for ASM mirroring between the systems and supporting distances of up to 100KM, and giving customers options on how far apart they need these systems to be to meet their availability needs. The only Exadata mirroring option (with RAC support) is between Exadata racks over InfiniBand, with a distance limitation of about 100m – that will not even get the next building.
Disaster Recovery
For disaster recovery, the real challenge comes from the dependent nature of the complex systems that most customers have. Data comes from many sources, internally and externally. There are many different internal systems that feed each other information, and in many cases some of the data that goes out of one system will go through a series of others and come back with updates or new information attached. So the biggest challenge in DR becomes not recovering the data for a single system, but having these dependent systems recovered to the same point in time. And if you cannot be sure which systems are dependent, then they all need to be recovered to the same point in time.
Symmetrix VMAX systems offer the Remote Data Facility (SRDF), which provides synchronous and/or asynchronous copies of data to remote locations. This works for Oracle data from multiple databases, and for SQL Server and any other database, and for flat files, and for Mainframe and iSeries – it just works. And for our very large customers where one array is not enough, SRDF Multi-Session Consistency (MSC) allows the same consistency to scale across multiple arrays (source and target), a solution that no other vendor currently offers.
Oracle Data Guard does a fine job of replicating Oracle databases. It does not currently offer a method of providing consistent point in time recovery across multiple databases, though some of the container enhancements planned in the future release 12c may help. But Oracle has no plans to offer replication consistency with other databases, or with flat files, or…. They do a fine job with Oracle databases. And if those were all the customer had, it might be enough.
Customer Choices
In the end, customers will continue to make their choices. What systems best let them meet their data integrity, availability, and performance goals, with the best return on investment, and in the simplest way? The past record makes it pretty clear that most large Oracle customers have been choosing VMAX over Exadata. Will a faster Exadata change that? Maybe, but it does not change the customer needs.