Introduction to Oracle Exadata

The Oracle Exadata Database Machine is a computing platform optimized for running Oracle Database. Exadata is a combined hardware and software platform that includes scale-out Intel x86-64 compute and storage servers, RoCE or InfiniBand networking, persistent memory, NVMe flash, and specialized software.

Exadata was introduced in 2008, and, since October 2015, is available either as an on-premises product or via the Oracle Cloud as a subscription service, known as the Exadata Cloud Service. Oracle databases deployed in the Exadata Cloud Service are 100% compatible with databases deployed on Exadata on-premises, which enables customers to transition to the Oracle Cloud with no application changes. Oracle Corporation manages this service, including hardware, network, Linux software and Exadata software, while customers have complete ownership of their databases.

Exadata is designed to run Oracle Database workloads, such as an OLTP application running simultaneously with Analytics processing. Historically, specialized database computing platforms were designed for a particular workload, such as Data Warehousing, and poor or unusable for other workloads, such as OLTP. Exadata allows mixed workloads to share system resources fairly with resource management features allowing prioritized allocation, such as always favoring workloads servicing interactive users over reporting and batch, even if they are accessing the same data. Long running requests, characterized by Data Warehouses, reports, batch jobs and Analytics, are reputed to run many times faster compared to a conventional, non-Exadata database server.

The Exadata Database Machine uses a scale-out architecture for both database servers and storage servers. As workloads grow, database CPUs, storage, and networking can be added to an Exadata Database Machine to scale without bottlenecks. The architecture expands from small to extremely large configurations to accommodate workloads of any size.

A brand new high-bandwidth low-latency 100 Gb/sec RDMA over Converged Ethernet (RoCE) Network Fabric connects all the components inside an Exadata Database Machine. Specialized database networking protocols deliver much lower latency and higher bandwidth than is possible with generic communication protocols for faster response time for OLTP operations and higher throughput for analytic workloads. External connectivity to the Exadata Database Machine is via standard 10 Gb/sec or 25 Gb/sec Ethernet. 

The Exadata X8M-2 Database Machine uses powerful database servers, each with two 24-core x86 processors and 384 GB of memory (expandable up to 1.5 TB). Exadata also uses scale-out, intelligent storage servers available in two configurations – High Capacity (HC) or Extreme Flash (EF). HC Storage Servers have four NVMe PCI Flash cards each with 6.4 TB (raw) Exadata Smart Flash Cache and twelve 14 TB 7,200 RPM disks. EF Storage Servers have an all-flash configuration with eight NVMe PCI Flash drives, each with 6.4 TB (raw) storage capacity. Exadata X8M HC and EF Storage now include persistent memory, further boosting capacity and performance. HC and EF Servers receive twelve 128 GB Intel® Optane™ DC Persistent Memory modules as a new tier between DRAM and flash. Exadata combines persistent memory with innovative RDMA algorithms that bypass the network and I/O stack, eliminating expensive CPU interrupts and context switches, reducing latency by 10x, from 200µs to less than 19µs. 

The minimal configuration of an Exadata Database Machine consists of two database servers and three storage servers, which can be expanded into elastic configurations adding more database and/or storage servers within the same rack. Elastic configurations provide a flexible and efficient mechanism to meet any size business need.

Shared Persistent Memory Acceleration

New with Exadata X8M, storage servers include persistent memory (PMEM) data and commit accelerators in front of flash cache, enabling orders of magnitude lower latency accessing remotely stored data. Persistent memory is a new silicon technology, adding a distinct storage tier of performance, capacity, and price between DRAM and Flash. As the persistent memory is physically present on the memory bus of the storage server, reads perform at memory speed, much faster than flash. Writes are persistent, surviving power cycles, unlike DRAM. By utilizing RDMA to access persistent memory remotely, Exadata Smart PMEM Cache is able to bypass the network, I/O software, interrupts and context switches, achieving more than 10x lower latency than previous Exadata generations, down to less than 19 microseconds. Smart Exadata System Software also ensures data is mirrored across storage servers, which provides additional fault-tolerance. Exadata’s unique end-to-end integration between Oracle Database and Exadata Storage automatically identifies the hottest data blocks to store, while ensuring database, persistent memory, and flash cache do not hold the same block multiple times, increasing the efficiency across the storage tiers. Adding persistent memory to the storage tier means the aggregate performance of this new cache tier can be dynamically used by any database on any server. This is a significant advantage over general-purpose storage architectures, which preclude sharing across servers. 

Another smart new Exadata System Software feature boosts log write performance. Log write latency is critical for OLTP performance, a faster log write means faster commit times. Inversely, any slowdown of log writes can cause the database to stall. Unique to Exadata X8M, Exadata Smart PMEM Log automatically enables the database to issue a one-way RDMA log write to persistent memory. RDMA and persistent memory technologies allow the log write to occur without acknowledgement, and smart software places the write across multiple servers for resilience. This leads to an 8x performance increase in log writes. 

Security and management of this new tier are also automated. Persistent memory is configured automatically at installation time, with no user interaction required. Hardware monitoring is configured out of the box. Persistent memory is only accessible to databases using database access controls, ensuring end to end security of data. Deploying persistent memory in Exadata X8M is so simple, it’s transparent.

Extreme Flash Storage Server: Record-Breaking I/O Performance

Exadata Extreme Flash (EF) Storage Server, first introduced with Exadata X5, is the foundation of a database-optimized all-flash Exadata Database Machine. Each EF Storage Server contains eight 6.4 TB Flash Accelerator F640v2 NVMe PCI Flash drives, offering 51.2 TB raw flash capacity per EF Storage Server. This state-of-the-art flash memory improves speed and power efficiency, and provides an expected endurance of 8 years or more for typical database workloads. In addition, Exadata delivers ultra-high performance by placing these flash devices directly on the high speed PCI bus rather than behind slow disk controllers and directors. Exadata X8M adds the shared persistent accelerator, can achieve up to 16 Million random 8K database read and 5.17 Million random 8K flash write I/O operations per second (IOPS), which is an industry record for database workloads.

These are real-world end-to-end performance figures measured running SQL workloads with standard 8K database I/O sizes inside a single rack Exadata system, unlike storage vendor performance figures based on small I/O sizes and low-level I/O tools and are therefore many times higher than can be achieved from realistic SQL workloads. Exadata’s performance on real database workloads is orders of magnitude faster than traditional storage array architectures, and is also much faster than current all-flash storage arrays, whose architecture bottlenecks flash throughput.

High Capacity Storage Server: Tiered Disk Flash And Persistent Memory Deliver Cost Of Disk With Shared Memory Performance

The second Exadata storage option is the High Capacity (HC) Storage Server. This server includes twelve 14 TB SAS disk drives with 168 TB total raw disk capacity. It also has four Flash Accelerator F640v2 NVMe PCIe cards with a total raw capacity of 25.6 TB of flash memory. Exadata X8M adds the shared persistent memory acceleration tier, twelve 128 GB Intel® Optane™ DC Persistent Memory modules in front of flash to boost performance even more. Deployed using smart software, Exadata Smart PMEM Cache, only the hottest database blocks are automatically cached in this new tier. Accessible over RDMA direct from the database delivers the highest I/O rates at an extremely low latency.

Flash in the HC Storage Server can be used directly as flash disks, but is almost always configured as a flash cache (Exadata Smart Flash Cache) in front of disk storage behind the PMEM Cache to deliver the best performance. Exadata Smart Flash Cache is used in-sync with PMEM Cache to automatically cache frequently accessed data while keeping infrequently accessed data on disk, delivering the high I/O rates and fast response times of flash with the large capacity and low cost of disk. Exadata uniquely understands database workloads and knows when to avoid caching data that will negatively affect overall performance. For example, if large write I/Os caused by backups or large table scans are likely to disrupt higher priority OLTP or scan operations, those large I/Os will bypass the flash cache and go straight to disk. Otherwise, Exadata System Software will utilize additional spare flash capacity and I/O bandwidth to optimize performance by caching these I/Os. In addition to automatic caching, administrators can optionally provide SQL directives to ensure that specific tables, indexes, or partitions are preferentially retained in the flash cache.

It is common for hit rates in the Exadata Smart Flash Cache to be over 95%, or even 99% in real-world database workloads, yielding an effective flash capacity many times larger than the physical flash. For arrays with thousands of disk drives.

The automatic data tiering between RAM, persistent memory, flash and disk in Exadata provides tremendous advantages over other flash-based solutions. Many storage vendors have developed flash-only arrays to achieve higher performance than traditional arrays. These flash-only arrays deliver better performance but cannot match the cost advantages of Exadata’s smart tiering of data between disk and flash, as the overall size of data that can benefit from flash is limited to the size of expensive flash. And these flash arrays are unable to benefit from Exadata’s unique database-aware storage optimization technologies. Generic data deduplication provided by some flash arrays is effective for Virtual Desktop Infrastructure environments, but not for databases.

Exadata not only delivers much more capacity than generic all-flash arrays, it also delivers better performance. Flash-only storage arrays cannot match the throughput of Exadata’s integrated and optimized architecture with full 100 Gb/sec RDMA over converged ethernet based scale-out network, fast PCI Flash, offload of data intensive operations to storage, and algorithms throughout that are specifically optimized for databases.

Extended Capacity Storage Server: Much Lower Cost Exadata Storage For Low Use Data

A third Exadata storage option was introduced with Exadata X8 – the Extended (XT) Storage Server. Each Exadata XT Storage Server includes twelve 14 TB SAS disk drives with 168 TB total raw disk capacity. To achieve a lower cost, flash is not included, and storage software is optional in this storage server. For the X8M generation, the XT Extended Storage server benefits from the addition of 100Gb/s network.

This storage option extends the operational and management benefits of Exadata to rarely accessed data that must be kept online. Exadata’s Extended (XT) Storage Server is:

Efficient – The XT server offers the same high capacity as the HC Storage server, including Hybrid Columnar Compression

Simple – The XT server adds capacity to Exadata while remaining transparent to applications, transparent to SQL, and retains the same operational model

Secure – The XT server enables customers to extend to low-use data the same Exadata security model and encryption used for online data

Fast and Scalable – Unlike other low-use data storage solutions, the XT server is integrated to the Exadata fabric, for fast access and easy scale-out

Compatible – The XT server is just another flavor of Exadata Storage server – you can just add XT servers to any Exadata rack

With Exadata Extended (XT) Storage Server, enterprises can meet their long-term data retention compliance requirements with the same trusted and continually validated Exadata solution, avoiding the operational risks and costs of managing information lifecycle across multiple platforms.

Exadata Database Machine can have a total of up to 576 processor cores in the storage servers able to offload the database servers. The CPUs in the storage servers do not replace database CPUs. Instead they accelerate database intensive workloads similar to how graphics cards accelerate image intensive workloads.

One of the many unique features of Exadata System software is Smart Scan technology, which offloads data intensive SQL operations from the database servers directly into the storage servers. By pushing SQL processing to the storage servers, data filtering and processing occur immediately and in parallel across all storage servers, as data is read from disk and flash. Only the rows and columns that are directly relevant to a query are sent to the database servers. 

For example, if a query is executed to identify the customers who placed sales orders over $1000 in the month of March, an Exadata system will offload the scanning of the table to the Exadata storage, filter out all sales orders that are less than $1000, filter out sales orders not in March, and extract just the relevant customer information. This reduces the data transferred to the database servers by orders of magnitude. Smart Scan greatly accelerates query execution, eliminates bottlenecks, and significantly reduces the CPU usage of the database servers. 

Storage Index is another powerful unique capability of Oracle Exadata System software that helps avoid unnecessary I/O operations and improves overall performance. The storage index, maintained in-memory at the storage server, tracks summary information for table columns contained in a storage region on that storage server. When a query specifies a WHERE clause, Exadata System software examines the storage index using a Bloom filter to determine if rows with the specified column value might exist in a region of disk on the storage server. If the column value doesn’t exist in the Bloom filter, then scan I/O in that region for that query is skipped. Storage Indexes make many SQL operations run dramatically faster because large numbers of I/O operations are automatically replaced by a few in-memory lookups.

Besides the intrinsic capabilities of Exadata System software, the combination of Oracle Database software, Exadata System software and Exadata infrastructure enables several additional unique capabilities that offer unparalleled performance levels for OLTP workloads. For example, Exafusion Direct-to-Wire Protocol uniquely allows database processes to read and send Oracle Real

Applications Cluster (Oracle RAC) messages directly over the ultra-fast RoCE network using Remote Direct Memory Access (RDMA), bypassing the OS kernel and networking software overhead. This improves the response time and scalability of Oracle RAC OLTP configurations on Oracle Exadata Database Machine, especially for workloads with high-contention updates.

In some OLTP workloads, more than half of remote reads are for Undo Blocks to satisfy read consistency. Exadata uniquely leverages ultra-fast RDMA to read UNDO blocks from other database instances, further improving OLTP performance.