IBM®
Skip to main content
    United States [change]    Terms of use
 
 
 
    Home    Products    Services & solutions    Support & downloads    My account    
IBM Research

Storage Systems - Projects - SARC

IBM Almaden Research Center


Overview

Sequential-prefetching in Adaptive Replacement Cache (SARC) is an efficient adaptive algorithm for managing read caches with both demand-paged and prefetched data.

  • Products: SARC is now available in IBM's flagship storage controllers: DS6000/DS8000.

  • Papers: SARC was published in USENIX 2005.

  • Press: (Oct 12, 2004) "New caching technology from IBM Research called (SARC) is designed to help clients achieve dramatically greater throughput and faster response times than previous IBM TotalStorage Enterprise Storage Server 800 systems. (SARC) incorporates autonomic, self-optimizing technology and a more efficient and effective method for the widely used process of replacing data pages in computer cache memories. The breakthrough technology, available in both the DS6000 and DS8000 series, dynamically optimizes the storage system's performance for both sequential and randomly accessed workloads."

  • More Press: (Dec 14, 2005) "The DS8000 series products feature three IBM Research-developed software innovations in caching that are designed to work together to deliver dramatically greater throughput and faster response times for a wide range of real-life workloads. A new prefetching feature preloads and manages sequential data in the cache so it always contains the needed data. This prefetching feature also enhances the previously announced Adaptive Replacement Cache technology that integrates and balances both of the critical caching and prefetching functions. The third innovation is designed to eliminate undesirable interactions between the read- and write-cache management while still allowing both caches to beneficially share memory resources."

  • Prestige: IBM Outstanding Innovation Award, IBM Outstanding Technical Achievement Award, Research Accomplishment 2005.

Detailed Description

Caching is a fundamental technique for hiding I/O latency and is widely used in storage controllers, databases, file systems, and operating systems. A modern cache typically contains volatile memory used as a read cache and non-volatile memory used as a write cache. The effectiveness of a read cache depends upon the hit ratio -- the fraction of requests that are served from the cache without necessitating a read from the disk (miss). In demand paging, a page is stored in the cache only after it has been requested. It may need to replace another page to create room for it in the cache and numerous algorithms like LRU, ARC, MQ, etc., provide policies to choose the page to evict.

Another way of improving the hit ratio is to speculatively prefetch pages into the cache even before they are requested. The most common and reliable prefetch technique is sequential prefetching, where a number of pages beyond the currently requested page are demanded from the slower media (disks) and placed into the cache. Subsequent requests for the prefetched pages are hits resulting in lower response times.

Most modern caches leverage both demand-paging and prefetching techniques in an attempt to maximize the cache hit ratio. However, the demand-paged data and the prefetched data share the same cache, and it is not trivial to decide how to divide the cache between these two classes of data.

SARC is an adaptive, low overhead, simple to implement, cache management policy that dynamically partitions the cache space amongst sequential (including prefetched data) and random (demand-paged) data so as to maximize the overall hit ratio. To achieve this, SARC dynamically equalizes the marginal utility of both the classes of data and adapts the cache space allocated to each class. It is extremely simple to convert an existing LRU implementation into SARC.

SARC is available in IBM's flagship storage controllers - IBM System Storage DS6000 and DS8000 series. On the adopted performance benchmark, SPC-1, SARC reduces the miss ratio by 11%, and increases the peak throughput by 12.5%, effectively increasing the cache space by 33%.

arrow image IBM Almaden Research - Advanced Storage Systems

SARC lists
SARC adaptively divides the cache between the random and sequential (including prefetched data) classes of data by equalizing the marginal utility of both data classes.
Technical Papers
Binny Gill and Dharmendra S. Modha
Link to content in pdf format SARC: Sequential Prefetching in Adaptive Replacement Cache (265 KB Acrobat PDF file) USENIX Annual Technical Symposium (USENIX 05), Anaheim, CA, April 10-15, 2005.


    About IBMPrivacyContact