IBM®
Skip to main content
    United States [change]    Terms of use
 
 
 
    Home    Products    Services & solutions    Support & downloads    My account    
IBM Research

Storage Systems - Serverless Backup

IBM Almaden Research Center


Overview

Doing regular data backups is an essential -- and expensive -- requirement for business computing. IBM researchers are making backup operations "server-less." Host computers do not read or write the backup data. Instead, the data is copied directly from disk to tape, or disk to disk, on a Storage Area Network (SAN), using the new SCSI Extended Copy command. Freed of the routine data-transport burden, server resources can be used for real work. We are enabling the Tivoli Storage Manager with serverless backup, while keeping its winning features of incremental backup, centralized management, and broad platform support.

Using a SAN to pump backup data is just one step. The connectivity and speed of a SAN enables storage devices to carry the load of bulk data movements, long-distance replication, data snapshots, and even routine processing. The challenge: How can these raw capabilities be turned into real tools to make systems work better?

How does it work?

Serverless backup is just an example of delegating data movement jobs to a copy engine. In today's Storage Area Networks, the copy engine is typically an auxiliary service running on a bridge/router device used to connect parallel-SCSI devices to a Fibre-Channel SAN. The management application tells the copy engine where to find the data, and where to put it. With SCSI Extended Copy, the instructions consist mainly of network addresses for the disk and tape devices, lists of logical block addresses (LBAs) on the disks, transfer lengths, and literal data such as file attributes for inclusion in the backup data set. The copy engine just follows the instructions and moves the data -- with no understanding of the data content.

Diagram: How serverless backup works
  1. Server mounts and positions tape, determines source block list
  2. Server issues Extended Copy to copy engine
  3. Copy engine issues reads from RAID controller disks (3a) and writes to tape (3b)
  4. Copy engine reports completion

Where is the Research work?

Building an application that offloads copy operations in specific backup contexts is a development challenge. The research fun comes in scouting out the growth directions -- into large, complex SANs, into new SAN technologies, and into applications beyond backup/recovery. Even in the nearer term, data integrity cannot be assured unless some complexities are addressed -- and a backup without guaranteed data integrity is not a backup at all.

Access control: A large SAN is practical only when data sets owned by different computers are protected from unauthorized modification. This requires management of static access rights and use of dynamic reservations. SCSI standards have not caught up with this need. We are developing protocols and advancing standards so that applications on a SAN can delegate operations to copy engines without exposing its data to "accidents".

Distributed delegation: Copy delegation using SCSI Extended Copy is straightforward when a single program knows all the sources and destinations. What if sources and destinations are managed by independent applications without full mutual trust? Or merely by components owned by different development teams -- who also don't fully trust each other? We are inventing and testing mechanisms for safe joint delegation of copy operations to exploit the efficiency of SAN.

 

    About IBMPrivacyContact