Chapter: Database Management Systems : Trends In Database Technology

RAID: Redundant Arrays of Independent Disks

Disk organization techniques that manage a large numbers of disks, providing a view of a single disk of high reliability by storing data.

RAID

RAID: Redundant Arrays of Independent Disks

disk organization techniques that manage a large numbers of disks, providing a view of a single disk of high reliability by storing data redundantly, so that data can be recovered even if a disk fails. The chance that some disk out of a set of N disks will fail is much higher than the chance that a specific single disk will fail.

E.g., a system with 100 disks, each with MTTF of 100,000 hours (approx. 11 years), will have a system MTTF of 1000 hours (approx. 41 days)

o Techniques for using redundancy to avoid data loss are critical with large numbers of disks

Originally a cost-effective alternative to large, expensive disks. o I in RAID originally stood for ―inexpensive‘‘

o Today RAIDs are used for their higher reliability and bandwidth. The ―I‖ is interpreted as independent

Improvement of Reliability via Redundancy

Redundancy – store extra information that can be used to rebuild information lost in a disk failure.

E.g., Mirroring (or shadowing)

o Duplicate every disk. Logical disk consists of two physical disks. o Every write is carried out on both disks

Reads can take place from either disk

o If one disk in a pair fails, data still available in the other

Data loss would occur only if a disk fails, and its mirror disk also fails before the system is repaired.

Prob ability of combined event is very small o Except for dependent failure modes such as fire or building collapse or electrical power surges.

Mean time to data loss depends on mean time to failure, and mean time to repair. o E.g. MTTF of 100,000 hours, mean time to repair of 10 hours

gives mean time to data loss of 500*106 hours (or 57,000 years) for a mirrored pair of disks (ignoring dependent failure modes)

Improvement in Performance via Parallelism

o Two main goals of parallelism in a disk system:

1. Load balance multiple small accesses to increase throughput

2. Parallelize large accesses to reduce response time.

o Improve transfer rate by striping data across multiple disks.

o Bit-level striping – split the bits of each byte across multiple disks

In an array of eight disks, write bit i of each byte to disk i.

Each access can read data at eight times the rate of a single disk.

But seek/access time worse than for a single disk

Bit level striping is not used much any more

o Block-level striping – with n disks, block i of a file goes to disk (i mod n) + 1

Requests for different blocks can run in parallel if the blocks reside on different disks. A request for a long sequence of blocks can utilize all disks in parallel.

RAID Levels

o Schemes to provide redundancy at lower cost by using disk striping combined with parity bits. Different RAID organizations, or RAID levels, have differing cost, performance and reliability

RAID Level 0: Block striping; non-redundant.

o Used in high-performance applications where data lost is not critical.

RAID Level 1: Mirrored disks with block striping. o Offers best write performance.

o Popular for applications such as storing log files in a database system. RAID Level 2: Memory-Style Error-Correcting-Codes (ECC) with bit striping.

RAID Level 5: Block-Interleaved Distributed Parity; partitions data and parity NOTES among all N + 1 disks, rather than storing data in N disks and parity in 1 disk.

o E.g., with 5 disks, parity block for nth set of blocks is stored on disk (n mod 5) + 1, with the data

blocks stored on the other 4 disks. o Higher I/O rates than Level 4. Block writes occur in parallel if the blocks and their parity blocks are on different disks. o Subsumes Level 4: provides same benefits, but avoids bottleneck of parity disk. RAID Level 6: P+Q Redundancy scheme; similar to Level 5, but stores extra redundant information to guard against multiple disk failures. o Better reliability than Level 5 at a higher cost; not used as widely.

Study Material, Lecturing Notes, Assignment, Reference, Wiki description explanation, brief detail

Database Management Systems : Trends In Database Technology : RAID: Redundant Arrays of Independent Disks |