Characterizing Schedules Based on Recoverability
When transactions are executing concurrently in an interleaved fashion, then the order of execution of operations from all the various transactions is known as a schedule (or history). In this section, first we define the concept of schedules, and then we characterize the types of schedules that facilitate recovery when failures occur. In Section 21.5, we characterize schedules in terms of the interference of par-ticipating transactions, leading to the concepts of serializability and serializable schedules.
1. Schedules (Histories) of Transactions
A schedule (or history) S of n transactions T1, T2, ..., Tn is an ordering of the operations of the transactions. Operations from different transactions can be interleaved in the schedule S. However, for each transaction Ti that participates in the schedule S, the operations of Ti in S must appear in the same order in which they occur in Ti. The order of operations in S is considered to be a total ordering, meaning that for any two operations in the schedule, one must occur before the other. It is possible theoretically to deal with schedules whose operations form partial orders (as we discuss later), but we will assume for now total ordering of the operations in a schedule.
For the purpose of recovery and concurrency control, we are mainly interested in the read_item and write_item operations of the transactions, as well as the commit and abort operations. A shorthand notation for describing a schedule uses the symbols b, r, w, e, c, and a for the operations begin_transaction, read_item, write_item, end_transaction, commit, and abort, respectively, and appends as a subscript the transaction id (transaction number) to each operation in the schedule. In this notation, the data-base item X that is read or written follows the r and w operations in parentheses. In some schedules, we will only show the read and write operations, whereas in other schedules, we will show all the operations. For example, the schedule in Figure 21.3(a), which we shall call Sa, can be written as follows in this notation:
Sa: r1(X); r2(X); w1(X); r1(Y); w2(X); w1(Y);
Similarly, the schedule for Figure 21.3(b), which we call Sb, can be written as follows, if we assume that transaction T1 aborted after its read_item(Y) operation:
Sb: r1(X); w1(X); r2(X); w2(X); r1(Y); a1;
Two operations in a schedule are said to conflict if they satisfy all three of the fol-lowing conditions: (1) they belong to different transactions; (2) they access the same item X; and (3) at least one of the operations is a write_item(X). For example, in schedule Sa, the operations r1(X) and w2(X) conflict, as do the operations r2(X) and w1(X), and the operations w1(X) and w2(X). However, the operations r1(X) and r2(X) do not conflict, since they are both read operations; the operations w2(X) and w1(Y) do not conflict because they operate on distinct data items X and Y; and the operations r1(X) and w1(X) do not conflict because they belong to the same transaction.
Intuitively, two operations are conflicting if changing their order can result in a dif-ferent outcome. For example, if we change the order of the two operations r1(X); w2(X) to w2(X); r1(X), then the value of X that is read by transaction T1 changes, because in the second order the value of X is changed by w2(X) before it is read by r1(X), whereas in the first order the value is read before it is changed. This is called a read-write conflict. The other type is called a write-write conflict, and is illustrated by the case where we change the order of two operations such as w1(X); w2(X) to w2(X); w1(X). For a write-write conflict, the last value of X will differ because in one case it is written by T2 and in the other case by T1. Notice that two read operations are not conflicting because changing their order makes no difference in outcome.
The rest of this section covers some theoretical definitions concerning schedules. A schedule S of n transactions T1, T2, ..., Tn is said to be a complete schedule if the following conditions hold:
1. The operations in S are exactly those operations in T1, T2, ..., Tn, including a commit or abort operation as the last operation for each transaction in the schedule.
2. For any pair of operations from the same transaction Ti, their relative order of appearance in S is the same as their order of appearance in Ti.
3. For any two conflicting operations, one of the two must occur before the other in the schedule.
The preceding condition (3) allows for two nonconflicting operations to occur in the schedule without defining which occurs first, thus leading to the definition of a schedule as a partial order of the operations in the n transactions.11 However, a total order must be specified in the schedule for any pair of conflicting operations (condition 3) and for any pair of operations from the same transaction (condition 2). Condition 1 simply states that all operations in the transactions must appear in the complete schedule. Since every transaction has either committed or aborted, a complete schedule will not contain any active transactions at the end of the schedule.
In general, it is difficult to encounter complete schedules in a transaction processing system because new transactions are continually being submitted to the system. Hence, it is useful to define the concept of the committed projection C(S) of a schedule S, which includes only the operations in S that belong to committed trans-actions—that is, transactions Ti whose commit operation ci is in S.
2. Characterizing Schedules Based on Recoverability
For some schedules it is easy to recover from transaction and system failures, whereas for other schedules the recovery process can be quite involved. In some cases, it is even not possible to recover correctly after a failure. Hence, it is important to characterize the types of schedules for which recovery is possible, as well as those for which recovery is relatively simple. These characterizations do not actually pro-vide the recovery algorithm; they only attempt to theoretically characterize the different types of schedules.
First, we would like to ensure that, once a transaction T is committed, it should never be necessary to roll back T. This ensures that the durability property of trans-actions is not violated (see Section 21.3). The schedules that theoretically meet this criterion are called recoverable schedules; those that do not are called nonrecoverable and hence should not be permitted by the DBMS. The definition of recoverable schedule is as follows: A schedule S is recoverable if no transaction T in S commits until all transactions T that have written some item X that T reads have committed. A transaction T reads from transaction T in a schedule S if some item X is first written by T and later read by T. In addition, T should not have been aborted before T reads item X, and there should be no transactions that write X after T writes it and before T reads it (unless those transactions, if any, have aborted before T reads X).
Some recoverable schedules may require a complex recovery process as we shall see, but if sufficient information is kept (in the log), a recovery algorithm can be devised for any recoverable schedule. The (partial) schedules Sa and Sb from the preceding section are both recoverable, since they satisfy the above definition. Consider the schedule Sa given below, which is the same as schedule Sa except that two commit operations have been added to Sa:
Sa : r1(X); r2(X); w1(X); r1(Y); w2(X); c2; w1(Y); c1;
Sa is recoverable, even though it suffers from the lost update problem; this problem is handled by serializability theory (see Section 21.5). However, consider the two (partial) schedules Sc and Sd that follow:
Sc: r1(X); w1(X); r2(X); r1(Y); w2(X); c2; a1;
Sd: r1(X); w1(X); r2(X); r1(Y); w2(X); w1(Y); c1; c2; Se: r1(X); w1(X); r2(X); r1(Y); w2(X); w1(Y); a1; a2;
Sc is not recoverable because T2 reads item X from T1, but T2 commits before T1 commits. The problem occurs if T1 aborts after the c2 operation in Sc, then the value of X that T2 read is no longer valid and T2 must be aborted after it is committed, leading to a schedule that is not recoverable. For the schedule to be recoverable, the c2 operation in Sc must be postponed until after T1 commits, as shown in Sd. If T1 aborts instead of committing, then T2 should also abort as shown in Se, because the value of X it read is no longer valid. In Se, aborting T2 is acceptable since it has not committed yet, which is not the case for the nonrecoverable schedule Sc.
In a recoverable schedule, no committed transaction ever needs to be rolled back, and so the definition of committed transaction as durable is not violated. However, it is possible for a phenomenon known as cascading rollback (or cascading abort) to occur in some recoverable schedules, where an uncommitted transaction has to be rolled back because it read an item from a transaction that failed. This is illustrated in schedule Se, where transaction T2 has to be rolled back because it read item X from T1, and T1 then aborted.
Because cascading rollback can be quite time-consuming—since numerous transactions can be rolled back (see Chapter 23)—it is important to characterize the sched ules where this phenomenon is guaranteed not to occur. A schedule is said to be cascadeless, or to avoid cascading rollback, if every transaction in the schedule reads only items that were written by committed transactions. In this case, all items read will not be discarded, so no cascading rollback will occur. To satisfy this criterion, the r2(X) command in schedules Sd and Se must be postponed until after T1 has commit-ted (or aborted), thus delaying T2 but ensuring no cascading rollback if T1 aborts.
Finally, there is a third, more restrictive type of schedule, called a strict schedule, in which transactions can neither read nor write an item X until the last transaction that wrote X has committed (or aborted). Strict schedules simplify the recovery process. In a strict schedule, the process of undoing a write_item(X) operation of an aborted transaction is simply to restore the before image (old_value or BFIM) of data item X. This simple procedure always works correctly for strict schedules, but it may not work for recoverable or cascadeless schedules. For example, consider schedule Sf :
Sf : w1(X, 5); w2(X, 8); a1;
Suppose that the value of X was originally 9, which is the before image stored in the system log along with the w1(X, 5) operation. If T1 aborts, as in Sf , the recovery pro-cedure that restores the before image of an aborted write operation will restore the value of X to 9, even though it has already been changed to 8 by transaction T2, thus leading to potentially incorrect results. Although schedule Sf is cascadeless, it is not a strict schedule, since it permits T2 to write item X even though the transaction T1 that last wrote X had not yet committed (or aborted). A strict schedule does not have this problem.
It is important to note that any strict schedule is also cascadeless, and any cascade-less schedule is also recoverable. Suppose we have i transactions T1, T2, ..., Ti, and their number of operations are n1, n2, ..., ni, respectively. If we make a set of all possible schedules of these transactions, we can divide the schedules into two disjoint subsets: recoverable and nonrecoverable. The cascadeless schedules will be a subset of the recoverable schedules, and the strict schedules will be a subset of the cascade-less schedules. Thus, all strict schedules are cascadeless, and all cascadeless schedules are recoverable.
Copyright © 2018-2020 BrainKart.com; All Rights Reserved. Developed by Therithal info, Chennai.