Characterizing Schedules Based on Recoverability
When transactions are executing concurrently in an interleaved fashion,
then the order of execution of operations from all the various transactions is
known as a schedule (or history). In this section, first we
define the concept of schedules, and then we characterize the types of
schedules that facilitate recovery when failures occur. In Section 21.5, we
characterize schedules in terms of the interference of par-ticipating
transactions, leading to the concepts of serializability and serializable
schedules.
1. Schedules
(Histories) of Transactions
A schedule (or history) S of n transactions T1, T2,
..., Tn is an ordering of
the operations of the transactions. Operations from different transactions can
be interleaved in the schedule S.
However, for each transaction Ti
that participates in the schedule S,
the operations of Ti in S must appear in the same order in
which they occur in Ti. The order of operations in S is considered to be a total ordering, meaning that for any two operations in the schedule, one must occur before the
other. It is possible theoretically
to deal with schedules whose operations form partial orders (as we discuss later), but we will assume for now
total ordering of the operations in a schedule.
For the
purpose of recovery and concurrency control, we are mainly interested in the read_item and write_item operations of the transactions,
as well as the commit and abort operations. A shorthand notation
for describing a schedule uses the symbols
b, r, w,
e, c, and a for the operations begin_transaction, read_item, write_item, end_transaction, commit, and abort, respectively, and appends as a subscript the transaction id (transaction number) to each
operation in the schedule. In this notation, the data-base item X that is read or written follows the r and w operations in parentheses. In some schedules, we will only show
the read and write operations, whereas in other schedules, we will show all the
operations. For example, the schedule in Figure 21.3(a), which we shall call Sa, can be written as follows
in this notation:
Sa: r1(X); r2(X); w1(X);
r1(Y); w2(X); w1(Y);
Similarly,
the schedule for Figure 21.3(b), which we call Sb, can be written as follows, if we assume that
transaction T1 aborted
after its read_item(Y) operation:
Sb: r1(X); w1(X); r2(X);
w2(X); r1(Y); a1;
Two
operations in a schedule are said to conflict
if they satisfy all three of the fol-lowing conditions: (1) they belong to different transactions; (2) they access
the same item X; and (3) at least one of
the operations is a write_item(X). For example, in schedule
Sa, the operations r1(X) and w2(X) conflict, as do the operations r2(X) and w1(X), and the operations w1(X) and w2(X). However, the operations r1(X) and r2(X) do not conflict, since they are both
read operations; the operations w2(X)
and w1(Y) do not conflict because they operate
on distinct data items X and Y; and the operations r1(X) and w1(X) do not conflict because they belong
to the same transaction.
Intuitively,
two operations are conflicting if changing their order can result in a dif-ferent
outcome. For example, if we change the order of the two operations r1(X); w2(X) to
w2(X); r1(X), then the value of X that
is read by transaction T1 changes, because in the second order the value of X is changed by w2(X) before it is read by r1(X), whereas in the first order the value is read before it is
changed. This is called a read-write conflict. The other type is
called a write-write conflict, and
is illustrated by the case where we
change the order of two operations such as w1(X); w2(X) to w2(X); w1(X). For a write-write conflict, the last value of X will
differ because in one case it is
written by T2 and in the
other case by T1. Notice
that two read operations are not conflicting because changing their order makes
no difference in outcome.
The rest
of this section covers some theoretical definitions concerning schedules. A
schedule S of n transactions T1,
T2, ..., Tn is said to be a complete schedule if the following
conditions hold:
1. The
operations in S are exactly those
operations in T1, T2, ..., Tn, including a commit or abort operation as the last
operation for each transaction in the schedule.
2. For any
pair of operations from the same transaction Ti, their relative order of appearance in S is the same as their order of
appearance in Ti.
3. For any
two conflicting operations, one of the two must occur before the other in the
schedule.
The
preceding condition (3) allows for two nonconflicting
operations to occur in the schedule without defining which occurs first,
thus leading to the definition of a schedule as a partial order of the operations in the n transactions.11 However, a total order must be
specified in the schedule for any pair of conflicting operations (condition 3)
and for any pair of operations from the same transaction (condition 2).
Condition 1 simply states that all operations in the transactions must appear
in the complete schedule. Since every transaction has either committed or
aborted, a complete schedule will not
contain any active transactions at the end of the schedule.
In
general, it is difficult to encounter complete schedules in a transaction
processing system because new transactions are continually being submitted to
the system. Hence, it is useful to define the concept of the committed projection C(S)
of a schedule S, which includes only
the operations in S that belong to
committed trans-actions—that is, transactions Ti whose commit operation ci is in S.
2. Characterizing
Schedules Based on Recoverability
For some
schedules it is easy to recover from transaction and system failures, whereas
for other schedules the recovery process can be quite involved. In some cases,
it is even not possible to recover correctly after a failure. Hence, it is
important to characterize the types of schedules for which recovery is possible, as well as those for which recovery is relatively simple. These
characterizations do not actually pro-vide the recovery algorithm; they only
attempt to theoretically characterize the different types of schedules.
First, we
would like to ensure that, once a transaction T is committed, it should never
be necessary to roll back T. This
ensures that the durability property of trans-actions is not violated (see
Section 21.3). The schedules that theoretically meet this criterion are called recoverable schedules; those that do not
are called nonrecoverable and hence
should not be permitted by the DBMS. The definition of recoverable schedule is as follows: A schedule S is recoverable if no transaction T in S commits until all transactions
T that have written some item X that T reads have committed. A transaction T
reads from transaction T in a schedule S if some item X is first
written by T and later read by T. In addition, T should not have been aborted
before T reads item X, and there should be no transactions
that write X after T writes it and before T reads it (unless those transactions,
if any, have aborted before T reads X).
Some
recoverable schedules may require a complex recovery process as we shall see,
but if sufficient information is kept (in the log), a recovery algorithm can be
devised for any recoverable schedule. The (partial) schedules Sa and Sb from the preceding section are both recoverable,
since they satisfy the above definition. Consider the schedule Sa given below, which is the
same as schedule Sa except
that two commit operations have been added to Sa:
Sa : r1(X); r2(X); w1(X);
r1(Y); w2(X); c2; w1(Y); c1;
Sa is recoverable, even though it
suffers from the lost update problem; this problem is handled by serializability theory (see Section 21.5). However,
consider the two (partial) schedules Sc
and Sd that follow:
Sc: r1(X); w1(X); r2(X);
r1(Y); w2(X); c2; a1;
Sd: r1(X); w1(X); r2(X);
r1(Y); w2(X); w1(Y);
c1; c2; Se: r1(X); w1(X); r2(X);
r1(Y); w2(X); w1(Y);
a1; a2;
Sc is not recoverable because T2 reads item X from T1, but T2 commits
before T1 commits. The problem occurs if T1 aborts after the c2 operation in Sc, then the value of X that T2 read is no longer valid and T2 must be aborted after
it is committed, leading to a schedule that is not recoverable. For the schedule to be recoverable, the c2 operation in Sc must be postponed until
after T1 commits, as shown
in Sd. If T1 aborts instead of
committing, then T2 should
also abort as shown in Se,
because the value of X it read is no
longer valid. In Se,
aborting T2 is acceptable
since it has not committed yet, which is not the case for the nonrecoverable
schedule Sc.
In a
recoverable schedule, no committed transaction ever needs to be rolled back,
and so the definition of committed transaction as durable is not violated.
However, it is possible for a phenomenon known as cascading rollback (or cascading
abort) to occur in some recoverable schedules, where an uncommitted transaction has to be rolled
back because it read an item from a transaction that failed. This is
illustrated in schedule Se,
where transaction T2 has
to be rolled back because it read item X
from T1, and T1 then aborted.
Because
cascading rollback can be quite time-consuming—since numerous transactions can
be rolled back (see Chapter 23)—it is important to characterize the sched ules
where this phenomenon is guaranteed not to occur. A schedule is said to be cascadeless, or to avoid cascading rollback, if every transaction in the schedule
reads only items that were written
by committed transactions. In this case, all items read will not be discarded,
so no cascading rollback will occur. To satisfy this criterion, the r2(X) command in schedules Sd
and Se must be
postponed until after T1 has commit-ted (or aborted), thus
delaying T2 but ensuring
no cascading rollback if T1
aborts.
Finally,
there is a third, more restrictive type of schedule, called a strict schedule, in which transactions
can neither read nor write an item X until the last transaction that wrote X has committed (or aborted). Strict
schedules simplify the recovery process. In a strict schedule, the process of
undoing a write_item(X) operation of an aborted transaction
is simply to restore the before image
(old_value or BFIM) of data item X. This simple procedure always works
correctly for strict schedules, but it may not work for recoverable or
cascadeless schedules. For example, consider schedule Sf :
Sf : w1(X, 5); w2(X, 8); a1;
Suppose
that the value of X was originally 9,
which is the before image stored in the system log along with the w1(X, 5) operation. If T1
aborts, as in Sf , the
recovery pro-cedure that restores the before image of an aborted write
operation will restore the value of X
to 9, even though it has already been changed to 8 by transaction T2, thus leading to
potentially incorrect results. Although schedule Sf is cascadeless, it is not a strict schedule, since it
permits T2 to write item X even though the transaction T1 that last wrote X had not yet committed (or aborted). A
strict schedule does not have this problem.
It is
important to note that any strict schedule is also cascadeless, and any
cascade-less schedule is also recoverable. Suppose we have i transactions T1,
T2, ..., Ti, and their number of
operations are n1, n2, ..., ni, respectively. If we make a set of all possible
schedules of these transactions, we can divide the schedules into two disjoint
subsets: recoverable and nonrecoverable. The cascadeless schedules will be a
subset of the recoverable schedules, and the strict schedules will be a subset
of the cascade-less schedules. Thus, all strict schedules are cascadeless, and
all cascadeless schedules are recoverable.
Related Topics
Privacy Policy, Terms and Conditions, DMCA Policy and Compliant
Copyright © 2018-2023 BrainKart.com; All Rights Reserved. Developed by Therithal info, Chennai.