Chapter: Compilers : Principles, Techniques, & Tools : Intermediate-Code Generation

Backpatching

1 One-Pass Code Generation Using Backpatching 2 Backpatching for Boolean Expressions 3 Flow-of-Control Statements 4 Break-, Continue-, and Goto-Statements 5 Exercises for Section 6.7

Backpatching

1 One-Pass Code Generation Using Backpatching

2 Backpatching for Boolean Expressions

3 Flow-of-Control Statements

4 Break-, Continue-, and Goto-Statements

5 Exercises for Section 6.7

A key problem when generating code for boolean expressions and flow-of-control statements is that of matching a jump instruction with the target of the jump For example, the translation of the boolean expression B in if ( B ) S contains a jump, for when B is false, to the instruction following the code for S. In a one-pass translation, B must be translated before S is examined. What then is the target of the goto that jumps over the code for S? In Section 6.6 we addressed this problem by passing labels as inherited attributes to where the relevant jump instructions were generated. But a separate pass is then needed to bind labels to addresses.

This section takes a complementary approach, called backpatching, in which lists of jumps are passed as synthesized attributes. Specifically, when a jump is generated, the target of the jump is temporarily left unspecified. Each such jump is put on a list of jumps whose labels are to be filled in when the proper label can be determined. All of the jumps on a list have the same target label.

1. One-Pass Code Generation Using Backpatching

Backpatching can be used to generate code for boolean expressions and flow-of-control statements in one pass. The translations we generate will be of the same form as those in Section 6.6, except for how we manage labels.

In this section, synthesized attributes truelist and falselist of nonterminal B are used to manage labels in jumping code for boolean expressions. In particu-lar, B. truelist will be a list of jump or conditional jump instructions into which we must insert the label to which control goes if B is true. B.falselist likewise is the list of instructions that eventually get the label to which control goes when B is false. As code is generated for B, jumps to the true and false exits are left incomplete, with the label field unfilled. These incomplete jumps are placed on lists pointed to by B.truelist and B.falselist, as appropriate. Similarly, a statement S has a synthesized attribute S.nextlist, denoting a list of jumps to the instruction immediately following the code for S.

For specificity, we generate instructions into an instruction array, and labels will be indices into this array. To manipulate lists of jumps, we use three functions:

1. makelist(i) creates a new list containing only i, an index into the array of instructions; makelist returns a pointer to the newly created list.

2. merge(pi,p2) concatenates the lists pointed to by p1 and p2, and returns a pointer to the concatenated list.

3. backpatch(p,i) inserts i as the target label for each of the instructions on the list pointed to by p.

2. Backpatching for Boolean Expressions

We now construct a translation scheme suitable for generating code for boolean expressions during bottom-up parsing. A marker nonterminal M in the gram-mar causes a semantic action to pick up, at appropriate times, the index of the next instruction to be generated. The grammar is as follows:

Consider semantic action (1) for the production B -> B1 || M B2. If Bx is true, then B is also true, so the jumps on Bi.truelist become part of B.truelist. If Bi is false, however, we must next test B2, so the target for the jumps B>i .falselist must be the beginning of the code generated for B2 • This target is obtained using the marker nonterminal M. That nonterminal produces, as a synthesized attribute M.instr, the index of the next instruction, just before B2 code starts being generated.

To obtain that instruction index, we associate with the production M -» e the semantic action

The variable nextinstr holds the index of the next instruction to follow. This value will be backpatched onto the Bi.falselist (i.e., each instruction on the list B1.falselist will receive M.instr as its target label) when we have seen the remainder of the production B ->• B1 || M B2.

Semantic action (2) for B -» B1 && M B2 is similar to (1). Action (3) for B -> ! B swaps the true and false lists. Action (4) ignores parentheses.

For simplicity, semantic action (5) generates two instructions, a conditional goto and an unconditional one. Neither has its target filled in. These instructions are put on new lists, pointed to by B.truelist and B.falselist, respectively.

Example 6.24 : Consider again the expression

An annotated parse tree is shown in Fig. 6.44; for readability, attributes tru-elist, falselist, and instr are represented by their initial letters. The actions are performed during a depth-first traversal of the tree. Since all actions appear at the ends of right sides, they can be performed in conjunction with reductions during a bottom-up parse. In response to the reduction of x < 100 to B by production (5), the two instructions

are generated. (We arbitrarily start instruction numbers at 100.) The marker nonterminal M in the production

records the value of nextinstr, which at this time is 102. The reduction of x > 200 to B by production (5) generates the instructions

The marker nonterminal M records the current value of nextinstr, which is now

Reducing x ! = y into B by production (5) generates

We now reduce by B —> B1 && M B2. The corresponding semantic action calls backpatch(B1.truelist,M.instr) to bind the true exit of B1 to the first instruction of B2. Since B1.truelist is {102} and M.instr is 104, this call to backpatch fills in 104 in instruction 102. The six instructions generated so far are thus as shown in Fig. 6.45(a).

The semantic action associated with the final reduction by B —> B1 || MB2 calls backpatch({101},102) which leaves the instructions as in Fig. 6.45(b).

The entire expression is true if and only if the gotos of instructions 100 or 104 are reached, and is false if and only if the gotos of instructions 103 or 105 are reached. These instructions will have their targets filled in later in the compilation, when it is seen what must be done depending on the truth or falsehood of the expression.

3. Flow-of-Control Statements

We now use backpatching to translate flow-of-control statements in one pass. Consider statements generated by the following grammar:

Here S denotes a statement, L a statement list, A an assignment-statement, and B a boolean expression. Note that there must be other productions, such as

those for assignment-statements. The productions given, however, are sufficient to illustrate the techniques used to translate flow-of-control statements.

The code layout for if-, if-else-, and while-statements is the same as in Section 6.6. We make the tacit assumption that the code sequence in the instruction array reflects the natural flow of control from one instruction to the next. If not, then explicit jumps must be inserted to implement the natural sequential flow of control.

The translation scheme in Fig. 6.46 maintains lists of jumps that are filled in when their targets are found. As in Fig. 6.43, boolean expressions generated by nonterminal B have two lists of jumps, B.truelist and B.falselist, corresponding to the true and false exits from the code for B, respectively. Statements gener-ated by nonterminals 5 and L have a list of unfilled jumps, given by attribute nextlist, that must eventually be completed by backpatching. S.nextlist is a list of all conditional and unconditional jumps to the instruction following the code for statement S in execution order. L.nextlist is defined similarly.

Consider the semantic action (3) in Fig. 6.46. The code layout for production 5 -» while (B) S1 is as in Fig. 6.35(c). The two occurrences of the marker nonterminal M in the production

record the instruction numbers of the beginning of the code for B and the beginning of the code for Si. The corresponding labels in Fig. 6.35(c) are begin and B.true, respectively.

Again, the only production for M is M -> e. Action (6) in Fig. 6.46 sets attribute M.instr to the number of the next instruction. After the body Si of the while-statement is executed, control flows to the beginning. Therefore, when we reduce while Mi (B ) M₂ S1 to S, we backpatch Sy.nextlist to make all targets on that list be My.instr. An explicit jump to the beginning of the code for B is appended after the code for Si because control may also "fall out the bottom." B. truelist is backpatched to go to the beginning of Si by making jumps on B.truelist go to M₂.instr.

A more compelling argument for using S.nextlist and L.nextlist comes when code is generated for the conditional statement if ( B ) Si else S2. If control "falls out the bottom" of Si, as when Si is an assignment, we must include at the end of the code for Si a jump over the code for S2- We use another marker nonterminal to generate this jump after Si. Let nonterminal N be this marker with production N -> e. N has attribute N.nextlist, which will be a list consisting of the instruction number of the jump goto _ that is generated by the semantic action (7) for N.

Semantic action (2) in Fig. 6.46 deals with if-else-statements with the syntax

We backpatch the jumps when B is true to the instruction Mi.instr; the latter is the beginning of the code for Si. Similarly, we backpatch jumps when B is false to go to the beginning of the code for S 2 . The list S.nextlist includes all jumps out of Si and S 2 , as well as the jump generated by N. (Variable temp is a temporary that is used only for merging lists.)

Semantic actions (8) and (9) handle sequences of statements. In

the instruction following the code for L± in order of execution is the beginning of S. Thus the L1.nextlist list is backpatched to the beginning of the code for S, which is given by M.instr. In L —> S, L.nextlist is the same as S.nextlist.

Note that no new instructions are generated anywhere in these semantic rules, except for rules (3) and (7). All other code is generated by the semantic actions associated with assignment-statements and expressions. The flow of control causes the proper backpatching so that the assignments and boolean expression evaluations will connect properly.

4. Break-, Continue-, and Goto-Statements

The most elementary programming language construct for changing the flow of control in a program is the goto-statement. In C, a statement like goto L sends control to the statement labeled L — there must be precisely one statement with label L in this scope. Goto-statements can be implemented by maintaining a list of unfilled jumps for each label and then backpatching the target when it is known.

Java does away with goto-statements. However, Java does permit disci-plined jumps called break-statements, which send control out of an enclosing construct, and continue-statements, which trigger the next iteration of an en-closing loop. The following excerpt from a lexical analyzer illustrates simple break- and continue-statements:

Control jumps from the break-statement On line 4 to the next statement after the enclosing for loop. Control jumps from the continue-statement on line 2 to code to evaluate readch() and then to the if-statement on line 2.

If 5 is the enclosing construct, then a break-statement is a jump to the first instruction after the code for S. We can generate code for the break by (1) keeping track of the enclosing statement S, (2) generating an unfilled jump for the break-statement, and (3) putting this unfilled jump on S.nextlist, where nextlist is as discussed in Section 6.7.3.

In a two-pass front end that builds syntax trees, S.nextlist can be imple-mented as a field in the node for S. We can keep track of S by using the symbol table to map a special identifier b r e a k to the node for the enclosing statement S. This approach will also handle labeled break-statements in Java, since the symbol table can be used to map the label to the syntax-tree node for the enclosing construct.

Alternatively, instead of using the symbol table to access the node for 5, we can put a pointer to S.nextlist in the symbol table. Now, when a break-statement is reached, we generate an unfilled jump, look up nextlist through the symbol table, and add the jump to the list, where it will be backpatched as discussed in Section 6.7.3.

Continue-statements can be handled in a manner analogous to the break-statement. The main difference between the two is that the target of the gen-erated jump is different.

5. Exercises for Section 6.7

Exercise 6.7.1 : Using the translation of Fig. 6.43, translate each of the fol-lowing expressions. Show the true and false lists for each subexpression. You may assume the address of the first instruction generated is 100.

E x e r c i s e 6 . 7 . 2 : In Fig. 6.47(a) is the outline of a program, and Fig. 6.47(b) sketches the structure of the generated three-address code, using the backpatch-ing translation of Fig. 6.46. Here, i_x through i₈ are the labels of the generated instructions that begin each of the "Code" sections. When we implement this translation, we maintain, for each boolean expression E, two lists of places in the code for E, which we denote by E.true and E.false. The places on list E.true are those places where we eventually put the label of the statement to which control must flow whenever E is true; E.false similarly lists the places where we put the label that control flows to when E is found to be false. Also,

Exercise 6 . 7 . 3 : When performing the translation of Fig. 6.47 using the scheme of Fig. 6.46, we create lists S.next for each statement, starting with the assign-ment-statements Si, S ₂ , and S 3 , and proceeding to progressively larger if-statements, if-else-statements, while-statements, and statement blocks. There are five constructed statements of this type in Fig. 6.47:

For each of these constructed statements, there is a rule that allows us to construct Sj.next in terms of other Sj.next lists, and the lists Ek.true and Ek-false for the expressions in the program. Give the rules for

Study Material, Lecturing Notes, Assignment, Reference, Wiki description explanation, brief detail

Compilers : Principles, Techniques, & Tools : Intermediate-Code Generation : Backpatching |