Issues in the Design of a Code
Generator
1 Input to the Code Generator
2 The Target Program
3 Instruction Selection
4 Register Allocation
5 Evaluation Order
While the details are dependent on the specifics of the intermediate
represen-tation, the target language, and the run-time system, tasks such as
instruction selection, register allocation and assignment, and instruction
ordering are en-countered in the design of almost all code generators.
The most important criterion for a code generator is that it produce
cor-rect code. Correctness takes on special significance because of the number
of special cases that a code generator might face. Given the premium on
correct-ness, designing a code generator so it can be easily implemented,
tested, and maintained is an important design goal.
1. Input to the Code Generator
The input to the code generator is the intermediate representation of
the source program produced by the front end, along with information in the
symbol table that is used to determine the run-time addresses of the data
objects denoted by the names in the IR.
The many choices for the IR include three-address representations such
as quadruples, triples, indirect triples; virtual machine representations such
as bytecodes and stack-machine code; linear representations such as postfix
no-tation; and graphical representations such as syntax trees and DAG's. Many
of the algorithms in this chapter are couched in terms of the representations
considered in Chapter 6: three-address code, trees, and DAG's. The techniques
we discuss can be applied, however, to the other intermediate representations
as well.
In this chapter, we assume that the front end
has scanned, parsed,
and translated the source program into a relatively low-level IR, so
that the values of the names appearing in the IR can be represented by
quantities that the target machine can directly manipulate, such as integers
and floating-point numbers. We also assume that all syntactic and static
semantic errors have been detected, that the necessary type checking has taken
place, and that type-conversion operators have been inserted wherever
necessary. The code generator can therefore proceed on the assumption that its
input is free of these kinds of errors.
2. The Target Program
The instruction-set
architecture of the target machine has a
significant impact on the
difficulty of constructing a good
code generator that
produces high-quality machine code. The most common target-machine
architectures are RISC (reduced instruction set computer), CISC (complex
instruction set computer), and stack based.
A RISC machine typically has many
registers, three-address instructions, simple addressing modes, and a
relatively simple instruction-set architecture. In contrast, a CISC machine
typically has few registers, two-address instruc-tions, a variety of addressing
modes, several register classes, variable-length instructions, and instructions
with side effects.
In a stack-based machine,
operations are done by pushing operands onto a stack and then performing the
operations on the operands at the top of the stack. To achieve high performance
the top of the stack is typically kept in registers. Stack-based machines
almost disappeared because it was felt that the stack organization was too
limiting and required too many swap and copy operations.
However, stack-based architectures were revived
with the introduction of the Java Virtual Machine (JVM). The JVM is a software
interpreter for Java bytecodes, an intermediate language produced by Java
compilers. The inter- preter provides software compatibility across multiple
platforms, a major factor in the success of Java.
To overcome the high performance
penalty of interpretation, which can be on the order of a factor of 10, just-in-time (JIT) Java compilers have
been created. These JIT compilers translate bytecodes during run time to the
native hardware instruction set of the target machine. Another approach to
improving Java performance is to build a compiler that compiles directly into
the machine instructions of the target machine, bypassing the Java bytecodes
entirely.
Producing an absolute
machine-language program as output has the ad-vantage that it can be placed in
a fixed location in memory and immediately executed. Programs can be compiled
and executed quickly.
Producing a relocatable
machine-language program (often called an object
module) as output allows subprograms
to be compiled separately. A set of relocatable
object modules can be linked together and loaded for execution by a linking
loader. Although we must pay the added expense of linking and loading if we
produce relocatable object modules, we gain a great deal of flexibility in
being able to compile subroutines separately and to call other previously
compiled programs from an object module. If the target machine does not handle
relocation automatically, the compiler must provide explicit relocation
information to the loader to link the separately compiled program modules.
Producing an assembly-language program as output
makes the process of code generation somewhat easier. We can generate symbolic
instructions and use the macro facilities of the assembler to help generate
code. The price paid is the assembly step after code generation.
In this chapter, we shall use a very simple
RISC-like computer as our target machine. We add to it some CISC-like addressing
modes so that we can also discuss code-generation techniques for CISC machines.
For readability, we use assembly code as the target language . As long as
addresses can be calculated from offsets and other information stored in the
symbol table, the code gener-ator can produce relocatable or absolute addresses
for names just as easily as symbolic addresses.
3. Instruction Selection
The code generator must map the IR program into a code sequence that can
be executed by the target machine. The complexity of performing this mapping is
determined by a factors such as
• the level
of the IR
• the nature
of the instruction-set architecture
• the
desired quality of the generated code.
If the IR is high level, the code generator may
translate each IR statement into a sequence of machine instructions using code
templates. Such statement-by-statement code generation, however, often produces
poor code that needs further optimization. If the IR reflects some of the
low-level details of the un-derlying machine, then the code generator can use
this information to generate
more efficient code sequences.
The nature of the instruction set
of the target machine has a strong effect on the difficulty of instruction
selection. For example, the uniformity and com-pleteness of the instruction set
are important factors. If the target machine does not support each data type in
a uniform manner, then each exception to the general rule requires special
handling. On some machines, for example, floating-point operations are done
using separate registers.
Instruction speeds and machine idioms are other important factors. If we
do not care about the efficiency of the target program, instruction selection
is straightforward. For each type of three-address statement, we can design a
code skeleton that defines the target code to be generated for that construct.
For example, every three-address statement of the form x = y + z, where x, y,
and z are statically allocated, can be translated into the code sequence
Here, the fourth statement is redundant since it
loads a value that has just been stored, and so is the third if a is not
subsequently used.
The quality of the generated code is usually
determined by its speed and size. On most machines, a given IR program can be
implemented by many different code sequences, with significant cost differences
between the different implementations. A naive translation of the intermediate
code may therefore lead to correct but unacceptably inefficient target code.
For example, if the target machine has an "increment" instruction (INC), then the three-address statement a = a + 1 may be implemented more efficiently by the single instruction INC a, rather than by a more obvious sequence that loads a into a register, adds one to the register, and then stores the result back into a:
We need to know instruction costs
in order to design good code sequences but, unfortunately, accurate cost
information is often difficult to obtain. De-ciding which machine-code sequence
is best for a given three-address construct may also require knowledge about
the context in which that construct appears.
In Section 8.9 we shall see that instruction
selection can be modeled as a tree-pattern matching process in which we
represent the IR and the machine instructions as trees. We then attempt to
"tile" an IR tree with a set of sub-trees that correspond to machine
instructions. If we associate a cost with each machine-instruction subtree, we
can use dynamic programming to generate op-timal code sequences. Dynamic
programming is discussed in Section 8.11.
4. Register Allocation
A key problem in code generation is deciding what
values to hold in what registers. Registers are the fastest computational unit
on the target machine, but we usually do not have enough of them to hold all
values. Values not held in registers need to reside in memory. Instructions involving
register operands are invariably shorter and faster than those involving
operands in memory, so efficient utilization of registers is particularly
important.
The use of registers is often subdivided into two
subproblems:
Register allocation, during
which we select the set of variables that will reside in registers at each point in the program.
Register assignment, during
which we pick the specific register that a
variable will reside in.
Finding an optimal assignment of registers to
variables is difficult, even with single-register machines. Mathematically, the
problem is NP-complete. The problem is further complicated because the hardware
and/or the operating system of the target machine may require that certain
register-usage conventions be observed.
E x a m p l e 8 . 1 : Certain machines require register-pairs (an even and next odd-numbered register) for some
operands and results. For example, on some ma-chines, integer multiplication
and integer division involve register pairs. The multiplication instruction is
of the form
M x, y
where x, the multiplicand, is the even register of an even/odd register
pair and y, the multiplier, is the odd register. The product occupies the
entire even/odd register pair. The division instruction is of the form
D x, y
where the dividend occupies an even/odd register pair whose even
register is x; the divisor is y. After division, the even register holds the
remainder and the odd register the quotient.
Now, consider the two
three-address code sequences in Fig. 8.2 in which
the only difference in (a) and (b) is the operator in the second statement. The
shortest assembly-code sequences for (a)
and (b) are given in Fig. 8.3.
Ri stands for register i. SRDA stands for Shift-Right-Double-Arithmetic and SRDA RO, 32 shifts
the dividend into
Rl and clears RO so all
bits equal its sign
bit. L, ST, and A stand for load, store, and add, respectively. Note that the optimal
choice for the register into which a is to be loaded depends on what will ultimately happen to t.
Strategies for register
allocation and assignment are discussed in Section 8.8. Section 8.10 shows
that for certain classes of machines we can construct code sequences that
evaluate expressions using as few registers as possible.
5. Evaluation Order
The order in which computations
are performed can affect the efficiency of the target code. As we shall see,
some computation orders require fewer registers to hold intermediate results
than others. However, picking a best order in the general case is a difficult
NP-complete problem. Initially, we shall avoid the problem by generating code
for the three-address statements in the order in which they have been produced
by the intermediate code generator. In Chapter 10, we shall study code scheduling
for pipelined machines that can execute several operations in a single clock
cycle.
Related Topics
Privacy Policy, Terms and Conditions, DMCA Policy and Compliant
Copyright © 2018-2026 BrainKart.com; All Rights Reserved. Developed by Therithal info, Chennai.