Using
Profile Feedback
Most compilers support profile feedback, which is a mechanism that
enables the compiler to gather information about the runtime behavior of the
application. Consider the snippet of code shown in Listing 2.43.
Listing 2.43 Code
Where the Runtime Behavior of the Code Is Uncertain
if ( a != 0 )
{ d++; } else
{ d--; }
In this situation, the compiler has no idea whether the general case is
to increment or decrement the variable d. The usual solution is for the compiler to either guess one is more
likely than the other or produce code that favors neither assumption. However,
if the code is in the frequently executed part of the application, the
appropriate choice may lead to an observable improvement in performance.
Another
case where knowledge of the runtime behavior of the application is useful is in
determining which routines to inline. As discussed in the previous section, picking
the correct routine to inline can lead to significant performance benefits.
However, every time a called routine gets inlined, it increases the number of
instructions in the calling routine. This code size increase is likely to cause
the instruction cache to be less effi-ciently utilized, leading to a drop in
performance. Hence, it can be quite important to inline routines that will
benefit performance and avoid inlining those that will only increase the
instruction cache footprint.
Profile
feedback, or feedback-directed optimization, allows the compiler the
opportu-nity to gather runtime information on the behavior of the application.
It is a three-step process. The first step is to build an instrumented version
of the application to collect runtime metrics. The next step is to run this
application on a data set, which is “typical” of the one that the application
will really run on but whose runtime is much shorter. The final step is to
recompile the application using this profile information. Listing 2.44 shows
the steps using the Solaris Studio compiler.
Listing 2.44 Steps
for Using Profile Feedback with the Sun Studio Compiler
$
cc -O -xprofile=collect:./profile -o
a.out prog.c
$
a.out
$
cc -O -xprofile=use:./profile -o a.out
prog.c
The benefit of profile feedback depends on the
application. Some applications will see no benefit, while some may see a
significant gain. As outlined earlier, the gains typically come from either
getting the compiler to lay out a performance-critical section of code in an
optimal way or inlining a performance-critical routine.
It is interesting to observe that profile feedback
tends to give the greatest benefit to codes where there are lots of branches or
calls rather than codes where there are a lot of loops. The compiler can
predict that loops will be iterated many times but has a harder job correctly
guessing for codes where there are plenty of control flow instructions. Codes
that have significant control flow instructions also tend to have few
instructions between control flow, so there are not many opportunities for the
compiler to extract performance in other ways. Hence, profile feedback can be
the most effective way of improving performance in a class of codes that is
otherwise hard to optimize.
There are
two concerns with using profile feedback. The first is that using profile
feedback complicates the build process and increases its duration. This can be
controlled by using profile feedback only on the release builds and not as part
of the regular developer builds. It can also be managed by ensuring that the
build process is as efficient as possible. For instance, the build process can
be parallelized so that it takes advantage of multiple cores. The other concern
is that using profile feedback optimizes the application for one particular
scenario at the expense of the performance in other scenarios. This is the
zero-sum view of performance; a gain on one workload has
to be compensated by a loss of performance in another. In general, this concern
is misplaced. Profile feedback helps the compiler make decisions about the
frequently executed paths and frequently called func-tions. In most instances,
the behavior of the application is only weakly dependent on the input data set.
For example, the same routines get called (although with a different
fre-quency), the same branches get taken, and so forth. This does not mean that
every con-trol transfer instruction has the same profile, but the majority of
the control transfers in the code have the same direction.
The exception is an application that has different “modes”: explicit
modes where the application is requested to perform different tasks or implicit
modes where some charac-teristic of the input data causes the application to
behave in a particular way.
An explicit mode might appear in the code as a switch/case
statement that calls entirely different code sections depending on an input
condition. An implicit mode might be an application that has multiple ways of
solving a problem, and the problem-solving approach used at each stage in the
solution depends on the results of the previ-ous steps.
If the application has modes of operation, then it is necessary to
provide training inputs that capture all the different modes of operation. The
profile of the application and the code coverage data for the particular
training data used provide the best indica-tion of whether the application has
these modes. Input data sets that do not cover signif-icant parts of the code
base are a strong indicator for the existence of these modes and definitely
indicate that more input data sets should be used in providing training data
for the application’s build.
The performance benefit from compiling with profile feedback is
variable. Codes where the time is spent in loops tend to benefit less from
profile feedback, whereas codes containing high numbers of control transfer
instructions tend to see a much greater ben-efit. The typical gain is probably
around 5% to 10%, but gains can be much greater if the profile feedback happens
to lead to other opportunities for further performance gains. The developer’s
choice to use profile feedback should be taken in light of whether using it
gets performance gains.
Related Topics
Privacy Policy, Terms and Conditions, DMCA Policy and Compliant
Copyright © 2018-2024 BrainKart.com; All Rights Reserved. Developed by Therithal info, Chennai.