Chapter: Multicore Application Programming For Windows, Linux, and Oracle Solaris : Coding for Performance

How Potential Pointer Aliasing Can Inhibit Compiler Optimizations

One of the most common barriers to optimization with C and C++ codes is pointer aliasing.

How Potential Pointer Aliasing Can Inhibit Compiler Optimizations

One of the most common barriers to optimization with C and C++ codes is pointer aliasing. In most situations, a compiler cannot tell whether two (or more) pointers point to the same address in memory or to different addresses. The compiler needs to make the safe choice, so it will often default to assuming that the pointers do alias, even when the programmer knows that the pointers do not. In some cases, the compiler is able to work around this problem by producing multiple versions of the code and using runtime check-ing to determine which version is appropriate. Consider the code shown in Listing 2.45.

Listing 2.45 Code with Potential Pointer Aliasing

void sum( double * total, double * array, int len )

{

for (i=0; i<len; i++)

{

*total += array[i];

}

The compiler cannot determine whether the memory location where the variable total is stored is part of the array. It has to make the safe assumption that the variable is part of the array, which results in code that stores the value of the variable total back to memory in every iteration. The loop shown in Listing 2.46 comes from the code compiled at a low level of optimization.

Listing 2.46 Loop Containing Store Operation Because of Potential Aliasing

top:

ldd [%o1],%f0 ! Load of array[i]

add %o4,1,%o4 ! Increment i

add %o1,8,%o1 ! Increment pointer to array[i];

cmp %o4,%o5 ! Check for end of loop

faddd %f2,%f0,%f4 ! Perform addition

std %f4,[%o0] ! Store of the variable total

bl,a,pt %icc,top ! Loop to top

ldd [%o0],%f2 ! Reload of the variable total

At higher levels of optimization, the compiler can version the loop so that the version that performs the store to memory can be avoided if the loop contains no alias. Listing 2.47 shows the equivalent source code.

Listing 2.47 Source Code Showing Two Versions of Loop with Potential Pointer Aliasing

void sum( double * total, double * array, int len )

{

if ( (total < array) || (total > array + len) )

{

double tmp = *total;

for (int i=0; i<len; i++)

{

tmp += array[i];

}

*total = tmp;

}

else

for ( int i=0; i<len; i++ )

{

*total += array[i];

}

The modified source code uses a temporary variable to hold the calculated value so that the compiler knows that no aliasing is possible. This is a good technique to use in order to avoid possible aliasing issues with pointers to global data.

There are two fundamental performance issues in the presence of potential aliasing. The first is illustrated in the example disassembly in Listing 2.46. Aliasing requires the compiler to include unnecessary stores or loads of variables. It is possible to identify this problem by counting the generated memory operations and confirming that they corre-spond to the expected number from the source code.

The second issue is more subtle and involves the ability of the compiler to reorder instructions. Often, a compiler will move loads earlier in the instruction stream to give them more time to complete and move stores to later in the instruction stream to give more time for the instruction feeding data to them to complete. Unfortunately, aliasing issues limit the ability of the compiler to do this. When the disassembly is viewed, the problem appears as a store instruction followed immediately by a load instruction. This schedule ensures the correct memory ordering, but it may not be optimal for perform-ance. Listing 2.48 shows this problem.

Listing 2.48 Code with Potential Aliasing Issues

void func(int * a, int * b)

{

(*a)++;

(*b)++;

}

When compiled, this code produces the SPARC assembly code shown in Listing 2.49.

Listing 2.49 SPARC Assembly Code Produced in the Presence of Possible

Pointer Aliasing

ld [%o0],%o5 ! Load *a

add %o5,1,%o5 ! Increment

st %o5,[%o0] ! Store *a // Store of first variable

ld [%o1],%o4 ! Load *b // Load of second variable

add %o4,1,%o3 ! Increment

st %o3,[%o1] ! Store *b

The store of the variable a needs to be completed before the load of the variable b is issued. The code shown in Listing 2.50 has no pointer aliasing problems.

Listing 2.50 Code with No Aliasing Problems

void func(int * a)

{

a[0]++;

a[1]++;

}

Compiling the code shown in Listing 2.50 produces the SPARC assembly code shown in Listing 2.51.

Listing 2.51 SPARC Assembly Code Produced in the Absence of Pointer Aliasing

ld [%o0],%o5 ! Load a[0]

ld [%o0+4],%o4 ! Load a[1] // Load of second variable

add %o5,1,%o5 ! Increment

st %o5,[%o0] ! Store a[0] // Store of first variable

add %o4,1,%o3 ! Increment

st %o3,[%o1] ! Store a[1]

Because the compiler is able to tell that there is no aliasing between the two opera-tions, it can reorder the instructions to ensure that both loads start as early as possible.

A bigger problem with aliasing is that it often inhibits the compiler’s ability to per-form complex transformations of the code. Once a code has more than two streams of input data, it becomes very difficult to produce runtime code that dynamically checks for aliasing issues. For instance, the compiler would find runtime checking for aliasing difficult for the code shown in Listing 2.52. In this code, the matrix a is accessed con-tiguously so the compiler has knowledge of the range of memory that will be modified. The matrix b is accessed through the first index, which is a pointer into multiple arrays. Any of these arrays might overlap with the one pointed to by a.

Listing 2.52 Code Where Code Runtime Checking of Aliasing Is Difficult

void add(double **a, double **b)

{

for( int i=0; i<100; i++ ) for( int j=0; j<100; j++ )

a[i][j] += b[j][i];

}

There are multiple ways to avoid aliasing issues. The first is to use local or stack-based variables. The compiler knows that these variables cannot alias with global variables; therefore, it can produce code based on this knowledge. For scalar variables with aliasing problems, this can often be the simplest solution.

Another approach is to advise the compiler what assumptions it is allowed to make about the code. The Solaris Studio compiler supports the flag -xalias_level=<level>, which allows the developer to specify, per file, the degree of aliasing that the code uses. The compiler also supports the flag -xrestrict, which tells the compiler that pointers passed into functions do not alias. Incidentally, this is the default for the Fortran standard. The gcc compiler supports the flag -fansi_alias, which tells the compiler that the code has aliasing that conforms to the C standard. The biggest issue with these flags is that they specify aliasing at the file or whole application level, and for large applications it can be difficult or impossible to prove that applying the flags is safe.

Compilers often support pragmas or directives that can be added to the source code to indicate the degree of aliasing at a function level. The finer level of control means that the directives can be added in places (a) where the biggest performance impact is seen and (b) where the developer can be certain that aliasing does not occur. However, com-piler pragmas or directives are rarely the same across multiple compilers, so using them may lead to code that only compiles with a particular compiler and is not portable to others.

The most effective solution to the aliasing problem for C-language programs may be the restrict keyword. This enables the developer to use restrict-qualified pointers. These tell the compiler that no other pointers point to the same memory at the position in the code where the pointer is assigned. This is most useful for when pointers are passed into functions. Listing 2.53 shows the code from Listing 2.52 modified to use the restrict keyword.

Listing 2.53 Code Modified to Use the restrict Keyword

void add( double ** restrict a, double ** restrict b )

{

for( int i=0; i<100; i++ ) for( int j=0; j<100; j++ )

a[i][j] += b[j][i];

}

The fact that either array a or b is restrict-qualified means that there is no aliasing between the arrays and means the compiler can generate more efficient code.

Study Material, Lecturing Notes, Assignment, Reference, Wiki description explanation, brief detail

Multicore Application Programming For Windows, Linux, and Oracle Solaris : Coding for Performance : How Potential Pointer Aliasing Can Inhibit Compiler Optimizations |