4.6. Restrict¶
4.6.1. Overview¶
The restrict
keyword is a qualifier for a pointer variable’s type. By applying restrict to the type declaration of a pointer p, the programmer is making the following guarantee to the compiler:
Within the scope of the declaration of p , only p or expressions based on p will be used to access the object pointed to by p.
The compiler can take advantage of this guarantee to generate more efficient code.
Explanation of the guarantee:
Within the scope of the declaration of p
p is is a pointer variable. Examples are
p1
,s.p2
,p3[i]
,and bothp4
andp5
inp4->p5[]
. The program region over which the restriction applies is the scope of p’s declaration.only p or expressions based on p
This refers to the pointer in such accesses as
*p
,p[i]
, andp[i+3]
.will be used to access the object pointed to by p
Only actual fetches and stores are accesses.
p[i]
is an access, but&p[i]
andp+i
are not.
Warning
Incorrect usage of restrict
can lead to the compiler generating incorrect code. An example of incorrect usage is applying restrict
to pointers that point to overlapping objects in memory. Refer to Incorrect Usage for an example.
4.6.2. Example¶
The comparison below illustrates the effectiveness of using restrict
. Adding the restrict
qualifier to the types for pointers a1
and b1
guarantees to the compiler that these pointers will not be used to access the same memory location as t->sum1
or t->sum2
. This enables the compiler to generate a more efficient sequence of instructions for the loop.
In Table 4.7, the loop executes 256 times. The cycle counts were measured on F280049C with code and data in RAM and with -O3 --opt_for_speed=5
. With restrict
, the cycle count reduces from 3618 to 1209 cycles.
#include <stdint.h>
typedef struct
{
float* a;
float* b;
float sum1;
float sum2;
int16_t N;
} Test;
void foo2(Test *t)
{
float* a1 = t->a;
float* b1 = t->b;
int i;
for (i = 0; i < t->N; i++)
{
t->sum1 += a1[i] * b1[i];
t->sum2 += a1[i] * a1[i];
}
}
|
#include <stdint.h>
typedef struct
{
float* a;
float* b;
float sum1;
float sum2;
int16_t N;
} Test;
void foo1(Test *t)
{
float* restrict a1 = t->a;
float* restrict b1 = t->b;
int i;
for (i = 0; i < t->N; i++)
{
t->sum1 += a1[i] * b1[i];
t->sum2 += a1[i] * a1[i];
}
}
|
3618 cycles |
1209 cycles |
Note
restrict
is effective only at --opt_level=2
or higher.
4.6.3. Usage¶
4.6.3.1. Global variables¶
int *restrict p1;
int *restrict p2;
extern int A[];
Taken together, these file scope declarations of global variables guarantee to the compiler that if an object is accessed using any one of p1, p2, or A[] it will not be accessed using any of the others. Furthermore, since the file scope encompasses all other scopes, no accesses through local pointer variables can access the object pointed to by p1 or p2.
4.6.3.2. Function parameters¶
The parameters in a function declaration have function prototype scope, which terminates at the end of the declaration:
void foo(float *restrict v1, float *v2, int n);
In this function’s definition, the parameters have the same block scope as i:
void foo(float *restrict v1, float *v2, int n)
{
int i;
...
}
Restricting v1
guarantees to the compiler that the object pointed to by v1
does not overlap with objects pointed to by other pointers in the body of foo().
Note
Arrays are passed by reference in C. To restrict-qualify an array parameter, the restrict keyword should appear as follows:
void foo(short a[restrict 100]);
4.6.3.3. Local pointer variables¶
void foo(Test *t)
{
float* restrict a1 = t->a;
float* restrict b1 = t->b;
...
}
Adding restrict qualification to the pointer’s type in local variables a1
and b1
enables the programmer to restrict the nature of the accesses made via the pointer within the smaller scope of the function.
4.6.4. Incorrect Usage¶
Listing 4.13 is an example of incorrect use of restrict
. Pointers p
and q
are restrict-qualified. However, the arguments to copy
are such that the pointers overlap. This can lead to the compiler generating invalid code.
void copy(int n, int *restrict p, int *restrict q) { while (n-- > 0) *p++ = *q++; } void test(void) { extern int d[100]; copy(50, d+1, d); // Breaks the restrict guarantee! }