10.5.10. Assigning Symbols at Link Time¶
Linker assignment statements allow you to define external (global) symbols and assign values to them at link time. You can use this feature to initialize a variable or pointer to an allocation-dependent value. See Using Linker Symbols in C/C++ Applications for information about referring to linker symbols in C/C++ code.
10.5.10.1. Syntax of Assignment Statements¶
The syntax of assignment statements in the linker is similar to that of assignment statements in the C language:
Assignment Statement Syntax in Linker Command Files
symbol
=
expression;
assigns the value of expression to symbol
symbol
+=
expression;
adds the value of expression to symbol
symbol
-=
expression;
subtracts the value of expression from symbol
symbol
*=
expression;
multiplies symbol by expression
symbol
/=
expression;
divides symbol by expression
The symbol should be defined externally. If it is not, the linker defines a new symbol and enters it into the symbol table. The expression must follow the rules defined in Assignment Expressions. Assignment statements must terminate with a semicolon.
The linker processes assignment statements after it allocates all the output sections. Therefore, if an expression contains a symbol, the address used for that symbol reflects the symbol’s address in the executable output file.
For example, suppose a program reads data from one of two tables identified by two external symbols, Table1 and Table2. The program uses the symbol cur_tab as the address of the current table. The cur_tab symbol must point to either Table1 or Table2. You could accomplish this in the assembly code, but you would need to reassemble the program to change tables. Instead, you can use a linker assignment statement to assign cur_tab at link time:
prog.c.o /* Input file */
cur_tab = Table1; /* Assign cur_tab to one of the tables */
10.5.10.2. Assigning the SPC to a Symbol¶
A special symbol, denoted by a dot (.
), represents the current value of the section program counter (SPC) during allocation. The SPC keeps track of the current location within a section. The linker’s .
symbol is analogous to the assembler’s $
symbol. The .
symbol can be used only in assignment statements within a SECTIONS directive because .
is meaningful only during allocation and SECTIONS controls the allocation process. (See The SECTIONS Directive.)
The .
symbol refers to the current run address, not the current load address, of the section.
For example, suppose a program needs to know the address of the beginning of the .data section. By using the .global directive (see Global (External) Symbols), you can create an external undefined variable called Dstart in the program. Then, assign the value of .
to Dstart:
SECTIONS
{
.text: {}
.data: {Dstart = .;}
.bss : {}
}
This defines Dstart to be the first linked address of the .data section. (Dstart is assigned before .data is allocated.) The linker relocates all references to Dstart.
A special type of assignment assigns a value to the . symbol. This adjusts the SPC within an output section and creates a hole between two input sections. Any value assigned to . to create a hole is relative to the beginning of the section, not to the address actually represented by the . symbol. Holes and assignments to . are described in Creating and Filling Holes.
10.5.10.3. Assignment Expressions¶
These rules apply to linker expressions:
Expressions can contain global symbols, constants, and the C language operators listed in the table below.
All numbers are treated as long (32-bit) integers.
Constants are identified by the linker in the same way as by the assembler. That is, numbers are recognized as decimal unless they have a suffix (H or h for hexadecimal and Q or q for octal). C language prefixes are also recognized (0 for octal and 0x for hex). Hexadecimal constants must begin with a digit. No binary constants are allowed.
Symbols within an expression have only the value of the symbol’s address. No type-checking is performed.
Linker expressions can be absolute or relocatable. If an expression contains any relocatable symbols (and 0 or more constants or absolute symbols), it is relocatable. Otherwise, the expression is absolute. If a symbol is assigned the value of a relocatable expression, it is relocatable; if it is assigned the value of an absolute expression, it is absolute.
The linker supports the C language operators listed in the following table in order of precedence. Operators in the same group have the same precedence. Besides the operators listed in this table, the linker also has an align operator that allows a symbol to be aligned on an n-byte boundary within an output section (n is a power of 2). For example, the following expression aligns the SPC within the current section on the next 16-byte boundary. Because the align operator is a function of the current SPC, it can be used only in the same context as . —that is, within a SECTIONS directive.
. = align(16);
Groups of Operators Used in Expressions for highest to lowest precedence:
Precedence Group
Operator
Description
Group 1
!
Logical NOT
~
Bitwise NOT
-
Negation
Group 2
*
Multiplication
/
Division
%
Modulus
Group 3
+
Addition
-
Subtraction
Group 4
>>
Arithmetic right shift
<<
Arithmetic left shift
Group 5
==
Equal to
!=
Not equal to
>
Greater than
<
Less than
<=
Less than or equal to
>=
Greater than or equal to
Group 6
&
Bitwise AND
Group 7
|
Bitwise OR
Group 8
&&
Logical AND
Group 9
||
Logical OR
Group 10
=
Assignment
+=
A += B is equivalent to A = A + B
-=
A -= B is equivalent to A = A - B
*=
A *= B is equivalent to A = A * B
/=
A /= B is equivalent to A = A / B
10.5.10.4. Symbols Automatically Defined by the Linker¶
The linker automatically defines the following symbols:
.text is assigned the first address of the .text output section. (It marks the beginning of executable code.)
etext is assigned the first address following the .text output section. (It marks the end of executable code.)
.data is assigned the first address of the .data output section. (It marks the beginning of initialized data tables.)
edata is assigned the first address following the .data output section. (It marks the end of initialized data tables.)
.bss is assigned the first address of the .bss output section. (It marks the beginning of uninitialized data.)
end is assigned the first address following the .bss output section. (It marks the end of uninitialized data.)
The linker automatically defines the following symbols for C/C++ support when the --ram_model or --rom_model option is used.
__TI_STACK_SIZE |
is assigned the size of the .stack section. |
__TI_STACK_END |
is assigned the end of the .stack section. |
__TI_SYSMEM_SIZE |
is assigned the size of the .sysmem section. |
These linker-defined symbols can be accessed in any assembly language module if they are declared with a .global directive (see Global (External) Symbols).
See Using Linker Symbols in C/C++ Applications for information about referring to linker symbols in C/C++ code.
10.5.10.5. Assigning Exact Start, End, and Size Values of a Section to a Symbol¶
The code generation tools currently support the ability to load program code in one area of (slow) memory and run it in another (faster) area. This is done by specifying separate load and run addresses for an output section or group in the linker command file. Then execute a sequence of instructions (the copying code in Referring to the Load Address by Using the .label Directive) that moves the program code from its load area to its run area before it is needed.
There are several responsibilities that a programmer must take on when setting up a system with this feature. One of these responsibilities is to determine the size and run-time address of the program code to be moved. The current mechanisms to do this involve use of the .label directives in the copying code. A simple example is illustrated in Referring to the Load Address by Using the .label Directive.
This method of specifying the size and load address of the program code has limitations. While it works fine for an individual input section that is contained entirely within one source file, this method becomes more complicated if the program code is spread over several source files or if the programmer wants to copy an entire output section from load space to run space.
Another problem with this method is that it does not account for the possibility that the section being moved may have an associated far call trampoline section that needs to be moved with it.
10.5.10.6. Why the Dot Operator Does Not Always Work¶
The dot operator (.) is used to define symbols at link-time with a particular address inside of an output section. It is interpreted like a PC. Whatever the current offset within the current section is, that is the value associated with the dot. Consider an output section specification within a SECTIONS directive:
outsect:
{
s1.c.o(.text)
end_of_s1 = .;
start_of_s2 = .;
s2.c.o(.text)
end_of_s2 = .;
}
This statement creates three symbols:
end_of_s1—the end address of .text in s1.c.o
start_of_s2—the start address of .text in s2.c.o
end_of_s2—the end address of .text in s2.c.o
Suppose there is padding between s1.c.o and s2.c.o created as a result of alignment. Then start_of_s2 is not really the start address of the .text section in s2.c.o, but it is the address before the padding needed to align the .text section in s2.c.o. This is due to the linker’s interpretation of the dot operator as the current PC. It is also true because the dot operator is evaluated independently of the input sections around it.
Another potential problem in the above example is that end_of_s2 may not account for any padding that was required at the end of the output section. You cannot reliably use end_of_s2 as the end address of the output section. One way to get around this problem is to create a dummy section immediately after the output section in question. For example:
GROUP
{
outsect:
{
start_of_outsect = .;
...
}
dummy: { size_of_outsect = . - start_of_outsect; }
}
10.5.10.7. Address and Dimension Operators¶
Six operators allow you to define symbols for load-time and run-time addresses and sizes:
LOAD_START(sym) START(sym) |
Defines sym with the load-time start address of related allocation unit |
LOAD_END(sym) END(sym) |
Defines sym with the load-time end address of related allocation unit |
LOAD_SIZE(sym) SIZE(sym) |
Defines sym with the load-time size of related allocation unit |
RUN_START(sym) |
Defines sym with the run-time start address of related allocation unit |
RUN_END(sym) |
Defines sym with the run-time end address of related allocation unit |
RUN_SIZE(sym) |
Defines sym with the run-time size of related allocation unit |
LAST(sym) |
Defines sym with the run-time address of the last allocated byte in the related memory range. |
Note
Linker Command File Operator Equivalencies: LOAD_START() and START() are equivalent, as are LOAD_END()/END() and LOAD_SIZE()/SIZE(). The LOAD names are recommended for clarity.
These address and dimension operators can be associated with several different kinds of allocation units, including input items, output sections, GROUPs, and UNIONs. The following sections provide some examples of how the operators can be used in each case.
Symbols defined by the linker can be accessed in C/C++ code using various techniques. See Using Linker Symbols in C/C++ Applications for more information about referring to linker symbols in C/C++ code.
10.5.10.7.1. Input Items¶
Consider an output section specification within a SECTIONS directive:
outsect:
{
s1.c.o(.text)
end_of_s1 = .;
start_of_s2 = .;
s2.c.o(.text)
end_of_s2 = .;
}
This can be rewritten using the START and END operators as follows:
outsect:
{
s1.c.o(.text) { END(end_of_s1) }
.c.o(.text) { START(start_of_s2), END(end_of_s2) }
}
The values of end_of_s1 and end_of_s2 will be the same as if you had used the dot operator in the original example, but start_of_s2 would be defined after any necessary padding that needs to be added between the two .text sections. Remember that the dot operator would cause start_of_s2 to be defined before any necessary padding is inserted between the two input sections.
The syntax for using these operators in association with input sections calls for braces { } to enclose the operator list. The operators in the list are applied to the input item that occurs immediately before the list.
10.5.10.7.2. Output Section¶
The START, END, and SIZE operators can also be associated with an output section. Here is an example:
outsect: START(start_of_outsect), SIZE(size_of_outsect)
{
<list of input items>
}
In this case, the SIZE operator defines size_of_outsect to incorporate any padding that is required in the output section to conform to any alignment requirements that are imposed.
The syntax for specifying the operators with an output section does not require braces to enclose the operator list. The operator list is simply included as part of the allocation specification for an output section.
10.5.10.7.3. GROUPs¶
Here is another use of the START and SIZE operators in the context of a GROUP specification:
GROUP
{
outsect1: { ... }
outsect2: { ... }
} load = ROM, run = RAM, START(group_start), SIZE(group_size);
This can be useful if the whole GROUP is to be loaded in one location and run in another. The copying code can use group_start and group_size as parameters for where to copy from and how much is to be copied. This makes the use of .label in the source code unnecessary.
10.5.10.7.4. UNIONs¶
The RUN_SIZE and LOAD_SIZE operators provide a mechanism to distinguish between the size of a UNION’s load space and the size of the space where its constituents are going to be copied before they are run. Here is an example:
UNION: run = RAM, LOAD_START(union_load_addr),
LOAD_SIZE(union_ld_sz), RUN_SIZE(union_run_sz)
{
.text1: load = ROM, SIZE(text1_size) { f1.c.o(.text) }
.text2: load = ROM, SIZE(text2_size) { f2.c.o(.text) }
}
Here union_ld_sz is going to be equal to the sum of the sizes of all output sections placed in the union. The union_run_sz value is equivalent to the largest output section in the union. Both of these symbols incorporate any padding due to blocking or alignment requirements.
10.5.10.8. LAST Operator¶
The LAST operator is similar to the START and END operators that were described previously. However, LAST applies to a memory range rather than to a section. You can use it in a MEMORY directive to define a symbol that can be used at run-time to learn how much memory was allocated when linking the program. See MEMORY Directive Syntax for syntax details.
For example, a memory range might be defined as follows:
D_MEM : org = 0x20000020 len = 0x20000000 LAST(dmem_end)
Your C/C++ code can then access this symbol at runtime as described in Using Linker Symbols in C/C++ Applications.