7.6. Symbols¶
An object file contains a symbol table that stores information about symbols in the object file. The linker uses this table when it performs relocation. See Symbolic Relocations.
An object file symbol is a named 32-bit integer value, usually representing an address. A symbol can represent such things as the starting address of a function, variable, section, or an absolute integer (such as the size of the stack).
Symbols are defined in assembly by adding a label or a directive such as .set .equ .bss, or .usect.
Symbols have a binding, which is similar to the C standard concept of linkage. ELF files may contain symbols bound as local symbols, global symbols, and weak symbols.
Global symbols are visible to the entire program. The linker does not allow more than one global definition of a particular symbol; it issues a multiple-definition error if a global symbol is defined more than once. (The assembler can provide a similar multiple-definition error for local symbols.) A reference to a global symbol from any object file refers to the one and only allowed global definition of that symbol. Assembly code must explicitly make a symbol global by adding a .def, .ref, or .global directive. (See Global (External) Symbols.)
Local symbols are visible only within one object file; each object file that uses a symbol needs its own local definition. References to local symbols in an object file are entirely unrelated to local symbols of the same name in another object file. By default, a symbol is local. (See Local Symbols.)
Weak symbols are symbols that may be used but not defined in the current module. They may or may not be defined in another module. A weak symbol is intended to be overridden by a strong (non-weak) global symbol definition of the same name in another object file. If a strong definition is available, the weak symbol is replaced by the strong symbol. If no definition is available (that is, if the weak symbol is unresolved), no error is generated, but the weak variable’s address is considered to be null (0). For this reason, application code that accesses a weak variable must check that its address is not zero before attempting to access the variable. (See Weak Symbols.)
Absolute symbols are symbols that have a numeric value. They may be constants. To the linker, such symbols are unsigned values, but the integer may be treated as signed or unsigned depending on how it is used. The range of legal values for an absolute integer is 0 to 2^32-1 for unsigned treatment and -2^31 to 2^31-1 for signed treatment.
In general, common symbols are preferred over weak symbols.
7.6.1. Global (External) Symbols¶
Global symbols are symbols that are either accessed in the current module but defined in another (an external symbol) or defined in the current module and accessed in another. Such symbols are visible across object modules.
You must use the .def, .ref, or .global directive to identify a symbol as external:
.def |
The symbol is defined in the current file and may be used in another file. |
.ref |
The symbol is referenced in the current file, but defined in another file. |
.global |
The symbol can be either of the above. The assembler chooses either .def or .ref as appropriate for each symbol. |
The following code fragments illustrate the use of the .global directive.
x: ADD R0, #56h ; Define x
.global x ; acts as .def x
Because x is defined in this module, the assembler treats “.global x” as “.def x”. Now other modules can refer to x.
B y ; Reference y
.global y ; .ref of y
Because y is not defined in this module, the assembler treats “.global y” as “.ref y”. The symbol y must be defined in another module.
Both the symbols x and y are external symbols and are placed in the object file’s symbol table; x as a defined symbol, and y as an undefined symbol. When the object file is linked with other object files, the entry for x is used to resolve references to x in other files. The entry for y causes the linker to look through the symbol tables of other files for y’s definition.
The linker attempts to match all references with corresponding definitions. If the linker cannot find a symbol’s definition, it prints an error message about the unresolved reference. This type of error prevents the linker from creating an executable object module.
An error also occurs if the same symbol is defined more than once.
7.6.2. Local Symbols¶
Local symbols are visible within a single object file. Each object file may have its own local definition for a particular symbol. References to local symbols in an object file are entirely unrelated to local symbols of the same name in another object file.
By default, a symbol is local.
7.6.3. Weak Symbols¶
Weak symbols are symbols that may or may not be defined.
The linker processes symbols that are defined with a “weak” binding differently from symbols that are defined with global binding. Instead of including a weak symbol in the object file’s symbol table (as it would for a global symbol), the linker only includes a weak symbol in the output of a “final” link if the symbol is required to resolve an otherwise unresolved reference.
This allows the linker to minimize the number of symbols it includes in the output file’s symbol table by omitting those that are not needed to resolve references. Reducing the size of the output file’s symbol table reduces the time required to link, especially if there are a large number of pre-loaded symbols to link against.
You can define a weak symbol using either the .weak assembly directive or the weak operator in the linker command file.
Using Assembly: To define a weak symbol in an input object file, the source file can be written in assembly. Use the .weak and .set directives in combination as shown in the following example, which defines a weak symbol “ext_addr_sym”.
.weak ext_addr_sym
ext_addr_sym .set 0x12345678
Assemble the source file that defines weak symbols, and include the resulting object file in the link. The “ext_addr_sym” in this example is available as a weak symbol in a final link. It is a candidate for removal if the symbol is not referenced elsewhere in the application.
Using the Linker Command File: To define a weak symbol in a linker command file, use the “weak” operator in an assignment expression to designate that the symbol as eligible for removal from the output file’s symbol table if it is not referenced. In a linker command file, an assignment expression outside a MEMORY or SECTIONS directive can be used to define a weak linker-defined symbol. For example, you can define “ext_addr_sym” as follows.
weak(ext_addr_sym) = 0x12345678;
If the linker command file is used to perform the final link, then “ext_addr_sym” is presented to the linker as a weak symbol; it is not included in the resulting output file if the symbol is not referenced. See Declaring Weak Symbols.
Using C/C++ code: See weak for information about the weak GCC-style variable attribute.
If there are multiple definitions of the same symbol, the linker uses certain rules to determine which definition takes precedence. Some definitions may have weak binding and others may have strong binding. “Strong” in this context means that the symbol has not been given a weak binding by either of the two methods described above. Some definitions may come from an input object file (that is, using assembly directives) and others may come from an assignment statement in a linker command file.
The linker uses the following guidelines to determine which definition is used when resolving references to a symbol:
A strongly bound symbol always takes precedence over a weakly bound symbol.
If two symbols are both strongly bound or both weakly bound, a symbol defined in a linker command file takes precedence over a symbol defined in an input object file.
If two symbols are both strongly bound and both are defined in an input object file, the linker provides a symbol redefinition error and halts the link process.
7.6.4. The Symbol Table¶
The assembler generates entries with global (external) binding in the symbol table for each of the following:
Each .ref, .def, or .global directive (see Global (External) Symbols)
The beginning of each section
The assembler generates entries with local binding for each locally-available function.
For informational purposes, there are also entries in the symbol table for each symbol in a program.