4.1.1. Assembly Source Anatomy

4.1.1.1. Fields of an Assembly Source Line

While there are many differences in the details between the legacy TI-syntax and GNU-syntax Arm assembly languages, they are generally very similar. Both follow the same general form:

label field: mnemonic field operand list field

For example, the following Arm instruction is legal in both legacy TI-syntax and GNU-syntax Arm assembly language:

add_me:   add    r0, r1

where “add_me” occupies the label field, “add” occupies the mnemonic field, and the operand list field consists of registers r0 and r1 with operands separated by a comma.

Many of the assembly directives supported in legacy TI-syntax will need to be converted into their functionally equivalent GNU-syntax counterparts, and while most Arm instructions will likely assemble successfully with either the legacy TI-syntax assembler or the tiarmclang GNU-syntax assembler, there are different rules governing legal syntax in the label, mnemonic, and operand list fields that one should be aware of when migrating assembly source files:

4.1.1.2. Labels

An optional label field can be used to associate a value with a symbol. Label symbol names are case sensitive, and a label must begin in the leftmost column of the assembly source line. The rules governing the name of a symbol defined by a label are largely similar for legacy TI-syntax and GNU-syntax assembly source code, but there are some subtle differences:

Legacy TI-Syntax:

  • Label symbols must begin with a letter or an underscore

  • Label symbols can consist of alphanumeric characters, the dollar sign (“$”), and underscores (“_”)

  • Label symbol definitions may be delimited by an optional terminating colon (“:”)

GNU-Syntax:

  • Label symbols must start with a letter, an underscore, or a period (“.”)

  • Label symbols can consist of alphanumeric characters, the dollar sign (“$”), an underscore (“_”), or a period (“.”)

  • Label symbol definitions must be delimited with a terminating colon (“:”), otherwise the tiarmclang assembler tries to interpret the symbol as a mnemonic identifier

The value assigned to a symbol defined in a label field may vary depending on whether the label occurs within the context of an instruction or a directive. A more detailed discussion of the label field in each of these contexts is provided in the Converting TI-Syntax Arm Instructions to GNU-Syntax Arm Instructions and the Converting TI-Syntax Assembly Directives to GNU-Syntax Assembly Directives sections.

4.1.1.2.1. Local Labels

TI-Syntax Arm Assembler Local Labels

The TI-syntax Arm assembler provides support for local labels whose scope and effect are temporary. Local labels cannot be declared with global linkage. There are two forms of local labels supported in the TI-syntax Arm assembler:

  • $n - where n is an integer in the range [0,9]

  • name? - where name is a legal identifier. The TI-syntax Arm assembler replaces the ? with a period followed by a unique integer.

Normal labels must be unique, but the TI-syntax Arm assembler allows local labels to be undefined and defined again within the same compilation unit. Please see the TI Arm Assembly Language Tools User’s Guide for more details about how to use and manage local labels in the TI-syntax Arm assembler.

GNU-Syntax Arm Assembler Local Labels

The GNU-syntax Arm assembler that is integrated into the tiarmclang compiler also supports the notion of local labels with similar limitations on the scope in that GNU-syntax local labels cannot be declared with global linkage. The syntax for defining and referring to GNU-syntax local labels is as follows:

  • Local label definitions use the form N: in the label field of a line of GNU-syntax assembly code, where N is an integer in the range [0,9].

  • References to the most recently defined local label use the form Nb, where N is the ID of the local label (an integer in [0,9]) and b indicates a backward reference.

  • References to the nest definition of a local label use the form Nf, where N is the ID of the local label (an integer in [0,9] and f indicates a forward reference.

Like TI-syntax local labels, GNU-syntax local labels can be redefined in the same compilation unit. However, unlike TI-syntax local labels, there are no special directives needed to undefine an existing local label. The GNU-syntax assembler will associate a unique ordinal ID for every local label definition so that it is able to distinguish one instance of a local label definition from another that was defined with the same value N.

Simple Local Label Example

Consider a snippet of TI-syntax Arm assembly source that implements a simple loop:

;* assume external global int "sum_tot"
     .global   sum_tot

;* assume incoming r0 has loop limit
     .global   foo
     .sect     ".text"
     .thumb
foo:
     ...
     MOVS   r1,#0
     CMP    r1, r0
     BLE    $1
$0:
     LDR    r2, C_CON1
     LDR    r3, [r2]
     ADDS   r3, r3, r1
     STR    r3, [r2]
     ADDS   r1, r1, #1
     CMP    r1, r0
     BGT    $0
$1:
     ...
     BX     LR

     .align 4
C_CON1:   .int   sum_tot

where $0 and $1 are TI-syntax local labels.

This loop example can be converted to GNU-syntax assembly as follows:

// assume external global int "sum_tot"
     .global   sum_tot

// assume incoming r0 has loop limit
     .global   foo
     .section  .text
     .thumb
foo:
     ...
     MOVS   r1,#0
     CMP    r1, r0
     BLE    1f
0:
     LDR    r2, C_CON1
     LDR    r3, [r2]
     ADDS   r3, r3, r1
     STR    r3, [r2]
     ADDS   r1, r1, #1
     CMP    r1, r0
     BGT    0b
1:
     ...
     BX     LR

     .align 4
C_CON1:   .int   sum_tot

where the following changes were made to the TI-syntax source to make a functionally equivalent GNU-syntax implementation of the loop:

  • The “;*” comment delimiters are replaced with “//”.

  • The TI-syntax for .sect directive is converted to the GNU-syntax .section directive.

  • The TI-syntax local labels, $0 and $1, are replaced with GNU-syntax local labels 0 and 1

  • The TI-syntax forward reference to $1 is converted to a GNU-syntax forward reference to local label 1 via the syntax 1f (‘f’ indicates a forward reference)

  • The TI-syntax backward reference to $0 is converted to a GNU-syntax backward reference to local label 0 via the syntax 0b (‘b’ indicates a backward reference)

Macro Example Use of Local Labels

The following macro is an example of the name? form of the TI-syntax local label:

;* TI-syntax implementation of trace_pc macro using local labels
trace_pc     .macro
L_?:
     .sect    ".trace_scn"
     .int     L_?
     .sect    ".text"
     .endm

     .sect    ".text"
foo:
     nop
     trace_pc
     nop
     trace_pc
     nop
     trace_pc
     nop

where L_? gets converted into L_0, L_1, and L_2 with each invocation of the trace_pc macro in the above example.

The above example can be converted to GNU-syntax as follows:

// GNU-syntax implementation of trace_pc macro using local labels
     .macro    trace_pc
\@:
     .section  .trace_scn,"aw",%progbits
     .int      \@b
     .previous
     .endm

     .section  .text
foo:
     nop
     trace_pc
     nop
     trace_pc
     nop
     trace_pc
     nop

where the following changes were made to the TI-syntax source to make a functionally equivalent GNU-syntax implementation of the macro example:

  • The “;*” comment delimiter is converted to “//”.

  • The name of the macro is moved from the label field for the TI-syntax implementation to the first operand field for the GNU-syntax implementation

  • The definition of the TI-syntax local label L_? is replaced by the use of the special @ operator for GNU-syntax assembly macros used in the label field to auto-generate a local label each time the macro is invoked.

  • Likewise, the reference to the TI-syntax local label L_? is replaced by @b, which the GNU-syntax assembler interprets as a backward reference to the auto-generated local label.

  • Finally, the .previous directive returns the assembler back to the previous input section that was being assembled into.

If we were to use a normal label like xyz_@ for the GNU-syntax implementation of the macro, the assembler would report a duplicate label definition for xyz_0. When the GNU-syntax assembler invokes the trace_pc macro using a local label definition, then the local label 0 is auto-generated for each invocation. Since the GNU-syntax assembler will assign a unique ordinal ID to each instance of a local label, it is able to avoid a duplicate label definition when local labels are used in the macro definition.

4.1.1.3. Mnemonics

The mnemonic field of a legal line of assembly code contains a pre-defined textual identifier that indicates whether the source line represents an instruction or a directive.

For example, the “push” mnemonic in the following line of assembly code is recognized as a valid Arm instruction:

        .text
        .thumb
        .global    simple_function

simple_function:
        push {r7,lr}
        ...

The ".text", ".thumb", and ".global" contents in the mnemonic
field are recognized as Arm assembly directives.

More information about how to convert legacy TI-syntax Arm instructions and directives into GNU-syntax is provided in both the Converting TI-Syntax Arm Instructions to GNU-Syntax Arm Instructions and the Converting TI-Syntax Assembly Directives to GNU-Syntax Assembly Directives sections, respectively.

With regards to syntax rules governing the mnemonic field, the only major difference between legacy TI-syntax and GNU-syntax is that GNU-syntax mnemonic identifiers may begin in the leftmost column of the assembly source line. Legacy TI-syntax mnemonic identifiers must not begin in the leftmost column.

4.1.1.4. Operand List

The syntax rules governing the operand list field is dependent on the identifier specified in the mnemonic field. For example, in the “push” instruction shown earlier in this section, the operand list field contains a list of one or more registers enclosed in braces, whereas the operand field of a .global directive expects a legal symbol identifier.

You can find more information about the Arm instruction set in the Arm Developer’s Instruction Set Architecture page. Legacy TI-syntax assembler directives are described in the Arm Assembly Language Tools User’s Guide. More information about GNU-syntax Arm assembler directives can be found in an up-to-date description of the GNU as Arm assembler.

4.1.1.5. Comments

When migrating assembly language source from legacy TI-syntax to GNU-syntax, you’ll need to modify the way that comments are delimited in your code. The syntax rules governing the demarcation of comments are significantly different between legacy TI-syntax and GNU-syntax Arm assembly language.

4.1.1.5.1. Legacy TI-Syntax Comment Delimiters

In legacy TI-syntax assembly source, comments can be delimited in two ways:

  • Text appearing after an asterisk, ‘*’, in column 0 is interpreted as a comment.

  • Text appearing after a semi-colon, ‘;’, on any column is interpreted as a comment.

The following snippet of legacy TI-syntax Arm assembly code demonstrates the use of these two methods of delimiting comments:

* Loop entry
loop_entry:
        BL      ef1           ; call ext func 1, ef1
        BL      ef2           ; call ext func 1, ef1
        LDR     A1, [SP, #0]
        ADDS    A1, A1, #1    ; I++ (A1)
        STR     A1, [SP, #0]
        LDR     A1, $C$CON1
        LDR     A2, [SP, #0]  ; load I (A2)
        LDR     A1, [A1, #0]  ; load ext var, evar (A1)
        CMP     A1, A2        ; I > evar?
        BGT     loop_entry    ; I < evar, go to loop_entry

* Loop exit
loop_exit:
        MOVS    A1, #0
        POP     {A4, PC}

4.1.1.5.2. GNU-Syntax Comment Delimiters

In GNU-syntax assembly source, comments can be delimited using:

  • C-style comments; text enclosed between “/*” and “*/” which may span multiple lines.

  • C++-style comments; text appearing after “//” on a line.

  • Text appearing after an at-sign, ‘@’, is interpreted as a comment unless that ‘@’ character appears in a macro definition preceded by a backslash ‘'. For more details about how to convert macro definitions from legacy TI-syntax to GNU-syntax please see the Converting TI-syntax Assembly Macros into GNU-syntax Assembly Macros section.

Now consider the following snippet of GNU-syntax assembly code, which implements the same instructions as the above legacy TI-syntax example. Note the use of C- and C++-style comments compared to the above example:

/*
 * Loop entry - comment can span multiple lines
 */
loop_entry:
        bl      ef1               // call ext func 1, ef1
        bl      ef2               // call ext func 1, ef1
        ldr     r0, [sp]
        adds    r0, #1            // I++ (r0)
        str     r0, [sp]
        movw    r1, :lower16:evar
        movt    r1, :upper16:evar
        ldr     r1, [r1]          // load evar (r1)
        cmp     r0, r1            // I > evar?
        blt     loop_entry        // I < evar, go to loop_entry

/* Loop exit */
loop_exit:
        movs    r0, #0
        pop     {r7, PC}