Test Readme for C7000 Code Generation Tools v2.0.1

0 Introduction to the C7000 Code Generation Tools v2.0.x STS
1 Documentation
2 TI E2E Community - Where to get help
3 Defect Tracking Database
4 New –mma_version compiler option
5 EABI change between v1.4 and v2.0
6 SE/SA/MMA Interface Changes
7 Streaming Address Generator supports predicated loads on C7120
8 Link-Time Optimization not supported between targets
9 Notes on Host Emulation Support
- 9.1 Host Emulation is experimental
- 9.2 Additional Host Emulation Pointer Operations Supported
10 A Note on Intrinsics and Header Files
- Supported Intrinsics
11 Removal of MISRA 2004 compiler command-line options
12 Silicon errata i2117 workaround support
13 Half-float conversion intrinsics
14 Resolved defects
15 Known defects

0 Introduction to the C7000 Code Generation Tools v2.0.x STS

This C7000 compiler release is an STS (Short-Term Support) release.

This release supports the C7100 and C7120 ISA cores. To compile code for the C7100 core, use the compiler command-line option -mv7100 or equivalently, --silicon_version=7100. To compile code for the C7120 core, use the compiler command-line option -mv7120 or equivalently, --silicon_version=7120.

For definitions and explanations of STS, LTS, and the versioning number scheme, please see SDTO Compiler Version Numbers

1 Documentation

The following documents provide information on how to use, program, and migrate to the C7000 CPU. (As of v2.0.0, these documents are no longer included with the compiler tools installer. They can always be found on the TI website.)

SPRUIG8***.PDF: C7000 C/C++Optimizing Compiler Users Guide

SPRUIV4***.PDF: C7000 Optimization Guide

SPRUIG4***.PDF: C7000 Embedded Application Binary Interface (EABI) Reference Guide

SPRUIG5***.PDF: C6000-to-C7000 Migration User’s Guide

SPRUIG3***.PDF: VCOP Kernel-C to C7000 Migration Tool User’s Guide

SPRUIG6***.PDF: C7000 Host Emulation User’s Guide (NOTE: Host Emulation is an experimental feature)

2 TI E2E Community - Where to get help

Post compiler related questions to the TI E2E design community forum and select the TI device being used.

The E2E Design Support Forum Website

If submitting a defect report, please attach a scaled-down test case with command-line options and the compiler version number to allow us to reproduce the issue easily.

The following is the top-level webpage for all of TI’s Code Generation Tools.

Code Generation Tools Landing Page

3 Defect Tracking Database

Compiler defect reports can be tracked at the new Development Tools bug database, SIR. SIR is a JIRA-based view into all public tools defects. The old SDOWP tracking database will be retired.

SIR Development Tools Defect Tracking Website

A my.ti.com account is required to access this page. To find an issue in SIR, enter your defect id in the top right search box once logged in. Alternatively from the top red navigation bar, select “Issues” then “Search for Issues”.

To find an old SDOWP issue, place the SDOWP ID in the search box and use double quotes around the SDOWP ID.

4 New –mma_version compiler option

The C7000 v2.0 C/C++ Compiler has a new command-line option, –mma_version. This option tells the compiler which version of the Matrix Multiply Accelerator (MMA) the compiler should compile for. It also causes the compiler to set certain predefined macros which turn on the appropriate MMA API configuration structures and enumeration values in include\c7x_mma.h.

 --mma_version=1       Enables use of MMA version 1 (C7100)
 --mma_version=2       Enables use of MMA version 2 (C7120)
 --mma_version=NONE    Disables use of the MMA

The compiler will place an appropriate MMA version build attribute in the object files that are generated. If the MMA is not used, an MMA version build attribute will be placed in the object file that indicates that the MMA is not used. MMA version build attributes ensure that linking of object files with incompatible versions of the MMA is disallowed. For more details, please see the C7000 Embedded Application Binary Interface (EABI) Reference Guide.

5 EABI change between v1.4 and v2.0

The C7000 Compiler v2.0 STS release implements a change in the way boolean vectors (e.g. bool16) are represented. Therefore, object code compiled with the v1.4 and earlier compilers from C/C++ source code that uses boolean vectors will not be compatible with object code compiled with the v2.0 and later compiler from C/C++ source code that uses boolean vectors. Unexpected results and program execution crashes could occur if this takes place. If boolean vectors are not used, there is no incompatibility.

Note that there is no representation change for the __vpred type in the v2.0 compiler and therefore the above statements do not apply for the __vpred type.

6 SE/SA/MMA Interface Changes

In the C7000 v2.0 Compiler, some reserved fields in the __SA_TEMPLATE_v1 configuration structure and some reserved fields in the MMA __HWA_CONFIG_REG_v1 configuration structure have been renamed or split and renamed. This has been done in order to use those reserved fields for added functionality that has been implemented in the C7120 ISA or in the MMA v2 hardware. This means that any use of those reserved fields in code that was compiled with the 1.4.x compiler must either be replaced by the new struct member names or replaced with a function call that sets default values as described below. The latter approach is the one we recommend.

In the future, as we’ve done with the v2.0 compiler tools, existing reserved fields in the SE/SA/MMA configuration structures may be used for additional features on future devices. Therefore, in future releases of the C7000 compiler, we may again

(1) change the name of a reserved field to support new features or (2) split the reserved field into two or more fields, or both (1) and (2).

A consequence of this is that directly using named reserved fields may not work with a future version of the C7000 Compiler. Therefore, it is recommended to set reserved fields with the __gen_SA_TEMPLATE_v1(), __gen_MMA_TEMPLATE_v1(), and similar functions which setup defaults for the given configuration structures for SA/SA/MMA. See include/c7x_strm.h and include/c7x_mma.h for details on the functions that setup safe default values for these configuration structs.

     sa_params.reserved2 = 0;       // named struct field
     { . . ., .reserved2 = 0, . . } // named struct field in named struct instantiation

Also note that “ordered struct instantiation” (where struct member fields are not named) may also break if a reserved field has its type changed (e.g. int64_t bitfield to an enum type).

Recommended approach:

     // Sets defaults including zeroing-out reserved fields:
     __SA_TEMPLATE_v1 sa0_config = __gen_SA_TEMPLATE_v1();
     // Now setup necessary fields
     sa0_config.ICNT0 = 32;
     // Continue setup not using reserved fields

In addition, as of version 1.4.0, the streaming address generator (SA) API has been modified to give greater compatibility with typedefs and const pointers. For example, __SA0ADV(int2_typedef, ptr) was disallowed, but is now legal. Similarly, __SA0ADV(const_int2, ptr) is also now legal and will return a const pointer. For more information, consult the descriptions in “c7x_strm.h”.

7 Streaming Address Generator supports predicated loads on C7120

On the C7120 ISA variant, implicit predication occurs on loads that use streaming address generator (SA) operands. If an SA may be used as an operand to a load and that SA may generate predicates with one or more predicate bits off, then a predicated load must be used to avoid unexpected behavior. Use the following idioms with implicitly predicated SA loads:

Well-defined behavior with normal predicated loads:

__vpred vp = __SA0_VPRED(int16);
int16 *ptr = __SA0ADV(int16, baseptr);
int16 x = __vload_pred(vp, ptr); // Normal load with explicit predication

In addition, specialized loads predicated with an SA predicate can be generated with the following idiom, which has well-defined behavior:

__vpred vp = __SA0_VPRED(uchar32);
uchar32 *ptr = __SA0ADV(uchar32, baseptr);
ushort32 x = __vload_pred_unpack_short(vp, ptr); // Specialized load with explicit predication

(Note that vector load intrinsics that have boolean vector arguments are also available.)

The compiler may optimize the above sequences to take advantage of the C7120 ISA’s implicit predication feature.

If implicit predication is not available (C7100), or the idiom is malformed, or the compiler fails to optimize the idiom, an equivalent series of instructions instead will be generated to perform the load and then predicate the result.

After configuring an SA for predication, beware that some C/C++ idioms have unspecified behavior:

ushort32 x = __vload_unpack_short(__SA0ADV(uchar32, baseptr); // May be predicated, or not!

int16 *ptr = __SA0ADV(int16, baseptr);
int16 x = *ptr // May be predicated, or not!

Please see the section titled “Using the Streaming Address Generator” in the C7000 C/C++Optimizing Compiler Users Guide for more information.

8 Link-Time Optimization not supported between targets

A clarification on Link-Time Optimization use:

When using Link-Time Optimization, use only source and object files compiled with the same –silicon_version and –mma_version option. Link-Time Optimization is not supported between source and/or object files compiled with different –silicon_version or –mma_version options. In this case, the compilation may fail.

For more information on Link-Time Optimization, see the C7000 C/C++Optimizing Compiler Users Guide.

9 Notes on Host Emulation Support

9.1 Host Emulation is experimental

Host Emulation is an experimental feature
- Host Emulation is an experimental feature and may not work as intended or expected in certain situations. In addition, there may be limitations that exist that are not disclosed in the Host Emulation User’s Guide, SPRUIG6.

9.2 Additional Host Emulation Pointer Operations Supported

The C7000 Host Emulation User’s Guide is being updated to reflect additional supported operations on pointer types when used with Host Emulation.

In addition to those arithmetic operations listed in “Vector and Complex Element Pointer Types”, the minus (“-”) operation should also be listed for pointer types that were created based on a conversion from a scalar pointer to memory.

An additional list will be added to the “Vector and Complex Eleemnt Pointer Types” section. This list will contain the pointer comparison operations that are supported.

Equal (“==”) pointer
Not-equal (“!=”) pointer
Less-than (“<”) pointer
Greater-than (“>”) pointer
Less-than-or-equal (“<=”) pointer
Greater-than-or-equal (“>=”) pointer

10 A Note on Intrinsics and Header Files

Supported Intrinsics

The included top-level header files “c7x.h” and “c6x_migration.h” list the supported intrinsics for both C7x and C6x, respectively. Note that you must include these header files with your source in order to leverage many of the C7x intrinsics and all of the legacy C6x intrinsics. “c7x.h” includes other useful header files that document/describe supported intrinsics:

c7x_vpred.h: List of intrinsics supporting low-level __vpred vector predicate type.
c7x_direct.h: List of intrinsics that map directly to instructions.
c7x_strm.h: List of intrinsics and flags for C7x Streaming Engine and Stream Address Generator.
c7x_mma.h: List of intrinsics and associated structures and enumerations for the C7x MMA.
c7x_luthist.h: List of intrinsics and flags for C7x Lookup Table / Histogram support.

11 Removal of MISRA 2004 compiler command-line options

The C7000 C/C++ Compiler does not support MISRA 2004 checking as some other Texas Instruments compilers do. Therefore, the command-line options for MISRA 2004 checking have been removed and are no longer accepted by the compiler.

12 Silicon errata i2117 workaround support

The compiler option --silicon_errata_i2117 generates code that automatically works around silicon errata i2117 on devices with the C7100 CPU core. MMA performance may be negatively impacted by the use of this option in edge cases.

13 Half-float conversion intrinsics

Starting with the v1.4.1 compiler, half-float conversion intrinsics are supported, which map to appropriate C7000 half-float conversion instructions:

// VSPHP
uint   = __float_to_half_float(float);
uint2  = __float_to_half_float(float2);
uint3  = __float_to_half_float(float3);
uint4  = __float_to_half_float(float4);
uint8  = __float_to_half_float(float8);
uint16 = __float_to_half_float(float16);

// VINTHP
uint   = __int_to_half_float(int);
uint2  = __int_to_half_float(int2);
uint3  = __int_to_half_float(int3);
uint4  = __int_to_half_float(int4);
uint8  = __int_to_half_float(int8);
uint16 = __int_to_half_float(int16);

// VHPSP
float   = __half_float_to_float(uint);
float2  = __half_float_to_float(uint2);
float3  = __half_float_to_float(uint3);
float4  = __half_float_to_float(uint4);
float8  = __half_float_to_float(uint8);
float16 = __half_float_to_float(uint16);

// VHPINT
int   = __half_float_to_int(uint);
int2  = __half_float_to_int(uint2);
int3  = __half_float_to_int(uint3);
int4  = __half_float_to_int(uint4);
int8  = __half_float_to_int(uint8);
int16 = __half_float_to_int(uint16);

Each half-float in a vector register occupies the least significant 16-bits of a 32-bit vector element, with zero padding to 32-bits. Because the C and C++ language specifications do not define a half-float type and because only conversion instructions are supported by the C7000 architecture, **a vector of half-floats is represented with __uint vector where the __uint serves as a container for the half-float**. The following intrinsics may be used to load and store this representation efficiently:

__vload_unpack_int()  [VLDHUNPKWU]
__vstore_packl()      [VSTHSVPACKL]

14 Resolved defects

Resolved defects in v2.0.1:

ID	Summary
CODEGEN-9332	Certain SE* instructions can't go in parallel with each other (SA* also)

15 Known defects

The up-to-date known defects in v2.0.1 can be found here (dynamically generated):

Known defects in v2.0.1

End Of File