C2000 C/C++ CODE GENERATION TOOLS Release Notes 6.4.10 March 2016 ******************************************************************************* Table of Contents ******************************************************************************* 1. Eliminated Outdated Hardware Support Options 1.1 Eliminated Silicon Version 27 Support 1.2 Eliminated Small Memory Model Support 2. Performance Enhancements 2.1 Loop-Based Performance Improvements 3. Linker Predefined Macros 4. Expanded CLA C-Language Support 4.1 Deeper function calls 4.2 More than 2 function parameters 5. CLA Interrupt Attribute 6. TMU Performance Advice ============================================================================== 1. Eliminated Outdated Hardware Support Options ============================================================================== The 6.4.0 compiler has eliminated support for outdated hardware options. Thus, commonly used options are now the default and no longer have to be specified. ============================================================================== 1.1 Eliminated Silicon Version 27 Support ============================================================================== Support has been eliminated for silicon version 27. Therefore, it is no longer necessary to specify --silicon_version=28 (-v28) as this is supported by default. ============================================================================== 1.2 Eliminated Small Memory Model Support ============================================================================== Support has been eliminated for small memory model. Previously, large memory model had to be specified if compiling without float support. Now, large memory model is supported by default and does not need to be specified. ============================================================================== 2. Performance Enhancements ============================================================================== Performance enhancements were a major emphasis of the 6.4.0 release. The release includes scheduling improvements to minimize register pipeline stalls and improvements to loop-based codes through increased repeat block generation and loop unrolling. ============================================================================== 2.1 Loop-Based Performance Improvements ============================================================================== At --opt_for_speed=3 and above, the optimizer will implement loop unrolling when it is possible and expected to be profitable. (This presumes an optimization level of at least -O2.) Loop unrolling is particularly beneficial for small loop bodies. On devices with an FPU, loop unrolling can enable a small loop to reach the repeat block size threshold when it doesn't otherwise. On devices without an FPU, loop unrolling reduces branch overhead, which is a significant portion of total loop cycles for small loops. Because loop unrolling increases code size, it is only enabled at --opt_for_speed settings of 3 or greater. However, at --opt_for_speed=1 and above, if a small loop misses the repeat block threshold by fewer instructions than would equal the branch cycles, NOPs will be inserted in order to generate a RPTB. This will give a somewhat smaller performance boost at a smaller code size increase. ============================================================================== 3. Linker Predefined Macros ============================================================================== The following Linker predefined macros are available for C2000 and can be used in linker command files. The corresponding device support compiler options must be used in the link step. __TI_COMPILER_VERSION__ __TMS320C2000__ __TMS320C28XX__ __TMS320C28XX_FPU32__ __TMS320C28XX_TMU__ __TMS320C28XX_VCU0__ __TMS320C28XX_VCU2__ __TMS320C28XX_CLA0__ __TMS320C28XX_CLA1__ __TMS320C28XX_CLA2__ ============================================================================== 4. Expanded CLA C-Language Support ============================================================================== All CLA versions now support deeper function calls and more than 2 function parameters. The new support is backwards compatible as long as both scratchpad naming conventions are supported in the linker command file. See the C2000 compiler guide for more information and assembly-level support. ============================================================================== 4.1 Deeper function calls ============================================================================== In previous compiler versions, top-level CLA interrupt functions could make function calls to leaf functions, which could not make any function calls. Now, normal function calling conventions are supported, with the exception of recursive functions and function pointers. For the new function call support, the user must place a ".scratchpad" section in the linker command file. All functions will have frames allocated to this section. ============================================================================== 4.2 More than 2 function parameters ============================================================================== Previous compiler versions restricted CLA code to 2 function arguments. This restriction has been eliminated. More than 2 function arguments are only available in the context of the new function call support. A. Pointers are passed in MAR0 and MAR1. B. 32-bit values are passed in MR0, MR1, and MR2. C. 16-bit values are passed in MR0, MR1, and MR2. D. Any further arguments are passed on the function frame (function-local scratchpad space), starting at offset 0. ============================================================================== 5. CLA Interrupt Attribute ============================================================================== The C28x compiler supports both the interrupt attribute and pragma for CLA interrupts. The interrupt keyword is also still available for interrupts, but using a standard syntax in new development might be preferable. We are advocating all targets to standardize interrupt syntax by using the interrupt attribute. __attribute__((interrupt)) void interrupt_name(void) {...} #pragma INTERRUPT(interrupt _name); void interrupt _name(void) {...} ============================================================================== 6. TMU Performance Advice ============================================================================== Version 6.4.0 adds the C2000 compiler option "--advice:performance", which takes the optional argument "all" or "none" and defaults to "all" if no argument is given. If the advice option is not specifed, the compiler behavior defaults to: --advice:performance=all Currently, this option generates TMU performance advice when TMU hardware support is enabled.