1. C7000 Software Development Flow

This chapter outlines a high-level software development process that many developers have found useful in developing high-performance code for the C7000 DSP core.

Figure 1.1 shows a possible software development strategy for the C7000 DSP core. There are three phases to this strategy; you may need to iterate on phase 1 and between phases 2 and 3. After adding an optimization, measure the resulting performance/code-size and repeat.

../_images/development_phases.png

Figure 1.1 Software Development - Profiling and Optimization

It is a good idea to set up a self-checking application, so that its correctness can be checked during optimizations.

1.1. Phase 1: Create Functionally Correct Code

During the first phase of code development, concentrate on developing code that is functionally correct. A second aim of this phase is to use C7000 DSP programming model constructs and idioms in performance-critical areas that may help performance. These strategies and techniques are covered in Basic Code Optimization Strategies and Techniques. The developer should write, test, and revise code, iterating as necessary to produce functionally correct code before moving to the next phases of development.

Depending on the nature of the application and your testing infrastructure, it may or may not be necessary to test the code on the C7000 DSP. If testing the code on a personal computer is possible, the C7000 Compiler Tools provide a "Host Emulation" infrastructure, which allows you to use C7000 compiler intrinsics and native vector types while developing and debugging on a PC. This allows you to use different debugging tools and programming environments to prototype programs targeted for C7000 hardware before using the C7000 compiler. The Host Emulation package does not attempt to simulate the C7000 CPU. See the C7000 Host Emulation Users Guide (SPRUIG6) for more details.

Once the application is functionally correct, move to Phase 2.

1.2. Phase 2: Profile Code on the Target

During this phase, focus on invoking the C7000 compiler using the appropriate compiler optimization options. Run and profile the code on the target to determine:

  • Does the code meet performance expectations?

  • Which parts of the code consume the most time?

Perform this phase with the code running on the C7000 DSP core, typically with an emulator attached to the EVM or device on which the system-on-chip and C7000 DSP core resides.

Running and profiling the code can give you a good idea of where cycles are being spent and thus which portions of the code may need further optimization work. For example, you may find that the application spends most of its time in one or two ISRs. In such scenarios, you would focus optimization efforts on those ISRs.

Profiling Code provides an overview of profiling tools.

1.3. Phase 3: Improve Performance

In Phase 2, you use profiling information to identify sections of code that need improvement. If the performance and memory use meet requirements and testing reveals no issues, the development process can end. If performance and memory use do not meet requirements, concentrate your optimization efforts on sections of code that consume the most cycles.

Often, profiling in Phase 2 reveals memory system bottlenecks that must be addressed first. For example, some algorithms require a large amount of data in a short amount of time. If the data is placed in DDR, the algorithm could suffer from lots of cold cache misses and may run slowly. This kind of data often should be placed into fast, on-chip memory before that data is consumed. Such memory needs to be set up appropriately. The DMA features of your device can help with moving data to appropriate locations. Detailed analysis of memory bottlenecks, setting up caches and memories, and using DMA is outside the scope of this document. Please consult your Texas Instruments Field Applications Engineer or the Texas Instruments E2E forums for guidance and more information.

Once the memories and DMA transfers are set up appropriately, further profiling can reveal hot spots that need further attention at the algorithm level. Typical steps may include:

  • Enabling the appropriate compiler options, which typically include:

    • Options to take advantage of optimization passes within the compiler, such as inlining and loop optimization. Note that loop optimization is especially important when optimizing performance on C7000 devices. Options to consider are covered in Selecting Compiler Options for Performance.

    • Options to take advantage of hardware features.

  • Where possible, using optimized libraries from TI.

  • Providing more information to the compiler to help its optimizations (for example: pragmas and the restrict keyword).

Basic Code Optimization Strategies and Techniques and Advanced Code Optimization Techniques describe these and more strategies to optimize your code.

If the application does not yet meet performance requirements, make improvements and return to Phase 2 to collect new profiling data.