7.3. Building and Running Memory Benchmarking Configurations

7.3.1. Demo Overview

  • This demo provides a means of measuring the performance of a realistic application where the text of the application is sitting in various memory locations and the data is sitting in On-Chip-Memory RAM (referred to as OCM, OCMC or OCMRAM).
  • The application executes 10 different configurations of the same text varying by data vs. instruction cache intensity. Each test calls 16 separate functions 500 total times in random order.
  • The most instruction intensive example achieves a instruction cache miss rate (ICM/sec) of ~3-4 million per second when run entirely from OCMRAM. This is a rate that we have similarly seen in real-world customer examples.
  • More data instensive tests have more repetitive code, achieving much lower ICM rates.
Application Output Description
Mem Cpy size    => 100 Size of the memcpy in bytes executed by each task
Exec Time in usec => 2567 Amount of time in microseconds
Iter            => 1 Number of times the test was run
Task calls      => 500 Number of randomly ordered calls to the 16 tasks
Inst Cache miss => 11421 Total instruction cache misses
Inst Cache acc  => 650207 Total instruction cache accesses
num switches    => 1469 Number of total context switches
num instr exec  => 1029260 Total number of executed instructions
ICM/sec         => 4449162 Instruction cache misses per second
INST/sec        => 400958317 Instructions executed per second

7.3.2. Supported Configurations

Core SOC Supported Memory Configs (MEM_CONF)
mcu1_0 j721e ocmc msmc ddr xip
mcu2_0 j721e ocmc msmc ddr xip
mcu1_0 + mcu2_0 j721e ddr xip
mcu1_0 j7200 ocmc msmc ddr xip
mcu2_0 j7200 ocmc msmc ddr xip
mcu1_0 + mcu2_0 j7200 ddr xip

7.3.3. How to Build

  • Go to the build folder, namely, /packages/ti/build/

7.3.3.1. Basic Single Core Benchmarking

  • Build the bootloader
    • make sbl_lib_cust_clean SOC=j7200 BOARD=j7200_evm CORE=mcu1_0 -sj
    • make sbl_cust_img SOC=j7200 BOARD=j7200_evm CORE=mcu1_0 RAT=1 -sj
  • Build the application (XIP and non-XIP):
    • make [MEM_CONF]_memory_benchmarking_app_freertos CORE=[CORE] SOC=j7200 BOARD=j7200_evm -sj

7.3.3.2. Multicore Core Benchmarking

  • If building for XIP, edit /boot/sbl/src/ospi/sbl_ospi.c - be sure to “#define BUILD_XIP” and comment out any line that may say “#undef BUILD_XIP”.
  • Run the command:
    • make sbl_cust_img SOC=j7200 BOARD=j7200_evm RAT=1 -sj
    • make [MEM_CONF]_memory_benchmarking_app_freertos CORE=mcu1_0 SOC=j7200 BOARD=j7200_evm MULTICORE=1 -sj
    • make [MEM_CONF]_memory_benchmarking_app_freertos CORE=mcu2_0 SOC=j7200 BOARD=j7200_evm MULTICORE=1 -sj
    • ../boot/sbl/tools/multicoreImageGen/bin/MulticoreImageGen LE 55 [MEM_CONF]_multicore.appimage 8 ../binary/[MEM_CONF]_memory_benchmarking_app_freertos/bin/j7200_evm/[MEM_CONF]_memory_benchmarking_app_freertos_mcu1_0_release.rprc  10 ../binary/[MEM_CONF]_memory_benchmarking_app_freertos/bin/j7200_evm/[MEM_CONF]_memory_benchmarking_app_freertos_mcu2_0_release.rprc
  • The appimage to flash (line 4 of the flash command) should now be the [mem_conf]_multicore.appimage file in your build directory.
  • If building for XIP, run the following command as well.
    • ../boot/sbl/tools/multicoreImageGen/bin/MulticoreImageGen LE 55 xip_multicore.appimage_xip 8 ../binary/[MEM_CONF]_memory_benchmarking_app_freertos/bin/j7200_evm/xip_memory_benchmarking_app_freertos_mcu1_0_release.rprc_xip 10 ../binary/[MEM_CONF]_memory_benchmarking_app_freertos/bin/j7200_evm/xip_memory_benchmarking_app_freertos_mcu2_0_release.rprc_xip

7.3.4. How to Run (Linux Host Machine Assumed)

7.3.4.1. Via OSPI

  • Build the cust sbl image (with RAT=1 option)
  • Put the EVM in UART boot mode
  • Download and install the uniflash tool

7.3.4.1.1. Non XIP Use Cases

If using a multicore appimage, just replace flashing step 4 with the appimage that was created in the multicore building step.

  • sudo <uniflash_directory>/dslite.sh --mode processors -c /dev/ttyUSB1 -f <uniflash_directory>/processors/FlashWriter/j7200_evm/uart_j7200_evm_flash_programmer_release.tiimage -i 0
  • sudo <uniflash_directory>/dslite.sh --mode processors -c /dev/ttyUSB1 -f <pdk_path>/packages/ti/boot/sbl/binary/j7200_evm/cust/bin/sbl_cust_img_mcu1_0_release.tiimage -d 3 -o 0
  • sudo <uniflash_directory>/dslite.sh --mode processors -c /dev/ttyUSB1 -f <pdk_path>/packages/ti/drv/sciclient/soc/V2/tifs.bin -d 3 -o 80000
  • sudo <uniflash_directory>/dslite.sh --mode processors -c /dev/ttyUSB1 -f <pdk_path>/packages/ti/binary/binary/[MEM_CONF]_memory_benchmarking_app_freertos/bin/j7200_evm/[MEM_CONF]_memory_benchmarking_app_[CORE]_freertos_release.appimage -d 3 -o 100000
  • sudo <uniflash_directory>/dslite.sh --mode processors -c /dev/ttyUSB1 -f <pdk_path>/packages/ti/board/src/flash/nor/ospi/nor_spi_patterns.bin -d 3 -o 3FC0000

7.3.4.1.2. Single Core XIP

  • To flash single core XIP:
    • sudo <uniflash_directory>/dslite.sh --mode processors -c /dev/ttyUSB1 -f <uniflash_directory>/processors/FlashWriter/j7200_evm/uart_j7200_evm_flash_programmer_release.tiimage -i 0
    • sudo <uniflash_directory>/dslite.sh --mode processors -c /dev/ttyUSB1 -f <pdk_path>/packages/ti/boot/sbl/binary/j7200_evm/cust/bin/sbl_cust_img_mcu1_0_release.tiimage -d 3 -o 0
    • sudo <uniflash_directory>/dslite.sh --mode processors -c /dev/ttyUSB1 -f <pdk_path>/packages/ti/drv/sciclient/soc/V2/tifs.bin -d 3 -o 80000
    • sudo <uniflash_directory>/dslite.sh --mode processors -c /dev/ttyUSB1 -f <pdk_path>/packages/ti/binary/[MEM_CONF]_memory_benchmarking_app_freertos/bin/j7200_evm/[MEM_CONF]_memory_benchmarking_app_[CORE]_freertos_release.appimage -d 3 -o 100000
    • sudo <uniflash_directory>/dslite.sh --mode processors -c /dev/ttyUSB1 -f <pdk_path>/packages/ti/binary/[MEM_CONF]_memory_benchmarking_app_freertos/bin/j7200_evm/[MEM_CONF]_memory_benchmarking_app_[CORE]_freertos_release.appimage_xip -d 3
    • sudo <uniflash_directory>/dslite.sh --mode processors -c /dev/ttyUSB1 -f <pdk_path>/packages/ti/board/src/flash/nor/ospi/nor_spi_patterns.bin -d 3 -o 3FC0000

7.3.4.1.3. Multicore XIP

  • To flash multicore XIP (be sure to see the build instructions above for getting xip_multicore.appimage and xip_multicore.appimage_xip):
    • sudo <uniflash_directory>/dslite.sh --mode processors -c /dev/ttyUSB1 -f <uniflash_directory>/processors/FlashWriter/j7200_evm/uart_j7200_evm_flash_programmer_release.tiimage -i 0
    • sudo <uniflash_directory>/dslite.sh --mode processors -c /dev/ttyUSB1 -f <pdk_path>/packages/ti/boot/sbl/binary/j7200_evm/cust/bin/sbl_cust_img_mcu1_0_release.tiimage -d 3 -o 0
    • sudo <uniflash_directory>/dslite.sh --mode processors -c /dev/ttyUSB1 -f <pdk_path>/packages/ti/drv/sciclient/soc/V2/tifs.bin -d 3 -o 80000
    • sudo <uniflash_directory>/dslite.sh --mode processors -c /dev/ttyUSB1 -f <pdk_path>/packages/ti/build/xip_multicore.appimage -d 3 -o 100000
    • sudo <uniflash_directory>/dslite.sh --mode processors -c /dev/ttyUSB1 -f <pdk_path>/packages/ti/build/xip_multicore.appimage_xip -d 3
    • sudo <uniflash_directory>/dslite.sh --mode processors -c /dev/ttyUSB1 -f <pdk_path>/packages/ti/board/src/flash/nor/ospi/nor_spi_patterns.bin -d 3 -o 3FC0000
  • For mcu1_0: Attach USB cable to UART Terminal 1 of the MCU UART port (sudo minicom -D /dev/ttyUSB1) to see the output of the application
  • For mcu2_0: Attach USB cable to UART Terminal 0 of the Main UART port (sudo minicom -D /dev/ttyUSB0) to see the output of the application
  • Power on the EVM in OSPI boot mode

7.3.5. Addtional Notes

  • Again, XIP cannot run via CCS due to the location in memory where the application code sits.
  • When building the sbl_cust_img, if you would like to see more verbose output, you may change the flag in /packages/ti/boot/sbl/sbl_component.mk CUST_SBL_TEST_FLAGS called “-DSBL_LOG_LEVEL” from 1 to 3. However, this will cause the cache miss rate to increase substantially and performance times to decrease. So only use this for debugging reasons, but not for actual performance benchmarking.
  • A “Mem Cpy size” of 0, means that no memcpy occurred and the application test was strictly instructions-based.
  • When building different memory configurations, it is always a good idea to do a clean build. Some consecutive builds will work, but some also will not, so it is best to be safe by building cleanly.
  • Problems have been noted with running at 166Mhz and a chip select of delay of 2. Symptoms can be the uart output being junk characters or the test stopping after simply executing a few of the tasks, but not all of them. This is intermittent and unlikely, but a potential when executing. If you are seeing this issue, change the clock speed to 133M or the CSDA value to 3 in the sbl_ospi.c file.
  • The offset for flashing the nor_spi_patterns.bin is SOC dependednt. Please check SOC specific documentation for the same.
  • ttyUSB1 is used as the MCU UART terminal and ttyUSB0 is used as Main UART Terminal. This might not always be the case, please check the same after connecting to the UART.