1. J7200 Datasheet¶
1.1. Introduction¶
This section provides the performance numbers of device drivers supported in PDK
1.1.1. Setup Details¶
SOC Details | Values |
---|---|
Core | R5F |
Core Operating Speed | 1GHz |
DDR Speed | 2666 MHz |
Cache status | Enabled |
Optimization Details | Values |
---|---|
Profile | Release |
Compile Options for R5F | -g -ms -DMAKEFILE_BUILD -c -qq -pdsw225 –endian=little -mv7R5 –abi=eabi -eo.oer5f -ea.ser5f –symdebug:dwarf –embed_inline_assembly –float_support=vfpv3d16 –emit_warnings_as_errors |
Linker Options for R5F | –emit_warnings_as_errors -w -q -u _c_int00 -c -mv7R5 –diag_suppress=10063 -x –zero_init=on |
Code Placement | DDR |
Data Placement | DDR |
1.1.2. Software Performance Numbers¶
1.1.2.1. CPSW¶
1.1.2.1.1. TCP (IP stack) Performance¶
1.1.2.1.1.1. CPSW5G - Main domain R5_0 core 0 (mcu2_0)¶
- Main domain R5_0 at 1GHz
- QSGMII interface at 1Gbps
- TCP window size: 128 KByte
Single Direction Test
Test | Measured Throughput (Mbps) | CPU Load (%) |
---|---|---|
TCP RX | 131.0 | 48 |
TCP TX | 23.4 | 12 |
Bidirectional Test
Test | Measured Throughput (Mbps) | CPU Load (%) |
---|---|---|
TCP RX | 138.0 | 62 |
TCP TX | 23.4 |
1.1.2.1.1.2. CPSW2G - MCU domain R5 core 0 (mcu1_0)¶
- MCU domain R5_0 at 1GHz
- RGMII interface at 1Gbps
- TCP window size: 128 KByte
Single Direction Test
Test | Measured Throughput (Mbps) | CPU Load (%) |
---|---|---|
TCP RX | 129.0 | 68 |
TCP TX | 23.4 | 18 |
Bidirectional Test
Test | Measured Throughput (Mbps) | CPU Load (%) |
---|---|---|
TCP RX | 125.0 | 80 |
TCP TX | 20.9 |
1.1.2.1.2. UDP (IP stack) Performance¶
1.1.2.1.2.1. CPSW5G - Main domain R5_0 core 0 (mcu2_0)¶
- Main domain R5_0 at 1GHz
- RGMII interface at 1Gbps
Single Direction Test
Test | Measured Throughput (Mbps) | CPU Load (%) |
---|---|---|
UDP RX | 231.0 | 91 |
1.1.2.1.2.2. CPSW2G - MCU domain R5 core 0 (mcu1_0)¶
- MCU domain R5_0 at 1GHz
- RGMII interface at 1Gbps
Single Direction Test
Test | Measured Throughput (Mbps) | CPU Load (%) |
---|---|---|
UDP RX | 163.0 | 79 |
Note:
- TCP/IP throughput results are measured using Enet LLD lwIP example application built with the following
modifications which are not PDK’s default settings:
- lwIP example app built with performance optimizations (OPTIMIZATION=PERFORMANCE passed to make command)
- Enable Thumb2 mode (CFLAGS_INTERNAL += -mthumb)
- Current performance numbers are preliminary as throughput profiling is not done in optimized environment (i.e. putting frequently used functions in fast memory, use of pacing using ring monitor, tuning of descriptors, etc.)
1.1.2.2. UDMA¶
1.1.2.2.1. DMA Parameters¶
- Ring Order ID: 0
- Channel Order ID: 0
- Channel DMA Priority: 1
- Channel Bus Priority: 4
- Channel BUS QOS: 4
- Channel TX FIFO depth: 128
- Channel Fetch Word Size: 16
- Channel Burst Size: 64 bytes for normal channel, 128 bytes for HC and UHC channels
1.1.2.2.2. Test Parameters¶
- Type: TR15 Block copy
- TR: one TR per TRPD in PBR mode
- TR Memory: Same as buffer memory (DDR, MSMC or OCMC depends on the test performed)
- Transfer Size: 1 MB read and 1MB write
- 1MB means 1000x1000 bytes and 1KB means 1000 bytes
Note: Throughput numbers mentioned is the combined memory throughput of both read and write operations
1.1.2.3. IPC¶
1.1.2.3.1. Test Set-up¶
Release build binaries are used for measurement
Ring Buffer : Uncached DDR
Buffer to be sent (RPMSG) – Cached DDR
Software/Application Used : ipc_multicore_perf_test loaded through SBL. Output is printed to UART.
R5F/MPU config : DDR config
- bufferable - 1
- cacheable - 1
- shareable - 0
Capturing Round trip time in us with different data sizes
1.1.2.3.2. Performance - Host Core A72, Bios, 2 GHz¶
Remote Core | 4 Bytes | 8 Bytes | 16 Bytes | 32 Bytes | 64 Bytes | 128 Bytes | 256 Bytes |
---|---|---|---|---|---|---|---|
MCU R5F0 | 20 | 20 | 22 | 25 | 32 | 44 | 70 |
Main R5F0 | 18 | 19 | 20 | 24 | 29 | 41 | 65 |
1.1.2.3.3. Performance - Host Core MCU R5F0, 1 GHz¶
Remote Core | 4 Bytes | 8 Bytes | 16 Bytes | 32 Bytes | 64 Bytes | 128 Bytes | 256 Bytes |
---|---|---|---|---|---|---|---|
A72 (bios) | 21 | 21 | 23 | 26 | 32 | 43 | 68 |
Main R5F0 | 17 | 18 | 19 | 22 | 28 | 39 | 65 |
1.1.2.3.4. Performance - Host Core MAIN R5F0, 1 GHz¶
Remote Core | 4 Bytes | 8 Bytes | 16 Bytes | 32 Bytes | 64 Bytes | 128 Bytes | 256 Bytes |
---|---|---|---|---|---|---|---|
A72 (Bios) | 17 | 17 | 18 | 21 | 26 | 37 | 59 |
MCU R5F0 | 16 | 15 | 17 | 20 | 25 | 35 | 58 |
Main R5F1 | 16 | 16 | 17 | 21 | 26 | 36 | 59 |
1.1.2.4. OSPI¶
1.1.2.4.1. OSPI Memory Non Cached Test Set-up¶
- Platform: J7200 EVM.
- OS Type: Baremetal/Sysbios.
- Core : R5F_0 at 1 GHz, A72_0 at 2 GHz.
- Software/Application Used: OSPI_Flash_TestApp/OSPI_Flash_Dma_TestApp/OSPI_Baremetal_Flash_TestApp/OSPI_Baremetal_Flash_Dma_TestApp
- System Configuration: Cache OFF, Read/Write Buffer in DDR. DMA Enabled/Disabled, Interrupts ON.
1.1.2.4.2. OSPI Read/Write Performance (DDR Octal Mode)¶
OSPI RCLK | OS | CPU | Mode | Write Tput (MB/s) | Write CPU Load | Read Tput (MB/s) | Read CPU Load |
---|---|---|---|---|---|---|---|
133 MHz | Baremetal | R5F_0 | DAC | 100% | 7.25 | 100% | |
DAC DMA | 100% | 118.625 | 100% | ||||
INDAC | 0.582 | 100% | 24.5 | 100% | |||
A72_0 | DAC | 100% | 5.25 | 100% | |||
DAC DMA | 100% | 102 | 100% | ||||
INDAC | 0.575 | 100% | 13.25 | 100% | |||
RTOS | R5F_0 | DAC | 100% | 7.25 | 100% | ||
DAC DMA | 0% | 118.625 | 56% | ||||
INDAC | 0.563 | 99% | 24.5 | 100% | |||
A72_0 | DAC | 100% | 5.375 | 100% | |||
DAC DMA | 0% | 102.875 | 61% | ||||
INDAC | 0.557 | 99% | 13.375 | 100% | |||
166 MHz | Baremetal | R5F_0 | DAC | 100% | 7.625 | 100% | |
DAC DMA | 100% | 92 | 100% | ||||
INDAC | 0.581 | 100% | 24.5 | 100% | |||
A72_0 | DAC | 100% | 5.625 | 100% | |||
DAC DMA | 100% | 70.375 | 100% | ||||
INDAC | 0.571 | 100% | 13.25 | 100% | |||
RTOS | R5F_0 | DAC | 100% | 7.625 | 100% | ||
DAC DMA | 0% | 91.75 | 73% | ||||
INDAC | 0.563 | 99% | 24.5 | 100% | |||
A72_0 | DAC | 100% | 5.875 | 100% | |||
DAC DMA | 0% | 70.75 | 79% | ||||
INDAC | 0.556 | 99% | 13.375 | 100% |
1.1.2.4.3. OSPI Memory Cached Test Set-up¶
- Platform: J7200 EVM.
- OS Type: Baremetal/Sysbios.
- Core : R5F_0 at 1 GHz, A72_0 at 2 GHz.
- Software/Application Used: OSPI_Flash_Cache_TestApp/OSPI_Flash_Dma_Cache_TestApp/OSPI_Baremetal_Flash_Cache_TestApp/OSPI_Baremetal_Flash_Dma_Cache_TestApp
- System Configuration: Cache ON, Read/Write Buffer in DDR. DMA Enabled/Disabled, Interrupts ON.
1.1.2.4.4. OSPI Read/Write Performance (DDR Octal Mode)¶
OSPI RCLK | OS | CPU | Mode | Write Tput (MB/s) | Write CPU Load | Read Tput (MB/s) | Read CPU Load |
---|---|---|---|---|---|---|---|
133 MHz | Baremetal | R5F_0 | DAC | 100% | 7.25 | 100% | |
DAC DMA | 100% | 119 | 100% | ||||
INDAC | 0.572 | 100% | 24.625 | 100% | |||
A72_0 | DAC | 100% | 100% | ||||
DAC DMA | 100% | 100% | |||||
INDAC | 100% | 100% | |||||
RTOS | R5F_0 | DAC | 100% | 63 | 100% | ||
DAC DMA | 0% | 119 | 55% | ||||
INDAC | 0.582 | 99% | 24.5 | 100% | |||
A72_0 | DAC | 100% | 5.25 | 100% | |||
DAC DMA | 0% | 102.875 | 61% | ||||
INDAC | 0.571 | 99% | 13.5 | 100% | |||
166 MHz | Baremetal | R5F_0 | DAC | 100% | 7.625 | 100% | |
DAC DMA | 100% | 92 | 100% | ||||
INDAC | 0.570 | 100% | 24.625 | 100% | |||
A72_0 | DAC | 100% | 100% | ||||
DAC DMA | 100% | 100% | |||||
INDAC | 100% | 100% | |||||
RTOS | R5F_0 | DAC | 100% | 48.875 | 100% | ||
DAC DMA | 0% | 91.75 | 73% | ||||
INDAC | 0.582 | 99% | 24.5 | 100% | |||
A72_0 | DAC | 100% | 5.75 | 100% | |||
DAC DMA | 0% | 70.75 | 79% | ||||
INDAC | 0.568 | 99% | 13.5 | 100% |
1.1.2.5. MMCSD¶
1.1.2.5.1. Test Set-up¶
- Platform: J7200 EVM.
- OS Type: Sysbios.
- Core : A72_0, 2 GHz.
- Software/Application Used: MMCSD_<EMMC>_Regression_TestApp (A menu based application which outputs the benchmark numbers on UART)
- System Configuration: Cache ON, Read/Write Buffer in DDR. ADMA enabled, Interrupts ON.
- SD Card used: Sandisk 16GB, Class 10. FAT32 formatted with allocation size = 4K (for optimal FAT32 throughput & compatibility with various cards)
- EMMC: EMMC on J7200 EVM. Please refer to the EVM data sheet for details
1.1.2.5.2. SD Card Performance¶
1.1.2.5.2.1. DS Mode (25 Mhz, 4-bit) Theoretical Max: 12.5 MB/s¶
Size of transfer (KB) | RAW Write Throughput (MB/s) | RAW Read Throughput (MB/s) | FATFS Write Throughput (MB/s) | FATFS Read Throughput (MB/s) |
---|---|---|---|---|
256 | 9.1059 | 9.4340 | 4.1804 | 7.5307 |
512 | 9.8377 | 10.4257 | 4.5550 | 8.0084 |
1024 | 10.0432 | 10.7388 | 4.9630 | 8.2052 |
2048 | 10.4119 | 10.9066 | 5.8666 | 8.0361 |
5120 | 10.0376 | 10.9829 | 4.7683 | 8.3273 |
1.1.2.5.2.2. HS Mode (50 Mhz, 4-bit) Theoretical Max: 50 MB/s¶
Size of transfer (KB) | RAW Write Throughput (MB/s) | RAW Read Throughput (MB/s) | FATFS Write Throughput (MB/s) | FATFS Read Throughput (MB/s) |
---|---|---|---|---|
256 | 15.9483 | 16.4356 | 4.3909 | 11.8113 |
512 | 18.5548 | 19.6683 | 6.2893 | 12.6380 |
1024 | 19.9566 | 20.8116 | 6.5560 | 13.1697 |
2048 | 19.9830 | 21.4463 | 6.5847 | 13.4176 |
5120 | 20.0178 | 21.8337 | 6.2207 | 13.4776 |
1.1.2.5.2.3. SDR12 Mode (25 Mhz, 4-bit) Theoretical Max: 12.5 MB/s¶
Size of transfer (KB) | RAW Write Throughput (MB/s) | RAW Read Throughput (MB/s) | FATFS Write Throughput (MB/s) | FATFS Read Throughput (MB/s) |
---|---|---|---|---|
256 | 9.0146 | 9.4187 | 4.2206 | 7.4148 |
512 | 9.7703 | 10.4165 | 4.9643 | 8.0081 |
1024 | 10.0714 | 10.7345 | 4.7311 | 8.2015 |
2048 | 9.6667 | 10.8930 | 5.0503 | 8.3087 |
5120 | 10.0025 | 11.0095 | 4.8343 | 8.3287 |
1.1.2.5.2.4. SDR25 Mode (50 Mhz, 4-bit) Theoretical Max: 25 MB/s¶
Size of transfer (KB) | RAW Write Throughput (MB/s) | RAW Read Throughput (MB/s) | FATFS Write Throughput (MB/s) | FATFS Read Throughput (MB/s) |
---|---|---|---|---|
256 | 16.2732 | 16.4143 | 5.6652 | 11.2796 |
512 | 18.3847 | 19.6669 | 6.3413 | 12.6358 |
1024 | 19.0623 | 20.8100 | 6.5959 | 13.1657 |
2048 | 17.4704 | 21.3765 | 6.3836 | 13.4073 |
5120 | 19.6133 | 21.8508 | 6.0397 | 12.5147 |
1.1.2.5.2.5. SDR50 Mode (50 Mhz, 4-bit) Theoretical Max: 50 MB/s¶
Size of transfer (KB) | RAW Write Throughput (MB/s) | RAW Read Throughput (MB/s) | FATFS Write Throughput (MB/s) | FATFS Read Throughput (MB/s) |
---|---|---|---|---|
256 | 24.6037 | 26.1130 | 4.5208 | 7.6322 |
512 | 29.9576 | 35.3214 | 4.9401 | 7.9848 |
1024 | 32.6505 | 39.1811 | 4.9564 | 8.1912 |
2048 | 30.3629 | 41.3373 | 4.9362 | 8.2954 |
5120 | 34.7683 | 43.0374 | 4.8785 | 8.3285 |
1.1.2.5.2.6. DDR50 Mode (50 Mhz, 4-bit) Theoretical Max: 50 MB/s¶
Size of transfer (KB) | RAW Write Throughput (MB/s) | RAW Read Throughput (MB/s) | FATFS Write Throughput (MB/s) | FATFS Read Throughput (MB/s) |
---|---|---|---|---|
256 | 23.4774 | 25.6365 | 4.2197 | 7.5511 |
512 | 26.2276 | 34.4773 | 4.4524 | 7.9936 |
1024 | 34.0707 | 38.1547 | 4.9994 | 8.2083 |
2048 | 29.2400 | 40.1979 | 5.0277 | 8.3036 |
5120 | 32.5992 | 41.6822 | 4.8337 | 8.3316 |
1.1.2.5.3. EMMC Performance¶
1.1.2.5.3.1. DS Mode (25 Mhz, 8-bit) Theoretical Max: 25 MB/s¶
Size of transfer (KB) | RAW Write Throughput (MB/s) | RAW Read Throughput (MB/s) |
256 | 15.9600 | 18.5776 |
512 | 18.1068 | 20.1941 |
1024 | 19.4310 | 21.1389 |
2048 | 20.1785 | 21.6574 |
5120 | 20.6573 | 21.9851 |
1.1.2.5.3.2. HS-SDR Mode (50 Mhz, 8-bit) Theoretical Max: 50 MB/s¶
Size of transfer (KB) | RAW Write Throughput (MB/s) | RAW Read Throughput (MB/s) |
256 | 25.6862 | 31.8970 |
512 | 31.7678 | 36.9522 |
1024 | 36.0882 | 40.2272 |
2048 | 38.7699 | 42.1508 |
5120 | 39.6647 | 43.3818 |
1.1.2.5.3.3. HS-DDR Mode (50 Mhz, 8-bit) Theoretical Max: 100 MB/s¶
Size of transfer (KB) | RAW Write Throughput (MB/s) | RAW Read Throughput (MB/s) |
256 | 34.8107 | 47.9176 |
512 | 41.8965 | 60.3240 |
1024 | 48.6215 | 69.5793 |
2048 | 53.9672 | 75.5317 |
5120 | 56.1397 | 79.6654 |
1.1.2.5.3.4. HS-200 Mode (200 Mhz, 8-bit) Theoretical Max: 200 MB/s¶
Size of transfer (KB) | RAW Write Throughput (MB/s) | RAW Read Throughput (MB/s) |
256 | 37.8881 | 68.9168 |
512 | 46.4331 | 97.8488 |
1024 | 50.7672 | 124.6944 |
2048 | 54.6804 | 145.1625 |
5120 | 55.0597 | 160.8638 |
1.1.2.5.3.5. HS-400 Mode (200 Mhz, 8-bit) Theoretical Max: 400 MB/s¶
Size of transfer (KB) | RAW Write Throughput (MB/s) | RAW Read Throughput (MB/s) |
256 | 36.2206 | 84.0709 |
512 | 47.7269 | 130.8260 |
1024 | 51.6706 | 184.4708 |
2048 | 55.3375 | 203.5146 |
5120 | 56.7088 | 208.5778 |
1.1.2.6. CSL-FL based Optimized OSPI Example¶
1.1.2.6.1. CPU Mode - Test Set-up¶
Platform: J7200 EVM.
OS Type: Baremetal
Core : R5F_0 at 1 GHz
Software/Application Used: csl_ospi_flash_app
- System Configuration:
- RCLK 133/166 MHz
- Cache ON,
- Buffer & Critical Fxn’s in TCMB,
- DMA Disabled,
- Interrupts OFF.
- Theoretical Max Throughput:
- 133 MHz :- 253.67 MB/s
- 166 MHz :- 316.62 MB/s
1.1.2.6.2. DAC Mode OSPI Read Performance (Dual Data Rate - Octal Mode)¶
OSPI RCLK | Size of transfer (B) | Read Time (ns) | Throughput (MB/s) |
---|---|---|---|
133 MHz | 16 | 815 | 19.6 |
32 | 1445 | 22.1 | |
64 | 2700 | 23.7 | |
128 | 5225 | 24.5 | |
256 | 10265 | 24.9 | |
512 | 20360 | 25.1 | |
1024 | 40510 | 25.3 | |
166 MHz | 16 | 945 | 16.9 |
32 | 2330 | 13.7 | |
64 | 4580 | 14.0 | |
128 | 9105 | 14.1 | |
256 | 18145 | 14.1 | |
512 | 36185 | 14.1 | |
1024 | 72295 | 14.2 |
1.1.2.6.3. DMA Mode - Test Set-up¶
Platform: J7200 EVM.
OS Type: Baremetal
Core : R5F_0 at 1 GHz
Software/Application Used: udma_baremetal_ospi_flash_testapp
- System Configuration:
- RCLK 133/166 MHz
- Cache ON,
- Buffer & Critical Fxn’s in TCMB,
- DMA Enabled - SW Trigger mode,
- Interrupts OFF.
- Theoretical Max Throughput:
- 133 MHz :- 253.67 MB/s
- 166 MHz :- 316.62 MB/s
1.1.2.6.4. DAC DMA Mode OSPI Read Performance (Dual Data Rate - Octal Mode)¶
OSPI RCLK | Size of transfer (B) | Read Time (ns) | Throughput (MB/s) |
---|---|---|---|
133 MHz | 16 | 800 | 20 |
32 | 805 | 39.8 | |
64 | 970 | 66 | |
128 | 1315 | 97.3 | |
256 | 1955 | 130.9 | |
512 | 3120 | 164.1 | |
1024 | 5450 | 187.9 | |
166 MHz | 16 | 675 | 23.7 |
32 | 805 | 39.8 | |
64 | 850 | 75.3 | |
128 | 1180 | 108.5 | |
256 | 1685 | 151.9 | |
512 | 2730 | 187.5 | |
1024 | 4670 | 219.3 |
1.1.2.7. SBL OSPI Boot Performance App¶
1.1.2.7.1. Test Set-up¶
- Platform: J7200 EVM.
- OS Type: Baremetal
- Core : R5F_0 at 1 GHz
- Software/Application Used: sbl_cust_img (with custom flags) and sbl_boot_perf_test appimage
1.1.2.7.2. GP EVM Performance¶
SBL Boot Time Breakdown | Time (ms) |
MCU_PORZ_OUT to MCU_RESETSTATz | 0.63 |
ROM : init + SBL load from OSPI | 10.00 |
SBL : Board_init (PINMUX) | 0.20 |
SBL: SPI_init | 0.01 |
SBL : SBL_SciClientInit: ReadSysfwImage | 0.05 |
Load/Start SYSFW | 10.70 |
Board Config | 5.20 |
PM Config | 0.90 |
RM Config | 1.16 |
Security Config | 0.57 |
SBL: SoC Late-Init | 0.00 |
SBL : Board_init (PLL) | 1.60 |
SBL: Board_init (CLOCKS) | 0.80 |
SBL: OSPI init | 0.06 |
SBL: Misc (Sciclient_pmSetModule) | 0.30 |
SBL: App copy to MCU SRAM & Jump to App | 2.90 |
MCUSW: CAN response | 1.00 |
TOTAL time | 36.10 |
1.1.2.8. OSPI Memory Configuration Benchmarking¶
- These numbers were collected from the memory_benchmarking_app demo which provides a means of measuring the performance of a realistic application where the text of the application is sitting in various memory locations and the data is sitting in On-Chip-Memory RAM (referred to as OCM, OCMC or OCMRAM).
- The application executes 10 different configurations of the same text varying by data vs. instruction cache intensity. Each test calls 16 separate functions 500 total times in random order.
- The most instruction intensive example achieves a instruction cache miss rate (ICM/sec) of ~3-4 million per second when run entirely from OCMRAM. This is a rate that we have similarly seen in real-world customer examples.
- More data instensive tests have more repetitive code, achieving much lower ICM rates
- When “Multicore” Configuration is used, it is defined as the execution of the same AUTOSAR application executed simultaneously by means of a synchronization delay on MCU Core 0 (mcu1_0) and MAIN Core 0 (mcu2_0)
- The Memcpy size is just a knob to make the synthetic benchmark application more data or instruction centric with no additional significance. (small memcpy size is more instruction centric with more ICM rate and vice versa)
1.1.2.8.1. Supported Configurations¶
Core | SOC | Supported Memory Configurations (MEM_CONF) |
---|---|---|
mcu1_0 | j721e | ocmc msmc ddr xip |
mcu2_0 | j721e | ocmc msmc ddr xip |
mcu1_0 + mcu2_0 | j721e | ddr xip |
1.1.2.8.2. Test Set-up¶
- Platform: J7200 EVM.
- OS Type: FreeRTOS
- Core – MCU Domain R5_0 (MCU1_0) & Main Domain R5_0 (MCU2_0)
- Software/Application Used: sbl_cust_img (with RAT=1 for mcu2_0 execution) and [MEM_CONF]_memory_benchmarking_app_freertos appimage
1.1.2.8.3. MCU Domain Single Core Execution¶
- Cache miss rate of 3M/sec is at memcpy size of 50 bytes.
Memcpy Size | 0 | 50 | 500 | 1000 | 2048 | |
---|---|---|---|---|---|---|
OCMC | OCMC Baseline Execution Time (us) | 3393 | 3580 | 5189 | 7065 | 11072 |
ICM/sec | 3335101 | 3072905 | 2055309 | 1481811 | 987897 | |
DDR | DDR execution time (us) | 6907 | 7051 | 8633 | 10489 | 14896 |
DDR / OCMC Baseline | 2.036 | 1.97 | 1.664 | 1.485 | 1.345 | |
MSMC | MSMC execution time (us) | 5613 | 5908 | 7434 | 9303 | 13541 |
MSMC / OCMC Baseline | 1.654 | 1.65 | 1.433 | 1.317 | 1.223 | |
XIP | XIP 133Mhz execution time (us) | 5301 | 5593 | 7179 | 9228 | 13742 |
XIP 133Mhz / OCMC Baseline | 1.562 | 1.562 | 1.384 | 1.306 | 1.241 | |
XIP 166Mhz execution time (us) | 6394 | 6495 | 7989 | 10020 | 14403 | |
XIP 166Mhz / OCMC Baseline | 1.884 | 1.814 | 1.54 | 1.418 | 1.301 |
1.1.2.8.4. MAIN Domain Single Core Execution¶
- Cache miss rate of 3M/sec is closest at memcpy size of ~0 bytes.
Memcpy Size | 0 | 50 | 500 | 1000 | 2048 | |
---|---|---|---|---|---|---|
OCMC | OCMC Baseline Execution Time (us) | 4505 | 4671 | 6346 | 8359 | 12495 |
ICM/sec | 2487014 | 2466923 | 1792782 | 1405790 | 941336 | |
DDR | DDR execution time (us) | 6073 | 6310 | 7873 | 9847 | 14027 |
DDR / OCMC Baseline | 1.348 | 1.351 | 1.241 | 1.178 | 1.123 | |
MSMC | MSMC execution time (us) | 4944 | 5098 | 6682 | 8617 | 12743 |
MSMC / OCMC Baseline | 1.097 | 1.091 | 1.053 | 1.031 | 1.02 | |
XIP | XIP 133Mhz execution time (us) | 9744 | 9765 | 11629 | 13626 | 18044 |
XIP 133Mhz / OCMC Baseline | 2.163 | 2.091 | 1.832 | 1.63 | 1.444 | |
XIP 166Mhz execution time (us) | 9619 | 9834 | 11359 | 13472 | 17992 | |
XIP 166Mhz / OCMC Baseline | 2.135 | 2.105 | 1.79 | 1.612 | 1.44 |
1.1.2.8.5. MCU Domain Multi-Core Execution¶
- Cache miss rate of 3M/sec is at memcpy size of 50 bytes.
Memcpy Size | 0 | 50 | 500 | 1000 | 2048 | |
---|---|---|---|---|---|---|
OCMC | OCMC Baseline Execution Time (us) | 3387 | 3596 | 5185 | 7083 | 11078 |
ICM/sec | 3332152 | 3132647 | 2047058 | 1529436 | 981404 | |
DDR | DDR execution time (us) | 6712 | 6948 | 8451 | 10490 | 14614 |
DDR / OCMC Baseline | 1.982 | 1.932 | 1.63 | 1.481 | 1.319 | |
XIP | XIP 133Mhz execution time (us) | 10163 | 10320 | 10627 | 12514 | 16421 |
XIP 133Mhz / OCMC Baseline | 3.001 | 2.87 | 2.05 | 1.767 | 1.482 | |
XIP 166Mhz execution time (us) | 9704 | 9623 | 10544 | 12200 | 16239 | |
XIP 166Mhz / OCMC Baseline | 2.865 | 2.676 | 2.034 | 1.722 | 1.466 |
1.1.2.8.6. MAIN Domain Multi-Core Execution¶
- Cache miss rate of 3M/sec is closest at memcpy size of 0 bytes.
Memcpy Size | 0 | 50 | 500 | 1000 | 2048 | |
---|---|---|---|---|---|---|
OCMC | OCMC Baseline Execution Time (us) | 4505 | 4671 | 6346 | 8359 | 12495 |
ICM/sec | 2487014 | 2466923 | 1792782 | 1405790 | 941336 | |
DDR | DDR execution time (us) | 6214 | 6390 | 7928 | 9818 | 14130 |
DDR / OCMC Baseline | 1.379 | 1.368 | 1.249 | 1.175 | 1.131 | |
XIP | XIP 133Mhz execution time (us) | 13045 | 13100 | 13903 | 16066 | 20017 |
XIP 133Mhz / OCMC Baseline | 2.896 | 2.805 | 2.191 | 1.922 | 1.602 | |
XIP 166Mhz execution time (us) | 11683 | 11832 | 12704 | 14763 | 18849 | |
XIP 166Mhz / OCMC Baseline | 2.593 | 2.533 | 2.002 | 1.766 | 1.509 |
1.1.2.8.7. Additional OCMC Baseline Details - MCU Domain¶
- View ICM/sec row to see that cache miss rate of 3M/sec is at memcpy size of 50 bytes.
Mem Cpy Size | 0 | 50 | 100 | 200 | 500 | 750 | 1000 | 1250 | 1500 | 2048 |
---|---|---|---|---|---|---|---|---|---|---|
Start Time in Usec | 55023 | 333022 | 613022 | 894023 | 1176022 | 1460024 | 1745024 | 2032024 | 2320024 | 2608024 |
Exec Time in Usec | 3387 | 3596 | 3804 | 4208 | 5185 | 6263 | 7083 | 8048 | 8937 | 11078 |
Task Calls | 500 | 500 | 500 | 500 | 500 | 500 | 500 | 500 | 500 | 500 |
Inst Cache Miss | 11286 | 11265 | 11403 | 11650 | 10614 | 11374 | 10833 | 11141 | 11431 | 10872 |
Inst Cache Acc | 1166705 | 1270466 | 1371551 | 1574380 | 2161434 | 2680942 | 3174851 | 3687742 | 4176758 | 5281564 |
Num Instr Exec | 1244275 | 1419949 | 1595675 | 1946698 | 2982873 | 3872586 | 4740226 | 5629740 | 6491966 | 8412353 |
ICM/sec | 3332152 | 3132647 | 2997634 | 2768536 | 2047058 | 1816062 | 1529436 | 1384319 | 1279064 | 981404 |
INST/sec | 367367877 | 394869021 | 419472923 | 462618346 | 575288910 | 618327638 | 669239870 | 699520377 | 726414456 | 759374706 |
1.1.2.8.8. Additional OCMC Baseline Details - MAIN Domain¶
- View ICM/sec row to see that cache miss rate of 3M/sec is closest at memcpy size of 0 bytes. mcu2_0 application is marginally less complex because mcu1_0 is responsble for the sciserver and is the boot core.
Mem Cpy Size | 0 | 50 | 100 | 200 | 500 | 750 | 1000 | 1250 | 1500 | 2048 |
---|---|---|---|---|---|---|---|---|---|---|
Start Time in Usec | 55039 | 334030 | 615030 | 897030 | 1180030 | 1465031 | 1751030 | 2039030 | 2328030 | 2619031 |
Exec Time in Usec | 4505 | 4671 | 4875 | 5254 | 6346 | 7504 | 8359 | 9370 | 10320 | 12495 |
Task Calls | 500 | 500 | 500 | 500 | 500 | 500 | 500 | 500 | 500 | 500 |
Inst Cache Miss | 11204 | 11523 | 11744 | 12214 | 11377 | 11371 | 11751 | 11997 | 12401 | 11762 |
Inst Cache Acc | 1152861 | 1255422 | 1358342 | 1559360 | 2148778 | 2666993 | 3158109 | 3674960 | 4163681 | 5266054 |
Num Instr Exec | 1248973 | 1424171 | 1600146 | 1950800 | 2986724 | 3876858 | 4743605 | 5632834 | 6495638 | 8416114 |
ICM/sec | 2487014 | 2466923 | 2409025 | 2324704 | 1792782 | 1515325 | 1405790 | 1280362 | 1201647 | 941336 |
INST/sec | 277241509 | 304896381 | 328235076 | 371298058 | 470646706 | 516638859 | 567484746 | 601156243 | 629422286 | 673558543 |