1. J721E Datasheet

1.1. Introduction

This section provides the performance numbers of device drivers supported in PDK

1.1.1. Setup Details

SOC Details Values
Core R5F
Core Operating Speed 1GHz
DDR Speed 4266 MTs
VPAC Frequency 650 MHz
DMPAC Frequency 520 MHz
Cache status Enabled
Optimization Details Values
Profile Release
Compile Options for R5F -g -ms -DMAKEFILE_BUILD -c -qq -pdsw225 –endian=little -mv7R5 –abi=eabi -eo.oer5f -ea.ser5f –symdebug:dwarf –embed_inline_assembly –float_support=vfpv3d16 –emit_warnings_as_errors
Linker Options for R5F –emit_warnings_as_errors -w -q -u _c_int00 -c -mv7R5 –diag_suppress=10063 -x –zero_init=on
Code Placement DDR
Data Placement DDR

1.1.2. Software Performance Numbers

1.1.2.1. VHWA

VHWA Driver Configuration Measured Throughput (MPix/S)
DOF 2MP (2048x1024), 12b Packed, 6 Levels, SR191x96 158.65
DOF 1MP (1312x736), 12b Packed, 5 Levels, SR170x124 150.93
MSC 1080P, 8b YUV420, 10 Scales output 609.46
NF 720P, 8b YUV420, Bilateral filter 615.54
SDE 2MP (2048x1024), 12b Packed, SR 192, LR Enabled 84.22
SDE 720P, 12b Packed, SR 192, LR Disabled 98.92
LDC 1080P, 8b YUV420, Single region 632.98
VISS 1080P, Raw 12 input, 2 frame Merge, YUV420 12b and 8b output 603.25

1.1.2.2. DSS

Display Type Configuration CPU Load
HDMI 1080P60 RGB888 1.0% (MCU2_0)
DP 1080P60 BGRA32 1.0% (MCU2_0)

1.1.2.3. CSI-Rx

Capture Type Configuration CPU Load
CSI2Rx Inst 0 4CH 1080P30 IMX390 Sensor Raw12 1.2% (MCU2_0)

1.1.2.4. CPSW_9G

1.1.2.4.1. Test Setup
_images/enet_j721e_cpsw9g_test_setup.png
Hardware Configuration Value
Processing Core Main R5F0 Core 0
Core Frequency 1 GHz
Ethernet Interface Type RGMII at 1Gbps
Packet buffer memory DDR
Hardware checksum offload Yes
Scatter-gather TX Yes
Scatter-gather RX No
Software Configuration Value
RTOS FreeRTOS
RTOS application Enet LLD lwIP example
TCP/IP stack lwIP 2.1.2
Host PC tool version iperf v2.0.10
1.1.2.4.2. TCP Performance
Test Bandwidth (Mbps) CPU Load (%)
TCP RX 122 95
TCP TX 103 100
TCP Bidirectional RX=45.7 TX=63.7 100

Host PC commands:

iperf -c <evm_ip> -r
iperf -c <evm_ip> -d
1.1.2.4.3. UDP Performance
Test Datagram Length = 64B Datagram Length = 256B Datagram Length = 512B Datagram Length = 1470B
Bandwidth
(Mbps)

CPU
Load
(%)
Packet
Loss
(%)
Bandwidth
(Mbps)

CPU
Load
(%)
Packet
Loss
(%)
Bandwidth
(Mbps)

CPU
Load
(%)
Packet
Loss
(%)
Bandwidth
(Mbps)

CPU
Load
(%)
Packet
Loss
(%)
UDP RX 5.24 41 0.00053 26.2 63 0.0029 26.2 45 0.0052 26.2 32 0.0
10.5 60 0.0021 52.4     52.4 67 0.051 52.4 44 0.0
15.7 79 0.013 105     105     105 64 0.015
UDP RX (Max) 21 97 0.098 47.2 100 0.012 86 100 0.014 189 97 0.0
UDP TX (Max) 16.3 100 0.00017 40.1 100 0.0054 80.1 100 0.00017 228 100 0.0

Host PC commands:

  • Test with datagram length of 64B:

    iperf -c <evm_ip> -u -l64 -b<bw> -r
    where <bw> is 5M, 10M, 15M, etc
    
  • Test with datagram length of 256B:

    iperf -c <evm_ip> -u -l256 -b<bw> -r
    where <bw> is 25M, 50M, 100M, etc
    
  • Test with datagram length of 512B:

    iperf -c <evm_ip> -u -l512 -b<bw> -r
    where <bw> is 25M, 50M, 100M, etc
    
  • Test with datagram length of 1470B (max):

    iperf -c <evm_ip> -u -b<bw> -r
    where <bw> is 25M, 50M, 100M, etc
    

1.1.2.5. UDMA

1.1.2.5.1. DMA Parameters
  • Ring Order ID: 0
  • Channel Order ID: 0
  • Channel DMA Priority: 1
  • Channel Bus Priority: 4
  • Channel BUS QOS: 4
  • Channel TX FIFO depth: 128
  • Channel Fetch Word Size: 16
  • Channel Burst Size: 64 bytes for normal channel, 128 bytes for HC and UHC channels
1.1.2.5.2. Test Parameters
  • Type: TR15 Block copy
  • TR: one TR per TRPD in PBR mode
  • TR Memory: Same as buffer memory (DDR, MSMC or OCMC depends on the test performed)
  • Transfer Size: 1 MB read and 1MB write
  • 1MB means 1000x1000 bytes and 1KB means 1000 bytes

Note: Throughput numbers mentioned is the combined memory throughput of both read and write operations

1.1.2.5.3. DRU Blockcopy

DRU channel performance with TR submitted through ring

Test Description Throughput (MCU2) CPU Load (MCU2) Throughput (C66x_1/2) CPU Load (C66x_1/2) Throughput (C7x_1) CPU Load (C7x_1)
[PDK-3501] 1CH DDR 1MB to DDR 1MB 11262 MB/sec 11% 12011 MB/sec 4% 11202 MB/sec 6%
[PDK-3502] 1CH MSMC 1KB Circular to DDR 1MB 17757 MB/sec 13% 18657 MB/sec 5% 17652 MB/sec 8%
[PDK-3503] 1CH DDR 1MB to MSMC circular 1KB 20580 MB/sec 8% 22844 MB/sec 5% 20420 MB/sec 8%
[PDK-3504] 1CH MSMC 1KB to MSMC circular 1KB (1MB per TR) 27377 MB/sec 14% 29413 MB/sec 6% 27200 MB/sec 8%
[PDK-3505] Multi CH DDR 1MB to DDR 1MB 12446 MB/sec (2CH) 25% 12679 MB/sec (4CH) 7% 10597 MB/sec (4CH) 14%
[PDK-3506] Multi CH MSMC 1KB to MSMC circular 1KB (1 MB per TR) 33581 MB/sec (2CH) 33% 33554 MB/sec (4CH) 18% 17955 MB/sec (4CH) 14%
1.1.2.5.5. MCU NAVSS Blockcopy (Normal Channel)

MCU NAVSS normal channel performance with TR submitted through ring

Test Description Throughput (MCU1) CPU Load (MCU1)
[PDK-3490] 1CH DDR 1MB to DDR 1MB 659 MB/sec 1%
[PDK-3491] 1CH MSMC 1KB Circular to DDR 1MB 982 MB/sec 1%
[PDK-3492] 1CH DDR 1MB to MSMC circular 1KB 717 MB/sec 1%
[PDK-3493] 1CH MSMC 1KB to MSMC circular 1KB (1MB per TR) 960 MB/sec 1%
[PDK-3489] 1CH OCMC 1KB to OCMC circular 1KB (1MB per TR) 2460 MB/sec 1%
[PDK-3495] Multi CH DDR 1MB to DDR 1MB 1186 MB/sec (2CH) 2%
[PDK-3497] Multi CH MSMC 1KB to MSMC circular 1KB (1 MB per TR) 1640 MB/sec (2CH) 2%

1.1.2.6. IPC

1.1.2.6.1. Test Set-up
  • Release build binaries are used for measurement

  • Ring Buffer : Uncached DDR

  • Buffer to be sent (RPMSG) – Cached DDR

  • C66x - L2 Cache 128K

  • C7x - L2 Cache 128K

  • Software/Application Used : ipc_multicore_perf_test loaded through SBL. Output is printed to UART.

  • R5F/MPU config : DDR config

    • bufferable - 1
    • cacheable - 1
    • shareable - 0

Capturing Round trip time in us with different data sizes

1.1.2.6.2. Performance - Host Core A72, Bios, 2 GHz
Remote Core 4 Bytes 8 Bytes 16 Bytes 32 Bytes 64 Bytes 128 Bytes 256 Bytes
MCU R5F0 20 20 22 25 32 44 70
Main R5F0 18 19 20 24 29 41 65
C66x1 17 16 17 16 18 20 25
C7x 20 20 20 20 23 24 25
1.1.2.6.3. Performance - Host Core MCU R5F0, 1 GHz
Remote Core 4 Bytes 8 Bytes 16 Bytes 32 Bytes 64 Bytes 128 Bytes 256 Bytes
A72 (bios) 21 21 23 26 32 43 68
Main R5F0 17 18 19 22 28 39 65
C66x1 17 17 19 22 28 40 64
C7x 18 18 20 23 29 40 66
1.1.2.6.4. Performance - Host Core MAIN R5F0, 1 GHz
Remote Core 4 Bytes 8 Bytes 16 Bytes 32 Bytes 64 Bytes 128 Bytes 256 Bytes
A72 (Bios) 17 17 18 21 26 37 59
MCU R5F0 16 15 17 20 25 35 58
Main R5F1 16 16 17 21 26 36 59
C66x1 16 15 17 20 25 36 58
C7x 16 16 17 20 25 36 58
1.1.2.6.5. Performance - Host Core C66X1, 1.35 GHz
Remote Core 4 Bytes 8 Bytes 16 Bytes 32 Bytes 64 Bytes 128 Bytes 256 Bytes
A72 (Bios) 19 18 18 18 18 22 26
MCU R5F0 26 26 28 30 37 52 81
Main R5F0 25 25 27 29 35 48 75
C66x2 23 22 22 21 23 28 35
C7x 30 29 29 28 31 34 37
1.1.2.6.6. Performance - Host Core C7x, 1GHz
Remote Core 4 Bytes 8 Bytes 16 Bytes 32 Bytes 64 Bytes 128 Bytes 256 Bytes
A72 (Bios) 21 21 21 21 24 23 25
Mcu R5F0 32 32 34 37 45 55 82
Main R5F0 28 29 30 34 42 51 75
C66x1 29 28 28 27 20 31 36

1.1.2.7. OSPI

1.1.2.7.1. OSPI Memory Non Cached Test Set-up
  • Platform: J721e EVM.
  • OS Type: Baremetal/FreeRTOS
  • Core : R5F_0 at 1 GHz, A72_0 at 2 GHz.
  • Software/Application Used: OSPI_Flash_TestApp/OSPI_Flash_Dma_TestApp/OSPI_Baremetal_Flash_TestApp/OSPI_Baremetal_Flash_Dma_TestApp
  • System Configuration: Cache OFF, Read/Write Buffer in DDR. DMA Enabled/Disabled, Interrupts ON.
1.1.2.7.2. OSPI Read/Write Performance (DDR Octal Mode)
OSPI RCLK OS CPU Mode Write Tput (MB/s) Write CPU Load Read Tput (MB/s) Read CPU Load
133 MHz Baremetal R5F_0 DAC 0.083   7.039  
DAC DMA 1.567   264.524  
INDAC 1.565   8.331  
A72_0 DAC 0.086   3.571  
DAC DMA 1.567   265.798  
INDAC 1.569   8.332  
RTOS R5F_0 DAC 0.083 100% 7.038 51%
DAC DMA 1.567 68% 262.669 19%
INDAC 1.566 74% 8.330 0%
166 MHz Baremetal R5F_0 DAC 0.090   8.212  
DAC DMA 1.646   329.430  
INDAC 1.640   10.413  
A72_0 DAC 0.089   4.240  
DAC DMA 1.581   331.617  
INDAC 1.638   10.415  
RTOS R5F_0 DAC 0.090 100% 8 51%
DAC DMA 1.646 69% 326.964 1%
INDAC 1.643 75% 10.411 0%
1.1.2.7.3. OSPI Memory Cached Test Set-up
  • Platform: J721e EVM.
  • OS Type: Baremetal/FreeRTOS
  • Core : R5F_0 at 1 GHz, A72_0 at 2 GHz.
  • Software/Application Used: OSPI_Flash_Cache_TestApp/OSPI_Flash_Dma_Cache_TestApp/OSPI_Baremetal_Flash_Cache_TestApp/OSPI_Baremetal_Flash_Dma_Cache_TestApp
  • System Configuration: Cache ON, Read/Write Buffer in DDR. DMA Enabled/Disabled, Interrupts ON.
1.1.2.7.4. OSPI Read/Write Performance (DDR Octal Mode)
OSPI RCLK OS CPU Mode Write Tput (MB/s) Write CPU Load Read Tput (MB/s) Read CPU Load
133 MHz Baremetal R5F_0 DAC 0.325   46.282  
DAC DMA 1.567   264.458  
INDAC 1.567   8.331  
RTOS R5F_0 DAC 0.325 100% 46.245 100%
DAC DMA 1.567 71% 262.537 67%
INDAC 1.563 74% 8.329 4%
166 MHz Baremetal R5F_0 DAC 0.360   57.497  
DAC DMA 1.646   329.533  
INDAC 1.640   10.413  
RTOS R5F_0 DAC 0.360 100% 57.440 100%
DAC DMA 1.646 72% 326.761 73%
INDAC 1.639 76% 10.411 4%

1.1.2.8. MMCSD

1.1.2.8.1. Test Set-up
  • Platform: J721e EVM.
  • OS Type: Sysbios
  • Core : A72_0, 2 GHz.
  • Software/Application Used: MMCSD_<EMMC>_Regression_TestApp (A menu based application which outputs the benchmark numbers on UART)
  • System Configuration: Cache ON, Read/Write Buffer in DDR. ADMA enabled, Interrupts ON.
  • SD Card used: Sandisk 16GB, Class 10. FAT32 formatted with allocation size = 4K (for optimal FAT32 throughput & compatibility with various cards)
  • EMMC: EMMC on J721E EVM. Please refer to the EVM data sheet for details
1.1.2.8.2. SD Card Performance
1.1.2.8.2.1. DS Mode (25 MHz, 4-bit) Theoretical Max: 12.5 MB/s
Size of transfer (KB) RAW Write Throughput (MB/s) RAW Read Throughput (MB/s) FATFS Write Throughput (MB/s) FATFS Read Throughput (MB/s)
256 9.1059 9.4340 4.1804 7.5307
512 9.8377 10.4257 4.5550 8.0084
1024 10.0432 10.7388 4.9630 8.2052
2048 10.4119 10.9066 5.8666 8.0361
5120 10.0376 10.9829 4.7683 8.3273
1.1.2.8.2.2. HS Mode (50 MHz, 4-bit) Theoretical Max: 50 MB/s
Size of transfer (KB) RAW Write Throughput (MB/s) RAW Read Throughput (MB/s) FATFS Write Throughput (MB/s) FATFS Read Throughput (MB/s)
256 15.9483 16.4356 4.3909 11.8113
512 18.5548 19.6683 6.2893 12.6380
1024 19.9566 20.8116 6.5560 13.1697
2048 19.9830 21.4463 6.5847 13.4176
5120 20.0178 21.8337 6.2207 13.4776
1.1.2.8.2.3. SDR12 Mode (25 MHz, 4-bit) Theoretical Max: 12.5 MB/s
Size of transfer (KB) RAW Write Throughput (MB/s) RAW Read Throughput (MB/s) FATFS Write Throughput (MB/s) FATFS Read Throughput (MB/s)
256 9.0146 9.4187 4.2206 7.4148
512 9.7703 10.4165 4.9643 8.0081
1024 10.0714 10.7345 4.7311 8.2015
2048 9.6667 10.8930 5.0503 8.3087
5120 10.0025 11.0095 4.8343 8.3287
1.1.2.8.2.4. SDR25 Mode (50 MHz, 4-bit) Theoretical Max: 25 MB/s
Size of transfer (KB) RAW Write Throughput (MB/s) RAW Read Throughput (MB/s) FATFS Write Throughput (MB/s) FATFS Read Throughput (MB/s)
256 16.2732 16.4143 5.6652 11.2796
512 18.3847 19.6669 6.3413 12.6358
1024 19.0623 20.8100 6.5959 13.1657
2048 17.4704 21.3765 6.3836 13.4073
5120 19.6133 21.8508 6.0397 12.5147
1.1.2.8.2.5. SDR50 Mode (50 MHz, 4-bit) Theoretical Max: 50 MB/s
Size of transfer (KB) RAW Write Throughput (MB/s) RAW Read Throughput (MB/s) FATFS Write Throughput (MB/s) FATFS Read Throughput (MB/s)
256 24.6037 26.1130 4.5208 7.6322
512 29.9576 35.3214 4.9401 7.9848
1024 32.6505 39.1811 4.9564 8.1912
2048 30.3629 41.3373 4.9362 8.2954
5120 34.7683 43.0374 4.8785 8.3285
1.1.2.8.2.6. DDR50 Mode (50 MHz, 4-bit) Theoretical Max: 50 MB/s
Size of transfer (KB) RAW Write Throughput (MB/s) RAW Read Throughput (MB/s) FATFS Write Throughput (MB/s) FATFS Read Throughput (MB/s)
256 23.4774 25.6365 4.2197 7.5511
512 26.2276 34.4773 4.4524 7.9936
1024 34.0707 38.1547 4.9994 8.2083
2048 29.2400 40.1979 5.0277 8.3036
5120 32.5992 41.6822 4.8337 8.3316
1.1.2.8.3. EMMC Performance
1.1.2.8.3.1. DS Mode (25 MHz, 8-bit) Theoretical Max: 25 MB/s
Size of transfer (KB) RAW Write Throughput (MB/s) RAW Read Throughput (MB/s)
256 15.9600 18.5776
512 18.1068 20.1941
1024 19.4310 21.1389
2048 20.1785 21.6574
5120 20.6573 21.9851
1.1.2.8.3.2. HS-SDR Mode (50 MHz, 8-bit) Theoretical Max: 50 MB/s
Size of transfer (KB) RAW Write Throughput (MB/s) RAW Read Throughput (MB/s)
256 25.6862 31.8970
512 31.7678 36.9522
1024 36.0882 40.2272
2048 38.7699 42.1508
5120 39.6647 43.3818
1.1.2.8.3.3. HS-DDR Mode (50 MHz, 8-bit) Theoretical Max: 100 MB/s
Size of transfer (KB) RAW Write Throughput (MB/s) RAW Read Throughput (MB/s)
256 34.8107 47.9176
512 41.8965 60.3240
1024 48.6215 69.5793
2048 53.9672 75.5317
5120 56.1397 79.6654
1.1.2.8.3.4. HS-200 Mode (200 MHz, 8-bit) Theoretical Max: 200 MB/s
Size of transfer (KB) RAW Write Throughput (MB/s) RAW Read Throughput (MB/s)
256 37.8881 68.9168
512 46.4331 97.8488
1024 50.7672 124.6944
2048 54.6804 145.1625
5120 55.0597 160.8638
1.1.2.8.3.5. HS-400 Mode (200 MHz, 8-bit) Theoretical Max: 400 MB/s
Size of transfer (KB) RAW Write Throughput (MB/s) RAW Read Throughput (MB/s)
256 36.2206 84.0709
512 47.7269 130.8260
1024 51.6706 184.4708
2048 55.3375 203.5146
5120 56.7088 208.5778

1.1.2.9. CSL-FL based Optimized OSPI Example

1.1.2.9.1. CPU Mode - Test Set-up
  • Platform: J721e EVM.

  • OS Type: Baremetal

  • Core : R5F_0 at 1 GHz

  • Software/Application Used: csl_ospi_flash_app

  • System Configuration:
    • RCLK 133/166 MHz
    • Cache ON,
    • Buffer & Critical Fxn’s in TCMB,
    • DMA Disabled,
    • Interrupts OFF.
  • Theoretical Max Throughput:
    • 133 MHz :- 253.67 MB/s
    • 166 MHz :- 316.62 MB/s
1.1.2.9.2. DAC Mode OSPI Read Performance (Dual Data Rate - Octal Mode)
OSPI RCLK Size of transfer (B) Read Time (ns) Throughput (MB/s)
133 MHz 16 815 19.6
32 1445 22.1
64 2700 23.7
128 5225 24.5
256 10265 24.9
512 20360 25.1
1024 40510 25.3
166 MHz 16 945 16.9
32 2330 13.7
64 4580 14.0
128 9105 14.1
256 18145 14.1
512 36185 14.1
1024 72295 14.2
1.1.2.9.3. DMA Mode - Test Set-up
  • Platform: J721e EVM.

  • OS Type: Baremetal

  • Core : R5F_0 at 1 GHz

  • Software/Application Used: udma_baremetal_ospi_flash_testapp

  • System Configuration:
    • RCLK 133/166 MHz
    • Cache ON,
    • Buffer & Critical Fxn’s in TCMB,
    • DMA Enabled - SW Trigger mode,
    • Interrupts OFF.
  • Theoretical Max Throughput:
    • 133 MHz :- 253.67 MB/s
    • 166 MHz :- 316.62 MB/s
1.1.2.9.4. DAC DMA Mode OSPI Read Performance (Dual Data Rate - Octal Mode)
OSPI RCLK Size of transfer (B) Read Time (ns) Throughput (MB/s)
133 MHz 16 800 20
32 805 39.8
64 970 66
128 1315 97.3
256 1955 130.9
512 3120 164.1
1024 5450 187.9
166 MHz 16 675 23.7
32 805 39.8
64 850 75.3
128 1180 108.5
256 1685 151.9
512 2730 187.5
1024 4670 219.3

1.1.2.10. SBL OSPI Boot Performance App

1.1.2.10.1. Test Set-up
  • Platform: J721e EVM.
  • OS Type: Baremetal
  • Core : R5F_0 at 1 GHz
  • Software/Application Used: sbl_cust_img (with custom flags) and sbl_boot_perf_test appimage
  • Please note that these performance numbers were from 8.2 release.
1.1.2.10.2. GP EVM Performance
SBL Boot Time Breakdown Time (ms)
MCU_PORZ_OUT to MCU_RESETSTATz 0.63
ROM : init + SBL load from OSPI 12.00
SBL : SBL_SciClientInit: ReadSysfwImage 6.116
Load/Start SYSFW 4.022
Sciclient_init 3.166
Board Config 2.010
PM Config 0.133
Security Config 0.605
RM Config 0.762
SBL : Board_init (PINMUX) 2.826
SBL: SoC Late-Init
SBL : Board_init (PLL) 1.545
SBL: Board_init (CLOCKS) 1.055
SBL: OSPI init 0.129
SBL: sbl_misc 0.041
SBL: App copy to MCU SRAM & Jump to App 3.90
MCUSW: CAN response 1.00
TOTAL time 46.858

1.1.2.11. OSPI Memory Configuration Benchmarking

  • These numbers were collected from the memory_benchmarking_app demo which provides a means of measuring the performance of a realistic application where the text of the application is sitting in various memory locations and the data is sitting in On-Chip-Memory RAM (referred to as OCM, OCMC or OCMRAM).
  • The application executes 10 different configurations of the same text varying by data vs. instruction cache intensity. Each test calls 16 separate functions 500 total times in random order.
  • The most instruction intensive example achieves a instruction cache miss rate (ICM/sec) of ~3-4 million per second when run entirely from OCMRAM. This is a rate that we have similarly seen in real-world customer examples.
  • More data instensive tests have more repetitive code, achieving much lower ICM rates
  • When “Multicore” Configuration is used, it is defined as the execution of the same AUTOSAR application executed simultaneously by means of a synchronization delay on MCU Core 0 (mcu1_0) and MAIN Core 0 (mcu2_0)
  • The Memcpy size is just a knob to make the synthetic benchmark application more data or instruction centric with no additional significance. (small memcpy size is more instruction centric with more ICM rate and vice versa)

1.1.2.12. Supported Configurations

Core SOC Supported Memory Configurations (MEM_CONF)
mcu1_0 j721e ocmc msmc ddr xip
mcu2_0 j721e ocmc msmc ddr xip
mcu1_0 + mcu2_0 j721e ddr xip
1.1.2.12.1. Test Set-up
  • Platform: J721e EVM.
  • OS Type: FreeRTOS
  • Core – MCU Domain R5_0 (MCU1_0) & Main Domain R5_0 (MCU2_0)
  • Software/Application Used: sbl_ospi_img and [MEM_CONF]_memory_benchmarking_app_freertos appimage
1.1.2.12.2. MCU Domain Single Core Execution
  • Cache miss rate of 3M/sec is at memcpy size ~500 bytes.
Memcpy Size   0 50 500 1000 2048
OCMC OCMC Baseline Execution Time (us) 4663 4919 7765 9845 19864
ICM/sec 3795200 3562720 2205920 1770850 902990
DDR DDR execution time (us) 8305 8543 10994 12880 22851
DDR / OCMC Baseline 1.781 1.737 1.416 1.308 1.150
MSMC MSMC execution time (us) 6355 6719 9069 11136 20434
MSMC / OCMC Baseline 1.363 1.366 1.168 1.131 1.029
XIP XIP 133 MHz execution time (us) 18408 18664 21562 24082 38343
XIP 133 MHz / OCMC Baseline 3.948 3.794 2.777 2.446 1.93
XIP 166 MHz execution time (us) 14838 15245 17973 20372 33936
XIP 166 MHz / OCMC Baseline 3.182 3.099 2.315 2.069 1.708
1.1.2.12.3. MAIN Domain Single Core Execution
  • Cache miss rate of 3M/sec is at memcpy size of ~0 bytes.
Memcpy Size   0 50 500 1000 2048
OCMC OCMC Baseline Execution Time (us) 6036 6312 10037 12069 26220
ICM/sec 3107020 2983680 1798050 1503850 718611
DDR DDR execution time (us) 9242 9546 13132 15586 30625
DDR / OCMC Baseline 1.531 1.512 1.308 1.291 1.168
MSMC MSMC execution time (us) 7640 7805 11385 13806 28436
MSMC / OCMC Baseline 1.266 1.237 1.134 1.133 1.085
XIP XIP 133 MHz execution time (us) 17515 17979 21713 24437 41887
XIP 133 MHz / OCMC Baseline 2.902 2.848 2.163 2.025 1.598
XIP 166 MHz execution time (us) 16024 16291 21090 23278 40347
XIP 166 MHz / OCMC Baseline 2.655 2.581 2.101 1.929 1.539
1.1.2.12.4. MCU Domain Multi-Core Execution
  • Cache miss rate of 3M/sec is at memcpy size ~500 bytes.
Memcpy Size   0 50 500 1000 2048
OCMC OCMC Baseline Execution Time (us) 3937 4169 6895 8982 18906
ICM/sec 3875030 3635160 2111960 1641280 798000
DDR DDR execution time (us) 6947 7115 9503 11546 21434
DDR / OCMC Baseline 1.765 1.707 1.378 1.285 1.134
XIP XIP 133 MHz execution time (us) 21855 21627 24340 27134 40861
XIP 133 MHz / OCMC Baseline 5.551 5.188 3.53 3.021 2.161
XIP 166 MHz execution time (us) 18374 18509 20716 23269 35770
XIP 166 MHz / OCMC Baseline 4.667 4.440 3.004 2.591 1.892
1.1.2.12.5. MAIN Domain Multi-Core Execution
  • Cache miss rate of 3M/sec is at memcpy size of ~0 bytes.
Memcpy Size   0 50 500 1000 2048
OCMC OCMC Baseline Execution Time (us) 5317 5644 9109 11169 25285
ICM/sec 3713750 3516480 2112740 1774380 830373
DDR DDR execution time (us) 9189 9462 12350 15140 30428
DDR / OCMC Baseline 1.728 1.676 1.356 1.356 1.203
XIP XIP 133 MHz execution time (us) 26740 27046 30984 33206 51039
XIP 133 MHz / OCMC Baseline 5.029 4.792 3.401 2.973 2.019
XIP 166 MHz execution time (us) 22835 23164 26311 28894 45654
XIP 166 MHz / OCMC Baseline 4.295 4.104 2.888 2.587 1.806
1.1.2.12.6. Extra OCMC Baseline Details - MCU Domain
  • View ICM/sec row to see that cache miss rate of 3M/sec is at memcpy size of ~500 bytes.
Mem Cpy Size 0 50 100 200 500 750 1000 1250 1500 2048
Start Time in Usec 54039 333046 614046 897045 1180042 1466038 1753039 2042038 2335036 2630038
Exec Time in Usec 4663 4919 5149 5711 7765 8924 9845 12219 14816 19864
Task Calls 500 500 500 500 500 500 500 500 500 500
Inst Cache Miss 17697 17525 17629 18069 17129 17586 17434 17887 18072 17937
Inst Cache Acc 1037352 1144030 1241053 1444733 2046048 2560479 3057361 3588774 4100627 5247423
Num Instr Exec 1289635 1466490 1639789 1992303 3028824 3918782 4785630 5674469 6537458 8458758
ICM/sec 3795196 3562715 3423771 3163894 2205924 1970640 1770848 1463867 1219762 902990
INST/sec 276567660 298127668 318467469 348853615 390061043 439128417 486097511 464397168 441243115 425833568
1.1.2.12.7. Extra OCMC Baseline Details - MAIN Domain
  • View ICM/sec row to see that cache miss rate of 3M/sec is at memcpy size of ~0 bytes.
Mem Cpy Size 0 50 100 200 500 750 1000 1250 1500 2048
Start Time in Usec 53044 332050 614048 897048 1181047 1470045 1760044 2052045 2347043 2645045
Exec Time in Usec 6036 6312 6758 7534 10037 11197 12069 15613 19311 26220
Task Calls 500 500 500 500 500 500 500 500 500 500
Inst Cache Miss 18754 18833 18473 19525 18047 18554 18150 19296 19015 18842
Inst Cache Acc 997171 1100220 1203451 1404605 2007865 2520578 3016345 3547178 4059553 5202267
Num Instr Exec 1291289 1465254 1641057 1992809 3029927 3918807 4786278 5675441 6538372 8460201
ICM/sec 3107024 2983681 2733501 2591584 1798047 1657050 1503852 1235893 984671 718611
INST/sec 213931245 232137832 242831754 264508760 301875759 349987228 396576186 363507397 338582776 322662128