1. J721S2 Datasheet¶
1.1. Introduction¶
This section provides the performance numbers of device drivers supported in PDK
1.1.1. Setup Details¶
SOC Details |
Values |
---|---|
Core |
R5F |
Core Operating Speed |
1GHz |
DDR Speed |
4266 MTs |
Cache status |
Enabled |
Optimization Details |
Values |
---|---|
Profile |
Release |
Compile Options for R5F |
-g -ms -DMAKEFILE_BUILD -c -qq -pdsw225 –endian=little -mv7R5 –abi=eabi -eo.oer5f -ea.ser5f –symdebug:dwarf –embed_inline_assembly –float_support=vfpv3d16 –emit_warnings_as_errors |
Linker Options for R5F |
–emit_warnings_as_errors -w -q -u _c_int00 -c -mv7R5 –diag_suppress=10063 -x –zero_init=on |
Code Placement |
DDR |
Data Placement |
DDR |
1.1.2. Software Performance Numbers¶
1.1.2.1. DSS¶
Display Type |
Configuration |
FPS |
CPU Load |
---|---|---|---|
DP |
1080P60 BGRA32 |
60 |
1.0% (MCU2_0) |
1.1.2.2. CSI-RX¶
Instance |
Configuration |
Time taken to receive one frame |
ISR latency |
---|---|---|---|
CSI2Rx Inst 0 |
1CH 1080P30 IMX390 Sensor Raw12 |
33.3ms (MCU2_0) |
6us (MCU2_0) |
1.1.2.3. CSI-Tx¶
Instance |
Configuration |
Time taken to Transmit one frame |
ISR latency |
---|---|---|---|
CSI2Tx Inst 0 |
1CH 1080P 2.5GBPS IMX390 Sensor Raw12 |
7.09ms (MCU2_0) |
13us (MCU2_0) |
1.1.2.4. UDMA¶
1.1.2.4.1. DMA Parameters¶
Ring Order ID: 0
Channel Order ID: 0
Channel DMA Priority: 1
Channel Bus Priority: 4
Channel BUS QOS: 4
Channel TX FIFO depth: 128
Channel Fetch Word Size: 16
Channel Burst Size: 64 bytes for normal channel, 128 bytes for HC and UHC channels
1.1.2.4.2. Test Parameters¶
Type: TR15 Block copy
TR: one TR per TRPD in PBR mode
TR Memory: Same as buffer memory (DDR, MSMC or OCMC depends on the test performed)
Transfer Size: 1 MB read and 1MB write
1MB means 1000x1000 bytes and 1KB means 1000 bytes
Note: Throughput numbers mentioned is the combined memory throughput of both read and write operations
1.1.2.4.3. DRU Blockcopy¶
DRU channel performance with TR submitted through ring
Test Description |
Throughput (MCU2) |
CPU Load (MCU2) |
Throughput (C7x_1) |
CPU Load (C7x_1) |
---|---|---|---|---|
[PDK-3501] 1CH DDR 1MB to DDR 1MB |
14593 MB/sec |
5% |
14810 MB/sec |
9% |
[PDK-3502] 1CH MSMC 1KB Circular to DDR 1MB |
36472 MB/sec |
7% |
33394 MB/sec |
11% |
[PDK-3503] 1CH DDR 1MB to MSMC circular 1KB |
25922 MB/sec |
4% |
24585 MB/sec |
10% |
[PDK-3504] 1CH MSMC 1KB to MSMC circular 1KB (1MB per TR) |
55188 MB/sec |
7% |
43965 MB/sec |
11% |
[PDK-3505] Multi CH DDR 1MB to DDR 1MB |
19636 MB/sec (2CH) |
12% |
14251 MB/sec (2CH) |
17% |
[PDK-3506] Multi CH MSMC 1KB to MSMC circular 1KB (1 MB per TR) |
52298 MB/sec (2CH) |
25% |
19535 MB/sec (2CH) |
16% |
1.1.2.5. OSPI¶
1.1.2.5.1. OSPI Memory Non Cached Test Set-up¶
Platform: J721S2 EVM.
OS Type: Baremetal/FreeRTOS.
Core : R5F_0 at 1 GHz.
Software/Application Used: OSPI_Flash_TestApp/OSPI_Flash_Dma_TestApp
System Configuration: Cache OFF, Read/Write Buffer in DDR. DMA Enabled/Disabled, Interrupts ON.
1.1.2.5.2. OSPI Phy Tuning Time (DDR Octal Mode)¶
OSPI RCLK |
Tuning Time |
---|---|
133 MHz |
7.801 |
166 MHz |
6.980 |
Note: PHY tuning time varies across silicon samples and PHY tuning point varies with voltage and temperature.
1.1.2.5.3. OSPI Read/Write Performance (DDR Octal Mode)¶
1.1.2.5.3.1. S28 (NOR)¶
OSPI RCLK |
Mode |
Write Tput (MB/s) |
Write CPU Load |
Read Tput (MB/s) |
Read CPU Load |
Read Tput Theoretical Max (MB/s) |
---|---|---|---|---|---|---|
133 MHz |
DAC |
NOT-SUPPORTED |
NOT-SUPPORTED |
7.467 |
51% |
266 |
DAC DMA |
NOT-SUPPORTED |
NOT-SUPPORTED |
264.658 |
1% |
||
INDAC |
0.485 |
100% |
8.332 |
51% |
||
166 MHz |
DAC |
NOT-SUPPORTED |
NOT-SUPPORTED |
8.568 |
51% |
332 |
DAC DMA |
NOT-SUPPORTED |
NOT-SUPPORTED |
329.948 |
1% |
||
INDAC |
0.487 |
100% |
10.414 |
51% |
1.1.2.5.3.2. W35N (NAND)¶
Mode |
Frequency |
Read Tput (MB/s) |
Read Tput Theoretical Max (MB/s) |
---|---|---|---|
DAC |
50 MHz |
3.782 |
33 |
DAC DMA |
166 MHz |
55.763 |
62 |
Note: Theoretical Max for W35N caluculated basing on the assumption for page load time to be 42 Usec.
1.1.2.5.4. OSPI Memory Cached Test Set-up¶
Platform: J721S2 EVM.
OS Type: Baremetal/FreeRTOS.
Core : R5F_0 at 1 GHz.
Software/Application Used: OSPI_Flash_Cache_TestApp/OSPI_Flash_Dma_Cache_TestApp
System Configuration: Cache ON, Read/Write Buffer in DDR. DMA Enabled/Disabled, Interrupts ON.
1.1.2.5.5. OSPI Read/Write Performance (DDR Octal Mode)¶
1.1.2.5.5.1. S28 (NOR)¶
OSPI RCLK |
Mode |
Write Tput (MB/s) |
Write CPU Load |
Read Tput (MB/s) |
Read CPU Load |
Read Tput Theoretical Max (MB/s) |
---|---|---|---|---|---|---|
133 MHz |
DAC |
NOT-SUPPORTED |
NOT-SUPPORTED |
81.722 |
51% |
266 |
DAC DMA |
NOT-SUPPORTED |
NOT-SUPPORTED |
264.324 |
1% |
||
INDAC |
0.486 |
100% |
8.332 |
51% |
||
166 MHz |
DAC |
NOT-SUPPORTED |
NOT-SUPPORTED |
93.916 |
51% |
332 |
DAC DMA |
NOT-SUPPORTED |
NOT-SUPPORTED |
320.156 |
2% |
||
INDAC |
0.492 |
100% |
10.414 |
51% |
1.1.2.5.5.2. W35N (NAND)¶
Mode |
Frequency |
Read Tput (MB/s) |
Read Tput Theoretical Max (MB/s) |
---|---|---|---|
DAC |
50 MHz |
32.633 |
33 |
DAC DMA |
166 MHz |
57.667 |
62 |
Note: Theoretical Max for W35N caluculated basing on the assumption for page load time to be 42 Usec.
1.1.2.6. MMCSD¶
1.1.2.6.1. Test Set-up¶
Platform: J721S2 EVM.
OS Type: FreeRTOS
Core : R5F_1 at 1 GHz.
Software/Application Used: MMCSD_<EMMC>_Regression_TestApp (A menu based application which outputs the benchmark numbers on UART)
System Configuration: Cache ON, Read/Write Buffer in DDR. ADMA enabled, Interrupts ON.
SD Card used: Sandisk 16GB, Class 10. FAT32 formatted with allocation size = 4K (for optimal FAT32 throughput & compatibility with various cards)
EMMC: EMMC on J721S2 EVM. Please refer to the EVM data sheet for details
1.1.2.6.2. SD Card Performance¶
1.1.2.6.2.1. DS Mode (25 Mhz, 4-bit) Theoretical Max: 12.5 MB/s¶
Size of transfer (KB) |
RAW Write Throughput (MB/s) |
RAW Read Throughput (MB/s) |
---|---|---|
256 |
9.482 |
11.196 |
512 |
10.770 |
11.392 |
1024 |
10.964 |
11.439 |
2048 |
10.982 |
11.465 |
5120 |
11.042 |
11.480 |
1.1.2.6.2.2. HS Mode (50 Mhz, 4-bit) Theoretical Max: 50 MB/s¶
Size of transfer (KB) |
RAW Write Throughput (MB/s) |
RAW Read Throughput (MB/s) |
---|---|---|
256 |
16.327 |
21.705 |
512 |
20.542 |
22.468 |
1024 |
21.270 |
22.647 |
2048 |
21.329 |
22.746 |
5120 |
21.210 |
22.803 |
1.1.2.6.2.3. SDR12 Mode (25 Mhz, 4-bit) Theoretical Max: 12.5 MB/s¶
Size of transfer (KB) |
RAW Write Throughput (MB/s) |
RAW Read Throughput (MB/s) |
---|---|---|
256 |
8.693 |
11.201 |
512 |
10.720 |
11.391 |
1024 |
10.986 |
11.441 |
2048 |
10.967 |
11.465 |
5120 |
10.453 |
11.480 |
1.1.2.6.2.4. SDR25 Mode (50 Mhz, 4-bit) Theoretical Max: 25 MB/s¶
Size of transfer (KB) |
RAW Write Throughput (MB/s) |
RAW Read Throughput (MB/s) |
---|---|---|
256 |
13.945 |
21.721 |
512 |
20.525 |
22.459 |
1024 |
21.258 |
22.647 |
2048 |
21.289 |
22.745 |
5120 |
21.585 |
22.803 |
1.1.2.6.2.5. SDR50 Mode (50 Mhz, 4-bit) Theoretical Max: 50 MB/s¶
Size of transfer (KB) |
RAW Write Throughput (MB/s) |
RAW Read Throughput (MB/s) |
---|---|---|
256 |
20.438 |
40.997 |
512 |
34.332 |
43.690 |
1024 |
39.105 |
44.396 |
2048 |
40.363 |
44.769 |
5120 |
39.606 |
44.990 |
1.1.2.6.2.6. DDR50 Mode (50 Mhz, 4-bit) Theoretical Max: 50 MB/s¶
Size of transfer (KB) |
RAW Write Throughput (MB/s) |
RAW Read Throughput (MB/s) |
---|---|---|
256 |
19.974 |
39.896 |
512 |
32.693 |
42.444 |
1024 |
37.624 |
43.147 |
2048 |
38.468 |
43.486 |
5120 |
26.301 |
43.633 |
1.1.2.6.3. EMMC Performance¶
1.1.2.6.3.1. DS Mode (25 Mhz, 8-bit) Theoretical Max: 25 MB/s¶
Size of transfer (KB) |
RAW Write Throughput (MB/s) |
RAW Read Throughput (MB/s) |
---|---|---|
256 |
15.853 |
18.389 |
512 |
18.001 |
20.035 |
1024 |
19.298 |
20.978 |
2048 |
20.081 |
21.471 |
5120 |
20.508 |
21.802 |
1.1.2.6.3.2. HS-SDR Mode (50 Mhz, 8-bit) Theoretical Max: 50 MB/s¶
Size of transfer (KB) |
RAW Write Throughput (MB/s) |
RAW Read Throughput (MB/s) |
---|---|---|
256 |
25.794 |
31.332 |
512 |
31.722 |
36.411 |
1024 |
35.704 |
39.655 |
2048 |
38.248 |
41.503 |
5120 |
39.957 |
42.696 |
1.1.2.6.3.3. HS-DDR Mode (50 Mhz, 8-bit) Theoretical Max: 100 MB/s¶
Size of transfer (KB) |
RAW Write Throughput (MB/s) |
RAW Read Throughput (MB/s) |
---|---|---|
256 |
33.201 |
46.655 |
512 |
43.128 |
58.893 |
1024 |
48.972 |
67.882 |
2048 |
49.117 |
73.482 |
5120 |
57.078 |
77.309 |
1.1.2.6.3.4. HS-200 Mode (200 Mhz, 8-bit) Theoretical Max: 200 MB/s¶
Size of transfer (KB) |
RAW Write Throughput (MB/s) |
RAW Read Throughput (MB/s) |
---|---|---|
256 |
31.236 |
48.384 |
512 |
42.985 |
61.611 |
1024 |
48.643 |
71.490 |
2048 |
52.649 |
77.705 |
5120 |
55.582 |
81.985 |
1.1.2.6.3.5. HS-400 Mode (200 Mhz, 8-bit) Theoretical Max: 400 MB/s¶
Size of transfer (KB) |
RAW Write Throughput (MB/s) |
RAW Read Throughput (MB/s) |
---|---|---|
256 |
20.637 |
65.088 |
512 |
43.414 |
91.018 |
1024 |
49.904 |
114.36 |
2048 |
56.637 |
131.11 |
5120 |
59.072 |
143.79 |
1.1.2.7. CPSW_2G¶
1.1.2.7.1. Test Setup¶
Hardware Configuration |
Value |
---|---|
Processing Core |
Main R5F0 Core 0 |
Core Frequency |
1 GHz |
Ethernet Interface Type |
RGMII at 1Gbps |
Packet buffer memory |
DDR |
Hardware checksum offload |
Yes |
Scatter-gather TX |
Yes |
Scatter-gather RX |
No |
Software Configuration |
Value |
---|---|
RTOS |
FreeRTOS |
RTOS application |
Enet LLD lwIP example |
TCP/IP stack |
lwIP 2.2.0 |
Host PC tool version |
iperf v2.0.10 |
1.1.2.7.2. TCP Performance¶
Test |
Bandwidth (Mbps) |
CPU Load (%) |
---|---|---|
TCP RX |
187 |
44 |
TCP TX |
186 |
67 |
TCP Bidirectional |
RX=169 TX=164 |
96 |
Host PC commands:
iperf -c <evm_ip> -r
iperf -c <evm_ip> -d
1.1.2.7.3. UDP Performance¶
Test |
Datagram Length = 64B |
Datagram Length = 256B |
Datagram Length = 512B |
Datagram Length = 1470B |
||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Bandwidth
(Mbps)
|
CPU
Load
(%)
|
Packet
Loss
(%)
|
Bandwidth
(Mbps)
|
CPU
Load
(%)
|
Packet
Loss
(%)
|
Bandwidth
(Mbps)
|
CPU
Load
(%)
|
Packet
Loss
(%)
|
Bandwidth
(Mbps)
|
CPU
Load
(%)
|
Packet
Loss
(%)
|
|
UDP RX |
5.24 |
17 |
0.000 |
26.2 |
36 |
0.000 |
26.2 |
25 |
0.000 |
26.2 |
13 |
0.000 |
10.5 |
29 |
0.000 |
52.4 |
62 |
0.000 |
52.4 |
35 |
0.000 |
52.4 |
20 |
0.000 |
|
15.7 |
41 |
0.000 |
105 |
105 |
67 |
0.000 |
105 |
34 |
0.000 |
|||
UDP RX (Max) |
37 |
96 |
0.013 |
82 |
99 |
0.016 |
150 |
99 |
0.008 |
325 |
100 |
0.037 |
UDP TX (Max) |
39.6 |
100 |
0.000 |
90.5 |
100 |
0.000 |
180 |
100 |
0.000 |
500 |
100 |
0.000 |
Host PC commands:
Test with datagram length of 64B:
iperf -c <evm_ip> -u -l64 -b<bw> -r where <bw> is 5M, 10M, 15M, etc
Test with datagram length of 256B:
iperf -c <evm_ip> -u -l256 -b<bw> -r where <bw> is 25M, 50M, 100M, etc
Test with datagram length of 512B:
iperf -c <evm_ip> -u -l512 -b<bw> -r where <bw> is 25M, 50M, 100M, etc
Test with datagram length of 1470B (max):
iperf -c <evm_ip> -u -b<bw> -r where <bw> is 25M, 50M, 100M, etc
1.1.2.8. SBL OSPI Boot Performance App¶
1.1.2.8.1. Test Set-up¶
Platform: J721S2 EVM.
OS Type: Baremetal
Core : R5F_0 at 1 GHz
Software/Application Used: sbl_boot_perf_cust_img and sbl_boot_perf_test appimage
Note that app image load time could vary depending on the actual image size
1.1.2.8.2. GP EVM Performance¶
SBL Boot Time Breakdown |
Time (ms) |
MCU_PORZ_OUT to MCU_RESETSTATz |
0.63 |
ROM : init + SBL load from OSPI |
9.113 |
SBL : SBL_SciClientInit: ReadSysfwImage |
0.050 |
Load/Start SYSFW |
4.870 |
Sciclient_init |
3.151 |
Board Config |
1.847 |
PM Config |
0.400 |
Security Config |
0.916 |
RM Config |
0.372 |
SBL: SoC Late-Init |
0.00 |
SBL : Board_init (pinmux) |
0.598 |
SBL : Board_init (PLL) |
0.981 |
SBL: Board_init (CLOCKS) |
1.154 |
SBL: OSPI init |
1.254 |
SBL: OSPI PHY tuning time |
7.397 |
SBL: App copy to MCU SRAM & Jump to App |
1.958 |
Misc |
0.035 |
TOTAL time |
34.72 |
1.1.2.9. Combined SBL OSPI Boot Performance App¶
1.1.2.9.1. Test Set-up¶
Platform: J721S2 EVM.
OS Type: Baremetal
Core : R5F_0 at 1 GHz
Software/Application Used: sbl_boot_perf_cust_img_combined and sbl_combined_boot_perf_test appimage
Note that app image load time could vary depending on the actual image size
1.1.2.9.2. GP EVM Performance¶
SBL Boot Time Breakdown |
Time (ms) |
MCU_PORZ_OUT to MCU_RESETSTATz |
0.63 |
ROM : init + SBL and TIFS load from OSPI |
12.274 |
Sciclient Boot Notification |
7.144 |
Sciclient_init |
0.025 |
Board Config |
0.161 |
PM Config |
0.395 |
Security Config |
0.782 |
RM Config |
0.756 |
SBL: SoC Late-Init |
0.00 |
SBL : Board_init (pinmux) |
0.090 |
SBL : Board_init (PLL) |
0.468 |
SBL: Board_init (CLOCKS) |
0.460 |
SBL: OSPI init |
0.776 |
SBL: OSPI PHY Tuning time |
7.466 |
SBL: App copy to MCU SRAM & Jump to App |
2.898 |
Misc |
0.035 |
TOTAL time |
34.291 |
1.1.2.10. Early CAN Response¶
CAN response is measured from MCU_PORZ_OUT to pulling the CAN-H line out of standby.
Below numbers are measured on J721S2 ES1.1 GP EVM.
Measured Time |
|
---|---|
Early CAN |
35.37 ms |
POST + Early CAN |
58.67 ms |
1.1.2.11. OSPI Memory Configuration Benchmarking¶
These numbers were collected from the memory_benchmarking_app demo which provides a means of measuring the performance of a realistic application where the text of the application is sitting in various memory locations and the data is sitting in On-Chip-Memory RAM (referred to as OCM, OCMC or OCMRAM).
The application executes 10 different configurations of the same text varying by data vs. instruction cache intensity. Each test calls 16 separate functions 500 total times in random order.
The most instruction intensive example achieves a instruction cache miss rate (ICM/sec) of ~3-4 million per second when run entirely from OCMRAM. This is a rate that we have similarly seen in real-world customer examples.
More data instensive tests have more repetitive code, achieving much lower ICM rates
When “Multicore” Configuration is used, it is defined as the execution of the same AUTOSAR application executed simultaneously by means of a synchronization delay on MCU Core 0 (mcu1_0) and MAIN Core 0 (mcu2_0)
The Memcpy size is just a knob to make the synthetic benchmark application more data or instruction centric with no additional significance. (small memcpy size is more instruction centric with more ICM rate and vice versa)
Memory benchmarking numbers have not been updated for 9.2 release and current numbers are from 9.1 release.
1.1.2.11.1. Supported Configurations¶
Core |
SOC |
Supported Memory Configurations (MEM_CONF) |
---|---|---|
mcu1_0 |
j721s2 |
ocmc msmc ddr xip |
mcu2_0 |
j721s2 |
ocmc msmc ddr xip |
mcu1_0 + mcu2_0 |
j721s2 | ocmc ddr xip |
1.1.2.11.2. Test Set-up¶
Platform: J721S2 EVM.
OS Type: FreeRTOS
Core – MCU Domain R5_0 (MCU1_0) & Main Domain R5_0 (MCU2_0)
Software/Application Used: sbl_cust_img and [MEM_CONF]_memory_benchmarking_app_freertos appimage
Refer Memory Benchmarking Apps user guide to which SBL variant to use to test different [MEM_CONF]_memory_benchmarking_app_freertos
1.1.2.11.3. MCU Domain Single Core Execution¶
Cache miss rate of 3M/sec is at memcpy size of 50 bytes.
Memcpy Size |
0 |
50 |
500 |
1000 |
2048 |
|
---|---|---|---|---|---|---|
OCMC |
OCMC Baseline Execution Time (us) |
3991 |
4053 |
5875 |
7085 |
10665 |
ICM/sec |
1969932 |
1914631 |
1250723 |
1071136 |
721331 |
|
DDR |
DDR execution time (us) |
5418 |
5520 |
6850 |
8108 |
11263 |
DDR / OCMC Baseline |
1.358 |
1.362 |
1.166 |
1.144 |
1.056 |
|
MSMC |
MSMC execution time (us) |
4637 |
4647 |
6008 |
7308 |
10409 |
MSMC / OCMC Baseline |
1.162 |
1.147 |
1.023 |
1.031 |
0.976 |
|
XIP |
XIP 133 MHz execution time (us) |
8230 |
8460 |
9842 |
11210 |
14786 |
XIP 133 MHz / OCMC Baseline |
2.062 |
2.087 |
1.675 |
1.582 |
1.386 |
|
XIP 166 MHz execution time (us) |
5642 |
5680 |
7556 |
8765 |
12552 |
|
XIP 166 MHz / OCMC Baseline |
1.414 |
1.401 |
1.286 |
1.237 |
1.177 |
1.1.2.11.4. MAIN Domain Single Core Execution¶
Cache miss rate of 3M/sec is closest at memcpy size of ~0 bytes.
Due to an issue with xip memory benchmarking application on mcu2_0, Benchmarking numbers are not updated for XIP.
Memcpy Size |
0 |
50 |
500 |
1000 |
2048 |
|
---|---|---|---|---|---|---|
OCMC |
OCMC Baseline Execution Time (us) |
3174 |
3326 |
4772 |
6060 |
9142 |
ICM/sec |
2614681 |
2491581 |
1641869 |
1292079 |
874206 |
|
DDR |
DDR execution time (us) |
4821 |
4977 |
6284 |
7673 |
10907 |
DDR / OCMC Baseline |
1.519 |
1.496 |
1.317 |
1.266 |
1.193 |
|
MSMC |
MSMC execution time (us) |
3822 |
3925 |
5267 |
6556 |
9820 |
MSMC / OCMC Baseline |
1.204 |
1.18 |
1.104 |
1.082 |
1.074 |
|
XIP |
XIP 133 MHz execution time (us) |
|||||
XIP 133 MHz / OCMC Baseline |
||||||
XIP 166 MHz execution time (us) |
||||||
XIP 166 MHz / OCMC Baseline |
1.1.2.11.5. MCU Domain Multi-Core Execution¶
Cache miss rate of 3M/sec is at memcpy size of 50 bytes.
Due to an issue with xip memory benchmarking application on mcu2_0, Benchmarking numbers are not updated for XIP.
Memcpy Size |
0 |
50 |
500 |
1000 |
2048 |
|
---|---|---|---|---|---|---|
OCMC |
OCMC Baseline Execution Time (us) |
3933 |
4006 |
5786 |
7050 |
10572 |
ICM/sec |
1847698 |
1861707 |
1228655 |
1037730 |
671679 |
|
DDR |
DDR execution time (us) |
5473 |
5526 |
6893 |
7989 |
11205 |
DDR / OCMC Baseline |
1.392 |
1.379 |
1.191 |
1.133 |
1.06 |
|
XIP |
XIP 133 MHz execution time (us) |
|||||
XIP 133 MHz / OCMC Baseline |
||||||
XIP 166 MHz execution time (us) |
||||||
XIP 166 MHz / OCMC Baseline |
1.1.2.11.6. MAIN Domain Multi-Core Execution¶
Cache miss rate of 3M/sec is closest at memcpy size of 0 bytes.
Due to an issue with xip memory benchmarking application on mcu2_0, Benchmarking numbers are not updated for XIP.
Memcpy Size |
0 |
50 |
500 |
1000 |
2048 |
|
---|---|---|---|---|---|---|
OCMC |
OCMC Baseline Execution Time (us) |
3172 |
3310 |
4781 |
6082 |
9159 |
ICM/sec |
2644388 |
2452265 |
1740849 |
1347254 |
896495 |
|
DDR |
DDR execution time (us) |
4971 |
5112 |
6332 |
7625 |
10948 |
DDR / OCMC Baseline |
1.567 |
1.544 |
1.324 |
1.254 |
1.195 |
|
XIP |
XIP 133 MHz execution time (us) |
|||||
XIP 133 MHz / OCMC Baseline |
||||||
XIP 166 MHz execution time (us) |
||||||
XIP 166 MHz / OCMC Baseline |
1.1.2.11.7. Additional OCMC Baseline Details - MCU Domain¶
View ICM/sec row to see that cache miss rate of 3M/sec is at memcpy size of 50 bytes.
Mem Cpy Size |
0 |
50 |
100 |
200 |
500 |
750 |
1000 |
1250 |
1500 |
2048 |
---|---|---|---|---|---|---|---|---|---|---|
Start Time in Usec |
305275 |
584111 |
864111 |
1145111 |
1427112 |
1711106 |
1995110 |
2281111 |
2567106 |
2854101 |
Exec Time in Usec |
3991 |
4053 |
4212 |
4557 |
5875 |
6632 |
7085 |
8021 |
9266 |
10665 |
Task Calls |
500 |
500 |
500 |
500 |
500 |
500 |
500 |
500 |
500 |
500 |
Inst Cache Miss |
7862 |
7760 |
7796 |
8172 |
7348 |
8054 |
7589 |
7887 |
8179 |
7693 |
Inst Cache Acc |
1004611 |
1092896 |
1166609 |
1319969 |
1763715 |
2151058 |
2518645 |
2907033 |
3276948 |
4102653 |
Num Instr Exec |
1394127 |
1496444 |
1595698 |
1798513 |
2384113 |
2899421 |
3390255 |
3905453 |
4391667 |
5490555 |
ICM/sec |
1969932 |
1914631 |
1850902 |
1793285 |
1250723 |
1214414 |
1071136 |
983293 |
882689 |
721331 |
INST/sec |
349317714 |
369218850 |
378845679 |
394670397 |
405806468 |
437186519 |
478511644 |
486903503 |
473954996 |
514819971 |
1.1.2.11.8. Additional OCMC Baseline Details - MAIN Domain¶
View ICM/sec row to see that cache miss rate of 3M/sec is closest at memcpy size of 0 bytes. mcu2_0 application is marginally less complex because mcu1_0 is responsble for the sciserver and is the boot core.
Mem Cpy Size |
0 |
50 |
100 |
200 |
500 |
750 |
1000 |
1250 |
1500 |
2048 |
---|---|---|---|---|---|---|---|---|---|---|
Start Time in Usec |
55137 |
331079 |
610078 |
890079 |
1170078 |
1452075 |
1735075 |
2020076 |
2305078 |
2591077 |
Exec Time in Usec |
3174 |
3326 |
3467 |
3814 |
4772 |
5505 |
6060 |
6784 |
7753 |
9142 |
Task Calls |
500 |
500 |
500 |
500 |
500 |
500 |
500 |
500 |
500 |
500 |
Inst Cache Miss |
8299 |
8287 |
8426 |
8622 |
7835 |
8276 |
7830 |
8387 |
8491 |
7992 |
Inst Cache Acc |
986422 |
1074914 |
1149556 |
1304496 |
1747327 |
2134404 |
2503805 |
2890003 |
3259333 |
4084005 |
Num Instr Exec |
1396682 |
1498666 |
1598766 |
1802446 |
2386732 |
2901636 |
3393660 |
3908624 |
4395364 |
5493164 |
ICM/sec |
2614681 |
2491581 |
2430343 |
2260618 |
1641869 |
1503360 |
1292079 |
1236291 |
1095188 |
874206 |
INST/sec |
440038437 |
450591100 |
461138159 |
472586785 |
500153394 |
527091008 |
560009900 |
576153301 |
566924287 |
600871144 |