1. J721E Datasheet¶
1.1. Introduction¶
This section provides the performance numbers of device drivers supported in PDK
1.1.1. Setup Details¶
SOC Details |
Values |
---|---|
Core |
R5F |
Core Operating Speed |
1GHz |
DDR Speed |
4266 MTs |
Cache status |
Enabled |
Optimization Details |
Values |
---|---|
Profile |
Release |
Compile Options for R5F |
-g -ms -DMAKEFILE_BUILD -c -qq -pdsw225 –endian=little -mv7R5 –abi=eabi -eo.oer5f -ea.ser5f –symdebug:dwarf –embed_inline_assembly –float_support=vfpv3d16 –emit_warnings_as_errors |
Linker Options for R5F |
–emit_warnings_as_errors -w -q -u _c_int00 -c -mv7R5 –diag_suppress=10063 -x –zero_init=on |
Code Placement |
DDR |
Data Placement |
DDR |
1.1.2. Software Performance Numbers¶
1.1.2.1. DSS¶
Display Type |
Configuration |
FPS |
CPU Load |
---|---|---|---|
HDMI |
1080P60 RGB888 |
60 |
1.0% (MCU2_0) |
DP |
1080P60 BGRA32 |
60 |
1.0% (MCU2_0) |
1.1.2.2. CSI-Rx¶
Capture Type |
Configuration |
CPU Load |
---|---|---|
CSI2Rx Inst 0 |
4CH 1080P30 IMX390 Sensor Raw12 |
1.2% (MCU2_0) |
Instance |
Configuration |
Time taken to receive one frame |
ISR latency |
---|---|---|---|
CSI2Rx Inst 0 |
1CH 1080P30 IMX390 Sensor Raw12 |
33.3ms (MCU2_0) |
9us (MCU2_0) |
1.1.2.3. CSI-Tx¶
Instance |
Configuration |
Time taken to Transmit one frame |
ISR latency |
---|---|---|---|
CSI2Tx Inst 0 |
1CH 1080P 2.5GBPS IMX390 Sensor Raw12 |
6.7ms (MCU2_0) |
21us (MCU2_0) |
1.1.2.4. CPSW_9G¶
1.1.2.4.1. Test Setup¶

Hardware Configuration |
Value |
---|---|
Processing Core |
Main R5F0 Core 0 |
Core Frequency |
1 GHz |
Ethernet Interface Type |
RGMII at 1Gbps |
Packet buffer memory |
DDR |
Hardware checksum offload |
Yes |
Scatter-gather TX |
Yes |
Scatter-gather RX |
No |
Software Configuration |
Value |
---|---|
RTOS |
FreeRTOS |
RTOS application |
Enet LLD lwIP example |
TCP/IP stack |
lwIP 2.2.0 |
Host PC tool version |
iperf v2.0.10 |
1.1.2.4.2. TCP Performance¶
Test |
Bandwidth (Mbps) |
CPU Load (%) |
---|---|---|
TCP RX |
141 |
84 |
TCP TX |
133 |
100 |
TCP Bidirectional |
RX=72.7 TX=71.8 |
100 |
Host PC commands:
iperf -c <evm_ip> -r
iperf -c <evm_ip> -d
1.1.2.4.3. UDP Performance¶
Test |
Datagram Length = 64B |
Datagram Length = 256B |
Datagram Length = 512B |
Datagram Length = 1470B |
||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Bandwidth
(Mbps)
|
CPU
Load
(%)
|
Packet
Loss
(%)
|
Bandwidth
(Mbps)
|
CPU
Load
(%)
|
Packet
Loss
(%)
|
Bandwidth
(Mbps)
|
CPU
Load
(%)
|
Packet
Loss
(%)
|
Bandwidth
(Mbps)
|
CPU
Load
(%)
|
Packet
Loss
(%)
|
|
UDP RX |
4.80 |
29 |
0.00 |
24.0 |
45 |
0.00 |
24.0 |
34 |
0.00 |
24.0 |
26 |
0.00 |
9.61 |
40 |
0.00 |
48.0 |
73 |
0.0016 |
48.1 |
49 |
0.00 |
48.1 |
34 |
0.00 |
|
14.4 |
52 |
0.01 |
105 |
96.1 |
81 |
0.0018 |
96.1 |
50 |
0.00 |
|||
UDP RX (Max) |
22.1 |
70 |
0.045 |
51.1 |
75 |
0.1 |
105 |
86 |
1.3 |
247 |
100 |
0.65 |
UDP TX (Max) |
22.2 |
100 |
0.004 |
48.6 |
100 |
0.004 |
97.1 |
100 |
0.003 |
279 |
100 |
0.002 |
Host PC commands:
Test with datagram length of 64B:
iperf -c <evm_ip> -u -l64 -b<bw> -r where <bw> is 5M, 10M, 15M, etc
Test with datagram length of 256B:
iperf -c <evm_ip> -u -l256 -b<bw> -r where <bw> is 25M, 50M, 100M, etc
Test with datagram length of 512B:
iperf -c <evm_ip> -u -l512 -b<bw> -r where <bw> is 25M, 50M, 100M, etc
Test with datagram length of 1470B (max):
iperf -c <evm_ip> -u -b<bw> -r where <bw> is 25M, 50M, 100M, etc
1.1.2.5. UDMA¶
1.1.2.5.1. DMA Parameters¶
Ring Order ID: 0
Channel Order ID: 0
Channel DMA Priority: 1
Channel Bus Priority: 4
Channel BUS QOS: 4
Channel TX FIFO depth: 128
Channel Fetch Word Size: 16
Channel Burst Size: 64 bytes for normal channel, 128 bytes for HC and UHC channels
1.1.2.5.2. Test Parameters¶
Type: TR15 Block copy
TR: one TR per TRPD in PBR mode
TR Memory: Same as buffer memory (DDR, MSMC or OCMC depends on the test performed)
Transfer Size: 1 MB read and 1MB write
1MB means 1000x1000 bytes and 1KB means 1000 bytes
Note: Throughput numbers mentioned is the combined memory throughput of both read and write operations
1.1.2.5.3. DRU Blockcopy¶
DRU channel performance with TR submitted through ring
Test Description |
Throughput (MCU2) |
CPU Load (MCU2) |
Throughput (C66x_1/2) |
CPU Load (C66x_1/2) |
Throughput (C7x_1) |
CPU Load (C7x_1) |
---|---|---|---|---|---|---|
[PDK-3501] 1CH DDR 1MB to DDR 1MB |
11605 MB/sec |
11% |
11963 MB/sec |
4% |
11287 MB/sec |
7% |
[PDK-3502] 1CH MSMC 1KB Circular to DDR 1MB |
18509 MB/sec |
12% |
18558 MB/sec |
6% |
17682 MB/sec |
9% |
[PDK-3503] 1CH DDR 1MB to MSMC circular 1KB |
21620 MB/sec |
12% |
22647 MB/sec |
5% |
20560 MB/sec |
9% |
[PDK-3504] 1CH MSMC 1KB to MSMC circular 1KB (1MB per TR) |
29086 MB/sec |
12% |
29208 MB/sec |
7% |
27413 MB/sec |
9% |
[PDK-3505] Multi CH DDR 1MB to DDR 1MB |
12066 MB/sec |
20% |
12310 MB/sec (4CH) |
8% |
10433 MB/sec (4CH) |
15% |
[PDK-3506] Multi CH MSMC 1KB to MSMC circular 1KB (1 MB per TR) |
30885 MB/sec |
21% |
30931 MB/sec (4CH) |
17% |
18020 MB/sec(4CH) |
15% |
1.1.2.6. IPC¶
1.1.2.6.1. Test Set-up¶
Release build binaries are used for measurement
Ring Buffer : Uncached DDR
Buffer to be sent (RPMSG) – Cached DDR
C66x - L2 Cache 128K
C7x - L2 Cache 128K
Software/Application Used : ipc_multicore_perf_test loaded through SBL. Output is printed to UART.
R5F/MPU config : DDR config
bufferable - 1
cacheable - 1
shareable - 0
Capturing Round trip time in us with different data sizes
1.1.2.6.2. Performance - Host Core A72, Bios, 2 GHz¶
Remote Core |
4 Bytes |
8 Bytes |
16 Bytes |
32 Bytes |
64 Bytes |
128 Bytes |
256 Bytes |
---|---|---|---|---|---|---|---|
MCU R5F0 |
20 |
20 |
22 |
25 |
32 |
44 |
70 |
Main R5F0 |
18 |
19 |
20 |
24 |
29 |
41 |
65 |
C66x1 |
17 |
16 |
17 |
16 |
18 |
20 |
25 |
C7x |
20 |
20 |
20 |
20 |
23 |
24 |
25 |
1.1.2.6.3. Performance - Host Core MCU R5F0, 1 GHz¶
Remote Core |
4 Bytes |
8 Bytes |
16 Bytes |
32 Bytes |
64 Bytes |
128 Bytes |
256 Bytes |
---|---|---|---|---|---|---|---|
A72 (bios) |
21 |
21 |
23 |
26 |
32 |
43 |
68 |
Main R5F0 |
17 |
18 |
19 |
22 |
28 |
39 |
65 |
C66x1 |
17 |
17 |
19 |
22 |
28 |
40 |
64 |
C7x |
18 |
18 |
20 |
23 |
29 |
40 |
66 |
1.1.2.6.4. Performance - Host Core MAIN R5F0, 1 GHz¶
Remote Core |
4 Bytes |
8 Bytes |
16 Bytes |
32 Bytes |
64 Bytes |
128 Bytes |
256 Bytes |
---|---|---|---|---|---|---|---|
A72 (Bios) |
17 |
17 |
18 |
21 |
26 |
37 |
59 |
MCU R5F0 |
16 |
15 |
17 |
20 |
25 |
35 |
58 |
Main R5F1 |
16 |
16 |
17 |
21 |
26 |
36 |
59 |
C66x1 |
16 |
15 |
17 |
20 |
25 |
36 |
58 |
C7x |
16 |
16 |
17 |
20 |
25 |
36 |
58 |
1.1.2.6.5. Performance - Host Core C66X1, 1.35 GHz¶
Remote Core |
4 Bytes |
8 Bytes |
16 Bytes |
32 Bytes |
64 Bytes |
128 Bytes |
256 Bytes |
---|---|---|---|---|---|---|---|
A72 (Bios) |
19 |
18 |
18 |
18 |
18 |
22 |
26 |
MCU R5F0 |
26 |
26 |
28 |
30 |
37 |
52 |
81 |
Main R5F0 |
25 |
25 |
27 |
29 |
35 |
48 |
75 |
C66x2 |
23 |
22 |
22 |
21 |
23 |
28 |
35 |
C7x |
30 |
29 |
29 |
28 |
31 |
34 |
37 |
1.1.2.6.6. Performance - Host Core C7x, 1GHz¶
Remote Core |
4 Bytes |
8 Bytes |
16 Bytes |
32 Bytes |
64 Bytes |
128 Bytes |
256 Bytes |
---|---|---|---|---|---|---|---|
A72 (Bios) |
21 |
21 |
21 |
21 |
24 |
23 |
25 |
Mcu R5F0 |
32 |
32 |
34 |
37 |
45 |
55 |
82 |
Main R5F0 |
28 |
29 |
30 |
34 |
42 |
51 |
75 |
C66x1 |
29 |
28 |
28 |
27 |
20 |
31 |
36 |
1.1.2.7. OSPI¶
1.1.2.7.1. OSPI Memory Non Cached Test Set-up¶
Platform: J721e EVM.
OS Type: Baremetal/FreeRTOS
Core : R5F_0 at 1 GHz.
Software/Application Used: OSPI_Flash_TestApp/OSPI_Flash_Dma_TestApp
System Configuration: Cache OFF, Read/Write Buffer in DDR. DMA Enabled/Disabled, Interrupts ON.
1.1.2.7.2. OSPI Phy Tuning Time (DDR Octal Mode)¶
OSPI RCLK |
Tuning Time |
---|---|
133 MHz |
3.512 |
166 MHz |
3.139 |
Note: PHY tuning time varies across silicon samples and PHY tuning point varies with voltage and temperature.
1.1.2.7.3. OSPI Read/Write Performance (DDR Octal Mode)¶
OSPI RCLK |
Mode |
Write Tput (MB/s) |
Write CPU Load |
Read Tput (MB/s) |
Read CPU Load |
Read Tput Theoretical Max (MB/s) |
---|---|---|---|---|---|---|
133 MHz |
DAC |
0.77 |
100% |
7.186 |
51% |
266 |
DAC DMA |
1.502 |
70% |
264.925 |
2% |
||
INDAC |
1.510 |
75% |
8.331 |
0% |
||
166 MHz |
DAC |
0.080 |
100% |
8.208 |
51% |
332 |
DAC DMA |
1.580 |
71% |
330.781 |
1% |
||
INDAC |
1.575 |
76% |
10.414 |
1% |
1.1.2.7.4. OSPI Memory Cached Test Set-up¶
Platform: J721e EVM.
OS Type: Baremetal/FreeRTOS
Core : R5F_0 at 1 GHz.
Software/Application Used: OSPI_Flash_Cache_TestApp/OSPI_Flash_Dma_Cache_TestApp
System Configuration: Cache ON, Read/Write Buffer in DDR. DMA Enabled/Disabled, Interrupts ON.
1.1.2.7.5. OSPI Read/Write Performance (DDR Octal Mode)¶
OSPI RCLK |
Mode |
Write Tput (MB/s) |
Write CPU Load |
Read Tput (MB/s) |
Read CPU Load |
Read Tput Theoretical Max (MB/s) |
---|---|---|---|---|---|---|
133 MHz |
DAC |
0.304 |
100% |
46.788 |
51% |
266 |
DAC DMA |
1.501 |
75% |
264.925 |
20% |
||
INDAC |
1.512 |
100% |
8.331 |
0% |
||
166 MHz |
DAC |
0.342 |
100% |
57.503 |
51% |
332 |
DAC DMA |
1.581 |
72% |
330.572 |
2% |
||
INDAC |
1.575 |
76% |
10.414 |
0% |
1.1.2.8. MMCSD¶
1.1.2.8.1. Test Set-up¶
Platform: J721e EVM.
OS Type: FreeRTOS
Core : R5F_0 at 1 GHz.
Software/Application Used: MMCSD_<EMMC>_Regression_TestApp (A menu based application which outputs the benchmark numbers on UART)
System Configuration: Cache ON, Read/Write Buffer in DDR. ADMA enabled, Interrupts ON.
SD Card used: Sandisk 16GB, Class 10. FAT32 formatted with allocation size = 4K (for optimal FAT32 throughput & compatibility with various cards)
EMMC: EMMC on J721E EVM. Please refer to the EVM data sheet for details
1.1.2.8.2. SD Card Performance¶
1.1.2.8.2.1. DS Mode (25 MHz, 4-bit) Theoretical Max: 12.5 MB/s¶
Size of transfer (KB) |
RAW Write Throughput (MB/s) |
RAW Read Throughput (MB/s) |
---|---|---|
256 |
8.749 |
10.018 |
512 |
5.584 |
10.876 |
1024 |
9.416 |
11.177 |
2048 |
9.617 |
11.283 |
5120 |
7.480 |
11.406 |
1.1.2.8.2.2. HS Mode (50 MHz, 4-bit) Theoretical Max: 50 MB/s¶
Size of transfer (KB) |
RAW Write Throughput (MB/s) |
RAW Read Throughput (MB/s) |
---|---|---|
256 |
13.784 |
17.634 |
512 |
16.981 |
20.663 |
1024 |
17.856 |
21.698 |
2048 |
18.587 |
22.255 |
5120 |
7.682 |
22.511 |
1.1.2.8.2.3. SDR12 Mode (25 MHz, 4-bit) Theoretical Max: 12.5 MB/s¶
Size of transfer (KB) |
RAW Write Throughput (MB/s) |
RAW Read Throughput (MB/s) |
---|---|---|
256 |
6.732 |
10.022 |
512 |
9.569 |
10.917 |
1024 |
9.763 |
11.197 |
2048 |
4.646 |
11.272 |
5120 |
7.626 |
11.406 |
1.1.2.8.2.4. SDR25 Mode (50 MHz, 4-bit) Theoretical Max: 25 MB/s¶
Size of transfer (KB) |
RAW Write Throughput (MB/s) |
RAW Read Throughput (MB/s) |
---|---|---|
256 |
15.268 |
17.632 |
512 |
19.378 |
20.669 |
1024 |
20.746 |
21.699 |
2048 |
21.503 |
22.255 |
5120 |
16.631 |
22.491 |
1.1.2.8.2.5. SDR50 Mode (50 MHz, 4-bit) Theoretical Max: 50 MB/s¶
Size of transfer (KB) |
RAW Write Throughput (MB/s) |
RAW Read Throughput (MB/s) |
---|---|---|
256 |
24.812 |
28.552 |
512 |
33.850 |
37.459 |
1024 |
39.192 |
40.954 |
2048 |
25.812 |
42.779 |
5120 |
42.487 |
44.238 |
1.1.2.8.2.6. DDR50 Mode (50 MHz, 4-bit) Theoretical Max: 50 MB/s¶
Size of transfer (KB) |
RAW Write Throughput (MB/s) |
RAW Read Throughput (MB/s) |
---|---|---|
256 |
24.330 |
28.074 |
512 |
11.394 |
36.119 |
1024 |
37.642 |
39.881 |
2048 |
39.679 |
41.764 |
5120 |
22.710 |
42.646 |
1.1.2.8.3. EMMC Performance¶
1.1.2.8.3.1. DS Mode (25 MHz, 8-bit) Theoretical Max: 25 MB/s¶
Size of transfer (KB) |
RAW Write Throughput (MB/s) |
RAW Read Throughput (MB/s) |
---|---|---|
256 |
15.837 |
18.427 |
512 |
17.996 |
20.060 |
1024 |
19.315 |
20.998 |
2048 |
20.030 |
21.494 |
5120 |
20.498 |
21.806 |
1.1.2.8.3.2. HS-SDR Mode (50 MHz, 8-bit) Theoretical Max: 50 MB/s¶
Size of transfer (KB) |
RAW Write Throughput (MB/s) |
RAW Read Throughput (MB/s) |
---|---|---|
256 |
25.440 |
31.468 |
512 |
31.417 |
36.512 |
1024 |
35.476 |
39.713 |
2048 |
38.232 |
41.536 |
5120 |
38.861 |
42.691 |
1.1.2.8.3.3. HS-DDR Mode (50 MHz, 8-bit) Theoretical Max: 100 MB/s¶
Size of transfer (KB) |
RAW Write Throughput (MB/s) |
RAW Read Throughput (MB/s) |
---|---|---|
256 |
30.613 |
46.946 |
512 |
42.009 |
59.158 |
1024 |
47.635 |
68.044 |
2048 |
52.767 |
73.574 |
5120 |
54.303 |
77.353 |
1.1.2.8.3.4. HS-200 Mode (200 MHz, 8-bit) Theoretical Max: 200 MB/s¶
Size of transfer (KB) |
RAW Write Throughput (MB/s) |
RAW Read Throughput (MB/s) |
---|---|---|
256 |
36.074 |
67.081 |
512 |
43.292 |
94.905 |
1024 |
49.768 |
119.96 |
2048 |
52.007 |
138.18 |
5120 |
52.941 |
152.06 |
1.1.2.9. CSL-FL based Optimized OSPI Example¶
1.1.2.9.1. CPU Mode - Test Set-up¶
Platform: J721e EVM.
OS Type: Baremetal
Core : R5F_0 at 1 GHz
Software/Application Used: csl_ospi_flash_app
- System Configuration:
RCLK 133/166 MHz
Cache ON,
Buffer & Critical Fxn’s in TCMB,
DMA Disabled,
Interrupts OFF.
- Theoretical Max Throughput:
133 MHz :- 253.67 MB/s
166 MHz :- 316.62 MB/s
1.1.2.9.2. DAC Mode OSPI Read Performance (Dual Data Rate - Octal Mode)¶
OSPI RCLK |
Size of transfer (B) |
Read Time (ns) |
Throughput (MB/s) |
---|---|---|---|
133 MHz |
16 |
815 |
19.6 |
32 |
1445 |
22.1 |
|
64 |
2700 |
23.7 |
|
128 |
5225 |
24.5 |
|
256 |
10265 |
24.9 |
|
512 |
20360 |
25.1 |
|
1024 |
40510 |
25.3 |
|
166 MHz |
16 |
945 |
16.9 |
32 |
2330 |
13.7 |
|
64 |
4580 |
14.0 |
|
128 |
9105 |
14.1 |
|
256 |
18145 |
14.1 |
|
512 |
36185 |
14.1 |
|
1024 |
72295 |
14.2 |
1.1.2.9.3. DMA Mode - Test Set-up¶
Platform: J721e EVM.
OS Type: Baremetal
Core : R5F_0 at 1 GHz
Software/Application Used: udma_baremetal_ospi_flash_testapp
- System Configuration:
RCLK 133/166 MHz
Cache ON,
Buffer & Critical Fxn’s in TCMB,
DMA Enabled - SW Trigger mode,
Interrupts OFF.
- Theoretical Max Throughput:
133 MHz :- 253.67 MB/s
166 MHz :- 316.62 MB/s
1.1.2.9.4. DAC DMA Mode OSPI Read Performance (Dual Data Rate - Octal Mode)¶
OSPI RCLK |
Size of transfer (B) |
Read Time (ns) |
Throughput (MB/s) |
---|---|---|---|
133 MHz |
16 |
800 |
20 |
32 |
805 |
39.8 |
|
64 |
970 |
66 |
|
128 |
1315 |
97.3 |
|
256 |
1955 |
130.9 |
|
512 |
3120 |
164.1 |
|
1024 |
5450 |
187.9 |
|
166 MHz |
16 |
675 |
23.7 |
32 |
805 |
39.8 |
|
64 |
850 |
75.3 |
|
128 |
1180 |
108.5 |
|
256 |
1685 |
151.9 |
|
512 |
2730 |
187.5 |
|
1024 |
4670 |
219.3 |
1.1.2.10. SBL Boot Performance Numbers¶
1.1.2.10.1. Test Set-up¶
Platform: J721E EVM.
OS Type: Baremetal
Core : R5F_0 at 1 GHz
Note that app image load time could vary depending on the actual image size
Note that RBL boot time numbers are not accounted in the below table
1.1.2.10.2. GP EVM Performance (Legacy Boot)¶
Boot Modes |
SBL Used |
Application Used |
MMCSD |
sbl_mmcsd_img |
sbl_boot_perf_test |
eMMC Boot0 |
sbl_emmc_boot0_img |
sbl_boot_perf_test |
eMMC UDA |
sbl_emmc_uda_img |
sbl_boot_perf_test |
OSPI NOR |
sbl_ospi_img |
sbl_boot_perf_test |
OSPI NOR Optimized |
sbl_boot_perf_cust_img |
sbl_boot_perf_early_can_test |
SBL Boot Time Breakdown |
MMCSD |
eMMC BOOT0 |
OSPI NOR Optimized |
OSPI NOR |
eMMC UDA |
SBL : SBL_SciClientInit: ReadSysfwImage |
57.702ms |
100.490ms |
8.306ms |
8.307ms |
103.497ms |
Load/Start SYSFW |
4.101ms |
4.101ms |
4.178ms |
4.178ms |
4.101ms |
Sciclient_init |
3.164ms |
3.165ms |
3.165ms |
3.165ms |
3.165ms |
Board Config |
7.096ms |
7.095ms |
2.007ms |
7.153ms |
7.096ms |
PM Config |
1.378ms |
1.367ms |
0.107ms |
1.358ms |
1.360ms |
Security Config |
4.229ms |
4.229ms |
6.322ms |
4.229ms |
4.229ms |
RM Config |
1.772ms |
1.773ms |
0.760ms |
1.774ms |
1.772ms |
SBL : Board_init (pinmux) |
4.590ms |
4.585ms |
2.878ms |
4.690ms |
4.588ms |
SBL : Board_init (PLL) |
0.220ms |
0.216ms |
0.790ms |
0.224ms |
0.218ms |
SBL: Board_init (CLOCKS) |
1.321ms |
1.358ms |
0.660ms |
1.283ms |
1.362ms |
SBL: DDR initialization |
30.101ms |
30.096ms |
0.000ms |
30.254ms |
30.095ms |
SBL: Ethernet Configuration |
153.115ms |
153.089ms |
0.000ms |
146.200ms |
153.084ms |
SBL: EEPROM copying time |
13.202ms |
13.201ms |
0.000ms |
6.893ms |
13.201ms |
SBL: HSM Core App Copying Time |
0.487ms |
0.487ms |
0.476ms |
0.488ms |
0.488ms |
SBL: Boot Media Drivers init |
16.006ms |
24.374ms |
2.304ms |
2.227ms |
24.221ms |
SBL: OSPI PHY Tuning time |
0.226ms |
0.001ms |
3.302ms |
3.292ms |
0.143ms |
SBL: Appication Image Verification |
0.001ms |
0.000ms |
0.000ms |
0.000ms |
0.001ms |
SBL: App copy to MCU SRAM & Jump to App |
89.217ms |
2.505ms |
2.600ms |
2.615ms |
55.624ms |
Misc |
0.001ms |
0.000ms |
0.000ms |
0.000ms |
0.000ms |
TOTAL time |
387.929ms |
352.132ms |
37.855ms |
228.276ms |
408.245ms |
1.1.2.10.3. HS EVM Performance (Legacy Boot)¶
Boot Modes |
SBL Used |
Application Used |
MMCSD |
sbl_mmcsd_img_hs |
sbl_boot_perf_test |
OSPI NOR |
sbl_ospi_img_hs |
sbl_boot_perf_test |
OSPI NOR Optimized |
sbl_boot_perf_cust_img_hs |
sbl_boot_perf_hs_early_can_test |
SBL Boot Time Breakdown |
OSPI NOR Optimized |
OSPI NOR |
MMCSD |
SBL : SBL_SciClientInit: ReadSysfwImage |
8.305ms |
8.309ms |
71.654ms |
Load/Start SYSFW |
12.939ms |
13.416ms |
12.759ms |
Sciclient_init |
3.165ms |
3.165ms |
3.165ms |
Board Config |
4.210ms |
9.344ms |
9.282ms |
PM Config |
0.105ms |
1.361ms |
1.390ms |
Security Config |
9.866ms |
6.832ms |
6.833ms |
RM Config |
3.052ms |
4.060ms |
4.061ms |
SBL : Board_init (pinmux) |
2.877ms |
4.646ms |
4.523ms |
SBL : Board_init (PLL) |
0.797ms |
0.224ms |
0.217ms |
SBL: Board_init (CLOCKS) |
0.660ms |
1.286ms |
1.345ms |
SBL: DDR initialization |
0.000ms |
30.182ms |
30.137ms |
SBL: Ethernet Configuration |
0.000ms |
146.241ms |
146.278ms |
SBL: EEPROM copying time |
0.000ms |
6.840ms |
6.839ms |
SBL: HSM Core App Copying Time |
0.482ms |
0.492ms |
0.493ms |
SBL: Boot Media Drivers init |
2.297ms |
2.220ms |
15.796ms |
SBL: OSPI PHY Tuning time |
3.355ms |
3.378ms |
0.235ms |
SBL: Appication Image Verification |
50.463ms |
51.497ms |
135.413ms |
SBL: App copy to MCU SRAM & Jump to App |
1.947ms |
3.288ms |
2.520ms |
Misc |
0.000ms |
0.001ms |
0.000ms |
TOTAL time |
104.520ms |
296.782ms |
452.931ms |
1.1.2.11. Early CAN Response¶
CAN response is measured from MCU_PORZ_OUT to pulling the CAN-H line out of standby.
Below numbers are measured on J721e ES2.0 GP EVM.
Measured Time |
|
---|---|
Early CAN |
55.2 ms |
POST + Early CAN |
82.9 ms |
1.1.2.12. Memory Configuration Benchmarking¶
These numbers were collected from the memory_benchmarking_app demo which provides a means of measuring the performance of a realistic application where the text of the application is sitting in various memory locations and the data is sitting in On-Chip-Memory RAM (referred to as OCM, OCMC or OCMRAM).
The application executes 10 different configurations of the same text varying by data buffer size. Each test calls 16 separate functions 200 total times in random order.
More data instensive tests have more repetitive code, achieving much lower ICM rates
The Memcpy size is just a knob to make the synthetic benchmark application more data or instruction centric with no additional significance. (small memcpy size is more instruction centric with more ICM rate and vice versa)
1.1.2.12.1. Supported Configurations¶
Core |
SOC |
Supported Memory Configurations (MEM_CONF) |
---|---|---|
mcu1_0 |
j721e |
ocmc msmc ddr xip |
mcu2_0 |
j721e |
ocmc msmc ddr xip |
mcu1_0 + mcu2_0 |
j721e |
ocmc ddr xip |
1.1.2.12.2. Test Set-up¶
Platform: J721E EVM.
OS Type: FreeRTOS
Core – MCU Domain R5_0 (MCU1_0) & Main Domain R5_0 (MCU2_0)
Software/Application Used: sbl_cust_img and [MEM_CONF]_memory_benchmarking_app_freertos appimage
Refer Memory Benchmarking Apps user guide to which SBL variant to use to test different [MEM_CONF]_memory_benchmarking_app_freertos
1.1.2.12.3. MCU Domain Single Core Execution¶
Memcpy Size |
0 |
50 |
500 |
1000 |
2048 |
|
---|---|---|---|---|---|---|
OCMC |
OCMC Baseline Execution Time (us) |
11594 |
11730 |
13250 |
14806 |
18014 |
ICM/sec |
3284112 |
3249190 |
2888075 |
2588882 |
2150827 |
|
DDR |
DDR execution time (us) |
17164 |
17296 |
18810 |
20472 |
23931 |
DDR / OCMC Baseline |
1.48 |
1.475 |
1.42 |
1.383 |
1.328 |
|
MSMC |
MSMC execution time (us) |
13750 |
13888 |
15398 |
17056 |
20436 |
MSMC / OCMC Baseline |
1.186 |
1.184 |
1.162 |
1.152 |
1.134 |
|
XIP |
XIP 133 MHz execution time (us) |
132612 |
133229 |
134857 |
136657 |
140532 |
XIP 133 MHz / OCMC Baseline |
11.438 |
11.358 |
10.178 |
9.23 |
7.801 |
|
XIP 166 MHz execution time (us) |
108390 |
108500 |
110290 |
112023 |
115944 |
|
XIP 166 MHz / OCMC Baseline |
9.349 |
9.25 |
8.324 |
7.566 |
6.436 |
1.1.2.12.4. MAIN Domain Single Core Execution¶
Memcpy Size |
0 |
50 |
500 |
1000 |
2048 |
|
---|---|---|---|---|---|---|
OCMC |
OCMC Baseline Execution Time (us) |
13077 |
13312 |
16134 |
19148 |
25175 |
ICM/sec |
2770971 |
2740910 |
2265960 |
1918216 |
1482343 |
|
DDR |
DDR execution time (us) |
19370 |
19663 |
23623 |
27783 |
35981 |
DDR / OCMC Baseline |
1.481 |
1.477 |
1.464 |
1.451 |
1.429 |
|
MSMC |
MSMC execution time (us) |
16317 |
16596 |
20527 |
24771 |
32974 |
MSMC / OCMC Baseline |
1.248 |
1.247 |
1.272 |
1.294 |
1.31 |
|
XIP |
XIP 133 MHz execution time (us) |
130603 |
130680 |
133710 |
137101 |
144103 |
XIP 133 MHz / OCMC Baseline |
9.987 |
9.817 |
8.287 |
7.16 |
5.724 |
|
XIP 166 MHz execution time (us) |
106578 |
106743 |
109596 |
112910 |
119876 |
|
XIP 166 MHz / OCMC Baseline |
8.15 |
8.019 |
6.793 |
5.897 |
4.762 |
1.1.2.12.5. MCU Domain Multi-Core Execution¶
Memcpy Size |
0 |
50 |
500 |
1000 |
2048 |
|
---|---|---|---|---|---|---|
OCMC |
OCMC Baseline Execution Time (us) |
11667 |
11812 |
13342 |
14896 |
18045 |
ICM/sec |
3352361 |
3309346 |
2957352 |
2652792 |
2214353 |
|
DDR |
DDR execution time (us) |
17380 |
17546 |
19036 |
20700 |
24122 |
DDR / OCMC Baseline |
1.49 |
1.485 |
1.427 |
1.39 |
1.337 |
|
XIP |
XIP 133 MHz execution time (us) |
131562 |
131416 |
133664 |
135317 |
139443 |
XIP 133 MHz / OCMC Baseline |
11.276 |
11.126 |
10.018 |
9.084 |
7.728 |
|
XIP 166 MHz execution time (us) |
107443 |
107608 |
109255 |
111171 |
114828 |
|
XIP 166 MHz / OCMC Baseline |
9.209 |
9.11 |
8.189 |
7.463 |
6.363 |
1.1.2.12.6. MAIN Domain Multi-Core Execution¶
Memcpy Size |
0 |
50 |
500 |
1000 |
2048 |
|
---|---|---|---|---|---|---|
OCMC |
OCMC Baseline Execution Time (us) |
13095 |
13327 |
16154 |
19156 |
25152 |
ICM/sec |
2733867 |
2708786 |
2240621 |
1898987 |
1466960 |
|
DDR |
DDR execution time (us) |
19264 |
19650 |
23538 |
27783 |
36057 |
DDR / OCMC Baseline |
1.471 |
1.474 |
1.457 |
1.45 |
1.434 |
|
XIP |
XIP 133 MHz execution time (us) |
130218 |
130498 |
133754 |
137025 |
144117 |
XIP 133 MHz / OCMC Baseline |
9.944 |
9.792 |
8.28 |
7.153 |
5.73 |
|
XIP 166 MHz execution time (us) |
106207 |
106548 |
109510 |
112786 |
119534 |
|
XIP 166 MHz / OCMC Baseline |
8.111 |
7.995 |
6.779 |
5.888 |
4.752 |
1.1.2.12.7. Additional OCMC Baseline Details - MCU Domain¶
Mem Cpy Size |
0 |
50 |
100 |
200 |
500 |
750 |
1000 |
1250 |
1500 |
2048 |
---|---|---|---|---|---|---|---|---|---|---|
Start Time in Usec |
371093 |
659053 |
948056 |
1239056 |
1531056 |
1824057 |
2118059 |
2413057 |
2709057 |
3006057 |
Exec Time in Usec |
11594 |
11730 |
11924 |
12164 |
13250 |
13996 |
14806 |
15595 |
16406 |
18014 |
Task Calls |
200 |
200 |
200 |
200 |
200 |
200 |
200 |
200 |
200 |
200 |
Inst Cache Miss |
38076 |
38113 |
38199 |
37990 |
38267 |
38083 |
38331 |
38216 |
38283 |
38745 |
Inst Cache Acc |
1783333 |
1818260 |
1851033 |
1909183 |
2099970 |
2253931 |
2414026 |
2568612 |
2724316 |
3074076 |
Num Instr Exec |
2239389 |
2281387 |
2323599 |
2397349 |
2642441 |
2839089 |
3044459 |
3240141 |
3440563 |
3886547 |
ICM/sec |
3284112 |
3249190 |
3203539 |
3123150 |
2888075 |
2720991 |
2588882 |
2450529 |
2333475 |
2150827 |
INST/sec |
193150681 |
194491645 |
194867410 |
197085580 |
199429509 |
202850028 |
205623328 |
207767938 |
209713702 |
215751471 |
1.1.2.12.8. Additional OCMC Baseline Details - MAIN Domain¶
Mem Cpy Size |
0 |
50 |
100 |
200 |
500 |
750 |
1000 |
1250 |
1500 |
2048 |
---|---|---|---|---|---|---|---|---|---|---|
Start Time in Usec |
53084 |
342056 |
632056 |
924056 |
1217058 |
1513060 |
1810062 |
2110062 |
2411064 |
2713064 |
Exec Time in Usec |
13077 |
13312 |
13635 |
14149 |
16134 |
17609 |
19148 |
20648 |
22192 |
25175 |
Task Calls |
200 |
200 |
200 |
200 |
200 |
200 |
200 |
200 |
200 |
200 |
Inst Cache Miss |
36236 |
36487 |
36543 |
36285 |
36559 |
36425 |
36730 |
36664 |
36640 |
37318 |
Inst Cache Acc |
1867441 |
1901851 |
1934369 |
1992426 |
2184540 |
2338146 |
2498247 |
2653106 |
2809352 |
3158407 |
Num Instr Exec |
2240027 |
2281511 |
2323877 |
2398123 |
2643705 |
2839953 |
3045731 |
3241955 |
3442149 |
3888243 |
ICM/sec |
2770971 |
2740910 |
2680088 |
2564492 |
2265960 |
2068544 |
1918216 |
1775668 |
1651045 |
1482343 |
INST/sec |
171295174 |
171387545 |
170434690 |
169490635 |
163859241 |
161278493 |
159062617 |
157010606 |
155107651 |
154448579 |