1. J721E Datasheet

1.1. Introduction

This section provides the performance numbers of device drivers supported in PDK

1.1.1. Setup Details

SOC Details

Values

Core

R5F

Core Operating Speed

1GHz

DDR Speed

4266 MTs

Cache status

Enabled

Optimization Details

Values

Profile

Release

Compile Options for R5F

-g -ms -DMAKEFILE_BUILD -c -qq -pdsw225 –endian=little -mv7R5 –abi=eabi -eo.oer5f -ea.ser5f –symdebug:dwarf –embed_inline_assembly –float_support=vfpv3d16 –emit_warnings_as_errors

Linker Options for R5F

–emit_warnings_as_errors -w -q -u _c_int00 -c -mv7R5 –diag_suppress=10063 -x –zero_init=on

Code Placement

DDR

Data Placement

DDR

1.1.2. Software Performance Numbers

1.1.2.1. DSS

Display Type

Configuration

FPS

CPU Load

HDMI

1080P60 RGB888

60

1.0% (MCU2_0)

DP

1080P60 BGRA32

60

1.0% (MCU2_0)

1.1.2.2. CSI-Rx

Capture Type

Configuration

CPU Load

CSI2Rx Inst 0

4CH 1080P30 IMX390 Sensor Raw12

1.2% (MCU2_0)

Instance

Configuration

Time taken to receive one frame

ISR latency

CSI2Rx Inst 0

1CH 1080P30 IMX390 Sensor Raw12

33.3ms (MCU2_0)

9us (MCU2_0)

1.1.2.3. CSI-Tx

Instance

Configuration

Time taken to Transmit one frame

ISR latency

CSI2Tx Inst 0

1CH 1080P 2.5GBPS IMX390 Sensor Raw12

6.7ms (MCU2_0)

21us (MCU2_0)

1.1.2.4. CPSW_9G

1.1.2.4.1. Test Setup
_images/enet_j721e_cpsw9g_test_setup.png

Hardware Configuration

Value

Processing Core

Main R5F0 Core 0

Core Frequency

1 GHz

Ethernet Interface Type

RGMII at 1Gbps

Packet buffer memory

DDR

Hardware checksum offload

Yes

Scatter-gather TX

Yes

Scatter-gather RX

No

Software Configuration

Value

RTOS

FreeRTOS

RTOS application

Enet LLD lwIP example

TCP/IP stack

lwIP 2.2.0

Host PC tool version

iperf v2.0.10

1.1.2.4.2. TCP Performance

Test

Bandwidth (Mbps)

CPU Load (%)

TCP RX

141

84

TCP TX

133

100

TCP Bidirectional

RX=72.7 TX=71.8

100

Host PC commands:

iperf -c <evm_ip> -r
iperf -c <evm_ip> -d
1.1.2.4.3. UDP Performance

Test

Datagram Length = 64B

Datagram Length = 256B

Datagram Length = 512B

Datagram Length = 1470B

Bandwidth
(Mbps)

CPU
Load
(%)
Packet
Loss
(%)
Bandwidth
(Mbps)

CPU
Load
(%)
Packet
Loss
(%)
Bandwidth
(Mbps)

CPU
Load
(%)
Packet
Loss
(%)
Bandwidth
(Mbps)

CPU
Load
(%)
Packet
Loss
(%)

UDP RX

4.80

29

0.00

24.0

45

0.00

24.0

34

0.00

24.0

26

0.00

9.61

40

0.00

48.0

73

0.0016

48.1

49

0.00

48.1

34

0.00

14.4

52

0.01

105

96.1

81

0.0018

96.1

50

0.00

UDP RX (Max)

22.1

70

0.045

51.1

75

0.1

105

86

1.3

247

100

0.65

UDP TX (Max)

22.2

100

0.004

48.6

100

0.004

97.1

100

0.003

279

100

0.002

Host PC commands:

  • Test with datagram length of 64B:

    iperf -c <evm_ip> -u -l64 -b<bw> -r
    where <bw> is 5M, 10M, 15M, etc
    
  • Test with datagram length of 256B:

    iperf -c <evm_ip> -u -l256 -b<bw> -r
    where <bw> is 25M, 50M, 100M, etc
    
  • Test with datagram length of 512B:

    iperf -c <evm_ip> -u -l512 -b<bw> -r
    where <bw> is 25M, 50M, 100M, etc
    
  • Test with datagram length of 1470B (max):

    iperf -c <evm_ip> -u -b<bw> -r
    where <bw> is 25M, 50M, 100M, etc
    

1.1.2.5. UDMA

1.1.2.5.1. DMA Parameters
  • Ring Order ID: 0

  • Channel Order ID: 0

  • Channel DMA Priority: 1

  • Channel Bus Priority: 4

  • Channel BUS QOS: 4

  • Channel TX FIFO depth: 128

  • Channel Fetch Word Size: 16

  • Channel Burst Size: 64 bytes for normal channel, 128 bytes for HC and UHC channels

1.1.2.5.2. Test Parameters
  • Type: TR15 Block copy

  • TR: one TR per TRPD in PBR mode

  • TR Memory: Same as buffer memory (DDR, MSMC or OCMC depends on the test performed)

  • Transfer Size: 1 MB read and 1MB write

  • 1MB means 1000x1000 bytes and 1KB means 1000 bytes

Note: Throughput numbers mentioned is the combined memory throughput of both read and write operations

1.1.2.5.3. DRU Blockcopy

DRU channel performance with TR submitted through ring

Test Description

Throughput (MCU2)

CPU Load (MCU2)

Throughput (C66x_1/2)

CPU Load (C66x_1/2)

Throughput (C7x_1)

CPU Load (C7x_1)

[PDK-3501] 1CH DDR 1MB to DDR 1MB

11605 MB/sec

11%

11963 MB/sec

4%

11287 MB/sec

7%

[PDK-3502] 1CH MSMC 1KB Circular to DDR 1MB

18509 MB/sec

12%

18558 MB/sec

6%

17682 MB/sec

9%

[PDK-3503] 1CH DDR 1MB to MSMC circular 1KB

21620 MB/sec

12%

22647 MB/sec

5%

20560 MB/sec

9%

[PDK-3504] 1CH MSMC 1KB to MSMC circular 1KB (1MB per TR)

29086 MB/sec

12%

29208 MB/sec

7%

27413 MB/sec

9%

[PDK-3505] Multi CH DDR 1MB to DDR 1MB

12066 MB/sec

20%

12310 MB/sec (4CH)

8%

10433 MB/sec (4CH)

15%

[PDK-3506] Multi CH MSMC 1KB to MSMC circular 1KB (1 MB per TR)

30885 MB/sec

21%

30931 MB/sec (4CH)

17%

18020 MB/sec(4CH)

15%

1.1.2.5.5. MCU NAVSS Blockcopy (Normal Channel)

MCU NAVSS normal channel performance with TR submitted through ring

Test Description

Throughput (MCU1)

CPU Load (MCU1)

[PDK-3490] 1CH DDR 1MB to DDR 1MB

668 MB/sec

1%

[PDK-3491] 1CH MSMC 1KB Circular to DDR 1MB

981 MB/sec

2%

[PDK-3492] 1CH DDR 1MB to MSMC circular 1KB

719 MB/sec

1%

[PDK-3493] 1CH MSMC 1KB to MSMC circular 1KB (1MB per TR)

969 MB/sec

2%

[PDK-3489] 1CH OCMC 1KB to OCMC circular 1KB (1MB per TR)

2478 MB/sec

4%

[PDK-3495] Multi CH DDR 1MB to DDR 1MB

1181 MB/sec (2CH)

2%

[PDK-3497] Multi CH MSMC 1KB to MSMC circular 1KB (1 MB per TR)

1638 MB/sec (2CH)

3%

[PDK-12918] 1CH MCU OCMC 1MB to DDR 1MB

1498 MB/sec

3%

[PDK-12919] 1CH DDR 1MB to MCU OCMC 1 MB

1232 MB/sec

2%

1.1.2.6. IPC

1.1.2.6.1. Test Set-up
  • Release build binaries are used for measurement

  • Ring Buffer : Uncached DDR

  • Buffer to be sent (RPMSG) – Cached DDR

  • C66x - L2 Cache 128K

  • C7x - L2 Cache 128K

  • Software/Application Used : ipc_multicore_perf_test loaded through SBL. Output is printed to UART.

  • R5F/MPU config : DDR config

    • bufferable - 1

    • cacheable - 1

    • shareable - 0

Capturing Round trip time in us with different data sizes

1.1.2.6.2. Performance - Host Core A72, Bios, 2 GHz

Remote Core

4 Bytes

8 Bytes

16 Bytes

32 Bytes

64 Bytes

128 Bytes

256 Bytes

MCU R5F0

20

20

22

25

32

44

70

Main R5F0

18

19

20

24

29

41

65

C66x1

17

16

17

16

18

20

25

C7x

20

20

20

20

23

24

25

1.1.2.6.3. Performance - Host Core MCU R5F0, 1 GHz

Remote Core

4 Bytes

8 Bytes

16 Bytes

32 Bytes

64 Bytes

128 Bytes

256 Bytes

A72 (bios)

21

21

23

26

32

43

68

Main R5F0

17

18

19

22

28

39

65

C66x1

17

17

19

22

28

40

64

C7x

18

18

20

23

29

40

66

1.1.2.6.4. Performance - Host Core MAIN R5F0, 1 GHz

Remote Core

4 Bytes

8 Bytes

16 Bytes

32 Bytes

64 Bytes

128 Bytes

256 Bytes

A72 (Bios)

17

17

18

21

26

37

59

MCU R5F0

16

15

17

20

25

35

58

Main R5F1

16

16

17

21

26

36

59

C66x1

16

15

17

20

25

36

58

C7x

16

16

17

20

25

36

58

1.1.2.6.5. Performance - Host Core C66X1, 1.35 GHz

Remote Core

4 Bytes

8 Bytes

16 Bytes

32 Bytes

64 Bytes

128 Bytes

256 Bytes

A72 (Bios)

19

18

18

18

18

22

26

MCU R5F0

26

26

28

30

37

52

81

Main R5F0

25

25

27

29

35

48

75

C66x2

23

22

22

21

23

28

35

C7x

30

29

29

28

31

34

37

1.1.2.6.6. Performance - Host Core C7x, 1GHz

Remote Core

4 Bytes

8 Bytes

16 Bytes

32 Bytes

64 Bytes

128 Bytes

256 Bytes

A72 (Bios)

21

21

21

21

24

23

25

Mcu R5F0

32

32

34

37

45

55

82

Main R5F0

28

29

30

34

42

51

75

C66x1

29

28

28

27

20

31

36

1.1.2.7. OSPI

1.1.2.7.1. OSPI Memory Non Cached Test Set-up
  • Platform: J721e EVM.

  • OS Type: Baremetal/FreeRTOS

  • Core : R5F_0 at 1 GHz.

  • Software/Application Used: OSPI_Flash_TestApp/OSPI_Flash_Dma_TestApp

  • System Configuration: Cache OFF, Read/Write Buffer in DDR. DMA Enabled/Disabled, Interrupts ON.

1.1.2.7.2. OSPI Phy Tuning Time (DDR Octal Mode)

OSPI RCLK

Tuning Time

133 MHz

3.512

166 MHz

3.139

Note: PHY tuning time varies across silicon samples and PHY tuning point varies with voltage and temperature.

1.1.2.7.3. OSPI Read/Write Performance (DDR Octal Mode)

OSPI RCLK

Mode

Write Tput (MB/s)

Write CPU Load

Read Tput (MB/s)

Read CPU Load

Read Tput Theoretical Max (MB/s)

133 MHz

DAC

0.77

100%

7.186

51%

266

DAC DMA

1.502

70%

264.925

2%

INDAC

1.510

75%

8.331

0%

166 MHz

DAC

0.080

100%

8.208

51%

332

DAC DMA

1.580

71%

330.781

1%

INDAC

1.575

76%

10.414

1%

1.1.2.7.4. OSPI Memory Cached Test Set-up
  • Platform: J721e EVM.

  • OS Type: Baremetal/FreeRTOS

  • Core : R5F_0 at 1 GHz.

  • Software/Application Used: OSPI_Flash_Cache_TestApp/OSPI_Flash_Dma_Cache_TestApp

  • System Configuration: Cache ON, Read/Write Buffer in DDR. DMA Enabled/Disabled, Interrupts ON.

1.1.2.7.5. OSPI Read/Write Performance (DDR Octal Mode)

OSPI RCLK

Mode

Write Tput (MB/s)

Write CPU Load

Read Tput (MB/s)

Read CPU Load

Read Tput Theoretical Max (MB/s)

133 MHz

DAC

0.304

100%

46.788

51%

266

DAC DMA

1.501

75%

264.925

20%

INDAC

1.512

100%

8.331

0%

166 MHz

DAC

0.342

100%

57.503

51%

332

DAC DMA

1.581

72%

330.572

2%

INDAC

1.575

76%

10.414

0%

1.1.2.8. MMCSD

1.1.2.8.1. Test Set-up
  • Platform: J721e EVM.

  • OS Type: FreeRTOS

  • Core : R5F_0 at 1 GHz.

  • Software/Application Used: MMCSD_<EMMC>_Regression_TestApp (A menu based application which outputs the benchmark numbers on UART)

  • System Configuration: Cache ON, Read/Write Buffer in DDR. ADMA enabled, Interrupts ON.

  • SD Card used: Sandisk 16GB, Class 10. FAT32 formatted with allocation size = 4K (for optimal FAT32 throughput & compatibility with various cards)

  • EMMC: EMMC on J721E EVM. Please refer to the EVM data sheet for details

1.1.2.8.2. SD Card Performance
1.1.2.8.2.1. DS Mode (25 MHz, 4-bit) Theoretical Max: 12.5 MB/s

Size of transfer (KB)

RAW Write Throughput (MB/s)

RAW Read Throughput (MB/s)

256

8.749

10.018

512

5.584

10.876

1024

9.416

11.177

2048

9.617

11.283

5120

7.480

11.406

1.1.2.8.2.2. HS Mode (50 MHz, 4-bit) Theoretical Max: 50 MB/s

Size of transfer (KB)

RAW Write Throughput (MB/s)

RAW Read Throughput (MB/s)

256

13.784

17.634

512

16.981

20.663

1024

17.856

21.698

2048

18.587

22.255

5120

7.682

22.511

1.1.2.8.2.3. SDR12 Mode (25 MHz, 4-bit) Theoretical Max: 12.5 MB/s

Size of transfer (KB)

RAW Write Throughput (MB/s)

RAW Read Throughput (MB/s)

256

6.732

10.022

512

9.569

10.917

1024

9.763

11.197

2048

4.646

11.272

5120

7.626

11.406

1.1.2.8.2.4. SDR25 Mode (50 MHz, 4-bit) Theoretical Max: 25 MB/s

Size of transfer (KB)

RAW Write Throughput (MB/s)

RAW Read Throughput (MB/s)

256

15.268

17.632

512

19.378

20.669

1024

20.746

21.699

2048

21.503

22.255

5120

16.631

22.491

1.1.2.8.2.5. SDR50 Mode (50 MHz, 4-bit) Theoretical Max: 50 MB/s

Size of transfer (KB)

RAW Write Throughput (MB/s)

RAW Read Throughput (MB/s)

256

24.812

28.552

512

33.850

37.459

1024

39.192

40.954

2048

25.812

42.779

5120

42.487

44.238

1.1.2.8.2.6. DDR50 Mode (50 MHz, 4-bit) Theoretical Max: 50 MB/s

Size of transfer (KB)

RAW Write Throughput (MB/s)

RAW Read Throughput (MB/s)

256

24.330

28.074

512

11.394

36.119

1024

37.642

39.881

2048

39.679

41.764

5120

22.710

42.646

1.1.2.8.3. EMMC Performance
1.1.2.8.3.1. DS Mode (25 MHz, 8-bit) Theoretical Max: 25 MB/s

Size of transfer (KB)

RAW Write Throughput (MB/s)

RAW Read Throughput (MB/s)

256

15.837

18.427

512

17.996

20.060

1024

19.315

20.998

2048

20.030

21.494

5120

20.498

21.806

1.1.2.8.3.2. HS-SDR Mode (50 MHz, 8-bit) Theoretical Max: 50 MB/s

Size of transfer (KB)

RAW Write Throughput (MB/s)

RAW Read Throughput (MB/s)

256

25.440

31.468

512

31.417

36.512

1024

35.476

39.713

2048

38.232

41.536

5120

38.861

42.691

1.1.2.8.3.3. HS-DDR Mode (50 MHz, 8-bit) Theoretical Max: 100 MB/s

Size of transfer (KB)

RAW Write Throughput (MB/s)

RAW Read Throughput (MB/s)

256

30.613

46.946

512

42.009

59.158

1024

47.635

68.044

2048

52.767

73.574

5120

54.303

77.353

1.1.2.8.3.4. HS-200 Mode (200 MHz, 8-bit) Theoretical Max: 200 MB/s

Size of transfer (KB)

RAW Write Throughput (MB/s)

RAW Read Throughput (MB/s)

256

36.074

67.081

512

43.292

94.905

1024

49.768

119.96

2048

52.007

138.18

5120

52.941

152.06

1.1.2.9. CSL-FL based Optimized OSPI Example

1.1.2.9.1. CPU Mode - Test Set-up
  • Platform: J721e EVM.

  • OS Type: Baremetal

  • Core : R5F_0 at 1 GHz

  • Software/Application Used: csl_ospi_flash_app

  • System Configuration:
    • RCLK 133/166 MHz

    • Cache ON,

    • Buffer & Critical Fxn’s in TCMB,

    • DMA Disabled,

    • Interrupts OFF.

  • Theoretical Max Throughput:
    • 133 MHz :- 253.67 MB/s

    • 166 MHz :- 316.62 MB/s

1.1.2.9.2. DAC Mode OSPI Read Performance (Dual Data Rate - Octal Mode)

OSPI RCLK

Size of transfer (B)

Read Time (ns)

Throughput (MB/s)

133 MHz

16

815

19.6

32

1445

22.1

64

2700

23.7

128

5225

24.5

256

10265

24.9

512

20360

25.1

1024

40510

25.3

166 MHz

16

945

16.9

32

2330

13.7

64

4580

14.0

128

9105

14.1

256

18145

14.1

512

36185

14.1

1024

72295

14.2

1.1.2.9.3. DMA Mode - Test Set-up
  • Platform: J721e EVM.

  • OS Type: Baremetal

  • Core : R5F_0 at 1 GHz

  • Software/Application Used: udma_baremetal_ospi_flash_testapp

  • System Configuration:
    • RCLK 133/166 MHz

    • Cache ON,

    • Buffer & Critical Fxn’s in TCMB,

    • DMA Enabled - SW Trigger mode,

    • Interrupts OFF.

  • Theoretical Max Throughput:
    • 133 MHz :- 253.67 MB/s

    • 166 MHz :- 316.62 MB/s

1.1.2.9.4. DAC DMA Mode OSPI Read Performance (Dual Data Rate - Octal Mode)

OSPI RCLK

Size of transfer (B)

Read Time (ns)

Throughput (MB/s)

133 MHz

16

800

20

32

805

39.8

64

970

66

128

1315

97.3

256

1955

130.9

512

3120

164.1

1024

5450

187.9

166 MHz

16

675

23.7

32

805

39.8

64

850

75.3

128

1180

108.5

256

1685

151.9

512

2730

187.5

1024

4670

219.3

1.1.2.10. SBL Boot Performance Numbers

1.1.2.10.1. Test Set-up
  • Platform: J721E EVM.

  • OS Type: Baremetal

  • Core : R5F_0 at 1 GHz

  • Note that app image load time could vary depending on the actual image size

  • Note that RBL boot time numbers are not accounted in the below table

1.1.2.10.2. GP EVM Performance (Legacy Boot)

Boot Modes

SBL Used

Application Used

MMCSD

sbl_mmcsd_img

sbl_boot_perf_test

eMMC Boot0

sbl_emmc_boot0_img

sbl_boot_perf_test

eMMC UDA

sbl_emmc_uda_img

sbl_boot_perf_test

OSPI NOR

sbl_ospi_img

sbl_boot_perf_test

OSPI NOR Optimized

sbl_boot_perf_cust_img

sbl_boot_perf_early_can_test

SBL Boot Time Breakdown

MMCSD

eMMC BOOT0

OSPI NOR Optimized

OSPI NOR

eMMC UDA

SBL : SBL_SciClientInit: ReadSysfwImage

57.702ms

100.490ms

8.306ms

8.307ms

103.497ms

Load/Start SYSFW

4.101ms

4.101ms

4.178ms

4.178ms

4.101ms

Sciclient_init

3.164ms

3.165ms

3.165ms

3.165ms

3.165ms

Board Config

7.096ms

7.095ms

2.007ms

7.153ms

7.096ms

PM Config

1.378ms

1.367ms

0.107ms

1.358ms

1.360ms

Security Config

4.229ms

4.229ms

6.322ms

4.229ms

4.229ms

RM Config

1.772ms

1.773ms

0.760ms

1.774ms

1.772ms

SBL : Board_init (pinmux)

4.590ms

4.585ms

2.878ms

4.690ms

4.588ms

SBL : Board_init (PLL)

0.220ms

0.216ms

0.790ms

0.224ms

0.218ms

SBL: Board_init (CLOCKS)

1.321ms

1.358ms

0.660ms

1.283ms

1.362ms

SBL: DDR initialization

30.101ms

30.096ms

0.000ms

30.254ms

30.095ms

SBL: Ethernet Configuration

153.115ms

153.089ms

0.000ms

146.200ms

153.084ms

SBL: EEPROM copying time

13.202ms

13.201ms

0.000ms

6.893ms

13.201ms

SBL: HSM Core App Copying Time

0.487ms

0.487ms

0.476ms

0.488ms

0.488ms

SBL: Boot Media Drivers init

16.006ms

24.374ms

2.304ms

2.227ms

24.221ms

SBL: OSPI PHY Tuning time

0.226ms

0.001ms

3.302ms

3.292ms

0.143ms

SBL: Appication Image Verification

0.001ms

0.000ms

0.000ms

0.000ms

0.001ms

SBL: App copy to MCU SRAM & Jump to App

89.217ms

2.505ms

2.600ms

2.615ms

55.624ms

Misc

0.001ms

0.000ms

0.000ms

0.000ms

0.000ms

TOTAL time

387.929ms

352.132ms

37.855ms

228.276ms

408.245ms

1.1.2.10.3. HS EVM Performance (Legacy Boot)

Boot Modes

SBL Used

Application Used

MMCSD

sbl_mmcsd_img_hs

sbl_boot_perf_test

OSPI NOR

sbl_ospi_img_hs

sbl_boot_perf_test

OSPI NOR Optimized

sbl_boot_perf_cust_img_hs

sbl_boot_perf_hs_early_can_test

SBL Boot Time Breakdown

OSPI NOR Optimized

OSPI NOR

MMCSD

SBL : SBL_SciClientInit: ReadSysfwImage

8.305ms

8.309ms

71.654ms

Load/Start SYSFW

12.939ms

13.416ms

12.759ms

Sciclient_init

3.165ms

3.165ms

3.165ms

Board Config

4.210ms

9.344ms

9.282ms

PM Config

0.105ms

1.361ms

1.390ms

Security Config

9.866ms

6.832ms

6.833ms

RM Config

3.052ms

4.060ms

4.061ms

SBL : Board_init (pinmux)

2.877ms

4.646ms

4.523ms

SBL : Board_init (PLL)

0.797ms

0.224ms

0.217ms

SBL: Board_init (CLOCKS)

0.660ms

1.286ms

1.345ms

SBL: DDR initialization

0.000ms

30.182ms

30.137ms

SBL: Ethernet Configuration

0.000ms

146.241ms

146.278ms

SBL: EEPROM copying time

0.000ms

6.840ms

6.839ms

SBL: HSM Core App Copying Time

0.482ms

0.492ms

0.493ms

SBL: Boot Media Drivers init

2.297ms

2.220ms

15.796ms

SBL: OSPI PHY Tuning time

3.355ms

3.378ms

0.235ms

SBL: Appication Image Verification

50.463ms

51.497ms

135.413ms

SBL: App copy to MCU SRAM & Jump to App

1.947ms

3.288ms

2.520ms

Misc

0.000ms

0.001ms

0.000ms

TOTAL time

104.520ms

296.782ms

452.931ms

1.1.2.11. Early CAN Response

  • CAN response is measured from MCU_PORZ_OUT to pulling the CAN-H line out of standby.

  • Below numbers are measured on J721e ES2.0 GP EVM.

Measured Time

Early CAN

55.2 ms

POST + Early CAN

82.9 ms

1.1.2.12. Memory Configuration Benchmarking

  • These numbers were collected from the memory_benchmarking_app demo which provides a means of measuring the performance of a realistic application where the text of the application is sitting in various memory locations and the data is sitting in On-Chip-Memory RAM (referred to as OCM, OCMC or OCMRAM).

  • The application executes 10 different configurations of the same text varying by data buffer size. Each test calls 16 separate functions 200 total times in random order.

  • More data instensive tests have more repetitive code, achieving much lower ICM rates

  • The Memcpy size is just a knob to make the synthetic benchmark application more data or instruction centric with no additional significance. (small memcpy size is more instruction centric with more ICM rate and vice versa)

1.1.2.12.1. Supported Configurations

Core

SOC

Supported Memory Configurations (MEM_CONF)

mcu1_0

j721e

ocmc msmc ddr xip

mcu2_0

j721e

ocmc msmc ddr xip

mcu1_0 + mcu2_0

j721e

ocmc ddr xip

1.1.2.12.2. Test Set-up
  • Platform: J721E EVM.

  • OS Type: FreeRTOS

  • Core – MCU Domain R5_0 (MCU1_0) & Main Domain R5_0 (MCU2_0)

  • Software/Application Used: sbl_cust_img and [MEM_CONF]_memory_benchmarking_app_freertos appimage

  • Refer Memory Benchmarking Apps user guide to which SBL variant to use to test different [MEM_CONF]_memory_benchmarking_app_freertos

1.1.2.12.3. MCU Domain Single Core Execution

Memcpy Size

0

50

500

1000

2048

OCMC

OCMC Baseline Execution Time (us)

11594

11730

13250

14806

18014

ICM/sec

3284112

3249190

2888075

2588882

2150827

DDR

DDR execution time (us)

17164

17296

18810

20472

23931

DDR / OCMC Baseline

1.48

1.475

1.42

1.383

1.328

MSMC

MSMC execution time (us)

13750

13888

15398

17056

20436

MSMC / OCMC Baseline

1.186

1.184

1.162

1.152

1.134

XIP

XIP 133 MHz execution time (us)

132612

133229

134857

136657

140532

XIP 133 MHz / OCMC Baseline

11.438

11.358

10.178

9.23

7.801

XIP 166 MHz execution time (us)

108390

108500

110290

112023

115944

XIP 166 MHz / OCMC Baseline

9.349

9.25

8.324

7.566

6.436

1.1.2.12.4. MAIN Domain Single Core Execution

Memcpy Size

0

50

500

1000

2048

OCMC

OCMC Baseline Execution Time (us)

13077

13312

16134

19148

25175

ICM/sec

2770971

2740910

2265960

1918216

1482343

DDR

DDR execution time (us)

19370

19663

23623

27783

35981

DDR / OCMC Baseline

1.481

1.477

1.464

1.451

1.429

MSMC

MSMC execution time (us)

16317

16596

20527

24771

32974

MSMC / OCMC Baseline

1.248

1.247

1.272

1.294

1.31

XIP

XIP 133 MHz execution time (us)

130603

130680

133710

137101

144103

XIP 133 MHz / OCMC Baseline

9.987

9.817

8.287

7.16

5.724

XIP 166 MHz execution time (us)

106578

106743

109596

112910

119876

XIP 166 MHz / OCMC Baseline

8.15

8.019

6.793

5.897

4.762

1.1.2.12.5. MCU Domain Multi-Core Execution

Memcpy Size

0

50

500

1000

2048

OCMC

OCMC Baseline Execution Time (us)

11667

11812

13342

14896

18045

ICM/sec

3352361

3309346

2957352

2652792

2214353

DDR

DDR execution time (us)

17380

17546

19036

20700

24122

DDR / OCMC Baseline

1.49

1.485

1.427

1.39

1.337

XIP

XIP 133 MHz execution time (us)

131562

131416

133664

135317

139443

XIP 133 MHz / OCMC Baseline

11.276

11.126

10.018

9.084

7.728

XIP 166 MHz execution time (us)

107443

107608

109255

111171

114828

XIP 166 MHz / OCMC Baseline

9.209

9.11

8.189

7.463

6.363

1.1.2.12.6. MAIN Domain Multi-Core Execution

Memcpy Size

0

50

500

1000

2048

OCMC

OCMC Baseline Execution Time (us)

13095

13327

16154

19156

25152

ICM/sec

2733867

2708786

2240621

1898987

1466960

DDR

DDR execution time (us)

19264

19650

23538

27783

36057

DDR / OCMC Baseline

1.471

1.474

1.457

1.45

1.434

XIP

XIP 133 MHz execution time (us)

130218

130498

133754

137025

144117

XIP 133 MHz / OCMC Baseline

9.944

9.792

8.28

7.153

5.73

XIP 166 MHz execution time (us)

106207

106548

109510

112786

119534

XIP 166 MHz / OCMC Baseline

8.111

7.995

6.779

5.888

4.752

1.1.2.12.7. Additional OCMC Baseline Details - MCU Domain

Mem Cpy Size

0

50

100

200

500

750

1000

1250

1500

2048

Start Time in Usec

371093

659053

948056

1239056

1531056

1824057

2118059

2413057

2709057

3006057

Exec Time in Usec

11594

11730

11924

12164

13250

13996

14806

15595

16406

18014

Task Calls

200

200

200

200

200

200

200

200

200

200

Inst Cache Miss

38076

38113

38199

37990

38267

38083

38331

38216

38283

38745

Inst Cache Acc

1783333

1818260

1851033

1909183

2099970

2253931

2414026

2568612

2724316

3074076

Num Instr Exec

2239389

2281387

2323599

2397349

2642441

2839089

3044459

3240141

3440563

3886547

ICM/sec

3284112

3249190

3203539

3123150

2888075

2720991

2588882

2450529

2333475

2150827

INST/sec

193150681

194491645

194867410

197085580

199429509

202850028

205623328

207767938

209713702

215751471

1.1.2.12.8. Additional OCMC Baseline Details - MAIN Domain

Mem Cpy Size

0

50

100

200

500

750

1000

1250

1500

2048

Start Time in Usec

53084

342056

632056

924056

1217058

1513060

1810062

2110062

2411064

2713064

Exec Time in Usec

13077

13312

13635

14149

16134

17609

19148

20648

22192

25175

Task Calls

200

200

200

200

200

200

200

200

200

200

Inst Cache Miss

36236

36487

36543

36285

36559

36425

36730

36664

36640

37318

Inst Cache Acc

1867441

1901851

1934369

1992426

2184540

2338146

2498247

2653106

2809352

3158407

Num Instr Exec

2240027

2281511

2323877

2398123

2643705

2839953

3045731

3241955

3442149

3888243

ICM/sec

2770971

2740910

2680088

2564492

2265960

2068544

1918216

1775668

1651045

1482343

INST/sec

171295174

171387545

170434690

169490635

163859241

161278493

159062617

157010606

155107651

154448579