1. J721E Datasheet

1.1. Introduction

This section provides the performance numbers of device drivers supported in PDK

1.1.1. Setup Details

SOC Details

Values

Core

R5F

Core Operating Speed

1GHz

DDR Speed

4266 MTs

Cache status

Enabled

Optimization Details

Values

Profile

Release

Compile Options for R5F

-g -ms -DMAKEFILE_BUILD -c -qq -pdsw225 –endian=little -mv7R5 –abi=eabi -eo.oer5f -ea.ser5f –symdebug:dwarf –embed_inline_assembly –float_support=vfpv3d16 –emit_warnings_as_errors

Linker Options for R5F

–emit_warnings_as_errors -w -q -u _c_int00 -c -mv7R5 –diag_suppress=10063 -x –zero_init=on

Code Placement

DDR

Data Placement

DDR

1.1.2. Software Performance Numbers

1.1.2.1. DSS

Display Type

Configuration

FPS

CPU Load

HDMI

1080P60 RGB888

60

1.0% (MCU2_0)

DP

1080P60 BGRA32

60

1.0% (MCU2_0)

1.1.2.2. CSI-Rx

Capture Type

Configuration

CPU Load

CSI2Rx Inst 0

4CH 1080P30 IMX390 Sensor Raw12

1.2% (MCU2_0)

Instance

Configuration

Time taken to receive one frame

ISR latency

CSI2Rx Inst 0

1CH 1080P30 IMX390 Sensor Raw12

33.3ms (MCU2_0)

9us (MCU2_0)

1.1.2.3. CSI-Tx

Instance

Configuration

Time taken to Transmit one frame

ISR latency

CSI2Tx Inst 0

1CH 1080P 2.5GBPS IMX390 Sensor Raw12

6.7ms (MCU2_0)

21us (MCU2_0)

1.1.2.4. CPSW_9G

1.1.2.4.1. Test Setup
_images/enet_j721e_cpsw9g_test_setup.png

Hardware Configuration

Value

Processing Core

Main R5F0 Core 0

Core Frequency

1 GHz

Ethernet Interface Type

RGMII at 1Gbps

Packet buffer memory

DDR

Hardware checksum offload

Yes

Scatter-gather TX

Yes

Scatter-gather RX

No

Software Configuration

Value

RTOS

FreeRTOS

RTOS application

Enet LLD lwIP example

TCP/IP stack

lwIP 2.2.0

Host PC tool version

iperf v2.0.10

1.1.2.4.2. TCP Performance

Test

Bandwidth (Mbps)

CPU Load (%)

TCP RX

123

89

TCP TX

105

100

TCP Bidirectional

RX=58 TX=59

100

Host PC commands:

iperf -c <evm_ip> -r
iperf -c <evm_ip> -d
1.1.2.4.3. UDP Performance

Test

Datagram Length = 64B

Datagram Length = 256B

Datagram Length = 512B

Datagram Length = 1470B

Bandwidth
(Mbps)

CPU
Load
(%)
Packet
Loss
(%)
Bandwidth
(Mbps)

CPU
Load
(%)
Packet
Loss
(%)
Bandwidth
(Mbps)

CPU
Load
(%)
Packet
Loss
(%)
Bandwidth
(Mbps)

CPU
Load
(%)
Packet
Loss
(%)

UDP RX

5.24

38

0.00

26.2

58

0.00

26.2

41

0.00

26.2

28

0.00

10.5

56

0.00

52.4

95

2.2

52.4

63

0.00

52.4

40

0.00

15.7

73

0.01

105

105

105

60

0.00

UDP RX (Max)

22

90

0.21

51

98

1.24

85

100

1.08

200

100

0.92

UDP TX (Max)

13.1

100

0.08

34

100

0.13

70

100

0.09

240

100

0.12

Host PC commands:

  • Test with datagram length of 64B:

    iperf -c <evm_ip> -u -l64 -b<bw> -r
    where <bw> is 5M, 10M, 15M, etc
    
  • Test with datagram length of 256B:

    iperf -c <evm_ip> -u -l256 -b<bw> -r
    where <bw> is 25M, 50M, 100M, etc
    
  • Test with datagram length of 512B:

    iperf -c <evm_ip> -u -l512 -b<bw> -r
    where <bw> is 25M, 50M, 100M, etc
    
  • Test with datagram length of 1470B (max):

    iperf -c <evm_ip> -u -b<bw> -r
    where <bw> is 25M, 50M, 100M, etc
    

1.1.2.5. UDMA

1.1.2.5.1. DMA Parameters
  • Ring Order ID: 0

  • Channel Order ID: 0

  • Channel DMA Priority: 1

  • Channel Bus Priority: 4

  • Channel BUS QOS: 4

  • Channel TX FIFO depth: 128

  • Channel Fetch Word Size: 16

  • Channel Burst Size: 64 bytes for normal channel, 128 bytes for HC and UHC channels

1.1.2.5.2. Test Parameters
  • Type: TR15 Block copy

  • TR: one TR per TRPD in PBR mode

  • TR Memory: Same as buffer memory (DDR, MSMC or OCMC depends on the test performed)

  • Transfer Size: 1 MB read and 1MB write

  • 1MB means 1000x1000 bytes and 1KB means 1000 bytes

Note: Throughput numbers mentioned is the combined memory throughput of both read and write operations

1.1.2.5.3. DRU Blockcopy

DRU channel performance with TR submitted through ring

Test Description

Throughput (MCU2)

CPU Load (MCU2)

Throughput (C66x_1/2)

CPU Load (C66x_1/2)

Throughput (C7x_1)

CPU Load (C7x_1)

[PDK-3501] 1CH DDR 1MB to DDR 1MB

11447 MB/sec

10%

12080 MB/sec

3%

11196 MB/sec

7%

[PDK-3502] 1CH MSMC 1KB Circular to DDR 1MB

18315 MB/sec

13%

18774 MB/sec

5%

17652 MB/sec

8%

[PDK-3503] 1CH DDR 1MB to MSMC circular 1KB

21355 MB/sec

11%

22995 MB/sec

4%

20360 MB/sec

9%

[PDK-3504] 1CH MSMC 1KB to MSMC circular 1KB (1MB per TR)

28571 MB/sec

14%

29704 MB/sec

6%

27200 MB/sec

9%

[PDK-3505] Multi CH DDR 1MB to DDR 1MB

12214 MB/sec

25%

12292 MB/sec (4CH)

7%

10597 MB/sec (4CH)

14%

[PDK-3506] Multi CH MSMC 1KB to MSMC circular 1KB (1 MB per TR)

30931 MB/sec

31%

30988 MB/sec (4CH)

15%

17962 MB/sec (4CH)

14%

1.1.2.5.5. MCU NAVSS Blockcopy (Normal Channel)

MCU NAVSS normal channel performance with TR submitted through ring

Test Description

Throughput (MCU1)

CPU Load (MCU1)

[PDK-3490] 1CH DDR 1MB to DDR 1MB

660 MB/sec

2%

[PDK-3491] 1CH MSMC 1KB Circular to DDR 1MB

982 MB/sec

2%

[PDK-3492] 1CH DDR 1MB to MSMC circular 1KB

717 MB/sec

2%

[PDK-3493] 1CH MSMC 1KB to MSMC circular 1KB (1MB per TR)

961 MB/sec

2%

[PDK-3489] 1CH OCMC 1KB to OCMC circular 1KB (1MB per TR)

2459 MB/sec

3%

[PDK-3495] Multi CH DDR 1MB to DDR 1MB

1183 MB/sec (2CH)

3%

[PDK-3497] Multi CH MSMC 1KB to MSMC circular 1KB (1 MB per TR)

1624 MB/sec (2CH)

4%

[PDK-12918] 1CH MCU OCMC 1MB to DDR 1MB

1503 MB/sec

3%

[PDK-12919] 1CH DDR 1MB to MCU OCMC 1 MB

1234 MB/sec

2%

1.1.2.6. IPC

1.1.2.6.1. Test Set-up
  • Release build binaries are used for measurement

  • Ring Buffer : Uncached DDR

  • Buffer to be sent (RPMSG) – Cached DDR

  • C66x - L2 Cache 128K

  • C7x - L2 Cache 128K

  • Software/Application Used : ipc_multicore_perf_test loaded through SBL. Output is printed to UART.

  • R5F/MPU config : DDR config

    • bufferable - 1

    • cacheable - 1

    • shareable - 0

Capturing Round trip time in us with different data sizes

1.1.2.6.2. Performance - Host Core A72, Bios, 2 GHz

Remote Core

4 Bytes

8 Bytes

16 Bytes

32 Bytes

64 Bytes

128 Bytes

256 Bytes

MCU R5F0

20

20

22

25

32

44

70

Main R5F0

18

19

20

24

29

41

65

C66x1

17

16

17

16

18

20

25

C7x

20

20

20

20

23

24

25

1.1.2.6.3. Performance - Host Core MCU R5F0, 1 GHz

Remote Core

4 Bytes

8 Bytes

16 Bytes

32 Bytes

64 Bytes

128 Bytes

256 Bytes

A72 (bios)

21

21

23

26

32

43

68

Main R5F0

17

18

19

22

28

39

65

C66x1

17

17

19

22

28

40

64

C7x

18

18

20

23

29

40

66

1.1.2.6.4. Performance - Host Core MAIN R5F0, 1 GHz

Remote Core

4 Bytes

8 Bytes

16 Bytes

32 Bytes

64 Bytes

128 Bytes

256 Bytes

A72 (Bios)

17

17

18

21

26

37

59

MCU R5F0

16

15

17

20

25

35

58

Main R5F1

16

16

17

21

26

36

59

C66x1

16

15

17

20

25

36

58

C7x

16

16

17

20

25

36

58

1.1.2.6.5. Performance - Host Core C66X1, 1.35 GHz

Remote Core

4 Bytes

8 Bytes

16 Bytes

32 Bytes

64 Bytes

128 Bytes

256 Bytes

A72 (Bios)

19

18

18

18

18

22

26

MCU R5F0

26

26

28

30

37

52

81

Main R5F0

25

25

27

29

35

48

75

C66x2

23

22

22

21

23

28

35

C7x

30

29

29

28

31

34

37

1.1.2.6.6. Performance - Host Core C7x, 1GHz

Remote Core

4 Bytes

8 Bytes

16 Bytes

32 Bytes

64 Bytes

128 Bytes

256 Bytes

A72 (Bios)

21

21

21

21

24

23

25

Mcu R5F0

32

32

34

37

45

55

82

Main R5F0

28

29

30

34

42

51

75

C66x1

29

28

28

27

20

31

36

1.1.2.7. OSPI

1.1.2.7.1. OSPI Memory Non Cached Test Set-up
  • Platform: J721e EVM.

  • OS Type: Baremetal/FreeRTOS

  • Core : R5F_0 at 1 GHz.

  • Software/Application Used: OSPI_Flash_TestApp/OSPI_Flash_Dma_TestApp

  • System Configuration: Cache OFF, Read/Write Buffer in DDR. DMA Enabled/Disabled, Interrupts ON.

1.1.2.7.2. OSPI Phy Tuning Time (DDR Octal Mode)

OSPI RCLK

Tuning Time

133 MHz

3.517

166 MHz

3.115

Note: PHY tuning time varies across silicon samples and PHY tuning point varies with voltage and temperature.

1.1.2.7.3. OSPI Read/Write Performance (DDR Octal Mode)

OSPI RCLK

Mode

Write Tput (MB/s)

Write CPU Load

Read Tput (MB/s)

Read CPU Load

Read Tput Theoretical Max (MB/s)

133 MHz

DAC

0.77

100%

7.185

51%

266

DAC DMA

1.504

70%

264.058

2%

INDAC

1.515

75%

8.331

0%

166 MHz

DAC

0.081

100%

8.212

51%

332

DAC DMA

1.622

71%

327.371

1%

INDAC

1.625

76%

10.410

1%

1.1.2.7.4. OSPI Memory Cached Test Set-up
  • Platform: J721e EVM.

  • OS Type: Baremetal/FreeRTOS

  • Core : R5F_0 at 1 GHz.

  • Software/Application Used: OSPI_Flash_Cache_TestApp/OSPI_Flash_Dma_Cache_TestApp

  • System Configuration: Cache ON, Read/Write Buffer in DDR. DMA Enabled/Disabled, Interrupts ON.

1.1.2.7.5. OSPI Read/Write Performance (DDR Octal Mode)

OSPI RCLK

Mode

Write Tput (MB/s)

Write CPU Load

Read Tput (MB/s)

Read CPU Load

Read Tput Theoretical Max (MB/s)

133 MHz

DAC

0.314

100%

46.284

51%

266

DAC DMA

1.550

75%

264.858

20%

INDAC

1.549

100%

8.331

0%

166 MHz

DAC

0.344

100%

57.446

51%

332

DAC DMA

1.623

72%

330.572

2%

INDAC

1.623

76%

10.414

0%

1.1.2.8. MMCSD

1.1.2.8.1. Test Set-up
  • Platform: J721e EVM.

  • OS Type: FreeRTOS

  • Core : R5F_0 at 1 GHz.

  • Software/Application Used: MMCSD_<EMMC>_Regression_TestApp (A menu based application which outputs the benchmark numbers on UART)

  • System Configuration: Cache ON, Read/Write Buffer in DDR. ADMA enabled, Interrupts ON.

  • SD Card used: Sandisk 16GB, Class 10. FAT32 formatted with allocation size = 4K (for optimal FAT32 throughput & compatibility with various cards)

  • EMMC: EMMC on J721E EVM. Please refer to the EVM data sheet for details

1.1.2.8.2. SD Card Performance
1.1.2.8.2.1. DS Mode (25 MHz, 4-bit) Theoretical Max: 12.5 MB/s

Size of transfer (KB)

RAW Write Throughput (MB/s)

RAW Read Throughput (MB/s)

256

9.946

11.201

512

10.484

11.389

1024

10.778

11.441

2048

11.075

11.465

5120

10.462

11.475

1.1.2.8.2.2. HS Mode (50 MHz, 4-bit) Theoretical Max: 50 MB/s

Size of transfer (KB)

RAW Write Throughput (MB/s)

RAW Read Throughput (MB/s)

256

16.638

21.731

512

20.286

22.450

1024

20.871

22.649

2048

21.686

22.744

5120

21.680

22.803

1.1.2.8.2.3. SDR12 Mode (25 MHz, 4-bit) Theoretical Max: 12.5 MB/s

Size of transfer (KB)

RAW Write Throughput (MB/s)

RAW Read Throughput (MB/s)

256

9.037

11.194

512

10.780

11.391

1024

10.948

11.439

2048

11.036

11.465

5120

11.037

11.480

1.1.2.8.2.4. SDR25 Mode (50 MHz, 4-bit) Theoretical Max: 25 MB/s

Size of transfer (KB)

RAW Write Throughput (MB/s)

RAW Read Throughput (MB/s)

256

13.948

21.719

512

20.519

22.460

1024

21.264

22.645

2048

21.273

22.745

5120

19.968

22.803

1.1.2.8.2.5. SDR50 Mode (50 MHz, 4-bit) Theoretical Max: 50 MB/s

Size of transfer (KB)

RAW Write Throughput (MB/s)

RAW Read Throughput (MB/s)

256

22.783

40.996

512

27.901

43.689

1024

37.582

44.397

2048

40.934

44.773

5120

41.188

44.992

1.1.2.8.2.6. DDR50 Mode (50 MHz, 4-bit) Theoretical Max: 50 MB/s

Size of transfer (KB)

RAW Write Throughput (MB/s)

RAW Read Throughput (MB/s)

256

24.739

39.858

512

32.020

42.436

1024

37.169

43.141

2048

40.220

43.475

5120

40.193

43.702

1.1.2.8.3. EMMC Performance
1.1.2.8.3.1. DS Mode (25 MHz, 8-bit) Theoretical Max: 25 MB/s

Size of transfer (KB)

RAW Write Throughput (MB/s)

RAW Read Throughput (MB/s)

256

15.931

18.426

512

17.999

20.058

1024

19.581

20.994

2048

19.943

21.493

5120

20.487

21.805

1.1.2.8.3.2. HS-SDR Mode (50 MHz, 8-bit) Theoretical Max: 50 MB/s

Size of transfer (KB)

RAW Write Throughput (MB/s)

RAW Read Throughput (MB/s)

256

25.399

31.410

512

30.382

36.506

1024

34.968

39.711

2048

37.818

41.528

5120

39.769

42.711

1.1.2.8.3.3. HS-DDR Mode (50 MHz, 8-bit) Theoretical Max: 100 MB/s

Size of transfer (KB)

RAW Write Throughput (MB/s)

RAW Read Throughput (MB/s)

256

32.824

46.822

512

42.265

59.121

1024

41.457

68.050

2048

52.341

73.576

5120

54.078

77.343

1.1.2.8.3.4. HS-200 Mode (200 MHz, 8-bit) Theoretical Max: 200 MB/s

Size of transfer (KB)

RAW Write Throughput (MB/s)

RAW Read Throughput (MB/s)

256

30.821

48.619

512

43.690

61.828

1024

47.311

71.659

2048

51.253

77.801

5120

51.320

81.940

1.1.2.9. CSL-FL based Optimized OSPI Example

1.1.2.9.1. CPU Mode - Test Set-up
  • Platform: J721e EVM.

  • OS Type: Baremetal

  • Core : R5F_0 at 1 GHz

  • Software/Application Used: csl_ospi_flash_app

  • System Configuration:
    • RCLK 133/166 MHz

    • Cache ON,

    • Buffer & Critical Fxn’s in TCMB,

    • DMA Disabled,

    • Interrupts OFF.

  • Theoretical Max Throughput:
    • 133 MHz :- 253.67 MB/s

    • 166 MHz :- 316.62 MB/s

1.1.2.9.2. DAC Mode OSPI Read Performance (Dual Data Rate - Octal Mode)

OSPI RCLK

Size of transfer (B)

Read Time (ns)

Throughput (MB/s)

133 MHz

16

815

19.6

32

1445

22.1

64

2700

23.7

128

5225

24.5

256

10265

24.9

512

20360

25.1

1024

40510

25.3

166 MHz

16

945

16.9

32

2330

13.7

64

4580

14.0

128

9105

14.1

256

18145

14.1

512

36185

14.1

1024

72295

14.2

1.1.2.9.3. DMA Mode - Test Set-up
  • Platform: J721e EVM.

  • OS Type: Baremetal

  • Core : R5F_0 at 1 GHz

  • Software/Application Used: udma_baremetal_ospi_flash_testapp

  • System Configuration:
    • RCLK 133/166 MHz

    • Cache ON,

    • Buffer & Critical Fxn’s in TCMB,

    • DMA Enabled - SW Trigger mode,

    • Interrupts OFF.

  • Theoretical Max Throughput:
    • 133 MHz :- 253.67 MB/s

    • 166 MHz :- 316.62 MB/s

1.1.2.9.4. DAC DMA Mode OSPI Read Performance (Dual Data Rate - Octal Mode)

OSPI RCLK

Size of transfer (B)

Read Time (ns)

Throughput (MB/s)

133 MHz

16

800

20

32

805

39.8

64

970

66

128

1315

97.3

256

1955

130.9

512

3120

164.1

1024

5450

187.9

166 MHz

16

675

23.7

32

805

39.8

64

850

75.3

128

1180

108.5

256

1685

151.9

512

2730

187.5

1024

4670

219.3

1.1.2.10. SBL OSPI Boot Performance App

1.1.2.10.1. Test Set-up
  • Platform: J721e EVM.

  • OS Type: Baremetal

  • Core : R5F_0 at 1 GHz

  • Software/Application Used: sbl_boot_perf_cust_img and sbl_boot_perf_test appimage

  • Note that app image load time could vary depending on the actual image size

1.1.2.10.2. GP EVM Performance

SBL Boot Time Breakdown

Time (ms)

MCU_PORZ_OUT to MCU_RESETSTATz

0.63

ROM : init + SBL load from OSPI

12.36

SBL : SBL_SciClientInit: ReadSysfwImage

8.270

Load/Start SYSFW

4.058

Sciclient_init

3.163

Board Config

2.009

PM Config

0.122

Security Config

5.503

RM Config

0.758

SBL: SoC Late-Init

SBL : Board_init (PINMUX)

2.819

SBL : Board_init (PLL)

1.340

SBL: Board_init (CLOCKS)

1.038

SBL: OSPI init

2.273

SBL: OSPI PHY tuning time

3.346

SBL: App copy to MCU SRAM & Jump to App

2.597

Misc

0.036

TOTAL time

50.33

1.1.2.11. Early CAN Response

  • CAN response is measured from MCU_PORZ_OUT to pulling the CAN-H line out of standby.

  • Below numbers are measured on J721e ES2.0 GP EVM.

Measured Time

Early CAN

52.89 ms

POST + Early CAN

82.23 ms

1.1.2.12. OSPI Memory Configuration Benchmarking

  • These numbers were collected from the memory_benchmarking_app demo which provides a means of measuring the performance of a realistic application where the text of the application is sitting in various memory locations and the data is sitting in On-Chip-Memory RAM (referred to as OCM, OCMC or OCMRAM).

  • The application executes 10 different configurations of the same text varying by data vs. instruction cache intensity. Each test calls 16 separate functions 500 total times in random order.

  • The most instruction intensive example achieves a instruction cache miss rate (ICM/sec) of ~3-4 million per second when run entirely from OCMRAM. This is a rate that we have similarly seen in real-world customer examples.

  • More data instensive tests have more repetitive code, achieving much lower ICM rates

  • When “Multicore” Configuration is used, it is defined as the execution of the same AUTOSAR application executed simultaneously by means of a synchronization delay on MCU Core 0 (mcu1_0) and MAIN Core 0 (mcu2_0)

  • The Memcpy size is just a knob to make the synthetic benchmark application more data or instruction centric with no additional significance. (small memcpy size is more instruction centric with more ICM rate and vice versa)

  • Memory benchmarking numbers have not been updated for 9.2 release and current numbers are from 9.1 release.

1.1.2.13. Supported Configurations

Core

SOC

Supported Memory Configurations (MEM_CONF)

mcu1_0

j721e

ocmc msmc ddr xip

mcu2_0

j721e

ocmc msmc ddr xip

mcu1_0 + mcu2_0

j721e

ocmc ddr xip

1.1.2.13.1. Test Set-up
  • Platform: J721e EVM.

  • OS Type: FreeRTOS

  • Core – MCU Domain R5_0 (MCU1_0) & Main Domain R5_0 (MCU2_0)

  • Software/Application Used: sbl_cust_img and [MEM_CONF]_memory_benchmarking_app_freertos appimage

  • Refer Memory Benchmarking Apps user guide to which SBL variant to use to test different [MEM_CONF]_memory_benchmarking_app_freertos

1.1.2.13.2. MCU Domain Single Core Execution
  • Cache miss rate of 3M/sec is at memcpy size ~500 bytes.

Memcpy Size

0

50

500

1000

2048

OCMC

OCMC Baseline Execution Time (us)

4603

4731

7023

8751

16966

ICM/sec

3403432

3337983

2124875

1783910

956029

DDR

DDR execution time (us)

7390

7496

9229

10632

17393

DDR / OCMC Baseline

1.605

1.584

1.314

1.215

1.025

MSMC

MSMC execution time (us)

5954

6072

7774

9189

15816

MSMC / OCMC Baseline

1.294

1.283

1.107

1.05

0.932

XIP

XIP 133 MHz execution time (us)

11852

11923

14099

16171

24594

XIP 133 MHz / OCMC Baseline

2.575

2.52

2.008

1.848

1.45

XIP 166 MHz execution time (us)

10290

10326

12797

14698

23119

XIP 166 MHz / OCMC Baseline

2.235

2.183

1.822

1.68

1.363

1.1.2.13.3. MAIN Domain Single Core Execution
  • Cache miss rate of 3M/sec is at memcpy size of ~0 bytes.

Memcpy Size

0

50

500

1000

2048

OCMC

OCMC Baseline Execution Time (us)

5231

5390

8591

9974

22082

ICM/sec

2744599

2584601

1571644

1392320

649261

DDR

DDR execution time (us)

7454

7516

10385

11910

24746

DDR / OCMC Baseline

1.425

1.394

1.209

1.194

1.121

MSMC

MSMC execution time (us)

6081

6307

9188

10605

23436

MSMC / OCMC Baseline

1.162

1.17

1.069

1.063

1.061

XIP

XIP 133 MHz execution time (us)

13308

13368

16347

17866

30252

XIP 133 MHz / OCMC Baseline

2.544

2.48

1.903

1.791

1.37

XIP 166 MHz execution time (us)

11843

12073

15119

16350

28805

XIP 166 MHz / OCMC Baseline

2.264

2.24

1.76

1.639

1.304

1.1.2.13.4. MCU Domain Multi-Core Execution
  • Cache miss rate of 3M/sec is at memcpy size ~500 bytes.

Memcpy Size

0

50

500

1000

2048

OCMC

OCMC Baseline Execution Time (us)

4571

4721

7034

8751

17048

ICM/sec

3271712

3117983

2054449

1690892

909432

DDR

DDR execution time (us)

7260

7364

9033

10521

17226

DDR / OCMC Baseline

1.588

1.56

1.284

1.202

1.01

XIP

XIP 133 MHz execution time (us)

18428

18597

19559

21197

28418

XIP 133 MHz / OCMC Baseline

4.032

3.939

2.781

2.422

1.667

XIP 166 MHz execution time (us)

15539

15753

16763

18232

25841

XIP 166 MHz / OCMC Baseline

3.399

3.337

2.383

2.083

1.516

1.1.2.13.5. MAIN Domain Multi-Core Execution
  • Cache miss rate of 3M/sec is at memcpy size of ~0 bytes.

Memcpy Size

0

50

500

1000

2048

OCMC

OCMC Baseline Execution Time (us)

5314

5493

8722

10052

22205

ICM/sec

2783966

2742399

1669915

1429864

681783

DDR

DDR execution time (us)

7641

7924

10711

12160

24928

DDR / OCMC Baseline

1.438

1.443

1.228

1.21

1.123

XIP

XIP 133 MHz execution time (us)

18652

19062

20486

21897

33199

XIP 133 MHz / OCMC Baseline

3.51

3.47

2.349

2.178

1.495

XIP 166 MHz execution time (us)

15597

16020

17907

19159

30759

XIP 166 MHz / OCMC Baseline

2.935

2.916

2.053

1.906

1.385

1.1.2.13.6. Extra OCMC Baseline Details - MCU Domain
  • View ICM/sec row to see that cache miss rate of 3M/sec is at memcpy size of ~500 bytes.

Mem Cpy Size

0

50

100

200

500

750

1000

1250

1500

2048

Start Time in Usec

305090

585064

866063

1148063

1432062

1718057

2005058

2293059

2584056

2877056

Exec Time in Usec

4603

4731

4791

5258

7023

8075

8751

10556

12827

16966

Task Calls

500

500

500

500

500

500

500

500

500

500

Inst Cache Miss

15666

15792

15310

15909

14923

15683

15611

16085

15584

16220

Inst Cache Acc

1014446

1099854

1175010

1326169

1776019

2165775

2535337

2939693

3325001

4193387

Num Instr Exec

1393997

1496837

1596941

1797781

2383503

2899351

3390792

3905715

4393051

5491808

ICM/sec

3403432

3337983

3195575

3025675

2124875

1942167

1783910

1523777

1214937

956029

INST/sec

302845318

316389135

333321018

341913465

339385305

359052755

387474802

369999526

342484680

323694919

1.1.2.13.7. Extra OCMC Baseline Details - MAIN Domain
  • View ICM/sec row to see that cache miss rate of 3M/sec is at memcpy size of ~0 bytes.

Mem Cpy Size

0

50

100

200

500

750

1000

1250

1500

2048

Start Time in Usec

54087

332065

614062

897064

1181061

1467060

1755063

2044063

2336062

2631062

Exec Time in Usec

5231

5390

5529

5974

8591

9470

9974

12718

16118

22082

Task Calls

500

500

500

500

500

500

500

500

500

500

Inst Cache Miss

14357

13931

13952

14276

13502

13856

13887

13903

14034

14337

Inst Cache Acc

950981

1040093

1115527

1268512

1720681

2108002

2477505

2883088

3267323

4133421

Num Instr Exec

1394131

1496074

1596026

1799279

2383899

2899323

3391676

3906621

4393553

5492726

ICM/sec

2744599

2584601

2523421

2389688

1571644

1463146

1392320

1093175

870703

649261

INST/sec

266513286

277564749

288664496

301184968

277487952

306158711

340051734

307172590

272586735

248742233