1. J721S2 Datasheet

1.1. Introduction

This section provides the performance numbers of device drivers supported in PDK

1.1.1. Setup Details

SOC Details

Values

Core

R5F

Core Operating Speed

1GHz

DDR Speed

4266 MTs

Cache status

Enabled

Optimization Details

Values

Profile

Release

Compile Options for R5F

-g -ms -DMAKEFILE_BUILD -c -qq -pdsw225 –endian=little -mv7R5 –abi=eabi -eo.oer5f -ea.ser5f –symdebug:dwarf –embed_inline_assembly –float_support=vfpv3d16 –emit_warnings_as_errors

Linker Options for R5F

–emit_warnings_as_errors -w -q -u _c_int00 -c -mv7R5 –diag_suppress=10063 -x –zero_init=on

Code Placement

DDR

Data Placement

DDR

1.1.2. Software Performance Numbers

1.1.2.1. DSS

Display Type

Configuration

FPS

CPU Load

DP

1080P60 BGRA32

60

1.0% (MCU2_0)

1.1.2.2. CSI-RX

Instance

Configuration

Time taken to receive one frame

ISR latency

CSI2Rx Inst 0

1CH 1080P30 IMX390 Sensor Raw12

33.3ms (MCU2_0)

6us (MCU2_0)

1.1.2.3. CSI-Tx

Instance

Configuration

Time taken to Transmit one frame

ISR latency

CSI2Tx Inst 0

1CH 1080P 2.5GBPS IMX390 Sensor Raw12

7.09ms (MCU2_0)

13us (MCU2_0)

1.1.2.4. UDMA

1.1.2.4.1. DMA Parameters
  • Ring Order ID: 0

  • Channel Order ID: 0

  • Channel DMA Priority: 1

  • Channel Bus Priority: 4

  • Channel BUS QOS: 4

  • Channel TX FIFO depth: 128

  • Channel Fetch Word Size: 16

  • Channel Burst Size: 64 bytes for normal channel, 128 bytes for HC and UHC channels

1.1.2.4.2. Test Parameters
  • Type: TR15 Block copy

  • TR: one TR per TRPD in PBR mode

  • TR Memory: Same as buffer memory (DDR, MSMC or OCMC depends on the test performed)

  • Transfer Size: 1 MB read and 1MB write

  • 1MB means 1000x1000 bytes and 1KB means 1000 bytes

Note: Throughput numbers mentioned is the combined memory throughput of both read and write operations

1.1.2.4.3. DRU Blockcopy

DRU channel performance with TR submitted through ring

Test Description

Throughput (MCU2)

CPU Load (MCU2)

Throughput (C7x_1)

CPU Load (C7x_1)

[PDK-3501] 1CH DDR 1MB to DDR 1MB

14593 MB/sec

5%

14810 MB/sec

9%

[PDK-3502] 1CH MSMC 1KB Circular to DDR 1MB

36472 MB/sec

7%

33394 MB/sec

11%

[PDK-3503] 1CH DDR 1MB to MSMC circular 1KB

25922 MB/sec

4%

24585 MB/sec

10%

[PDK-3504] 1CH MSMC 1KB to MSMC circular 1KB (1MB per TR)

55188 MB/sec

7%

43965 MB/sec

11%

[PDK-3505] Multi CH DDR 1MB to DDR 1MB

19636 MB/sec (2CH)

12%

14251 MB/sec (2CH)

17%

[PDK-3506] Multi CH MSMC 1KB to MSMC circular 1KB (1 MB per TR)

52298 MB/sec (2CH)

25%

19535 MB/sec (2CH)

16%

1.1.2.4.5. MCU NAVSS Blockcopy (Normal Channel)

MCU NAVSS normal channel performance with TR submitted through ring

Test Description

Throughput (MCU1)

CPU Load (MCU1)

[PDK-3490] 1CH DDR 1MB to DDR 1MB

534 MB/sec

1%

[PDK-3491] 1CH MSMC 1KB Circular to DDR 1MB

850 MB/sec

1%

[PDK-3492] 1CH DDR 1MB to MSMC circular 1KB

600 MB/sec

1%

[PDK-3493] 1CH MSMC 1KB to MSMC circular 1KB (1MB per TR)

833 MB/sec

1%

[PDK-3489] 1CH OCMC 1KB to OCMC circular 1KB (1MB per TR)

2490 MB/sec

3%

[PDK-3495] Multi CH DDR 1MB to DDR 1MB

1119 MB/sec (2CH)

1%

[PDK-3497] Multi CH MSMC 1KB to MSMC circular 1KB (1 MB per TR)

1710 MB/sec (2CH)

2%

[PDK-12918] 1CH MCU OCMC 1MB to DDR 1MB

1213 MB/sec

1%

[PDK-12919] 1CH DDR 1MB to MCU OCMC 1 MB

1046 MB/sec

1%

1.1.2.5. OSPI

1.1.2.5.1. OSPI Memory Non Cached Test Set-up
  • Platform: J721S2 EVM.

  • OS Type: Baremetal/FreeRTOS.

  • Core : R5F_0 at 1 GHz.

  • Software/Application Used: OSPI_Flash_TestApp/OSPI_Flash_Dma_TestApp

  • System Configuration: Cache OFF, Read/Write Buffer in DDR. DMA Enabled/Disabled, Interrupts ON.

1.1.2.5.2. OSPI Phy Tuning Time (DDR Octal Mode)

OSPI RCLK

Tuning Time

133 MHz

7.801

166 MHz

6.980

Note: PHY tuning time varies across silicon samples and PHY tuning point varies with voltage and temperature.

1.1.2.5.3. OSPI Read/Write Performance (DDR Octal Mode)
1.1.2.5.3.1. S28 (NOR)

OSPI RCLK

Mode

Write Tput (MB/s)

Write CPU Load

Read Tput (MB/s)

Read CPU Load

Read Tput Theoretical Max (MB/s)

133 MHz

DAC

NOT-SUPPORTED

NOT-SUPPORTED

7.467

51%

266

DAC DMA

NOT-SUPPORTED

NOT-SUPPORTED

264.658

1%

INDAC

0.485

100%

8.332

51%

166 MHz

DAC

NOT-SUPPORTED

NOT-SUPPORTED

8.568

51%

332

DAC DMA

NOT-SUPPORTED

NOT-SUPPORTED

329.948

1%

INDAC

0.487

100%

10.414

51%

1.1.2.5.3.2. W35N (NAND)

Mode

Frequency

Read Tput (MB/s)

Read Tput Theoretical Max (MB/s)

DAC

50 MHz

3.782

33

DAC DMA

166 MHz

55.763

62

Note: Theoretical Max for W35N caluculated basing on the assumption for page load time to be 42 Usec.

1.1.2.5.4. OSPI Memory Cached Test Set-up
  • Platform: J721S2 EVM.

  • OS Type: Baremetal/FreeRTOS.

  • Core : R5F_0 at 1 GHz.

  • Software/Application Used: OSPI_Flash_Cache_TestApp/OSPI_Flash_Dma_Cache_TestApp

  • System Configuration: Cache ON, Read/Write Buffer in DDR. DMA Enabled/Disabled, Interrupts ON.

1.1.2.5.5. OSPI Read/Write Performance (DDR Octal Mode)
1.1.2.5.5.1. S28 (NOR)

OSPI RCLK

Mode

Write Tput (MB/s)

Write CPU Load

Read Tput (MB/s)

Read CPU Load

Read Tput Theoretical Max (MB/s)

133 MHz

DAC

NOT-SUPPORTED

NOT-SUPPORTED

81.722

51%

266

DAC DMA

NOT-SUPPORTED

NOT-SUPPORTED

264.324

1%

INDAC

0.486

100%

8.332

51%

166 MHz

DAC

NOT-SUPPORTED

NOT-SUPPORTED

93.916

51%

332

DAC DMA

NOT-SUPPORTED

NOT-SUPPORTED

320.156

2%

INDAC

0.492

100%

10.414

51%

1.1.2.5.5.2. W35N (NAND)

Mode

Frequency

Read Tput (MB/s)

Read Tput Theoretical Max (MB/s)

DAC

50 MHz

32.633

33

DAC DMA

166 MHz

57.667

62

Note: Theoretical Max for W35N caluculated basing on the assumption for page load time to be 42 Usec.

1.1.2.6. MMCSD

1.1.2.6.1. Test Set-up
  • Platform: J721S2 EVM.

  • OS Type: FreeRTOS

  • Core : R5F_1 at 1 GHz.

  • Software/Application Used: MMCSD_<EMMC>_Regression_TestApp (A menu based application which outputs the benchmark numbers on UART)

  • System Configuration: Cache ON, Read/Write Buffer in DDR. ADMA enabled, Interrupts ON.

  • SD Card used: Sandisk 16GB, Class 10. FAT32 formatted with allocation size = 4K (for optimal FAT32 throughput & compatibility with various cards)

  • EMMC: EMMC on J721S2 EVM. Please refer to the EVM data sheet for details

1.1.2.6.2. SD Card Performance
1.1.2.6.2.1. DS Mode (25 Mhz, 4-bit) Theoretical Max: 12.5 MB/s

Size of transfer (KB)

RAW Write Throughput (MB/s)

RAW Read Throughput (MB/s)

256

9.482

11.196

512

10.770

11.392

1024

10.964

11.439

2048

10.982

11.465

5120

11.042

11.480

1.1.2.6.2.2. HS Mode (50 Mhz, 4-bit) Theoretical Max: 50 MB/s

Size of transfer (KB)

RAW Write Throughput (MB/s)

RAW Read Throughput (MB/s)

256

16.327

21.705

512

20.542

22.468

1024

21.270

22.647

2048

21.329

22.746

5120

21.210

22.803

1.1.2.6.2.3. SDR12 Mode (25 Mhz, 4-bit) Theoretical Max: 12.5 MB/s

Size of transfer (KB)

RAW Write Throughput (MB/s)

RAW Read Throughput (MB/s)

256

8.693

11.201

512

10.720

11.391

1024

10.986

11.441

2048

10.967

11.465

5120

10.453

11.480

1.1.2.6.2.4. SDR25 Mode (50 Mhz, 4-bit) Theoretical Max: 25 MB/s

Size of transfer (KB)

RAW Write Throughput (MB/s)

RAW Read Throughput (MB/s)

256

13.945

21.721

512

20.525

22.459

1024

21.258

22.647

2048

21.289

22.745

5120

21.585

22.803

1.1.2.6.2.5. SDR50 Mode (50 Mhz, 4-bit) Theoretical Max: 50 MB/s

Size of transfer (KB)

RAW Write Throughput (MB/s)

RAW Read Throughput (MB/s)

256

20.438

40.997

512

34.332

43.690

1024

39.105

44.396

2048

40.363

44.769

5120

39.606

44.990

1.1.2.6.2.6. DDR50 Mode (50 Mhz, 4-bit) Theoretical Max: 50 MB/s

Size of transfer (KB)

RAW Write Throughput (MB/s)

RAW Read Throughput (MB/s)

256

19.974

39.896

512

32.693

42.444

1024

37.624

43.147

2048

38.468

43.486

5120

26.301

43.633

1.1.2.6.3. EMMC Performance
1.1.2.6.3.1. DS Mode (25 Mhz, 8-bit) Theoretical Max: 25 MB/s

Size of transfer (KB)

RAW Write Throughput (MB/s)

RAW Read Throughput (MB/s)

256

15.853

18.389

512

18.001

20.035

1024

19.298

20.978

2048

20.081

21.471

5120

20.508

21.802

1.1.2.6.3.2. HS-SDR Mode (50 Mhz, 8-bit) Theoretical Max: 50 MB/s

Size of transfer (KB)

RAW Write Throughput (MB/s)

RAW Read Throughput (MB/s)

256

25.794

31.332

512

31.722

36.411

1024

35.704

39.655

2048

38.248

41.503

5120

39.957

42.696

1.1.2.6.3.3. HS-DDR Mode (50 Mhz, 8-bit) Theoretical Max: 100 MB/s

Size of transfer (KB)

RAW Write Throughput (MB/s)

RAW Read Throughput (MB/s)

256

33.201

46.655

512

43.128

58.893

1024

48.972

67.882

2048

49.117

73.482

5120

57.078

77.309

1.1.2.6.3.4. HS-200 Mode (200 Mhz, 8-bit) Theoretical Max: 200 MB/s

Size of transfer (KB)

RAW Write Throughput (MB/s)

RAW Read Throughput (MB/s)

256

31.236

48.384

512

42.985

61.611

1024

48.643

71.490

2048

52.649

77.705

5120

55.582

81.985

1.1.2.6.3.5. HS-400 Mode (200 Mhz, 8-bit) Theoretical Max: 400 MB/s

Size of transfer (KB)

RAW Write Throughput (MB/s)

RAW Read Throughput (MB/s)

256

20.637

65.088

512

43.414

91.018

1024

49.904

114.36

2048

56.637

131.11

5120

59.072

143.79

1.1.2.7. CPSW_2G

1.1.2.7.1. Test Setup
_images/enet_j721s2_cpsw2g_test_setup.png

Hardware Configuration

Value

Processing Core

Main R5F0 Core 0

Core Frequency

1 GHz

Ethernet Interface Type

RGMII at 1Gbps

Packet buffer memory

DDR

Hardware checksum offload

Yes

Scatter-gather TX

Yes

Scatter-gather RX

No

Software Configuration

Value

RTOS

FreeRTOS

RTOS application

Enet LLD lwIP example

TCP/IP stack

lwIP 2.2.0

Host PC tool version

iperf v2.0.10

1.1.2.7.2. TCP Performance

Test

Bandwidth (Mbps)

CPU Load (%)

TCP RX

187

44

TCP TX

186

67

TCP Bidirectional

RX=169 TX=164

96

Host PC commands:

iperf -c <evm_ip> -r
iperf -c <evm_ip> -d
1.1.2.7.3. UDP Performance

Test

Datagram Length = 64B

Datagram Length = 256B

Datagram Length = 512B

Datagram Length = 1470B

Bandwidth
(Mbps)

CPU
Load
(%)
Packet
Loss
(%)
Bandwidth
(Mbps)

CPU
Load
(%)
Packet
Loss
(%)
Bandwidth
(Mbps)

CPU
Load
(%)
Packet
Loss
(%)
Bandwidth
(Mbps)

CPU
Load
(%)
Packet
Loss
(%)

UDP RX

5.24

17

0.000

26.2

36

0.000

26.2

25

0.000

26.2

13

0.000

10.5

29

0.000

52.4

62

0.000

52.4

35

0.000

52.4

20

0.000

15.7

41

0.000

105

105

67

0.000

105

34

0.000

UDP RX (Max)

37

96

0.013

82

99

0.016

150

99

0.008

325

100

0.037

UDP TX (Max)

39.6

100

0.000

90.5

100

0.000

180

100

0.000

500

100

0.000

Host PC commands:

  • Test with datagram length of 64B:

    iperf -c <evm_ip> -u -l64 -b<bw> -r
    where <bw> is 5M, 10M, 15M, etc
    
  • Test with datagram length of 256B:

    iperf -c <evm_ip> -u -l256 -b<bw> -r
    where <bw> is 25M, 50M, 100M, etc
    
  • Test with datagram length of 512B:

    iperf -c <evm_ip> -u -l512 -b<bw> -r
    where <bw> is 25M, 50M, 100M, etc
    
  • Test with datagram length of 1470B (max):

    iperf -c <evm_ip> -u -b<bw> -r
    where <bw> is 25M, 50M, 100M, etc
    

1.1.2.8. SBL OSPI Boot Performance App

1.1.2.8.1. Test Set-up
  • Platform: J721S2 EVM.

  • OS Type: Baremetal

  • Core : R5F_0 at 1 GHz

  • Software/Application Used: sbl_boot_perf_cust_img and sbl_boot_perf_test appimage

  • Note that app image load time could vary depending on the actual image size

1.1.2.8.2. GP EVM Performance

SBL Boot Time Breakdown

Time (ms)

MCU_PORZ_OUT to MCU_RESETSTATz

0.63

ROM : init + SBL load from OSPI

9.113

SBL : SBL_SciClientInit: ReadSysfwImage

0.050

Load/Start SYSFW

4.870

Sciclient_init

3.151

Board Config

1.847

PM Config

0.400

Security Config

0.916

RM Config

0.372

SBL: SoC Late-Init

0.00

SBL : Board_init (pinmux)

0.598

SBL : Board_init (PLL)

0.981

SBL: Board_init (CLOCKS)

1.154

SBL: OSPI init

1.254

SBL: OSPI PHY tuning time

7.397

SBL: App copy to MCU SRAM & Jump to App

1.958

Misc

0.035

TOTAL time

34.72

1.1.2.9. Combined SBL OSPI Boot Performance App

1.1.2.9.1. Test Set-up
  • Platform: J721S2 EVM.

  • OS Type: Baremetal

  • Core : R5F_0 at 1 GHz

  • Software/Application Used: sbl_boot_perf_cust_img_combined and sbl_combined_boot_perf_test appimage

  • Note that app image load time could vary depending on the actual image size

1.1.2.9.2. GP EVM Performance

SBL Boot Time Breakdown

Time (ms)

MCU_PORZ_OUT to MCU_RESETSTATz

0.63

ROM : init + SBL and TIFS load from OSPI

12.274

Sciclient Boot Notification

7.144

Sciclient_init

0.025

Board Config

0.161

PM Config

0.395

Security Config

0.782

RM Config

0.756

SBL: SoC Late-Init

0.00

SBL : Board_init (pinmux)

0.090

SBL : Board_init (PLL)

0.468

SBL: Board_init (CLOCKS)

0.460

SBL: OSPI init

0.776

SBL: OSPI PHY Tuning time

7.466

SBL: App copy to MCU SRAM & Jump to App

2.898

Misc

0.035

TOTAL time

34.291

1.1.2.10. Early CAN Response

  • CAN response is measured from MCU_PORZ_OUT to pulling the CAN-H line out of standby.

  • Below numbers are measured on J721S2 ES1.1 GP EVM.

Measured Time

Early CAN

35.37 ms

POST + Early CAN

58.67 ms

1.1.2.11. OSPI Memory Configuration Benchmarking

  • These numbers were collected from the memory_benchmarking_app demo which provides a means of measuring the performance of a realistic application where the text of the application is sitting in various memory locations and the data is sitting in On-Chip-Memory RAM (referred to as OCM, OCMC or OCMRAM).

  • The application executes 10 different configurations of the same text varying by data vs. instruction cache intensity. Each test calls 16 separate functions 500 total times in random order.

  • The most instruction intensive example achieves a instruction cache miss rate (ICM/sec) of ~3-4 million per second when run entirely from OCMRAM. This is a rate that we have similarly seen in real-world customer examples.

  • More data instensive tests have more repetitive code, achieving much lower ICM rates

  • When “Multicore” Configuration is used, it is defined as the execution of the same AUTOSAR application executed simultaneously by means of a synchronization delay on MCU Core 0 (mcu1_0) and MAIN Core 0 (mcu2_0)

  • The Memcpy size is just a knob to make the synthetic benchmark application more data or instruction centric with no additional significance. (small memcpy size is more instruction centric with more ICM rate and vice versa)

  • Memory benchmarking numbers have not been updated for 9.2 release and current numbers are from 9.1 release.

1.1.2.11.1. Supported Configurations

Core

SOC

Supported Memory Configurations (MEM_CONF)

mcu1_0

j721s2

ocmc msmc ddr xip

mcu2_0

j721s2

ocmc msmc ddr xip

mcu1_0 + mcu2_0

j721s2 | ocmc ddr xip

1.1.2.11.2. Test Set-up
  • Platform: J721S2 EVM.

  • OS Type: FreeRTOS

  • Core – MCU Domain R5_0 (MCU1_0) & Main Domain R5_0 (MCU2_0)

  • Software/Application Used: sbl_cust_img and [MEM_CONF]_memory_benchmarking_app_freertos appimage

  • Refer Memory Benchmarking Apps user guide to which SBL variant to use to test different [MEM_CONF]_memory_benchmarking_app_freertos

1.1.2.11.3. MCU Domain Single Core Execution
  • Cache miss rate of 3M/sec is at memcpy size of 50 bytes.

Memcpy Size

0

50

500

1000

2048

OCMC

OCMC Baseline Execution Time (us)

3991

4053

5875

7085

10665

ICM/sec

1969932

1914631

1250723

1071136

721331

DDR

DDR execution time (us)

5418

5520

6850

8108

11263

DDR / OCMC Baseline

1.358

1.362

1.166

1.144

1.056

MSMC

MSMC execution time (us)

4637

4647

6008

7308

10409

MSMC / OCMC Baseline

1.162

1.147

1.023

1.031

0.976

XIP

XIP 133 MHz execution time (us)

8230

8460

9842

11210

14786

XIP 133 MHz / OCMC Baseline

2.062

2.087

1.675

1.582

1.386

XIP 166 MHz execution time (us)

5642

5680

7556

8765

12552

XIP 166 MHz / OCMC Baseline

1.414

1.401

1.286

1.237

1.177

1.1.2.11.4. MAIN Domain Single Core Execution
  • Cache miss rate of 3M/sec is closest at memcpy size of ~0 bytes.

  • Due to an issue with xip memory benchmarking application on mcu2_0, Benchmarking numbers are not updated for XIP.

Memcpy Size

0

50

500

1000

2048

OCMC

OCMC Baseline Execution Time (us)

3174

3326

4772

6060

9142

ICM/sec

2614681

2491581

1641869

1292079

874206

DDR

DDR execution time (us)

4821

4977

6284

7673

10907

DDR / OCMC Baseline

1.519

1.496

1.317

1.266

1.193

MSMC

MSMC execution time (us)

3822

3925

5267

6556

9820

MSMC / OCMC Baseline

1.204

1.18

1.104

1.082

1.074

XIP

XIP 133 MHz execution time (us)

XIP 133 MHz / OCMC Baseline

XIP 166 MHz execution time (us)

XIP 166 MHz / OCMC Baseline

1.1.2.11.5. MCU Domain Multi-Core Execution
  • Cache miss rate of 3M/sec is at memcpy size of 50 bytes.

  • Due to an issue with xip memory benchmarking application on mcu2_0, Benchmarking numbers are not updated for XIP.

Memcpy Size

0

50

500

1000

2048

OCMC

OCMC Baseline Execution Time (us)

3933

4006

5786

7050

10572

ICM/sec

1847698

1861707

1228655

1037730

671679

DDR

DDR execution time (us)

5473

5526

6893

7989

11205

DDR / OCMC Baseline

1.392

1.379

1.191

1.133

1.06

XIP

XIP 133 MHz execution time (us)

XIP 133 MHz / OCMC Baseline

XIP 166 MHz execution time (us)

XIP 166 MHz / OCMC Baseline

1.1.2.11.6. MAIN Domain Multi-Core Execution
  • Cache miss rate of 3M/sec is closest at memcpy size of 0 bytes.

  • Due to an issue with xip memory benchmarking application on mcu2_0, Benchmarking numbers are not updated for XIP.

Memcpy Size

0

50

500

1000

2048

OCMC

OCMC Baseline Execution Time (us)

3172

3310

4781

6082

9159

ICM/sec

2644388

2452265

1740849

1347254

896495

DDR

DDR execution time (us)

4971

5112

6332

7625

10948

DDR / OCMC Baseline

1.567

1.544

1.324

1.254

1.195

XIP

XIP 133 MHz execution time (us)

XIP 133 MHz / OCMC Baseline

XIP 166 MHz execution time (us)

XIP 166 MHz / OCMC Baseline

1.1.2.11.7. Additional OCMC Baseline Details - MCU Domain
  • View ICM/sec row to see that cache miss rate of 3M/sec is at memcpy size of 50 bytes.

Mem Cpy Size

0

50

100

200

500

750

1000

1250

1500

2048

Start Time in Usec

305275

584111

864111

1145111

1427112

1711106

1995110

2281111

2567106

2854101

Exec Time in Usec

3991

4053

4212

4557

5875

6632

7085

8021

9266

10665

Task Calls

500

500

500

500

500

500

500

500

500

500

Inst Cache Miss

7862

7760

7796

8172

7348

8054

7589

7887

8179

7693

Inst Cache Acc

1004611

1092896

1166609

1319969

1763715

2151058

2518645

2907033

3276948

4102653

Num Instr Exec

1394127

1496444

1595698

1798513

2384113

2899421

3390255

3905453

4391667

5490555

ICM/sec

1969932

1914631

1850902

1793285

1250723

1214414

1071136

983293

882689

721331

INST/sec

349317714

369218850

378845679

394670397

405806468

437186519

478511644

486903503

473954996

514819971

1.1.2.11.8. Additional OCMC Baseline Details - MAIN Domain
  • View ICM/sec row to see that cache miss rate of 3M/sec is closest at memcpy size of 0 bytes. mcu2_0 application is marginally less complex because mcu1_0 is responsble for the sciserver and is the boot core.

Mem Cpy Size

0

50

100

200

500

750

1000

1250

1500

2048

Start Time in Usec

55137

331079

610078

890079

1170078

1452075

1735075

2020076

2305078

2591077

Exec Time in Usec

3174

3326

3467

3814

4772

5505

6060

6784

7753

9142

Task Calls

500

500

500

500

500

500

500

500

500

500

Inst Cache Miss

8299

8287

8426

8622

7835

8276

7830

8387

8491

7992

Inst Cache Acc

986422

1074914

1149556

1304496

1747327

2134404

2503805

2890003

3259333

4084005

Num Instr Exec

1396682

1498666

1598766

1802446

2386732

2901636

3393660

3908624

4395364

5493164

ICM/sec

2614681

2491581

2430343

2260618

1641869

1503360

1292079

1236291

1095188

874206

INST/sec

440038437

450591100

461138159

472586785

500153394

527091008

560009900

576153301

566924287

600871144