Arrow Lake NPU Overclocking

arrow lake npu overclocking skatterbencher

We have a closer look at tuning the performance of the the Arrow Lake NPU (Neural Processing Unit), located on the SoC Tile.

https://www.youtube.com/watch?v=YeJEB8lHsfo

Arrow Lake is Intel’s revolutionary new processor for mainstream desktop, featuring new P-cores and E-cores, disaggregated tile-based 3D Foveros packaging, an integrated NPU for AI acceleration, a next-generation uncore, DLVR power rails, and so much more.

Arrow Lake infographics 2

In this blog post series, I have a closer look at Arrow Lake and explore its performance tuning and overclocking opportunities. I will cover the Compute (P-core, E-core, Graphics, NPU), Memory Subsystem (DDR, MC), and Data Fabric (Ring, NGU, D2D).

Arrow Lake NPU: Introduction

The Neural Processing Unit (NPU) is located on the SoC tile which it shares with a lot of other IP blocks including but not limited to the media engine, the memory controller, and the system agent. The SoC tile is manufactured using the TSMC N6 process.

arrow lake npu3 ai boost

The Ai Boost branded integrated neural processor is based on the NPU 3 design which is also featured in Meteor Lake. NPU 3 features 2 Neural Compute Engines (NCE) with each two Shave DSP processor. A single NCE is capable of delivering 4 INT8 TOPS at 1 GHz, however, the NPU can boost up to 1600 MHz on Arrow Lake.

Arrow Lake NPU: Clocking

The clocking of the NPU is similar to other parts on the CPU: a reference clock is multiplied with a ratio to achieve the eventual operating frequency.

arrow lake clocking topology

Reference Clock

The 100MHz reference clock is derived internally from the SoC PLL. However, it can also be clocked with an external clock generator providing the reference clock for the SoC Tile. This clock affects nearly all the IP blocks of Arrow Lake, except for those in the Compute Tile and the PCIe/DMI links. This PLL can be linked to the CPU PLL when you run in synchronous mode or work independently if you run asynchronous mode.

You can configure the SOC BCLK frequency between 40 and 1000 MHz.

In the ASUS ROG BIOS, you can configure the SOC BCLK Frequency in the Ai Tweaker menu by first setting the Ai Overclock Tuner to anything else than Auto.

arrow lake soc bclk bios

NPU Ratio

The reference clock is multiplied by the NPU ratio to achieve the final clock frequency. Unfortunately, we cannot adjust the NPU Ratio on Arrow Lake, however this may change on future platforms. The default NPU frequency is 1600 MHz.

Arrow Lake NPU: Voltage

The voltage regulation for the neural processor is similar to that of other secondary devices on the Arrow Lake processor, such as the memory controller and the network-on-chip. 

arrow lake voltage topology

VccSA MBVR

The external VccSA MBVR powers several parts of the SOC dielet, including the NPU. Unlike Compute IP, the parts of the SOC dielet are not powered using DLVR. So, power delivery is identical to previous architectures. The most relevant parts powered by the VccSA voltage rail are the neural processor, the next-generation uncore, and the memory controller.

The voltage configuration of the VccSA voltage rail is rather complicated. Since multiple IP domains share the voltage rail, the VccSA voltage is set based on the highest requested voltage from the various connected IP blocks.

arrow lake vccsa svid mode

There’s no NPU-specific voltage available in the BIOS. However, in the ASUS ROG BIOS, you can set the VccSA voltage rail in the Ai Tweaker menu by configuring the CPU System Agent Voltage.

arrow lake vccsa pmbus mode bios

Arrow Lake NPU: Power

NPU power management is similar to that of other Uncore devices. There are multiple so-called “work points” which are defined by a certain frequency. Depending on the workload , the CPU will switch between the work points to adjust the NPU frequency.

By default, the NPU frequency idles at 733 MHz and boosts up to 1.6 GHz. There’s also a work point at 333 MHz which gets activated when the NGU Ratio is set to anything except 26X. That should get fixed in later BIOSes, however.

Arrow Lake NPU: Overclock

Due to the lack of NPU ratios, the only way to improve the NPU performance is by overclocking the SoC base clock frequency. When the BCLK is configured to asynchronous mode, the only frequencies affected by the SoC BCLK are those from the Graphics and SoC tile. The IP on the Compute tile, including the P-cores, E-cores, and Ring, are not affected.

The NPU has a surprisingly large overclocking headroom. I could easily increase the SoC BCLK to 120 MHz. That increases the NPU frequency from 1600 to 1920 MHz and it also increases the performance in the Procyon benchmark by about 20%. Increasing the memory frequency can also help with NPU performance, however the scaling is limited. When increasing the memory frequency from DDR5-4800 to DDR5-7200, the Procyon performance improved by about 3%

With a bit of extra tuning, we can even get the NPU up to 2 GHz!

arrow lake npu 2000 mhz

2 thoughts on “Arrow Lake NPU Overclocking

  1. Dmitry

    How to check stability of the NPU? What’s software?

    1. Pieter

      I only used UL’s Procyon (https://benchmarks.ul.com/procyon) to evaluate performance and stability of the NPU

Leave A Comment