April 12, 2026: AMA Livestream to Celebrate SkatterBencher #100 (Learn More)

SkatterBencher #102: Intel Xeon 658X Overclocked to 5100 MHz

xeon 658x skatterbencher overclocking guide

Today we undervolt and overclock the Intel Xeon 658X 24-core processor up to 5100 MHz with custom loop water cooling.

https://www.youtube.com/watch?v=ek3ou89kKdc

I do this by leveraging several of the advanced tools in Intel’s overclocking toolkit, including adaptive voltage, and the Turbo Ratio limits. In this guide, I break down the Xeon 658X tuning process into five unique overclocking strategies for beginner and advanced overclockers.

  • First, we unlock the power limits and load a memory preset,
  • Second, we overclock using ASUS’ water-cooled oc preset,
  • Third, we manually tune our CPU core overclock,
  • Fourth, we finetune the memory timings,
  • And, finally, we improve performance with an undervolt.

However, before we jump into overclocking, let us quickly review the hardware and benchmarks used in this guide.

Intel Xeon 658X: Introduction

The Intel Xeon 658X is part of Intel’s Granite Rapids based Xeon 600 for workstation processor lineup. The processors were announced in February 2026 even though its datacenter counterparts have been available since September 2024.

intel xeon 600 processors for workstation

Granite Rapids features up to 86 Redwood Cove+ P-cores built using the Intel 3 manufacturing process. You also get up to 8 memory channels and 128 CPU PCIe 5.0 lanes. There are four variants of the Granite Rapids package: UCC, XCC, HCC, and LCC.

  • The UCC package comes with three compute dies and is only available for data center as it requires the larger LGA7529 socket.
  • The XCC package features two compute dies and is available in 64- and 86-core configurations. All SKUs are unlocked for overclocking.
  • The HCC package features a single compute die and is available in 24-, 28-, 32-, and 48-core configurations. All SKUs are unlocked for overclocking.
  • The LCC package also features a single compute die, but, unfortunately, doesn’t have a single unlocked part.
granite rapids die configurations

The Xeon 658X we’re overclocking today features the HCC package with 24 cores and 250W TDP. It has a base frequency of 4.3 GHz for SSE workloads and 3.5 GHz for AMX workloads. It also has a maximum frequency of 4.9 GHz when up to 2 cores are active.

Intel Xeon 658X: Platform Overview

The system we’re overclocking today consists of the following hardware.

ItemSKU
CPUIntel Xeon 658X
MotherboardASUS Pro WS W890E-Sage SE
CPU CoolingBitspower Summit Pro X4677
2x EK-Quantum Surface S360
MemoryG.SKILL R-DIMM DDR5-6400 128GB F5-6400R3239G16GQ4-G5
Power SupplyXPG Fusion 1600W Titanium
Graphics CardGALAX GeForce RTX 4090 HOF
StorageCorsair MP700 Elite 2TB PCIe 5.0 NVME
ChassisOpen Benchtable V2
TelemetryBENCHLAB
MonitorASUS VS228
intel xeon 658x overclock system

Benchmark Software

We use Windows 11 and the following benchmark applications and games to measure performance and ensure system stability.

BenchmarkLink
3DMark CPU Profilehttps://www.3dmark.com/
7-Zip 25.01https://www.7-zip.org/
AI-Benchmarkhttps://ai-benchmark.com/
AIDA64https://www.aida64.com/
Blender 4.5.0https://opendata.blender.org/
Cinebench 2026.1https://www.maxon.net/en/cinebench/
Corona 10 Benchmarkhttps://corona-renderer.com/benchmark
CPU-Zhttps://www.cpuid.com/softwares/cpu-z.html
Geekbench 6https://www.geekbench.com/
Geekbench AIhttps://www.geekbench.com/ai/
IndigoBenchhttps://www.indigorenderer.com/indigobench
LocalScore 14Bhttps://www.localscore.ai/
OCCThttps://www.ocbase.com/
PugetBench for Creatorshttps://www.pugetsystems.com/pugetbench/creators/
Pov-Ray 2.01https://www.povray.org/
PYPrime 2.0 32Bhttps://github.com/mbntr/PYPrime-2.x
V-Ray 6https://www.chaosgroup.com/vray/benchmark
Y-Cruncher 25Bhttp://www.numberworld.org/y-cruncher/
GameLink
Counter Strike 2https://store.steampowered.com/app/730/CounterStrike_2/
Shadow of the Tomb Raiderhttps://store.steampowered.com/app/750920/Shadow_of_the_Tomb_Raider_Definitive_Edition/
Homeworld 3https://store.steampowered.com/app/1840080/Homeworld_3/

Intel Xeon 658X: Stock Performance

Before starting overclocking, we must check the system performance at default settings.

In the past, there was some confusion about what constitutes default settings for Intel processors. For new platforms, Intel spends extra resources clarifying to industry and media partners what constitutes default settings. This default configuration is available in the BIOS.

xeon 658x default settings bios

In the ASUS BIOS we can easily enforce the default settings by switching the Performance Preferences option to Intel Default Settings.

The Intel Default Turbo Boost 2.0 and Turbo Ratio Limit parameters for the Xeon 658X are as follows:

xeon 658x default turbo boost parameters

We can see that the operating frequency of the Xeon 658X is a dynamic term. We note that the biggest difference in operating frequency between SSE and AMX workloads is 900 MHz, which occurs when between 3 and 4 cores are active. We have a total of 24 P-cores, two of which can boost to 4.9 GHz, two can boost to 4.8 GHz, and the remaining 20 P-cores are limited up to 4.7 GHz.

Here is the benchmark performance at stock.

When running the OCCT CPU AVX512 Stability Test, the average CPU core effective clock is 3949 MHz with 0.931 volts. The average CPU temperature is 63 degrees Celsius. The average system power is 338.1 watts.

When running the OCCT CPU SSE Stability Test, the average CPU core effective clock is 4193 MHz with 0.959 volts. The average CPU temperature is 68 degrees Celsius. The average system power is 339.0 watts.

Of course, we can increase the maximum power consumption limit using Turbo Boost 2.0 adjustments. That’s what we’ll do in our first overclocking strategy.

However, before we get going, make sure to locate the CMOS Clear button. Pressing the Clear CMOS button will reset all your BIOS settings to default, which is helpful if you want to start your BIOS configuration from scratch. The Clear CMOS button is located on the back I/O of the motherboard.

OC Strategy #1: OC Profile + MCE + Memory Preset

In our first overclocking strategy, we leverage Intel Turbo Boost 2.0 to increase the power limits and use Intel Extreme Memory Preset to boost memory performance.

ASUS Advanced OC Profile

As I mentioned earlier, Intel has clarified the rules of engagement by defining what constitutes default settings. In doing so, it created stricter rules for default BIOS settings but also opened up new avenues for innovation. One such innovation is the ASUS Advanced OC Profile.

In essence, the ASUS Advanced OC Profile sets up the BIOS so it’s ready for overclocking. Not only does it reconfigure some of the parameters included in the definition of Intel’s Default Settings, but it also configures a number of other settings related to power-saving, performance limiters, and so on.

By modifying some parameters, it’s possible to improve the system performance. Most importantly for our guide, however, you’re required to switch to the Advanced OC Profile for any kind of overclocking.

Intel Turbo Boost 2.0

Intel Turbo Boost 2.0 Technology allows the processor cores to run faster than the base operating frequency if the processor is operating below rated power, temperature, and current specification limits. The ultimate advantage is opportunistic performance improvements in both multi-threaded and single-threaded workloads.

The turbo boost algorithm works according to an EWMA formula. This stands for Exponentially Weighted Moving Average. There are three main parameters to consider: PL1, PL2, and Tau.

  • Power Limit 1, or PL1, is the threshold that the average power will not exceed. Historically, this has always been set equal to Intel’s advertised TDP. Very importantly, PL1 should not be set higher than the thermal solution cooling limits.
  • Power Limit 2, or PL2, is the maximum power the processor is allowed to use for a limited amount of time.
  • Tau is a weighing constant used in the algorithm to calculate the moving average power consumption. Tau, in seconds, is the time window for calculating the average power consumption. If the average power consumed is higher than PL1 the CPU will reduce the CPU frequency.

Turbo Boost 2.0 Technology has evolved over the past years to incorporate a lot of power, thermal, and electrical performance limiters, including PL3, PL4, IccMax, TCC_Offset, VR_TDC, RATL, Pmax, and many more.

An easy ASUS MultiCore Enhancement option on ASUS motherboards allows you to unleash the Turbo Boost power limits. Set the option to Enabled – Remove All Limits and enjoy maximum performance.

Enabling ASUS MultiCore Enhancement adjusts the following parameters:

Intel Extreme Memory Profile 3.0

Intel Extreme Memory Profile 3.0 is the new XMP standard for DDR5 memory and is the successor to XMP 2.0 for DDR4 memory. It was introduced together with the Alder Lake processors in 2021. It is largely based on the XMP 2.0 standard but has additional functionality.

XMP allows memory vendors such as G.SKILL to program higher performance settings onto the memory sticks. If the motherboard supports XMP, you can enable higher performance with a single BIOS setting. So, it saves you lots of manual configuration.

BIOS Settings & Benchmark Results

Upon entering the BIOS

  • Go to the Ai Tweaker menu
  • Set Performance Preferences to ASUS Advanced OC Profile
  • Set Ai Overclock Tuner to XMP
  • Set ASUS MultiCore Enhancement to Enabled – Remove All limits

Then save and exit the BIOS.

We re-ran the benchmarks and checked the performance increase compared to the default operation.

While you’d expect unlocking the power limits would make this chip fly, we’re still performance limited by the default Turbo Ratios. So overall, the performance improvement is not that much although with the help of slightly better memory we do get a nice little bump in most workloads. The Geomean performance speedup is +4.21%, and we get a maximum benchmark improvement of +12.01% in PyPrime.

When running the OCCT CPU AVX512 Stability Test, the average CPU core effective clock is 4001 MHz with 0.942 volts. The average CPU temperature is 68 degrees Celsius. The average system power is 412.6 watts.

When running the OCCT CPU SSE Stability Test, the average CPU core effective clock is 4286 MHz with 0.975 volts. The average CPU temperature is 74 degrees Celsius. The average system power is 409.4 watts.

OC Strategy #2: Water-Cooled OC Preset

In our second overclocking strategy, we use ASUS’s water-cooled oc preset to overclock the CPU cores. That changes the Turbo Ratio Limits, Per Core Ratio Limits, and adjusts the Turbo Boost 2.0 power limits.

ASUS Water-Cooled OC Preset

The ASUS water-cooled OC preset is an excellent addition to the ASUS Pro WS W890 motherboards, giving Xeon customers an easy path to additional performance. We can enable the preset with a single button click.

The preset drastically improves the all-core performance by changing the Turbo Ratio Limit and Per Core Ratio Limit configuration. It also sets a per core adaptive target voltage and a -20mV per core adaptive voltage offset. Lastly, it also manages the temperature and power consumption by adjusting the Turbo Boost 2.0 power limit to 374W.

To better understand what the preset does in terms of CPU frequency, let’s briefly talk about the relevant Intel overclocking technologies and see how the water-cooled preset uses them.

Intel Turbo Ratio Limit

We all know the Turbo Boost 2.0 technology from its impact on the power limits, but a second significant aspect of Turbo Boost 2.0 is configuring the CPU frequency based on the number of active cores.

Intel Turbo Ratio Limits allows us to configure the overclock for different scenarios ranging from 1 active core to all active cores. That enables us to run some cores significantly faster than others when the conditions are right. Intel provides eight (8) registers to configure the Turbo Ratio Limits.

On mainstream platforms where the top SKU has no more than 8 P-cores, these registers are configured from 1-active P-core to 8-active P-cores. However, on platforms with core counts beyond eight cores, we can configure each register by target Turbo Boost Ratio and the number of active cores. Well … we used to but on Granite Rapids it appears the Turbo Ratio Limit registers are configured for a specific active core count.

Enabling the water-cooled OC preset adjusts the Turbo Ratio Limits as follows:

By Core Usage is not the same as configuring each core individually. When using By Core Usage, we determine an overclock according to the actual usage. For example, if a workload uses four cores, the CPU determines which cores should execute this workload and applies our set frequency to those cores.

Intel Per Core Ratio Limit

Intel Per Core Ratio Limit allows you to set a maximum CPU Ratio for every individual P-core. It is an extension of the Intel Turbo Boost Max 3.0 technology introduced in 2016. It acts independently from the Turbo Ratio Limit, meaning that when you set a Per Core Ratio Limit, the core ratio will be restricted even if the Turbo Ratio Limit allows for a higher boost frequency.

The Per Core Ratio Limit plays an important role in the Granite Rapids overclocking process. It’s not only the ratio used in the V/F point configuration but also the ratio referenced by many other ratio tuning technologies. For now, what’s important to know is that each Granite Rapids processor has a number of favored cores which are allowed to boost higher than others. These cores are different for each CPU.

The water-cooled oc preset adjusts these Per Core Ratio Limits as follows:

So, all cores got “promoted” to boost to a higher frequency than default.

BIOS Settings & Benchmark Results

Upon entering the BIOS

  • Go to the Ai Tweaker menu
  • Set Performance Preferences to ASUS Advanced OC Profile
  • Set Ai Overclock Tuner to XMP
  • Set ASUS MultiCore Enhancement to Enabled – Remove All limits
  • Set CPU Core Ratio to Water-Cooled OC Preset

Then save and exit the BIOS.

The boost curve now starts at 5.1 GHz when 1 core is active and gradually decreases to 4.9 GHz when all cores are active. That’s between 100 and 700 MHz higher frequency than stock. All P-cores now also boost 200 to 300 MHz higher than at stock.

We re-ran the benchmarks and checked the performance increase compared to the default operation.

The water-cooled oc preset is the simplest way to extract more performance from the CPU P-cores on ASUS W890 motherboards. The additional frequency obviously adds performance across the board. The geomean performance speedup improves by nine and a half percentage points and we get a maximum benchmark improvement of +23.76% Cinebench 2026 Multi Core.

When running the OCCT CPU AVX512 Stability Test, the average CPU core effective clock is 4516 MHz with 1.021 volts. The average CPU temperature is 87 degrees Celsius. The average system power is 512.8 watts.

When running the OCCT CPU SSE Stability Test, the average CPU core effective clock is 4882 MHz with 1.043 volts. The average CPU temperature is 89 degrees Celsius. The average system power is 509.8 watts.

OC Strategy #3: Manual CPU P-core Overclock

In our third overclocking strategy, we pursue a basic manual, dynamic CPU P-core overclock using the Turbo Ratio Limits and adaptive voltage mode. Before we get to the settings, I need to briefly cover Intel’s overclocking toolkit for Granite Rapids as well as have a look at the Granite Rapids Clocking and Voltage topology.

Granite Rapids Clocking Topology

The clocking of a standard Granite Rapids platform slightly differs from what we’re used to with mainstream platforms. The standard clocking topology relies on a 25 MHz crystal or crystal oscillator input to the PCH and the processor. The processor then generates four 100 MHz reference clocks which is used for the various on-chip and off-chip devices, including those connected via the PCH.  

Unfortunately, there’s no reference clock control like we had on previous platforms. We can’t even adjust the frequency to 101 MHz! That’s a bit unfortunate because, as we’ll see later on, it limits the clock granularity of, for example, the system memory.

The 100 MHz CPU BCLK is then multiplied with specific ratios for each of the different parts in the CPU.

  • Each P-core can run at an independent frequency with ratio support of up to 127X.
  • The mesh frequency ties together the last-level cache and CHAs. It can run an independent frequency from the P-cores. Unlike Sapphire Rapids, the Mesh Ratio is fully unlocked on single-tile as well as multi-tile CPUs. It has ratio support up to 80X
  • The memory frequency is also driven by the CPU BCLK and multiplied by a memory ratio. The available memory ratios depend on the type of system memory. For regular R-DIMM, the memory controller is fixed at Gear 4 and for MR-DIMM it’s fixed at Gear 8!

Then, there’s also a mesh frequency for the two IO dies (referred to as IO Die North and South). Their ratios are limited to 25X even though you can program the register up to 100X.

Granite Rapids Voltage Topology

Granite Rapids uses a combination of fully integrated voltage regulators (FIVR) and motherboard voltage regulators (MBVR) for power management. There are eight (8) distinct voltage inputs to a Granite Rapids processor, most of them FIVR or fully integrated voltage regulator. The FIVR then manages the voltage provided to specific parts of the CPU. The end user can control some of these voltages.

The voltages most relevant for Granite Rapids Xeon 600 processor overclocking are those driven by the VccIN, including the P-core and Mesh voltages, and, to a lesser extent, the voltages driven by the VccFA_EHV for the PCIE & IO and VccD_HVx for the memory controllers.

Granite Rapids Overclocking Toolkit

I described the history of Intel’s overclocking toolkit in a previous guide on this website. Long story short, Intel developed and maintained a technology called the OC Mailbox which contains the entire overclocker’s toolkit. This toolkit is not always the same for each CPU architecture, as sometimes we need different tools.

On Granite Rapids, the overclocking toolkit consists of the following tools:

  • Global turbo ratio limit and voltage control
  • Per core ratio limit, oc ratio, and voltage control
  • Mesh ratio limit, oc ratio, and voltage control
  • HDC, DDRD, and INF voltage control
  • DRAM ratio and timings control
  • AVX2, AVX-512, and TMUL ratio offset
  • Voltage limits & undervolt protection
  • SVID disable
  • XMP 3.0 support

Notably missing from the OC toolbox are prominent features we know from mainstream desktop like Advanced Voltage Offset, better known as V/F points, and OverClocking Thermal Velocity Boost, or OCTVB, as well as XTU support.

However, Intel has enabled at-runtime overclocking tools though their collaboration with OCCT. It should also be available on Linux, which I want to try at some point in the future.

Xeon 658X Voltage-Frequency Curve

Each core inside this Xeon 658X has its own factory-fused voltage-frequency curve. According to the specification, two cores can run up to 4.9 GHz, two cores can run up to 4.8 GHz, while the 20 remaining cores can only run up to 4.7 GHz.

Unfortunately, there’s no simple way to extract the V/F curve from the CPU. However, we can use HWiNFO and a light workload to get an approximation of the curve.

The maximum voltage at 47X ranges from 1.038V (Core 20) to 1.061V (6 different cores). We also find that the favored cores are not necessarily those with the best V/F curves. For example, while Core 21 can boost to 49X it only has the 13th best V/F curve. Cores 20, 14, and 16 are 1st, 3rd, and 4th though.

Granite Rapids Adaptive Voltage Mode

Like any previous Intel architecture, there are two main ways of configuring the voltage for the CPU cores: override mode and adaptive mode.

  • Override mode specifies a single static voltage across all ratios. It is mainly used for extreme overclocking where stability at high frequencies is the only consideration.
  • Adaptive mode is the standard mode of operation. In Adaptive Mode, the CPU relies on the factory-fused voltage-frequency curves to set the appropriate voltage for a given ratio. When configuring an adaptive voltage, it is mapped against the “OC Ratio, ” the highest configured ratio. We’ll get back to that in a minute.

Since Granite Rapids uses FIVR, we can only adjust the core voltage by configuring the CPU PCU registers via BIOS or specialized tools.

We can specify a voltage offset for override and adaptive modes. Of course, this doesn’t make much sense for override mode – if you set 1.15V with a +50mV offset, you could just set 1.20V – but it can be helpful in adaptive mode as you can offset the entire V/F curve by up to 500mV in both directions.

On Granite Rapids, you can configure the override or adaptive voltage on a Global or Per-Core level. Let’s focus on adaptive mode voltage configuration and first look at how it works for a single core.

When we set an adaptive voltage for a core, this voltage is mapped against a core-specific “OC Ratio.” This ratio is not always configurable from BIOS and is usually equal to the Per Core Ratio Limit, though can in theory be configured independently. In the case of the Xeon 658X, that ratio would be 49X for two super-favored cores, 48X for the other two favored cores, and 47X for the rest of the cores.

Then, specific rules govern what adaptive voltage can be set.

A) the voltage set for a given ratio n must be higher than or equal to the voltage set for ratio n-1.

Suppose Core 0 of our Xeon 658X runs 47X at 1.05V. In that case, setting the adaptive voltage, mapped to 47X, lower than 1.05V, is pointless. 47X always runs at 1.05V or higher. Usually, BIOSes may allow you to configure lower values. However, the CPU’s internal mechanisms will override your configuration if it doesn’t follow the rules.

B) the adaptive voltage configured for any ratio below the default maximum Turbo Ratio will be ignored.

Take the same example of Core 0, specified to run 47X at 1.05V. If you try to configure all cores to 45X and set 0.95V, the CPU will ignore this because it has its own factory-fused target voltage for all ratios up to 47X and will use this voltage. You can only change the voltage of the OC Ratio, which, as mentioned before, on most cores is >49X. If you wish to set a fixed frequency and voltage, you should switch to override mode.

C) for ratios between the set OC Ratio and default maximum Turbo Ratio, the voltage is interpolated between the set adaptive voltage and the factory-fused voltage.

Returning to the Core 0 example, specified to run 47X at 1.05V, let’s say we manually configure the OC ratio to be 51X at 1.25V. The target voltage for ratios 48X, 49X, and 50X will now be interpolated between 1.05V and 1.25V

As I mentioned already, we can do this for each core individually. However, that would be rather painful, especially on a 24-core CPU! Fortunately, there’s also an alternative way: set a global adaptive voltage.

When we set a global adaptive voltage, it maps this voltage to the OC Ratio for each core in our CPU. That certainly makes things easier.

Granite Rapids Per Core Ratio Limit & Voltage

While we only recently saw the addition of per-core ratio control on mainstream desktop with Rocket Lake, on the high-end desktop, the ability to control the maximum ratio and voltage for each core has been around since Broadwell-E in 2016.

The Per Core Ratio Limit and Voltage control options let you control the upper end of the voltage-frequency curve of each core inside your CPU. While the general rules of adaptive voltage mode still apply, this enables two crucial new avenues for CPU overclocking.

  • First, it allows users to overclock each core and find its maximum stable frequency individually.
  • Second, it allows users to set an aggressive by core usage overclock while constraining the worst cores.

Since each core has an independent FIVR-regulated power rail, it’s possible to fine-tune each core to its maximum capability.

In theory, the Per Core Ratio Limit’s only function is to limit the maximum frequency of a specific P-core. That means you could independently define a core’s V/F curve and its maximum allowed ratio. For example, set the Core 0 to 1.25V at 51X but limit the core’s maximum ratio to 50X at its interpolated voltage. However, on most motherboards, the auto-rules will use the Per Core Ratio Limit value to also configure the P-core’s OC Ratio. 

Xeon 658X: Manual Tuning Process

With all the theory in mind, let’s get clocking. Our basic manual overclock consists of two main steps:

  1. Set a global adaptive voltage, which the motherboard auto-rules will then for each core’s OC Ratio.
  2. Push the Turbo Ratio Limits to increase the performance when 1 to all cores are active.

For the adaptive voltage, I pick 1.15V which is about 50mV higher than the highest default voltage for any P-core.

Then, for the Turbo Ratio Limit I first try to find the maximum stable configuration in the heaviest possible workload. In my case, that’s OCCT AVX-512. Here I find that setting 51X is stable but 52X will shut down the system.

BIOS Settings & Benchmark Results

Upon entering the BIOS

  • Go to the Ai Tweaker menu
  • Set Performance Preferences to ASUS Advanced OC Profile
  • Set Ai Overclock Tuner to XMP
  • Set ASUS MultiCore Enhancement to Enabled – Remove All limits
  • Set CPU Core Ratio to By Core Usage
  • Enter the By Core Usage submenu
    • Set Turbo Ratio Limit 1 to 8 to 51
  • Leave the By Core Usage submenu
  • Set Compute Die0 Core Voltage to Adaptive Mode
    • Set Additional Turbo Mode Voltage to 1.15

Then save and exit the BIOS.

The boost curve now starts at 5.1 GHz when one core is active and remains there even when all 24 cores are active. Every P-core is now able to boost up to 5.1 GHz too, which is 400 MHz higher than stock for most cores.

We re-ran the benchmarks and checked the performance increase compared to the default operation.

Manually tuning the P-core frequency gives us another performance uplift over the water-cooled oc preset we tried before, but also makes us hit the thermal limit in intense multi-core workloads. The Geomean performance speedup improves by another five percentage points and we get a maximum benchmark improvement of +33.25% in Counter Strike 2.

When running the OCCT CPU AVX512 Stability Test, the average CPU core effective clock is 4900 MHz with 1.116 volts. The average CPU temperature is 104 degrees Celsius. The average system power is 704.7 watts.

When running the OCCT CPU SSE Stability Test, the average CPU core effective clock is 5019 MHz with 1.136 volts. The average CPU temperature is 104 degrees Celsius. The average system power is 648.6 watts.

OC Strategy #4: Memory Subsystem Tuning

In our fourth overclocking strategy, we delve into tuning the data fabric and memory subsystem which consists of the compute die mesh, memory controllers, and system memory.

Granite Rapids: Mesh

The Granite Rapids CPUs uses mesh interconnects on all tiles, including the Compute tiles and both IOD tiles. On the Compute tile, the Mesh ties together the various modules like CPU P-cores, memory controllers, last-level cache, and CHAs (Cache and Home Agent).

Each Mesh has its own V/F curve. Here is the Compute Mesh V/F curve for my specific Xeon 658X. The voltage is about 703 mV at 800 MHz and gradually increases to 730 mV at 2.2 GHz. As said, the IODs have their own V/F curve which is about 100 mV “worse” than the compute mesh.

The mesh reference clock frequency is generated internally by the CPU. This clock affects all IP blocks in CPU and, unfortunately, cannot be adjusted. The reference clock is multiplied by the mesh ratio to achieve the final clock frequency. It operates independently. The default ratio is 22X, which yields a 2.2 GHz operating frequency. The Compute Mesh is entirely unlocked for overclocking. The IOD Meshes operate at 2.5 GHz by default but can’t be overclocked, unfortunately.

The voltage regulation for the mesh is more complex than mainstream platforms due to the use of FIVR. As we discussed earlier in the guide, the external VccIN motherboard voltage regulator (MBVR) provides the input voltage for individual P-cores and Mesh.

  • VccCOREn is the rail powering an individual P-cores on the Compute tile.
  • VccM VccCOREn is the rail powering an individual P-cores on the Compute tiles.
  • VccM is the rail powering the Mesh on the Compute tiles.
  • VccCFN is the rail powering ?
  • VccHDC is the rail powering the last-level cache on the Compute tiles.
  • VccDDRD is the rail powering the memory controllers on the Compute tiles

Similar to the P-cores, the compute mesh voltage can be configured in adaptive and override voltage mode, each supporting additional offset as well. The specific rules governing the adaptive voltage mode which I covered at length in the previous OC Strategy also apply to the compute mesh adaptive voltage.

In my case I could set 26X for 2.6 GHz at 0.9V.

Granite Rapids: Memory Controller

The Granite Rapids memory controllers are integrated on the compute tiles, connected to the P-cores via the mesh interconnect.

Granite Rapids memory support comes in a variety of configurations, enabling different levels of channel count, DRAM type support, and DRAM speed.

  • The UCC package for Granite Rapids-AP supports 12 memory channels (4 channels per Compute tile), both RDIMM and MRDIMM, with MRDIMM speeds up to DDR5-8800.
  • The XCC package for Granite Rapids-SP supports 8 memory channels (4 channels per Compute tile), also both RDIMM and MRDIMM, but MRDIMM only up to DDR5-8000.
  • The HCC package for Graphite Rapids-SP supports 8 memory channels (8 channels per Compute tile) but has split DRAM type support as only 28 core and up supports MRDIMM.
  • The LCC package for Granite Rapids-SP has split memory channel support with core counts below 18 only having 4 memory channels. All SKUs support RDIMM up to DDR5-6400

Unlike other Intel DDR5 platforms, the memory controller frequency is fixed in the sense that you can only run gear 4 mode with R-DIMM and gear 8 mode with MR-DIMM.

In terms of memory overclocking, not much news as it’s the same as previous RDIMM platforms.

System Memory Tuning

Usually for the DRAM tuning section I rely on the ASUS Memory Presets. Since there’s a profile available for 8x16GB, that’s the one I picked. Unfortunately, my kit was not stable at DDR5-8000 which is not unreasonable as it’s rated at only XMP-6400.

To make my life easy, I decided to just lower the memory frequency. Turns out it was stable at DDR5-7200, which is still 12.5% higher than its rating.

After tuning the memory subsystem, our AIDA64 performance improved quite significantly. We got about +10% extra bandwidth and better latency by enabling XMP. Tuning the CPU frequency added another 5 percentage points. Then, tuning the memory added another 15% to 30% higher bandwidth and further improved the latency by 20%.

BIOS Settings & Benchmark Results

Upon entering the BIOS

  • Go to the Ai Tweaker menu
  • Set Performance Preferences to ASUS Advanced OC Profile
  • Set Ai Overclock Tuner to XMP
  • Set ASUS MultiCore Enhancement to Enabled – Remove All limits
  • Set CPU Core Ratio to By Core Usage
  • Enter the By Core Usage submenu
    • Set Turbo Ratio Limit 1 to 8 to 51
  • Leave the By Core Usage submenu
  • Set DRAM Frequency to DDR5-7200
  • Set Min and Max Compute Die0 Mesh Ratio to 26
  • Set Min and Max IOD South and North Nesh Ratio to 25
  • Enter the DRAM Timing Control submenu
    • Enter the Memory Presets submenu
      • Select Load Hynix 8000MHz 1.4V 8x16GB SR
    • Leave the Memory Presets submenu
  • Leave the DRAM Timing Control submenu
  • Set Compute Die0 Core Voltage to Adaptive Mode
    • Set Additional Turbo Mode Voltage to 1.150
  • Set Compute Die0 Mesh Voltage to Adaptive Mode
    • Set Additional Turbo Mode Voltage to 0.9

Then save and exit the BIOS.

We re-ran the benchmarks and checked the performance increase compared to the default operation.

As we’re used to now even on mainstream desktop: optimizing the memory subsystem can have a significant impact on workload performance. That’s no different with this workstation platform as we see a performance uplift across the board. The geomean performance speedup improves by another seven percentage points and we get a maximum benchmark improvement of +44.41% in Counter Strike 2.

OC Strategy #5: Undervolt

In our final overclocking strategy, we delve into undervolting to reduce the thermal throttling reducing performance in intense multi-threaded workloads.

Granite Rapids: Undervolt Protection

Intel introduced an Undervolt Protection feature on 12th gen processors to mitigate Plundervolt exploits.

In essence, the feature disables at runtime undervolting when core isolation memory integrity or Hyper-V is enabled and must be enabled by default on all motherboards. Unfortunately, that makes it difficult to test undervolting headroom at runtime because the lowest allowed voltage is defined by the BIOS configuration.

The option can be disabled in the BIOS but it’s possible tuning tools won’t run unless Undervolt Protection is enabled. Also, if you set an override voltage in the BIOS and undervolt protection is enabled, then you can’t switch back to adaptive mode in the operating system.

Setting an undervolt in the BIOS is entirely unaffected by this feature, however. So, you can still try out the undervolting headroom by configuring it there.

Granite Rapids: Adaptive Voltage Mode

As we discussed earlier in the guide, the Granite Rapids overclocking toolkit lacks some of the advanced voltage tuning tools like V/F points. However, the adaptive voltage technology is pretty flexible in that we can set both a target voltage as well as use a voltage offset. So, we can use voltage offset to undervolt.

The process is pretty straight-forward: you can configure a negative voltage offset for each of the individual P-cores or configure a global adaptive voltage offset to adjust them all at once. The voltage offset is applied across the entire V/F curve, from idling at 800 MHz to boosting at over 5 GHz. That means stress-testing can become a little tricky since you need to verify stability across light and heavy workloads.

The ASUS water-cooled oc presets applies a negative 20 mV voltage offset, however I was able to set it to 50 mV. The main challenge with undervolting this Xeon 658X is that the negative voltage offset requires us to increase the voltage for the higher frequency range. To be specific: I need to increase the adaptive voltage, matched to 51X, to 1.230V to ensure stability at 49X and 50X. After accounting for the negative voltage offset, the configured voltage for 51X is 1.180V, still 30mV higher than in OC Strategy #3.

And that will cause some performance issues since we’re able to run at 4.9 to 5.1 GHz in our all core OCCT workloads. At slightly higher voltages, we’ll see slightly more thermal throttling, and thus slightly lower effective clock frequencies.

This is one of those situations where having access to Advanced Voltage Offset would’ve probably given us a better outcome.

BIOS Settings & Benchmark Results

Upon entering the BIOS

  • Go to the Ai Tweaker menu
  • Set Performance Preferences to ASUS Advanced OC Profile
  • Set Ai Overclock Tuner to XMP
  • Set ASUS MultiCore Enhancement to Enabled – Remove All limits
  • Set CPU Core Ratio to By Core Usage
  • Enter the By Core Usage submenu
    • Set Turbo Ratio Limit 1 to 8 to 51
  • Leave the By Core Usage submenu
  • Set DRAM Frequency to DDR5-7200
  • Set Min and Max Compute Die0 Mesh Ratio to 26
  • Set Min and Max IOD South and North Nesh Ratio to 25
  • Enter the DRAM Timing Control submenu
    • Enter the Memory Presets submenu
      • Select Load Hynix 8000MHz 1.4V 8x16GB SR
    • Leave the Memory Presets submenu
  • Leave the DRAM Timing Control submenu
  • Set Compute Die0 Core Voltage to Adaptive Mode
    • Set Additional Turbo Mode Voltage to 1.230
    • Set Offset Mode Sign to –
    • Set Offset Voltage to 0.05
  • Set Compute Die0 Mesh Voltage to Adaptive Mode
    • Set Additional Turbo Mode Voltage to 0.900

Then save and exit the BIOS.

We re-ran the benchmarks and checked the performance increase compared to the default operation.

Since we weren’t thermal throttling in previous OC Strategies and we’re unable to increase the operating frequency in lighter workloads, undervolting has a minimal impact on the system performance. In fact, the slightly higher voltages for the upper frequency cause it to thermal throttle quicker, and this lower the frequency faster. The geomean performance speedup regresses by 0.15 percentage points and we get a maximum benchmark improvement of +45.51% PyPrime.

When running the OCCT CPU AVX512 Stability Test, the average CPU core effective clock is 4889 MHz with 1.112 volts. The average CPU temperature is 104 degrees Celsius. The average system power is 742.4 watts.

When running the OCCT CPU SSE Stability Test, the average CPU core effective clock is 4997 MHz with 1.140 volts. The average CPU temperature is 104 degrees Celsius. The average system power is 682.9 watts.

Intel Xeon 658X: Conclusion

Alright, let us wrap this up.

The Xeon 658X is the entry level SKU in the overclockable Granite Rapids product family. While its P-core overclocking headroom was surprisingly limited, once again we find that there’s a lot of performance headroom up for grabs on this platform.

Perhaps the biggest challenge with this particular CPU is the lack of Advanced Voltage Offset, or V/F Points. Not that it eliminates undervolting altogether, but it forces us to overvolt at the top end of the V/F curve. That results in a small performance regression which we wouldn’t have with appropriate V/F Point configuration options.

Anyway, that’s it for this guide.

I want to thank my Patreon supporters and YouTube members for supporting my work. I’ll have some other content available with this system on the channel as well which YouTube members have early access to. If you have any questions or comments, please drop them in the comment section below. 

‘Till you next time!

1 Comment

  1. 11

    Thanks once again Skatter for sharing these comprehensive OC guides and benchmarks for these less canvased platforms.

    That 39.41% performance jump after OC in CS2 is crazy. Almost directly mirrors the similar uplift you saw on W790 but in Tomb-Raider on the 2495X.

Leave A Comment