SkatterBencher #63: Intel Xeon w7-3465X Overclocked to 5100 MHz
We overclock the Intel Xeon w7-3465X up to 5100 MHz with the ASUS Pro WS W790E-Sage SE motherboard and EK-Pro water cooling.
We’ll do this by individually adjusting each P-core’s maximum allowed Turbo Ratio and core voltage. I’ll also cover more broadly the basics of Sapphire Rapids overclocking. But first, let’s look at the hardware we’re overclocking today.
All right, we have lots to cover, so let’s jump straight in.
Table of Contents
Intel Sapphire Rapids: Introduction
The Intel Xeon w7-3465X processor is part of Intel’s 4th generation Xeon Scalable processor line-up, formerly known as Sapphire Rapids-112L and Sapphire Rapids-64L.
Sapphire Rapids is the successor to, well, a variety of architectures. On the 4S/8S server side, it’s the successor to the 2020 14nm Cooper Lake. On the 1S/2S server and workstation side, it’s the successor to the 2021 10nm Ice Lake. And on the high-end desktop (HEDT) side, it’s the successor to the 2019 14nm Cascade Lake.
Enthusiasts like myself can think of the Sapphire Rapids W790 platform as the successor of the overclockable Cascade Lake-X and locked Cascade Lake-W processors. Perhaps the real spiritual predecessor of the unlocked Xeon W-2400 and W-3400 series is the overclockable 28-core Xeon W-3175X, launched in 2018.
Intel spoke at length about Sapphire Rapids during the 2021 Architecture Day. I won’t go over the architecture details, but it suffices to say there are some significant improvements over the Ice Lake, Cooper Lake, and Cascade Lake architectures.
The most significant improvements are the Intel 7 process technology and up to 56 Golden Cove P-cores. That makes Alder Lake the equivalent on mainstream desktop. It also features PCIe 5.0, DDR5 ECC RDIMM support, and Intel’s 3rd generation Deep Learning Boost technology. Lastly, Sapphire Rapids transitions from a single monolithic die design to a multi-tile design for increased scalability. Well, sort of. Only the Xeon W-3400 series uses the multi-tile die design, whereas the Xeon W-2400 segment still features a monolithic die.
And that’s not where the difference between the W-2400 and W-3400 segment ends.
While the W-3400 series go up to 56 P-cores, the W-2400 only goes up to 24 P-cores. The W-3400 series supports 8-channel memory, whereas the W-2400 series only supports 4-channel memory. The W-3400 series also supports 112 PCI-e 5.0 lanes, whereas the W-2400 series only support 64 lanes.
Intel further segments the Sapphire Rapids CPUs according to the Xeon w3, w5, w7, and w9 brands. That’s similar to how we have Core i3 to Core i9 on the mainstream desktop. Xeon w9 is reserved exclusively for the W-3400 series, and you can only find Xeon w3 processors in the Xeon W-2400 product line. Xeon w5 and w7 are available in both series.
Across all Sapphire Rapids Workstation products, eight overclockable SKUs are split evenly between the W-2400 and W-3400 segments. We’ll get back to how overclocking is enabled later in this article.
The Xeon w7-3465X has 28 P-cores with 56 threads. The base frequency is 2.5 GHz, the Turbo Boost 2.0 boost frequency is 4.6 GHz, and the Turbo Boost Max 3.0 boost frequency is 4.8 GHz. The maximum boost frequency gradually decreases from 4.8 GHz for up to 2 active cores to 3.2 GHz when all cores are active. The base TDP is 300W, and the Turbo TDP is 360W. The TjMax is 97 degrees Celsius.
In this article, we will cover four overclocking strategies:
- First, we rely on ASUS MCE and ASUS Memory Presets
- Second, we use the ASUS water-cooled OC preset
- Third, we try a simple basic overclock
- Lastly, we go for a simple dynamic overclock
However, before we jump into overclocking, let us quickly review the hardware and benchmarks used in this article.
Intel Xeon w7-3465X: Platform Overview
The system we’re overclocking today consists of the following hardware.
Item | SKU | Price (USD) |
CPU | Intel Xeon w7-3465X | 2,899 |
Motherboard | ASUS Pro WS W790E-Sage SE | 1,300 |
CPU Cooling | EK-Pro CPU WB 4677 Ni + Acetal Prototype EK-Quantum Kinetic FLT 240 D5 EK-Quantum Surface P480M – Black | 134 221 150 |
Fan Controller | ElmorLabs EFC-SB ElmorLabs EVC2 ElmorLabs PMD-USB | 50 35 60 |
Memory | V-Color TR51640S840 | |
Power Supply | Cooler Master V1200 Platinum Enermax MaxRevo 1500W | 270 270 |
Graphics Card | ASUS ROG Strix RTX 2080 TI | 880 |
Storage | AORUS RGB 512 GB M.2-2280 NVME | 120 |
Chassis | Open Benchtable V2 | 200 |
ElmorLabs EFC-SB & EVC2
The Easy Fan Controller SkatterBencher Edition (EFC-SB) is a customized EFC resulting from a collaboration between SkatterBencher and ElmorLabs.
I explained how I use the EFC-SB in a separate blog post on this website. By connecting the EFC-SB to the EVC2 device, I monitor the ambient temperature (EFC), water temperature (EFC), and fan duty cycle (EFC). I include the measurements in my Prime95 stability test results.
I also use the ElmorLabs EFC-SB to map the radiator fan curve to the water temperature. Without going into too many details: I have attached an external temperature sensor from the water in the loop to the EFC-SB. Then, I use the low/high setting to map the fan curve from 25 to 40 degrees water temperature. I use this configuration for all overclocking strategies.
The main takeaway from this configuration is that it gives us a good indicator of whether the cooling solution is saturated.
Intel Xeon w7-3465X: Benchmark Software
We use Windows 11 and the following benchmark applications to measure performance and ensure system stability.
BENCHMARK | LINK |
SuperPI 4M | https://www.techpowerup.com/download/super-pi/ |
Geekbench 6 | https://www.geekbench.com/ |
Cinebench R23 | https://www.maxon.net/en/cinebench/ |
CPU-Z | https://www.cpuid.com/softwares/cpu-z.html |
V-Ray 5 | https://www.chaosgroup.com/vray/benchmark |
AI-Benchmark | https://ai-benchmark.com/ |
Y-Cruncher | http://www.numberworld.org/y-cruncher/ |
Blender Monster | https://opendata.blender.org/ |
3DMark CPU Profile | https://www.3dmark.com/ |
3DMark Night Raid | https://www.3dmark.com/ |
Nero Score | https://store.steampowered.com/app/1942030/Nero_Score__PC_benchmark__performance_test/ |
Handbrake | https://handbrake.fr/ |
CS:GO FPS Bench | https://steamcommunity.com/sharedfiles/filedetails/?id=500334237 |
Shadow of the Tomb Raider | https://store.steampowered.com/app/750920/Shadow_of_the_Tomb_Raider_Definitive_Edition/ |
Final Fantasy XV | http://benchmark.finalfantasyxv.com/na/ |
Prime 95 | https://www.mersenne.org/download/ |
Xeon w7-3465X: Stock Performance
Before starting overclocking, we must check the system performance at default settings. Note that on this motherboard, Turbo Boost 2.0 is unleashed by default. So, to check the performance at default settings, you must enter the BIOS and
- Go to the Ai Tweaker menu
- Set ASUS MultiCore Enhancement to Disabled – Enforce All limits
Then save and exit the BIOS.
The default Turbo Boost 2.0 parameters for the Xeon w7-3465X are as follows:
- PL1: 300W
- PL2: 360W
- Tau: 67sec
- ICCIN_MAX: 550A
- ICIN_VR_TDC: 185A
- PMAX: 922W
- VTRIP: 1.6095V
Here is the benchmark performance at stock:
- SuperPI 4M: 36.600 seconds
- Geekbench 6 (single): 2,277 points
- Geekbench 6 (multi): 18,258 points
- Cinebench R23 Single: 1,714 points
- Cinebench R23 Multi: 42,151 points
- CPU-Z V17.01.64 Single: 709.4 points
- CPU-Z V17.01.64 Multi: 18,664.3 points
- V-Ray 5: 31,800 vsamples
- AI Benchmark: 8,599 points
- Y-Cruncher PI MT 25B: 439.425 seconds
- Blender Monster: 303.95 fps
- Blender Classroom: 146.45 fps
- 3DMark Night Raid: 56,406 points
- Nero Score: 2,821 points
- Handbrake: 38.82 fps
- CS:GO FPS Bench: 501.38 fps
- Tom Raider: 197 fps
- Final Fantasy XV: 194.93 fps
Here are the 3DMark CPU Profile scores at stock
- CPU Profile 1 Thread: 869
- CPU Profile 2 Threads: 1,749
- CPU Profile 4 Threads: 2,730
- CPU Profile 8 Threads: 5,125
- CPU Profile 16 Threads: 9,882
- CPU Profile Max Threads: 15,938
When running Prime 95 Small FFTs with AVX2 enabled, the average CPU effective clock is 2553 MHz with 0.828 volts. The average CPU temperature is 39.0 degrees Celsius. The ambient and water temperature is 27.1 and 32.0 degrees Celsius. The average CPU package power is 299.0 watts.
When running Prime 95 Small FFTs with AVX disabled, the average CPU effective clock is 3093 MHz with 0.884 volts. The average CPU temperature is 41.0 degrees Celsius. The ambient and water temperature is 27.2 and 31.9 degrees Celsius. The average CPU package power is 299.8 watts.
Now, let us try our first overclocking strategy.
However, before we get going, make sure to locate the CMOS Clear button
Pressing the Clear CMOS button will reset all your BIOS settings to default which is helpful if you want to start your BIOS configuration from scratch. However, it does not delete any of the BIOS profiles previously saved. The Clear CMOS button is located on the rear I/O panel.
OC Strategy #1: MCE + Memory OC
In our first overclocking strategy, we use ASUS MultiCore Enhancement to unleash the Turbo Boost 2.0 power limits and ASUS Memory Presets to overclock the memory.
Turbo Boost 2.0
Intel Turbo Boost 2.0 Technology allows the processor cores to run faster than the base operating frequency. Turbo Boost is available when the processor works below its rated power, temperature, and current specification limits. The ultimate advantage is opportunistic performance improvements in both multi-threaded and single-threaded workloads.
The turbo boost algorithm works according to a proprietary EWMA formula. That stands for Exponentially Weighted Moving Average. There are three parameters to consider: PL1, PL2, and Tau.
- Power Limit 1, or PL1, is the threshold the average power won’t exceed. Historically, this has always been set equal to Intel’s advertised TDP. PL1 should not be set higher than the thermal solution cooling limits.
- Power Limit 2, or PL2, is the maximum power the processor can use for a limited time.
- Tau, in seconds, is the time window for calculating the average power consumption. The CPU will reduce the CPU frequency if the average power consumed exceeds PL1.
Intel Turbo Boost 2.0 technology is available on Sapphire Rapids as it’s the primary driver of performance over the base frequency.
An easy ASUS MultiCore Enhancement option on ASUS motherboards allows you to unleash the Turbo Boost power limits. Set the option to Enabled – Remove All Limits and enjoy maximum performance.
Adjusting the power limits is strictly not considered overclocking, as we don’t change the CPU’s thermal, electrical, or frequency parameters. Intel provides the Turbo Boost parameters as guidance to motherboard vendors and system integrators to ensure their designs enable the base performance of the CPU. Better motherboard designs, thermal solutions, and system configurations can facilitate peak performance for longer.
ASUS Memory Presets
ASUS Memory Presets is an ASUS overclocking technology that provides a selection of memory-tuning presets for specific memory ICs. The presets will adjust the memory timings and voltages.
The technology was first introduced in 2012 on Z77 and has been on select ASUS ROG motherboards ever since. The memory profiles available differ from platform to platform. For example, there were no less than 14 profiles available for various ICs and memory configurations on the Maximus V Extreme.
Four memory profiles are available on the ASUS Pro WS W790E-Sage SE motherboard, two each for Hynix and Micron. Since our memory can overclock pretty well, we use the profile for Hynix DDR5-6800 memory.
BIOS Settings & Benchmark Results
Upon entering the BIOS
- Go to the Ai Tweaker menu
- Set ASUS MultiCore Enhancement to Enabled – Remove All limits
- Set DRAM Frequency to DDR5-6800MHz
- Enter the DRAM Timing Control submenu
- Enter the Memory Presets submenu
- Select Load Hynix 6800 1.4V 8x16GB SR
- Select Yes
- Enter the Memory Presets submenu
Then save and exit the BIOS.
We re-ran the benchmarks and checked the performance increase compared to the default operation.
- SuperPI 4M: +4.97%
- Geekbench 6 (single): +3.78%
- Geekbench 6 (multi): +5.65%
- Cinebench R23 Single: +0.58%
- Cinebench R23 Multi: +1.23%
- CPU-Z V17.01.64 Single: +0.03%
- CPU-Z V17.01.64 Multi: +0.03%
- V-Ray 5: +6.31%
- AI Benchmark: +4.52%
- Y-Cruncher PI MT 25B: +13.49%
- Blender Monster: +3.53%
- Blender Classroom: +3.90%
- 3DMark Night Raid: +0.54%
- Nero Score: +6.20%
- Handbrake: +5.51%
- CS:GO FPS Bench: +0.77%
- Tom Raider: +1.52%
- Final Fantasy XV: +1.61%
Here are the 3DMark CPU Profile scores
- CPU Profile 1 Thread: +3.91%
- CPU Profile 2 Threads: +0.97%
- CPU Profile 4 Threads: +7.66%
- CPU Profile 8 Threads: +0.80%
- CPU Profile 16 Threads: +0.06%
- CPU Profile Max Threads: +1.18%
The performance improves slightly after unleashing the Turbo Boost 2.0 power limits and increasing the memory frequency. In light workloads, the performance gain is minimal. However, as the workload intensity increases, the performance gains also increase. We see the most significant improvement of +13.49% in the heavy and memory-sensitive Y-Cruncher workload.
When running Prime 95 Small FFTs with AVX2 enabled, the average CPU effective clock is 3095 MHz with 0.881 volts. The average CPU temperature is 50.0 degrees Celsius. The ambient and water temperature is 28.7 and 35.3 degrees Celsius. The average CPU package power is 467.0 watts.
When running Prime 95 Small FFTs with AVX disabled, the average CPU effective clock is 3128 MHz with 0.886 volts. The average CPU temperature is 46.0 degrees Celsius. The ambient and water temperature is 27.8 and 33.8 degrees Celsius. The average CPU package power is 404.2 watts.
OC Strategy #2: Water-Cooled OC Preset
In our second overclocking strategy, we use the Water-Cooled OC Preset available in the BIOS. However, the preset usage isn’t as straightforward as with the Xeon w7-2495X and Xeon w5-3435X in SkatterBencher #59 and #61.
As you’ll see later in the blog post, the OC profile isn’t entirely stable for this particular processor. The reason is apparent, and I’ll explain it in due time. But before we can understand why the OC profile isn’t stable, we must understand a couple of fundamental Intel overclocking technologies: Turbo Boost 2.0, Turbo Boost Max 3.0, and Intel Adaptive Voltage.
Turbo Boost 2.0 Ratio Configuration
We all know the Turbo Boost 2.0 technology from its impact on the power limits, but a second significant aspect of Turbo Boost 2.0 is configuring the CPU frequency based on the number of active cores.
Turbo Boost 2.0 Ratio Configuration allows us to configure the overclock for different scenarios ranging from 1 active core to all active cores. That enables us to run some cores significantly faster than others when the conditions are right. Intel provides eight (8) registers to configure the Turbo Boost 2.0 Ratio.
On mainstream platforms where the top SKU has no more than 8 P-cores, these registers are configured from 1-active P-core to 8-active P-cores. However, on platforms with core counts beyond eight cores, we can configure each register by target Turbo Boost Ratio and the number of active cores.
By Core Usage is not the same as configuring each core individually. When using By Core Usage, we determine an overclock according to the actual usage. For example, if a workload uses four cores, the CPU determines which cores should execute this workload and applies our set frequency to those cores.
Turbo Boost Max 3.0 Technology
In 2016, Intel introduced the Turbo Boost Max Technology 3.0. While carrying the same name, Turbo Boost Max 3.0 is not an iteration of Turbo Boost 2.0.
Turbo Boost Max Technology 3.0 aims to exploit the natural variance in CPU core quality observed in multi-core CPUs. With Turbo Boost Max 3.0, Intel identifies the best cores in your CPU and calls those the “favored cores.” The favored cores are essential for two reasons.
- Intel allows for additional frequency boosts of the favored cores. On the Sapphire Rapids Xeon w7-3465X, there are four favored P-cores. Two can boost to 4.8 GHz, and two can boost to 4.7 GHz. The rest of the non-favored cores are limited to 4.6 GHz.
- The operating system will automatically assign the most demanding workloads to these favored cores, ensuring potentially higher performance.
The performance benefit of ITBMT 3.0 is most visible in low thread count workloads. Highly threaded workloads do not benefit from ITBMT 3.0.
ASUS Water-Cooled OC Preset
The ASUS water-cooled OC preset is an excellent addition to the ASUS Pro WS W790 motherboards, giving Xeon customers an easy path to additional performance. We can enable the preset with a single button click. The preset drastically improves the all-core performance by changing the Turbo Boost 2.0 Ratio configuration. Furthermore, the preset also adjusts the per-core ratio limit.
- 48X up to 2 active cores ->
- 47X up to 4 active cores -> 48X up to 4 active cores
- 43X up to 12 active cores ->
- 39X up to 16 active cores -> 46X up to 16 active cores
- 36X up to 20 active cores -> 45X up to 20 active cores
- 33X up to 24 active cores -> 44X up to 24 active cores
- 32X up to 28 active cores -> 40X up to 28 active cores
On this Xeon w7-3465X, for example, by enabling the preset, the frequency in all situations (except for when two cores are active) increases. The all-core frequency even increases by 800 MHz from 3.2 GHz to 4.0 GHz.
In addition, it also adjusts the Per Core Ratio Limits for a variety of cores. You can use Intel Extreme Tuning Utility to check each core’s ratio limit. Instead of having two cores that go up to 48X, two cores that go up to 47X, and 24 cores that go up to 46X, the water-cooled OC profile has four cores going up to 48X, 12 cores going up to 46X, four cores up to 45X, four cores up to 44X, and four cores up to 40X.
Ratio Limit | Default | WaterCooled Profile |
48X | 2 cores | 4 cores |
47X | 2 cores | |
46X | 24 cores | 12 cores |
45X | 4 cores | |
44X | 4 cores | |
40X | 4 cores |
While it seemingly does this without any voltage adjustments, we can see some adjustments when we have a closer look.
Twenty-four cores get assigned a core-specific adaptive voltage ranging from 1.1V for 44X to 1.2V for 48X. The four cores with a ratio limit of 40X get an override voltage of 0.9V. This override voltage is likely chosen to help lower temperatures in all-core heavy workloads. We’ll discuss the voltage on Sapphire Rapids later in the article.
As I said at the beginning of the OC Strategy, the preset isn’t entirely stable on our system. As you’ll see in a minute, while it passes the light to medium workloads, it fails on most heavy workloads. The reason is simple: the voltage adjustments are too aggressive for this particular CPU, causing some cores to be unstable.
BIOS Settings & Benchmark Results
Upon entering the BIOS
- Go to the Ai Tweaker menu
- Set ASUS MultiCore Enhancement to Enabled – Remove All limits
- Set CPU Core Ratio to Water-Cooled OC Preset
- Set DRAM Frequency to DDR5-6800MHz
- Enter the DRAM Timing Control submenu
- Enter the Memory Presets submenu
- Select Load Hynix 6800 1.4V 8x16GB SR
- Select Yes
- Enter the Memory Presets submenu
Then save and exit the BIOS.
We re-ran the benchmarks and checked the performance increase compared to the default operation.
- SuperPI 4M: +5.84%
- Geekbench 6 (single): +6.59%
- Geekbench 6 (multi): +19.19%
- Cinebench R23 Single: +6.18%
- Cinebench R23 Multi: FAIL
- CPU-Z V17.01.64 Single: +6.01%
- CPU-Z V17.01.64 Multi: +23.84%
- V-Ray 5: +29.45%
- AI Benchmark: FAIL
- Y-Cruncher PI MT 25B: FAIL
- Blender Monster: FAIL
- Blender Classroom: FAIL
- 3DMark Night Raid: +19.01%
- Nero Score: +18.22%
- Handbrake: FAIL
- CS:GO FPS Bench: FAIL
- Tom Raider: +2.54%
- Final Fantasy XV: +3.09%
Here are the 3DMark CPU Profile scores
- CPU Profile 1 Thread: +16.23%
- CPU Profile 2 Threads: +14.69%
- CPU Profile 4 Threads: +33.00%
- CPU Profile 8 Threads: +26.81%
- CPU Profile 16 Threads: +25.11%
- CPU Profile Max Threads: +29.98%
When enabling the Water-Cooled OC Preset, the system becomes unstable in most heavy workloads. However, the performance improvement is significant when the system is stable enough to pass the benchmarks. We get a maximum performance improvement of +33% in 3DMark CPU Profile 4 Threads.
The system is unstable when running Prime 95 Small FFTs with AVX2 enabled. When running Prime 95 Small FFTs with AVX disabled, the system is also unstable.
OC Strategy #3: Simple Basic Overclock
In our third overclocking strategy, we pursue a simple basic overclock. Since the Water Cooled OC profile failed, the goal is to get a simple overclock. So, we rely primarily on the Turbo Boost 2.0 ratio control without touching the voltage. So this will be a quick OC Strategy.
Turbo Boost 2.0 Ratio Configuration
As I explained earlier in the article, Turbo Boost 2.0 ratio configuration allows us to configure the overclock for different scenarios ranging from 1 active core to all active cores. That enables us to run some cores significantly faster than others when the conditions are right.
From the Xeon w7-3465X specification, we know that every core is spec’d to run up to 46X, that four ‘favored’ cores are spec’d up to 47X, and that two ‘super-favored’ cores can even go up to 48X. However, we also know that the CPU is limited to much lower frequencies as more cores become active.
So, we can use this information to our advantage to extract a lot more performance in multi-threaded applications.
- 48X up to 2 active cores -> 48X up to 2 active cores
- 47X up to 4 active cores -> 47X up to 4 active cores
- 43X up to 12 active cores ->
- 39X up to 16 active cores ->
- 36X up to 20 active cores ->
- 33X up to 24 active cores ->
- 32X up to 28 active cores -> 46X up to 28 active cores
In this overclocking strategy, we simply lift the Turbo Boost 2.0 ratio from when five cores are active to 46X. That should give us a boost of 300 MHz when five cores are active, up to a boost of 1.4GHz when all 28 cores are active.
As I pointed out in the previous OC Strategy, since every core has a V/F curve up to at least their maximum default frequency, we don’t have to adjust any voltage. The CPU can use the default V/F curve to regulate each core’s voltage.
However, we do make one change to the voltages. We slightly increase the VccIN voltage to 2.4V. That’s to make it easier on the VccIN VRM. After all, a higher voltage means a lower current at a given input power in Watts. I’ll discuss Sapphire Rapids voltage configuration in the next OC Strategy.
BIOS Settings & Benchmark Results
Upon entering the BIOS
- Go to the Ai Tweaker menu
- Set ASUS MultiCore Enhancement to Enabled – Remove All limits
- Set CPU Core Ratio to By Core Usage
- Enter the By Core Usage submenu
- Set Turbo Ratio Limit 1 to 48
- Set Turbo Ratio Cores 1 to 2
- Set Turbo Ratio Limit 2 to 47
- Set Turbo Ratio Cores 2 to 4
- Set Turbo Ratio Limit 3 to 46
- Set Turbo Ratio Cores 3 to 28
- Leave the By Core Usage submenu
- Enter the Specific Core submenu
- Set double * Core Specific Ratio Limit to 48
- Set single * Core Specific Ratio Limit to 47
- Set *-less Core Specific Ratio Limit to 46
- Leave the Specific Core submenu
- Set DRAM Frequency to DDR5-6800MHz
- Enter the DRAM Timing Control submenu
- Enter the Memory Presets submenu
- Select Load Hynix 6800 1.4V 8x16GB SR
- Select Yes
- Leave the Memory Presets submenu
- Enter the Memory Presets submenu
- Leave the DRAM Timing Control submenu
- Set Vcore 1.8V IN to Manual mode
- Set CPU Core Voltage Override to 2.4
Then save and exit the BIOS.
We re-ran the benchmarks and checked the performance increase compared to the default operation.
- SuperPI 4M: +5.13%
- Geekbench 6 (single): +8.08%
- Geekbench 6 (multi): +29.33%
- Cinebench R23 Single: +6.24%
- Cinebench R23 Multi: +45.69%
- CPU-Z V17.01.64 Single: +6.32%
- CPU-Z V17.01.64 Multi: +42.80%
- V-Ray 5: +46.46%
- AI Benchmark: +32.93%
- Y-Cruncher PI MT 25B: +50.76%
- Blender Monster: +45.10%
- Blender Classroom: +46.460%
- 3DMark Night Raid: +34.76%
- Nero Score: +27.79%
- Handbrake: +46.34%
- CS:GO FPS Bench: +19.15%
- Tom Raider: +3.05%
- Final Fantasy XV: +2.50%
Here are the 3DMark CPU Profile scores
- CPU Profile 1 Thread: +16.69%
- CPU Profile 2 Threads: +15.27%
- CPU Profile 4 Threads: +43.99%
- CPU Profile 8 Threads: +44.33%
- CPU Profile 16 Threads: +42.88%
- CPU Profile Max Threads: +43.56%
We significantly improve the system performance after these minor adjustments in the Turbo Boost 2.0 Ratio Configuration. The heavy all-core workloads improve in particular. We see a maximum performance increase of +50.76% in Y-Cruncher.
When running Prime 95 Small FFTs with AVX2 enabled, the average CPU effective clock is 4432 MHz with 1.148 volts. The average CPU temperature is 96.0 degrees Celsius. The ambient and water temperature is 33.0 and 46.1 degrees Celsius. The average CPU package power is 797.4 watts.
When running Prime 95 Small FFTs with AVX disabled, the average CPU effective clock is 4587 MHz with 1.180 volts. The average CPU temperature is 95.0 degrees Celsius. The ambient and water temperature is 31.8 and 42.3 degrees Celsius. The average CPU package power is 681.8 watts.
OC Strategy #4: Simple Dynamic Overclock
We pursue a modern, dynamic manual overclock in our final overclocking strategy. We must discuss Intel’s overclocking toolkit for Sapphire Rapids to explore how we can do this and briefly talk about the Sapphire Rapids clocking and voltage topologies.
Sapphire Rapids Clocking Topology
The clocking of a standard Sapphire Rapids platform slightly differs from what we’re used to with mainstream platforms. The standard clocking topology relies on a 25 MHz crystal or crystal oscillator input to an external CK440Q clock generator which then connects to one or more DB2000Q differential buffer devices. The platform supports multiple clocking topologies: balanced and unbalanced.
- Balanced:
- All CPU BCLKs and PCIe reference clocks are driven by the same DB or different DBs at the same depth levels
- Unbalanced:
- CPU BCLKs are driven by DB and PCIe by the extCLK/PCHCLK or other DB
- Vice versa
The specific implementation depends on your choice of motherboard. Ideally, we would isolate the CPU BCLK from any PCIe reference clocks. However, it seems that this unbalanced architecture is currently not working very well. So you’ll likely see all motherboards adopting a balanced clocking architecture. That means if you increase the CPU BCLK, you also increase the CPU PCIe clock frequency.
Either way, the external clock generator generates multiple 100 or 25 MHz clock sources. These sources can be used in a variety of ways:
- 100 MHz CPU base clock frequency
- 100 MHz CPU PCIe clock frequency
- 100 MHz PCH PCIe clock frequency
- 100 MHz NIC clock frequency
- 100 MHz clock input for the PCH
The 100 MHz CPU BCLK is then multiplied with specific ratios for each of the different parts in the CPU.
Each P-core can run at its independent frequency. The Mesh PLL ties together the last-level cache, cache box, and seemingly also the memory controller. It can run an independent frequency from the P-cores. On the multi-tile dies of the W-3400 processors, the Mesh ratio is limited to 27X.
The memory frequency is also driven by the CPU BCLK and multiplied by a memory ratio. Unlike on mainstream desktop, the memory frequency is not tied to the memory controller frequency and can run independently. The memory ratio goes up to 88X or a frequency of up to DDR5-8800.
Sapphire Rapids Voltage Topology
Sapphire Rapids uses a combination of fully integrated voltage regulators (FIVR) and motherboard voltage regulators (MBVR) for power management. There are eight (8) distinct voltage inputs to a Sapphire Rapids processor—most of these power inputs power a FIVR or fully integrated voltage regulator. The FIVR then manages the voltage provided to specific parts of the CPU. The end user can control some of these voltages.
The voltages most relevant for Sapphire Rapids W-3400 processor overclocking are those driven by the VccIN, including the P-core and Mesh voltages, and, to a lesser extent, the voltages driven by the VccFA_EHV, including VccCFN and VccMDFI.
Sapphire Rapids Overclocking Toolkit
I described the history of Intel’s overclocking toolkit in a video on my YoubTube channel titled: “How is Alder Lake Non-K Overclocking Even Possible?!” Long story short, Intel developed and maintained a technology called the OC Mailbox which contains the entire overclocker’s toolkit. This toolkit is not always the same for each CPU architecture, as sometimes we need different tools.
On Sapphire Rapids, the overclocking toolkit consists of the following tools:
- Per Core ratio and voltage control
- Mesh ratio and voltage control
- DRAM ratio control
- AVX2, AVX-512, and TMUL ratio offset
- Turbo Boost 2.0 ratio and power control
- Turbo Boost Max 3.0 ratio control
- SVID disable
- XMP 3.0 support
- XTU Support
Notably missing from the OC toolbox are prominent features we know from mainstream desktop like Advanced Voltage Offset, better known as V/F points, and OverClocking Thermal Velocity Boost, or OCTVB.
Sapphire Rapids Adaptive Voltage Mode
Like any previous Intel architecture, there are two main ways of configuring the voltage for the CPU cores: override mode and adaptive mode.
- Override mode specifies a single static voltage across all ratios. It is mainly used for extreme overclocking where stability at high frequencies is the only consideration.
- Adaptive mode is the standard mode of operation. In Adaptive Mode, the CPU relies on the factory-fused voltage-frequency curves to set the appropriate voltage for a given ratio. When configuring an adaptive voltage, it is mapped against the “OC Ratio, ” the highest configured ratio. We’ll get back to that in a minute.
Since Sapphire Rapids uses FIVR, we can only adjust the core voltage by configuring the CPU PCU via BIOS or specialized tools like XTU.
We can specify a voltage offset for override and adaptive modes. Of course, this doesn’t make much sense for override mode – if you set 1.15V with a +50mV offset, you could just set 1.20V – but it can be helpful in adaptive mode as you can offset the entire V/F curve by up to 500mV in both directions.
On Sapphire Rapids, you can configure the override or adaptive voltage on a Global or Per-Core level. Let’s focus on adaptive mode voltage configuration and first look at how it works for a single core.
When we set an adaptive voltage for a core, this voltage is mapped against the “OC Ratio.” The “OC Ratio” is the highest ratio configured for the CPU across all settings and cores. The default maximum turbo ratio determines the OC ratio when you leave everything at default. In the case of the Xeon w7-3465X, that ratio would be 47X because of the Turbo Boost Max 3.0. The “OC Ratio” equals the highest configured ratio if you overclock.
Specific rules govern what adaptive voltage can be set.
A) the voltage set for a given ratio n must be higher than or equal to the voltage set for ratio n-1.
Suppose our Xeon w7-3465X runs 48X at 1.30V. In that case, setting the adaptive voltage, mapped to 48X, lower than 1.30V, is pointless. 48X always runs at 1.30V or higher. Usually, BIOSes may allow you to configure lower values. However, the CPU’s internal mechanisms will override your configuration if it doesn’t follow the rules.
B) the adaptive voltage configured for any ratio below the maximum default turbo ratio will be ignored.
Take the same example of the Xeon w7-3465X, specified to run 48X at 1.30V. If you try to configure all cores to 45X and set 1.10V, the CPU will ignore this because it has its own factory-fused target voltage for all ratios up to 48X and will use this voltage. You can only change the voltage of the OC Ratio, which, as mentioned before, on the Xeon w7-3465X, is 48X and up.
C) for ratios between the OC Ratio and the next highest factory-fused V/f point, the voltage is interpolated between the set adaptive voltage and the factory-fused voltage.
Returning to our example of a Xeon w7-3465X, specified to run 48X at 1.30V, let’s say we manually configure the OC ratio to be 52X at 1.40V. The target voltage for ratios 51X, 50X, and 49X will now be interpolated between 1.30V and 1.40V.
As I mentioned already, we can do this for each core individually. However, that would be rather painful, especially on a 56-core CPU! Fortunately, there’s also an alternative way to set a global adaptive voltage.
When we set a global adaptive voltage, it maps this voltage to the OC Ratio for each core in our CPU. So, if our OC Ratio is 52X and the global adaptive voltage is 1.40V, then every core in our CPU has a voltage frequency curve that goes up to 52X at 1.40V. That certainly makes things easier.
Sapphire Rapids Per Core Ratio & Voltage Control
While we only recently saw the addition of per-core ratio control on mainstream desktop with Rocket Lake, on the high-end desktop, the ability to control the maximum ratio and voltage for each core has been around since Broadwell-E in 2016.
The Per Core Ratio Limit and Voltage control options let you control the upper end of the voltage-frequency curve of each core inside your CPU. While the general rules of adaptive voltage mode still apply, this enables two crucial new avenues for CPU overclocking.
- First, it allows users to overclock each core and find its maximum stable frequency individually.
- Second, it allows users to set an aggressive by core usage overclock while constraining the worst cores.
Since each core has an independent FIVR-regulated power rail, it’s possible to finetune each core to its maximum capability.
When we set a Per Core Ratio Limit, counter-intuitively, this Ratio doesn’t act as a core-specific OC Ratio but as a means to limit what parts of the V/F curve can be used. Let’s use that same example of the 52X at 1.40V. If we set the Per-Core Ratio Limit to 51X, the CPU core will boost up to 5.1 GHz at a voltage interpolated between 52X at 1.40V and 48X at 1.30V.
Similarly, if we set a Per Core Adaptive Voltage, this voltage is mapped to the OC Ratio. However, the voltage interpolation is based on the core-specific voltage-frequency curve. So, while each core has the same OC Ratio mapped to a core-specific adaptive voltage, each core will have a unique voltage-frequency curve.
I will dig deeper into this topic when we discuss the manual tuning process. But first, let’s have a closer look at the voltage-frequency curves.
Xeon w7-3465X Voltage-Frequency Curve
Each core inside this Xeon w7-3465X has its own factory-fused voltage-frequency curve. According to the specification, two cores can run up to 4.8 GHz, two cores can run up to 4.7 GHz, and the rest of the 28 cores can only run up to 4.6 GHz. We’d expect each of the twenty-eight cores to have a factory-fused specific voltage for its maximum ratio. And according to the Adaptive Voltage rules, when you increase the ratio without adjusting the adaptive voltage, it will use the voltage for its highest V/F point.
We can investigate this theory by mapping the voltage-frequency curve for each core of this CPU. The process is relatively simple.
- Set the C-State to C0/C1 in the BIOS to ensure the CPU cores always run at the maximum frequency.
- We use Shamino’s OC Tool or Intel Extreme Tuning Utility to set all cores to a fixed frequency. Double-check that each core’s Per Core Ratio Limit is also set to this frequency.
- Then, we use HWiNFO to check the maximum VID for each core.
When we checked the voltage-frequency curves for this Xeon w7-3465X, we didn’t exactly find what we expected. We can put the 28 cores in 4 buckets: “rightfully favored cores” (RFCs), “unrightfully non-favored cores” (UNFCs), “normal, regular cores” (NRCs), and “shitty regular cores” (SRCs).
- RFCs are favored and have a V/F curve up to 48X. It includes Cores 0, 1, 3, and 20.
- UNFCs are not favored and have a V/F curve up to 48X. It includes Cores 5, 17, and 19.
- NRCs are regular cores with a V/F curve up to only 46X and a reasonable maximum voltage. It includes Cores 2, 4, 6, 8, 10, 11, 12, 13, 14, 15, 16, 18, 21, 22, 23, 25, 26, and 27.
- SRCs are also regular cores with a V/F curve up to only 46X, but the highest voltage is exceptionally high. That includes Cores 7, 9, and 24.
Not only do we have unfavored cores with V/F curves as if they are favored cores, but we also have three regular cores with an exceptionally high maximum VID. This information will become relevant when we finally start our manual tuning.
Sapphire Rapids AVX2, AVX-512, and TMUL Ratio Offset
Intel first introduced the AVX negative ratio offset on Broadwell-E processors. Successive processors adopted this feature and eventually expanded it with AVX2 and AVX-512 negative offsets.
New on Sapphire Rapids is the addition of the TMUL ratio offset. TMUL stands for Tile matrix MULtiply and is an Intel Advanced Matrix Extensions (AMX) technology component. It’s designed to accelerate AI and deep learning workloads.
The ratio offsets help achieve maximum performance for SSE, AVX, and AMX workloads. The rabbit hole of AVX offset goes deep, but it suffices to know three pieces of information for this guide.
- By default, the AVX ratio offset is 0;
- The AVX ratio offset is applied on a per-core basis as it’s subtracted from each core’s Per Core Ratio Limit;
- The ratios are triggered based on workload intensity, not necessarily what type of instructions.
That last part is crucial because “light” AVX loads may not trigger the offset, and heavy non-AVX workloads may.
Xeon w7-3465X Simple Dynamic OC Manual Tuning
As demonstrated in my previous Sapphire Rapids overclocking guides, setting up a dynamic overclock is tricky with these CPUs. There are two main reasons for that.
- As I highlighted, the CPU core V/F curves are strange. Not only do some cores have curves beyond their maximum allowed default frequency, but some also have maximum voltages that are exceptionally high. For example, non-favored Core 17 has a V/F curve up to 48X at 1.31V!
- This CPU has 28 cores that can be tuned individually. Finding the most optimal overclock for each core will be a painstaking activity unless we find tuning shortcuts. Since I’ve received many requests to provide more detail on my manual tuning process, let’s dig into the practical side of Sapphire Rapids tuning.
The first step is to set overclocking objective. In this case, I aim to get more single-threaded performance by overclocking each core to its approximate maximum stable frequency. Next, the maximum voltage I’m comfortable with will determine the stable frequency limit. I know that the maximum factory-fused voltage is 1.31V for Core 17. So, I pick a maximum voltage of 1.35V.
When I set an adaptive core voltage of 1.35V, it means every core will have the voltage of its highest V/F point set at 1.35V. What that V/F point’s frequency is … well, that’s what we need to figure out. I try to shorten the test procedure by relying on CoreCycler.
Sp00n initially developed this nifty stability test script to test Curve Optimizer for AMD CPUs, but I often use it on other platforms too. In short, the script cycles through each CPU core, applying a workload of your choice like Prime95, Y-Cruncer, or AIDA, and tells you which is unstable.
First, I ensure each core runs stable through 30 seconds of CoreCycler Prime95 SSE. Then I verify the stability for 30 seconds of CoreCycler Prime95 AVX2. Lastly, I also run 30 seconds of CoreCycler Y-Cruncer 00-86x. If my configuration passes all these tests, I try higher frequencies.
Sounds simple, doesn’t it? Yes and no. You need to be aware that increasing the frequency may also decrease the voltage for a given V/F Point. Let me explain by looking back at our CPU’s voltage-frequency curve.
As we know, each core has its own factory-fused voltage-frequency curve. At default, each V/F curve has a maximum voltage equal to the highest V/F Point. For some cores, that’s at 46X, and for others at 48X.
Let’s say we set all cores to 50X with an adaptive voltage of 1.35V. This is what the V/F curve looks like for each core.
Now let’s see what happens when we increase the frequency to 52X but keep the same voltage of 1.35V. This is what the V/F curve looks like across all cores.
As expected, the V/F curves now converge to 52X at 1.35V. But did you also notice that the voltage for most V/F points is lower than before? While at 45X, the voltage ranges from 1.136V for Core 20 to 1.251V for Core 7, all V/F curves converge to 50X at 1.35V.
We can better illustrate this by mapping the average V/F point for each core. Let’s compare three data points: the factory-fused V/F curve, a manual curve to 50X at 1.35V, and a manual curve at 52X. Let’s compare the V/F point for 49X.
- Factory-fused AVG V/F Point: 49X at 1.214V
- 50X at 1.35V AVG V/F Point: 49X at 1.313V
- 52X at 1.35V AVG V/F Point: 49X at 1.279V
So, as we increase the OC Ratio while keeping the same adaptive voltage, the voltage for ratios between our factory-fused highest V/F point and the OC Ratio will reduce. If we’d set 55X at 1.35V, the average voltage at 49X would be 1.257V.
As I said, this is a crucial aspect of Per Core manual tuning on Sapphire Rapids. The implication is that if a core is stable with 50X at 1.35V, the core may become unstable when we set 52X at 1.35V even if we set the Per Core Ratio Limit to 50X! That’s why I run CoreCycler across all cores every time I increase the frequency.
For configuring the overclock, you can rely on any tool. For this guide, I choose Intel Extreme Tuning Utility. First, I set the 50X and 1.35V in the BIOS. Then I run the CoreCycler tests. If a core fails, I reduce that core’s ratio limit by 1X and try again.
Here’s the complete tuning test log. As you can see, it’s not a quick process, as many cores can fail for any given reason. During my testing, many cores could reach 54X but ultimately ran into strange stability issues when I set the By Core Usage ratio was over 51X from BIOS.
After spending way more time than I should trying to figure out what was going on, I settled for a modest manual overclock of 51X for all cores, except Core 7, and an AVX offset of -3.
ElmorLabs PMD-USB
Lastly, I added the ElmorLabs PMD-USB to my setup to measure the input CPU power to the motherboard. That’s because while the reported CPU Package Power in HWiNFO for this CPU during a Prime95 AVX2 stability test run is about 800W, my wall socket power measurement showed 1500W. That’s a large discrepancy, so I wanted to see how much power the CPU pulls during the stability testing. As you’ll see in a minute, the power draw is over 1100W.
BIOS Settings & Benchmark Results
Upon entering the BIOS
- Go to the Ai Tweaker menu
- Set ASUS MultiCore Enhancement to Enabled – Remove All limits
- Set CPU Core Ratio to By Core Usage
- Enter the By Core Usage sub-menu
- Set Turbo Ratio Limit 1 to 51
- Set Turbo Ratio Cores 1 to 8
- Set Turbo Ratio Limit 2 to 49
- Set Turbo Ratio Cores 2 to 12
- Set Turbo Ratio Limit 3 to 48
- Set Turbo Ratio Cores 3 to 16
- Set Turbo Ratio Limit 4 to 47
- Set Turbo Ratio Cores 4 to 28
- Leave the By Core Usage sub-menu
- Enter the Specific Core submenu
- Set Core 7 Specific Ratio Limit to 50
- Leave the Specific Core submenu
- Set DRAM Frequency to DDR5-6800MHz
- Enter the AVX Related Controls submenu
- Set AVX2 Ratio Offset to per-core Ratio Limit to User Specify
- Set AVX2 Ratio Offset to 3
- Leave the AVX Related Controls submenu
- Enter the DRAM Timing Control submenu
- Enter the Memory Presets submenu
- Select Load Hynix 6800 1.4V 8x16GB SR
- Select Yes
- Leave the Memory Presets submenu
- Enter the Memory Presets submenu
- Leave the DRAM Timing Control submenu
- Set Max. CPU Cache Ratio to 27
- Set Vcore 1.8V IN to Manual Mode
- Set CPU Core Voltage Override to 2.4
- Set Global Core SVID Voltage to Adaptive Mode
- Set Additional Turbo Mode CPU Core Voltage to 1.35
Then save and exit the BIOS.
We re-ran the benchmarks and checked the performance increase compared to the default operation.
- SuperPI 4M: +12.12%
- Geekbench 6 (single): +13.97%
- Geekbench 6 (multi): +31.27%
- Cinebench R23 Single: +15.75%
- Cinebench R23 Multi: +47.52%
- CPU-Z V17.01.64 Single: +15.86%
- CPU-Z V17.01.64 Multi: +46.00%
- V-Ray 5: +50.60%
- AI Benchmark: +40.71%
- Y-Cruncher PI MT 10B: +46.65%
- Blender Monster: +48.07%
- Blender Classroom: +49.70%
- 3DMark Night Raid: +37.66%
- Nero Score: +28.78%
- Handbrake: +47.14%
- CS:GO FPS Bench: +20.65%
- Tom Raider: +4.06%
- Final Fantasy XV: +3.23%
Here are the 3DMark CPU Profile scores
- CPU Profile 1 Thread: +23.13%
- CPU Profile 2 Threads: +21.10%
- CPU Profile 4 Threads: +49.01%
- CPU Profile 8 Threads: +46.01%
- CPU Profile 16 Threads: +46.20%
- CPU Profile Max Threads: +38.00%
We get a solid uplift in single and lightly threaded benchmarks by increasing the maximum core frequency of most cores by 500 MHz – from 4.6 GHz to 5.1 GHz – for example, we get a +23.14% performance uplift in 3DMark CPU Profile 1 Thread. Overall, we get the highest performance improvement of +50.60% in V-Ray 5.
When running Prime 95 Small FFTs with AVX2 enabled, the average CPU effective clock is 4368 MHz with 1.134 volts. The average CPU temperature is 96.0 degrees Celsius. The ambient and water temperature is 33.0 and 45.2 degrees Celsius. The average CPU package power is 760.5 watts, and the average CPU input power is 1140.7 watts.
When running Prime 95 Small FFTs with AVX disabled, the average CPU effective clock is 4466 MHz with 1.154 volts. The average CPU temperature is 91.0 degrees Celsius. The ambient and water temperature is 32.9 and 42.3 degrees Celsius. The average CPU package power is 639.4 watts, and the average CPU input power is 930.3 watts.
Intel Xeon w7-3465X: Conclusion
All right, let us wrap this up.
It took me way too long to finish this overclocking guide. I assembled this Xeon w7-3465X system in the middle of May and only wrapped up testing three months later. The main reason is that Sapphire Rapids is a challenging architecture to finetune. Both in terms of all-core stability in heavy workloads and single-core stability in light workloads.
For the heavy workloads, the issue comes down to the massive power requirements. Going from idle to an all-core overclocked Prime95 AVX2 run spikes the power consumption well over 1000W. That’s huge, and you must set up your system correctly.
For light workloads, it’s difficult to distinguish which cores are causing the instability when finetuning, especially in adaptive voltage mode. We effectively have only one OC V/F point that controls the V/F curves of all the CPU cores. That makes it tough to finesse the overclock.
But the performance improvements are worth the trouble. Getting over 50% extra performance with what I’d consider minimal tuning is pretty sweet.
Anyway, that’s all for today! I want to thank my Patreon supporters for supporting my work. If you have any questions or comments, please drop them in the comment section below.
See you next time!
5 Minute Overclock: Intel Xeon w7-3465X to 5100 MHz - 5 Minute Overclock
[…] I’ll speedrun you through the BIOS settings and provide some notes and tips along the way. Please note that this is for entertainment purposes only and not the whole picture. Please don’t outright copy these settings and apply them to your system. If you want to learn how to overclock this system, please check out the longer SkatterBencher article. […]
Lex
Hey Pieter,
i get an easy 4.8 all-core OC with only ASUS MultiCore Enhancement to Enabled – Remove All limits, CPU Core Ratio to By Core Usage and disable C-states basically. This pushes the CPU under full load to 4.8ghz on all cores and close to 600W according to HWinfo. Its running at about 62° in my rig.
Where did you get your Memory from? I am stuck with Kingston 8x16gb 6000mt as it was/is the only OC 8x16gb memory available in Germany. Are the Hynix presets for the Zeta R5? The preset timings seem to be way better than Kingston XMP…
Thanks for your hard work and keep on going !
Cheers, Lex
Pieter
Thanks for the kind words!
The memory is not a retail kit but an engineering sample and I just got it for testing with Sapphire Rapids.