SkatterBencher #41: AMD Radeon RX 6500 XT Overclocked to 3002 MHz
We overclock the AMD Radeon RX 6500 XT up to 3002 MHz with using the ElmorLabs EVC2SX and PARAMi water cooling.
It is the first time I’m overclocking an AMD graphics card for this blog and the first AMD graphics card I overclock in over ten years. The last AMD GPU I overclocked with serious intent was the Radeon HD 7870 back in 2012.
Like my GT 1030 video, my main goal was to get back up to speed on how AMD GPU overclocking works. Unfortunately, the outcome was quite disappointing because of AMD’s overclocking restrictions, which I’ll share in this blog post.
Let’s get started.
Table of Contents
AMD Radeon RX 6500 XT: Introduction
The AMD Radeon RX 6500 XT is one of the two discrete graphics cards equipped with the Navi 24 GPU alongside its little brother, the RX 6400. Navi 24 features AMD’s latest RNDA2 architecture built on the TSMC 6nm process. It is, in fact, the first 6nm GPU.
Navi 24 is the most low-end GPU of the second generation Navi product lineup and features 16 Compute Units. That’s significantly less than the 80 Compute Units featured on the Navi 21 powered RX 6900 XT.
The Radeon RX 6500 XT has a rated TBP of 107W, a base clock of 2310 MHz, a game frequency of 2610 MHz, and a listed boost clock frequency of 2815 MHz. It also has 4GB of GDDR6 memory
In today’s video, we cover five overclocking strategies.
- First, increase the GPU frequency using AMD’s automatic overclocking feature
- Second, we manually increase the GPU and memory frequency with the AMD Adrenalin tuning toolset
- Third, we try to use the undervolting tool to increase the GPU frequency further
- fourth, we use the ElmorLabs EVC2SX to work around some performance limiting issues
- Lastly, we slap on a water block to get the most performance out of our GPU
Before we jump into the overclocking, let us quickly go over the hardware and benchmarks we use in this video.
AMD Radeon RX 6500 XT: Platform Overview
Along with the GIGABYTE Radeon RX 6500 XT Eagle graphics card, in this guide, we use an Intel Core i9-12900KF processor, an ASUS ROG Maximus Z690 Apex motherboard, a pair of generic 16GB DDR5 Micron memory sticks, an 512GB Aorus M.2 NVMe SSD, an Antec HCP 1000W Platinum power supply, a Noctua NH-L9i-17xx chromax.black heatsink, the ElmorLabs Easy Fan Controller, the ElmorLabs EVC2SX, the Elmorlabs Power Measurement Device, and EK-Pro QDC Kit P360 water cooling, and a PARAMi universal GPU water block. All this is mounted on top of our favorite Open Benchtable V2.
The cost of the components should be around $3,318.
- GIGABYTE Radeon™ RX 6500 XT EAGLE graphics card: $210
- Intel Core i9-12900KF processor: $573
- ASUS ROG Maximus Z690 Apex motherboard: $720
- 32B DDR5-4800 Micron memory: $250
- AORUS RGB NVMe M.2 512GB SSD: $90
- Antec HCP 1000W Platinum power supply: $200
- Noctua NH-L9i-17xx chromax.black heatsink: $55
- ElmorLabs Easy Fan Controller: $20
- ElmorLabs EVC2SX: $32
- Elmorlabs Power Measurement Device: $45
- EK-Pro QDC Kit P360: $893
- PARAMi water block: $30
- Open Benchtable V2: $200
AMD Radeon RX 6500 XT: Benchmark Software
We use Windows 11 and the following benchmark applications to measure performance and ensure system stability.
- Geekbench 5 (OpenCL, Vulkan) https://www.geekbench.com/
- Geeks3D FurMark https://geeks3d.com/furmark/
- PhysX FluidMark https://www.ozone3d.net/benchmarks/physx-fluidmark/
- 3DMark Night Raid https://www.3dmark.com/
- Simple Raytracing Benchmark https://marvizer.itch.io/simple-raytracing-benchmark
- Unigine Superposition: https://benchmark.unigine.com/superposition
- Spaceship: https://store.steampowered.com/app/1605230/Spaceship__Visual_Effect_Graph_Demo/
- Shadow of the Tomb Raider https://store.steampowered.com/app/750920/Shadow_of_the_Tomb_Raider_Definitive_Edition/
- CS:GO FPS Bench https://steamcommunity.com/sharedfiles/filedetails/?id=500334237
- Final Fantasy XV http://benchmark.finalfantasyxv.com/na/
- GPU-Z Render Test https://www.techpowerup.com/gpuz/
The benchmark selection is similar to the one we used in SkatterBencher #40 except for the Simple RayTracing Benchmark and Tomb Raider benchmark. Since RDNA2 has Ray Accelerators, it makes sense also to include a raytracing benchmark, and I included Tomb Raider to test the PCIe performance scaling.
AMD Radeon RX 6500 XT: Stock Performance
Before starting any overclocking, we must first check the system performance at default settings. Unlike some other cards on the market, this GIGABYTE RX 6500 XT EAGLE is not overclocked out of the box and thus runs the default AMD-rated frequencies.
Here is the benchmark performance at stock:
- Geekbench 5 OpenCL: 57,113 points
- Geekbench 5 Vulkan: 42,347 points
- Furmark 1080P: 5,296 points
- FluidMark 1080P: 4,573 points
- 3DMark Night Raid: 46,991 marks
- Simple RayTracing Benchmark: 19.58
- Unigine Superposition: 9,604 points
- Spaceship: 94.7 fps
- Shadow of the Tomb Raider: 77 fps
- CS:GO FPS Bench: 345.28 fps
- Final Fantasy XV: 67.89 fps
When running Furmark GPU Stress Test, the average GPU clock is 2493 MHz with 0.958 volts, and the GPU Memory clock is 2234 MHz with 1.36 volts. The average GPU and GPU Hot Spot temperature is 60.7 degrees Celsius and 74.2 degrees Celsius. The average TGP power is 79.976 watts.
When running the GPU-Z Render Test, the maximum GPU Clock is 2870 MHz with 1.195 volts.
PCIe Performance Scaling
A big talking point of the RX 6500 XT is that it’s physically limited to a PCIe 4.0 x4 interface.
PCIe 4.0 x4 offers 8GB/s of theoretical bandwidth in each direction, and the reduction of lanes means a cost-saving opportunity due to lower design complexity and fewer components required. On the flip side, tech media argued that some customers would use this low-end graphics card in a system that doesn’t support PCIe Gen 4. In that case, the PCIe bandwidth would downgrade to the 4 GB/s offered by PCIe 3.0 x4.
TechPowerUp published a detailed article on PCIe performance scaling with the RX 6500 XT graphics card, and I suggest you check it out for more details.
The article points out a 13% performance loss when downgrading from PCIe 4.0 to PCIe 3.0. Furthermore, they also estimate a 6-10% performance loss compared to if AMD had chosen a PCIe 4.0 x8 configuration.
Out of curiosity, we checked if there was any performance scaling from PCIe overclocking. We first check the theoretical bandwidth using the 3DMark PCI Express Feature Test. Going up from gen 3 to gen 4 increases the bandwidth by 87%, and increasing the PCIe frequency from 100 MHz to 110 MHz increases the bandwidth by 18%.
When checking two synthetic 3D benchmarks, Time Spy Extreme and Final Fantasy XV, there’s no performance difference between gen 3 and gen 4. However, in a real game benchmark such as Shadow of the Tomb Raider, we find a performance uplift of 18% going from gen 3 to gen 4 and a performance increase between 3% and 6% when increasing the PCIe frequency to 110 MHz.
So, in summary:
- The performance loss associated with installing the RX 6500 XT in a system that offers up to PCIe 3.0 is dependent on the type of workload
- Workloads that rely on lots of communication between the CPU and GPU, such as real-world gaming scenarios, will suffer from a significant performance penalty on Gen 3 systems
- Increasing the PCIe frequency impacts the performance in real-world gaming workloads positively. Increased performance scaling when overclocking the PCIe frequency on a PCIe 4.0 system suggests that the x4 physical connection may indeed be limiting the RX 6500 XT performance out of the box already
While I would have liked to include overclocking the PCIe frequency in my overclocking strategies, it’s important to note that this isn’t a straightforward solution because many devices are running on the PCIe clock.
On Alder Lake, the PCH PLL generates the PCIe clock frequency, which drives the clock for the CPU PCIe lanes, the DMI interconnect, and the chipset PCIe lanes. Overclocking the PCIe increases the clock frequency for all those devices. That causes three main issues:
- Not all graphics cards can run high PCIe frequency. Up to 110 MHz was OK for the RX 6500 XT, however.
- Not all M.2 storage devices can run high PCIe frequency. My Aorus drive, in particular, has issues around 104MHz already, so I have to switch to a SATA drive connected via the chipset
- The DMI link connecting the CPU and chipset may not be capable of running at a high frequency. In my case, at around 115 MHz PCIe frequency, the system no longer boots due to too high DMI frequency
Considering all the above challenges, I decided to stick with 100 MHz PCIe frequency for all overclocking strategies.
Now let’s jump into the overclocking. First up is AMD’s automatic overclocking
OC Strategy #1: AMD Automatic Overclocking
In our first overclocking strategy, we use the AMD Adrenalin Tuning toolset to overclock the Radeon RX 6500 XT automatically.
AMD Software: Adrenalin Edition
AMD Software Adrenalin Edition, introduced in March 2022, is a rebranded version of the Radeon Software: Adrenalin Edition, the successor of the Crimson ReLive Edition software package from 2016. In short, the software includes not only the graphics card drivers but also a bunch of valuable tools for finetuning your gaming experience. As this is an overclocking guide, I will skip all the gaming features and go straight to the overclocking.
Embedded in the Adrenalin software, you can find a segment dedicated to performance. That includes logging metrics and tuning knobs. As part of the tuning knobs, you have three options for automatic overclocking:
- Undervolting the GPU
- Overclocking the GPU, and
- Overclocking the VRAM
Unfortunately, you can’t do all three simultaneously, so we only overclock the GPU for this strategy.
Upon opening the AMD Adrenalin driver software
- Go to the Performance tab
- Click on Tuning
- Under Automatic Tuning, click Overclock GPU
- Click on Proceed
- Click OK.
We re-ran the benchmarks and checked the performance increase compared to the default operation.
- Geekbench 5 OpenCL: +1.37%
- Geekbench 5 Vulkan: +2.04%
- Furmark 1080P: +0.66%
- FluidMark 1080P: +2.16%
- 3DMark Night Raid: +5.81%
- Simple RayTracing Benchmark: +1.53%
- Unigine Superposition: +2.33%
- Spaceship: +2.01%
- Shadow of the Tomb Raider: +1.30%
- CS:GO FPS Bench: +4.59%
- Final Fantasy XV: +5.99%
We’re using the automatic overclocking function, so any performance increase is entirely free. We see improvements ranging from 0.66% in Furmark 1080P to 5.59% in Final Fantasy XV.
When running Furmark GPU Stress Test, the average GPU clock is 2532 MHz with 0.958 volts, and the GPU Memory clock is 2234 MHz with 1.36 volts. The average GPU and GPU Hot Spot temperature is 63.3 degrees Celsius and 78.1 degrees Celsius. The average TGP power is 84.748 watts.
When running the GPU-Z Render Test, the maximum GPU Clock is 2901 MHz with 1.195 volts.
Now it’s time to start our manual overclock. But before we get to that, let’s look at AMD’s GPU boost technology.
AMD GPU Frequency Boost Technology
Over the past couple of months, I’ve done quite a few videos where I have a closer look at the semiconductor frequency boosting technologies. Typically there’s a clear timeline for the technology evolution as many iterations follow a clear starting point.
That’s not quite the case with AMD. Our story can start in 2015, when AMD introduced the Adaptive Voltage Frequency Scaling, or in 2001 when ATI introduced the PowerPlay technology.
While I would love to do a deep dive into the PowerPlay technology starting from its inception in 2001, I will save that for a future SkatterBencher video. Instead, let’s kick off our journey in 2010 when AMD launched its first AMD-branded graphics card.
2010 – PowerPlay, PowerTune, & Overdrive (TeraScale 3, Northern Islands)
In 2006, AMD acquired ATI Technology, a Canadian GPU semiconductor graphics company, for US$5.4 billion. The acquisition was part of AMD’s plan to change how CPUs are built. In AMD’s eyes, the future of the CPU was not just a chip but a SOC or system-on-chip. The SOC would integrate functions like I/O and graphics alongside the CPU. In addition to developing a true system-on-chip, acquiring ATI Technology also brought AMD to the forefront of the discrete graphics market.
AMD continued to market their discrete graphics under the ATI Technologies brand until the Radeon HD 5000 series launched in 2009. On October 22, 2010, AMD launched its first AMD branded discrete graphics card in the form of the AMD Radeon HD 6870.
Less than two months later, AMD launched the new TeraScale 3 GPU architecture and Northern Islands product family with AMD Radeon HD 6970 as the flagship product. Among the many novelties, AMD introduced a new Dynamic Voltage Frequency Scaling technology, DVFS, called AMD PowerTune as the successor to the acquired ATI PowerPlay technology.
ATI PowerPlay
ATI introduced the PowerPlay technology in 2001 as a form of Dynamic Power Management (DPM) for their mobile graphics. In short, PowerPlay consisted of three fixed power states, each defined by a specific power, voltage, and frequency for the GPU. As the workload shifts from low to heavy and back, the power state adjusts accordingly. That ensures maximum performance when needed for heavy workloads and minimum power usage when there are no workloads.
Traditionally, the power states have fixed voltage and frequency. However, workloads can vary significantly in how they use the GPU. As a result, GPU power draw in the highest performance state can differ widely depending on the specific application. If we had an unlimited power budget, this would be no problem. But unfortunately, we are always power-limited.
Semiconductor companies like AMD address this challenge by utilizing a TDP or Thermal Design power limit. The TDP captures the maximum power a product can reliably operate at within the warranty period. The voltage and clock speeds selected for the highest power state are therefore limited by:
- The worst-case scenario for peak power use
- The maximum voltage that fits within the power budget
- The highest frequency that the GPU can run with the given voltage
These three compounding “worst-case” factors result in a very conservative configuration for maximum performance. Furthermore, most real-world applications would never meet these worst-case conditions. So, in the real world, much performance is lost.
AMD PowerTune
A 2010 whitepaper lays out the argument for the AMD PowerTune technology.
PowerTune aims to deliver higher performance optimized to the GPU’s power limits by dynamically adjusting the clock during runtime based on an internally calculated GPU power assessment.
By dynamically managing the GPU clock frequency based on the GPU’s proximity to the rated TDP, the technology enables higher clock speeds in the highest power state. So, the AMD PowerTune technology allows for dynamic frequency adjustment to fit within the TDP envelope rather than restricting the GPU’s frequency by a worst-case power expectation. Essentially, AMD employs several inferred power states with lower GPU frequencies between the Highest P-state and the Intermediate P-State.
To make the inferred states as effective as possible, it’s essential to have high granularity in terms of frequency. Anandtech had noticed that AMD’s frequency granularity was much higher than the 13 MHz steps offered by NVIDIA at the time. Furthermore, AMD GPUs would switch much faster between states in milliseconds instead of the 100ms for NVIDIA cards. As a result of the high granularity and the high-speed state switching, it’s tough to accurately measure which frequency the GPU is currently running at. On the plus side, the result is substantially higher frequencies and resulting performance in typical workloads like gaming while staying within the power budget.
ATI Overdrive
Another technology that AMD added to its portfolio with the acquisition of ATI in 2006 is ATI Overdrive. Overdrive was a new feature of the Catalyst 3.8 driver, released alongside the ATI Radeon 9800 XT graphics card in December 2003. Its function was pretty simple: if you enable Overdrive, then the GPU frequency would increase from 412 MHz to 419 MHz if the GPU temperature is below 56 degrees Celsius.
By the third iteration of the Overdrive feature, ATI had enabled advanced performance tuning tools such as manual GPU and memory overclocking, and even an automatic overclocking function!
The Overdrive feature would continue to live within the ATI Catalyst software suite with the same overclocking toolkit even after AMD’s acquisition and the introduction of the PowerTune technology in 2010. With the transition from PowerPlay to PowerTune, the Overdrive overclocking knobs also receive a minor overhaul. Whereas we used to have just two options to control the GPU Clock and Memory Clock, from the Radeon HD 6000 series onwards, we have three settings:
- High-Performance GPU clock,
- High-Performance Memory clock, and
- Power Control
The high-performance GPU and memory clock represents the maximum clock frequencies the card is allowed to run at in the highest power state. This maximum frequency is still constrained by the power usage and may decrease if the current power exceeds the TDP.
The Power Control knob allows us to adjust the maximum power usage in the highest performance state.
PowerPlayInfo VBIOS Module
The last topic that we must tackle before moving forward is the VBIOS.
In short: the VBIOS is the BIOS of a graphics card and contains all information required to initialize the graphics card at boot. The VBIOS often contains multiple modules which provide information about the graphics card or define certain specifications.
On AMD graphics cards, a specific module called PowerPlayInfo contains all information related to the performance specification, such as the frequency and voltage for each DPM state, power and thermal limits, and fan speed configuration. We can find this module by extracting the VBIOS ROM.
In the past, enthusiasts could edit the PowerPlay module and flash a VBIOS with custom settings to their graphics card. Nowadays, AMD digitally signs the VBIOS firmware. While it’s still possible to extract the BIOS and make modifications, without the ability to correctly digitally sign, the card will not boot up with a custom VBIOS.
The graphics software has a powerplay driver component that reads the information from the VBIOS and configures the appropriate frequency, voltage, and fan speed configuration. End-users could override the powerplay driver configuration using a “softpowerplay” registry entry. This registry entry is essentially a direct copy of the VBIOS module.
However, since the first Navi graphics card, the VBIOS information is parsed by the GPU System Management Unit, and the System Management Controller manages the power.
We’ll touch on this topic a couple of times during this video as it is particularly relevant for overclockers.
2012 – PowerTune Technology with Boost (GCN 1.0, Southern Islands)
In January 2012, AMD launched the brand new Graphics Core Next GPU architecture and the Southern Islands product family with the AMD Radeon HD 7970 as the flagship product.
Six months later, on June 22, 2012, AMD introduced the AMD Radeon HD 7970 GHz Edition graphics card in response to the launch of the NVIDIA GeForce GTX 680. As the name already suggests, the GHz edition was a higher clock version of the same 7970 launched six months earlier, but with a twist. There’s a boost!
AMD PowerTune Technology with Boost, or PT Boost, introduces a new approach to maximizing out-of-the-box frequency and performance.
The idea is pretty simple: AMD adds a Boost P-state on top of the previous highest P-state. This additional P-state allows for extra voltage and thus higher operating frequency.
The Boost P-state works the same as other P-state as the internal power usage estimate primarily drives it. However, on top of that, AMD also adds temperature into the equation. AMD calls this Digital Temperature Estimation, or DTE.
AMD estimates the actual junction temperature using DTE instead of assuming the GPU is operating at the worst-case junction temperature. The GPU rarely runs at the worst-case temperature, which opens up opportunities for about 4% additional frequency headroom.
2013 – PowerTune Enhanced & SVI2 (GCN 2.0, Sea Islands)
On March 22, 2013, AMD launched the second generation Graphics Core Next GPU architecture and the Sea Islands product family with the Radeon HD 7790 as the initial flagship product and Radeon R9 290X as the later flagship product.
AMD also introduced an enhanced version of PowerTune Technology alongside many new improvements. Enhanced PowerTune comes with five new features:
- Increasing the amount of DPM states and removing the inferred states
- Improved switching speed between DPM states
- Introduction of the Effective Clock
- Complete inclusion of operating temperature in the DPM state characterization
- Exposing the ability to change the GPU temperature target to end-user
Until now, AMD’s technology offered four distinct dynamic power management (DPM) states: Boost State, High State, Intermediate State, and Low State. Several inferred states had slightly lower frequency but similar voltage as the master P-state between the Boost State and Intermediate State.
While this implementation offered AMD many improvements in performance per watt and power efficiency, a key drawback was that these inferred states didn’t adjust the operating voltage. However, as we all know, voltage is a crucial driver for power usage.
The Enhanced version of PowerTune doubled the amount of DPM states from four to eight. That also effectively doubled the clock/voltage pairings and, in theory, should offer more significant power efficiency gains. Unfortunately, AMD did not continue with the inferred states, which reduced the clock frequency granularity.
Implementing more fine grained power states is not that straightforward. First, you need to switch as quickly as possible between states to optimize power efficiency. Second, you also need fast and accurate telemetry to enable more rapid switching.
The Radeon HD 7790 can state jump as quickly as every 10ms, typically bouncing between two or more states to keep the card within its limits.
The telemetry is also improved.
The Enhanced PowerTune relies on the AMD SVI2 interface. SVI stands for Serial VID Interface and is an AMD-designed VR controller interface. The 2nd generation SVI was initially released in early 2010 and provides a range of features, including greater granularity in voltage selection, faster data rate, and telemetry functions.
As a result, the Enhanced PowerTune architecture includes three main inputs: hardware temperature monitoring, internal power & current estimates, and external VRM telemetry inputs. The SMU evaluates these inputs and decides which is the optimal DPM state for the GPU. The DPM state then determines the clock frequency, the voltage, and the fan control setting.
AMD also aimed to exploit the availability of the many available inputs and controls to throttle performance based on various limits. These include explicit temperature, power, and fan speed limits. On the Radeon 9 290X, the temperature limit was 95C, and the power limit was 300W. The fan speed limit was 40% in Quiet Mode and 55% in Uber Mode.
If the GPU temperature exceeds the throttle point within the power limit, the GPU will ramp the fan speed before throttling the frequency. So, you can go fast but not too loud.
As the GPU will switch faster between states, it’s also more challenging to know at what frequency the GPU is operating. To accurately communicate the actual operating frequency to the end-user, AMD introduced the “Effective Clock.” The Effective Clock is essentially the average clock frequency over a period of 50 milliseconds.
The final aspect of the Enhanced PowerTune is exposing new overclocking tools like Total Design Current (TDC) and the ability to adjust the target GPU temperature.
2015 – Adaptive Voltage Frequency Scaling & Voltage Adaptive Operation (Excavator+GCN 3.0, Carrizo)
Before continuing with the AMD graphics frequency technology, let’s make a small sidestep to the CPU side.
In 2014 AMD launched the 25×20 initiative, a bold goal to deliver at least 25 times more energy efficiency by 2020 in their mobile processors. They proudly announced having exceeded this target by achieving 31.7 times energy efficiency six years later. To incredibly oversimply this challenge, there are two main ways how AMD got around doing this:
- Increase the compute performance for a given power budget through architectural innovation of their CPUs and GPUs integrated into the APUs
- Decrease the power use for a given compute performance by innovations in real-time power management and silicon-level power optimizations.
A first stepping stone for that second point was the introduction of the Adaptive Voltage and Frequency Scaling (AVFS) technology introduced in the Carrizo APU in 2015.
Carrizo is a 15W 28nm APU part of the 3rd generation Bulldozer-based microarchitecture codenamed Steamroller. Importantly, it was the first AMD product to introduce AVFS for CPU and GPU cores.
In a modern processor, everything starts with power delivered from the 12V DC rail of the power supply across the motherboard to a voltage regulator. The voltage regulator then converts it to about 1V. That 1V is then delivered via the motherboard to the processor socket, where it gets transferred via the socket pins to the processor package and ultimately ends up at the processor die.
The voltage to the processor is not constant. The processor uses higher voltage when it boosts to a high frequency in response to a demanding workload. Conversely, the processor uses lower voltage when in an idle state. Switching from high to low load and vice versa causes significant voltage fluctuations. This phenomenon is commonly known as voltage droop.
The problem with voltage droops is very well known to overclockers: a voltage droop can cause the operating voltage to fall too low causing system instability. In the past, chip designers would address this issue by supplying 10 to 15% excess voltage to ensure the chip always has sufficient voltage, even after a significant voltage droop.
Unfortunately, power increases proportionally to the square of the voltage (C*V2*F). In other words: a 10% increase in voltage equals a 21% increase in power. When aiming to improve power efficiency by 25 times, clearly getting a tight grip on operating voltage is an important topic.
The Carrizo APU addresses this challenge with two technologies:
- Clock Stretching
- Adaptive Voltage Frequency Scaling
Clock Stretching is something any Ryzen CPU overclocker already knows very well. To keep it short: a circuit on the processor tracks the operating voltage and compares the average voltage to droops. If the actual voltage drops below the average, the processor briefly reduces the frequency accordingly. The system can respond in nanoseconds, which means, typically, the user would not notice the temporary reductions in clock frequency. If clock stretching is active for an extended period, you would see a significant decrease in performance.
Adaptive Voltage Frequency Scaling is a technology that implements many silicon speed capability sensors and voltage sensors in addition to the already present temperature and power sensors. Silicon speed capability means the maximum operating frequency of a chip. AVFS enables setting the optimal frequency for a given power or performance level across process, voltage, and temperature ranges.
More importantly, this technology can optimize the frequency for a specific individual chip. It can finetune the optimal frequency for a golden chip and a total dud. Does this sound familiar to the Ryzen overclockers?
Now that we understand the purpose of clock Stretching and Adaptive Voltage Frequency Scaling, let’s get back to the GPUs.
2015 – AVFS & Voltage Adaptive Mode (GNC 3.0, Volcanic Islands)
While AMD first introduced the third generation GCN architecture with the Volcanic Islands product family in 2014, it still featured the same PowerTune technology from the previous graphics cards. In 2015, AMD introduced their next high-end GPU, codenamed Fiji, and the Radeon R9 Fury X graphics card as the flagship product.
In terms of power management, the Fury X improved from the previous generation GPUs as it featured the Voltage Adaptive Operation and Adaptive Voltage Frequency Scaling technologies first introduced in Carrizo.
Towards the end of 2015, AMD worked on a new driver and software package. Retiring the Catalyst moniker, which had served AMD and first ATI since 2002, the new software launched on November 24, 2015, was called Radeon Software Crimson Edition. The software package was designed from scratch and featured a total overhaul of features and interface.
Fortunately for overclockers, Overdrive was still present, and all overclocking knobs still were available.
2016 – Radeon Wattman (GCN 4.0, Polaris)
The successor to AMD Fiji, Polaris, was announced at CES 2016. AMD Polaris is a 28nm GPU based on the 4th generation GCN architecture. The GPU would eventually make it to market on June 29, 2016, with the Radeon RX 480 graphics card.
It’s a bit fuzzy on whether Polaris was the first or second-generation GPU to integrate AMD’s clock stretching and adaptive voltage frequency scaling technologies. As we just learned, Fiji seems to have been the first. However, in the Polaris whitepaper, we read: “Polaris is AMD’s first GPU architecture to take advantage of the advanced power management and circuit design techniques that have been developed for CPUs at AMD.”
A significant change for overclockers was retiring the Overdrive feature in the AMD software package. Instead, AMD offered Radeon Wattman as the toolkit for people who want to tune the performance of their graphics card.
Radeon Wattman offers a wide range of overclocking tools, including performance/watt profiles, automatic overclocking, manual overclocking, graphics card monitoring, exporting and importing profiles, and power limit adjustments.
2017 – SMU & Infinity Fabric (GCN 5.0, Vega 10)
The Polaris architecture spawned ten different GPUs: 3 in the 1st generation, 5 in the 2nd generation, and 2 in the 3rd and final generation. Alongside the 2nd generation Polaris, AMD launched their next GPU: Vega.
AMD Vega is a 14nm GPU based on the 5th generation GCN architecture. The GPU would eventually make it to market on August 14, 2017, in the form of the Radeon RX Vega 64 graphics card. Naturally, it inherits the AVFS and Adaptive Clocking technology from Polaris but also comes with a couple of other relevant improvements
Vega also included a more powerful integrated power management microcontroller. The new microcontroller allows for implementing more complex power-control algorithms, and the addition of a floating-point unit allows for higher-precision calculations to be a part of that mix.
Vega also added a third clock domain for its Infinity Fabric SoC level interconnect alongside the graphics core and memory domains. The Infinity Fabric connects the graphics core to other on-chip functions like multimedia, display, and I/O. Thanks to the addition of the third domain, the graphics core can be clocked separately from the fabric and thus allows for both maximizing the GPU clock and ensuring a high interconnect speed when the GPU is idle.
By combining the more robust power management unit, the addition of the third clock domain, and active workload identification, Vega can more precisely tune the graphics card for optimal power and performance across various workloads.
2019 – SW SMU (GCN 5.1, Vega 20)
The Vega 20 refresh consumer products and the world’s first 7nm GPU launched on February 7, 2019, with the Radeon VII as the flagship product. Vega 20 is a simple die shrink of the original Vega GPU launched a year earlier. Thus there are only minor architectural differences.
However, a significant change would come to how AMD tackles the challenge of power management. As disclosed in a Linux patch note, from Vega 20 onwards, the GPU will no longer use the traditional powerplay driver but instead use a new software SMU framework.
A key line in the patch notes indicates that the system management unit (SMU) reads the powerplay parameters from the VBIOS and stores the information with the system management controller (SMC). The SMC controls all power management-related functions, including DPM states, clock frequency, and voltage.
While this new approach was not implemented for Vega 20, it would come into effect with the next generation GPUs as will be apparent in a couple of minutes
2019 – Voltage Frequency Curve (RDNA, Navi 10)
The Vega architecture spawned three different GPUs: 2 in the 1st and 1 in the 2nd generation. Alongside the 2nd generation Vega, AMD launched their next GPU: Navi.
AMD Navi is a 7nm GPU based on a new microarchitecture called RDNA and came to market on July 7, 2019, as the Radeon RX 5700 XT graphics card. It gets a little fuzzy on the specific power management technology used in the current Navi graphics cards. We only know there’s no PowerTune on the product specification pages.
Furthermore, overclocking support embedded in the new driver software package is entirely different. As we just mentioned, from Navi onwards, the GPU power management has moved away from the powerplay driver to an SMU-managed algorithm. That is visualized with the updated Radeon Wattman software. We now have a voltage-frequency curve instead of having frequency and voltage control over specific DPM states.
The voltage frequency curve enables the configuration of 2 distinct points: the upper limit and the lower limit. Each point can be configured with a specific frequency and voltage within arbitrary limits.
Aside from the voltage frequency curve, there’s also an option to increase the memory frequency and the power limit slightly.
In the 2020 release of the Radeon Software Adrenalin Edition, a more general Performance Tuning tab replaces the Radeon Wattman. Again, the same features are still available; however, they’re re-arranged in a more easy-to-digest format.
2020 – Min/Max GPU Frequency (RDNA2, Navi 20)
The 2nd generation RDNA and Navi 20 products launched on November 18, 2020, with initially the Radeon RX 6800 XT and the Radeon RX 6900 XT as flagship one month later.
It’s again not very easy to cover what’s changed on the GPU frequency boosting technology as there’s little to no information available from AMD. We can only highlight the slight change in the performance tuning software interface. The voltage/frequency curve was exchanged for two sliders: minimum and maximum GPU frequency.
We’ll use these settings to overclock our Radeon RX 6500 XT manually.
OC Strategy #2: Manual Overclock
In our second overclocking strategy, we use the AMD Adrenalin Tuning toolset to overclock the Radeon RX 6500 XT manually.
As I pointed out earlier, the AMD Adrenalin software houses a couple of performance tuning knobs that we can use to overclock our graphics card. Specifically, we have access to:
- GPU Frequency and voltage
- Memory Frequency and timings
- Fan configuration
- Power configuration
While at first sight, it may seem AMD is offering a large selection of overclocking tools, in practice, we’re extremely constrained:
- In terms of GPU and memory frequency, there is no way to override the default limits and go beyond what AMD allows for
- In terms of GPU voltage, we’re already at the maximum setting by default, and again there’s no way to override the setting
- There are only two options to select for memory timings: the default timings and so-called fast timings.
- In terms of power, we are also limited by AMD’s maximum allowed limit, and there’s no way to go beyond this
In short: ironically, AMD overclocking is the exact opposite of “adrenaline.” It’s incredibly boring as it’s just a matter of maxing out all the sliders except for minimum GPU frequency.
To understand why we use the minimum frequency for performance tuning, we can look at the Radeon RX 6500 XT voltage-frequency curve.
Radeon RX 6500 XT V/F Curve
Generating the default voltage-frequency curve is difficult as AMD employs AVFS to finetune the voltage and frequency to a specific situation.
In this case, we manually generate the VF curve by checking the VDDCR_GFX GPU core voltage at a specific frequency. We run the GPU-Z render test in the background to ensure the GPU is in the highest power state. To map the voltage, we ensure the minimum and maximum GPU frequency delta is exactly 100 MHz, which is the minimum delta allowed.
The final result is that the GPU AVFS algorithm will set a GPU frequency between the minimum and maximum specified frequency. We record the maximum frequency set within those guardbands and the maximum voltage. That generates the following voltage-frequency curve.
As you can see, the curve starts at 720mV for the lowest frequencies and ends at 1196mV for the highest frequencies. The highest voltage is applied from around 2750 MHz to the maximum configurable frequency of 2975 MHz.
When comparing the voltage frequency curve using GPU-Z Render Test and FurMark as workload, we can see minor differences in both voltage and frequency for a given test point.
For example, if we set the minimum GPU Frequency to 2300 MHz and the maximum GPU Frequency to 2400 MHz, the resulting voltage and frequency is
- 2386 MHz at 0.924 volts with GPU-Z Render Test, and
- 2376 MHz at 0.913 volts with FurMark
The standard minimum GPU frequency is 500 MHz. By increasing this value, we essentially prevent the GPU from using the bottom part of the voltage frequency curve. In some benchmarks, this helps improve the performance.
However, increasing the minimum GPU frequency is not without its challenges.
Minimum GPU Frequency Performance Scaling
The minimum GPU frequency option is a great tool to prevent the GPU from using lower frequencies in workloads that require high performance but don’t push the GPU frequency to the limit. However, this is only useful if no limiter throttles the frequency. Unfortunately, this does happen on the Radeon RX 6500 XT graphics card, and, twice unfortunately, it’s not that easy to spot.
To demonstrate what’s going on, I track the relative performance improvement in two benchmarks: Furmark 1080P and Geekbench 5 Vulkan. We also set the following parameters:
- Maximum GPU Frequency: 2975 MHz
- Maximum GPU Voltage: 1200 mV
- Maximum Memory Frequency: 2400 MHz
- Maximum Power Limit: 15%
Then we start with a minimum frequency of 500 MHz and gradually increase to the maximum of 2875 MHz. The resulting chart looks as follows.
We can see that the Geekbench 5 Vulkan performance scales up to a minimum GPU frequency of 2875 MHz. With FurMark, however, the performance drops sharply if the minimum GPU frequency exceeds 2600 MHz.
AMD’s GPU frequency technologies, such as clock stretching, are fantastic. But while these technologies prevent the GPU from crashes and instabilities, it also makes it challenging to detect performance throttling.
If we set the minimum GPU frequency to 2875 MHz, the FurMark benchmark keeps running, and HWiNFO keeps reporting both the GPU clock and GPU effective clock as 2800 MHz+. So as an end-user, you have no idea the performance is worse unless you check with a benchmark.
We use 2600 MHz as the minimum GPU frequency and max out the rest of the sliders for this overclocking strategy.
Upon opening the AMD Adrenaline driver software
- Go to the Performance tab
- Click on Tuning
- Under Tuning Control, Manual Tuning, click Custom
- Set GPU Tuning to Enabled
- Set Advanced Control to Enabled
- Set Min Frequency (MHz) to 2600 MHz
- Set Max Frequency (MHz) to 2975 MHz
- Set VRAM Tuning to Enabled
- Set Memory Timing to Fast Timing
- Set Advanced Control to Enabled
- Set Max Frequency (MHz) to 2400
- Set Power Tuning to Enabled
- Set Power Limit (%) to 15
Then click Apply Changes to confirm the overclocked settings.
We re-ran the benchmarks and checked the performance increase compared to the default operation.
- Geekbench 5 OpenCL: +5.50%
- Geekbench 5 Vulkan: +15.92%
- Furmark 1080P: +2.11%
- FluidMark 1080P: +14.59%
- 3DMark Night Raid: +9.83%
- Simple RayTracing Benchmark: +8.78%
- Unigine Superposition: +7.91%
- Spaceship: +7.92%
- Shadow of the Tomb Raider: +6.49%
- CS:GO FPS Bench: +9.21%
- Final Fantasy XV: +10.66%
While severely restricted by AMD, our manual overclock still results in a nice performance boost of up to +15.92% in Geekbench 5 Vulkan.
When running Furmark GPU Stress Test, the average GPU clock is 2602 MHz with 1.014 volts, and the GPU Memory clock is 2384 MHz with 1.36 volts. The average GPU and GPU Hot Spot temperature is 60.3 degrees Celsius and 76.5 degrees Celsius. The average TGP power is 92.009 watts.
When running the GPU-Z Render Test, the maximum GPU Clock is 2962 MHz with 1.195 volts.
OC Strategy #3: Manual Overclock & Undervolt
In our third overclocking strategy, we use the AMD Adrenalin Tuning toolset to overclock and undervolt the Radeon RX 6500 XT manually.
While this may seem like a straightforward process, again, there’s a twist on this AMD Radeon graphics card.
Undervolting the Radeon RX 6500 XT
The undervolting slider available in the Adrenalin software appears to need no explanation. However, things aren’t as they seem.
Looking back at our manually charted voltage-frequency curve, we see that the top end of the frequency uses 1196 mV, close to the 1200 mV we see in the software. So, our gut feeling says the value shown in the software represents the voltage allowed for the highest frequency. But that gut feeling would be wrong.
The voltage shown in the software represents the maximum voltage, or Vmax, allowed for the GPU. Similarly, as we can see from the voltage frequency curve, there’s a minimum voltage, or Vmin, at which the GPU is allowed to operate. Any requested voltage from the voltage controller by the GPU AVFS algorithm is constrained by the Vmin and Vmax guardbands.
Reducing the Vmax value negatively offsets the entire voltage-frequency curve by a certain amount. For example, reducing the Vmax from 1200 mV to 1150 mV will offset the voltage frequency curve by 50 mV. To demonstrate this behavior, I again charted the voltage frequency curve, but this time with 1150 mV Vmax.
I want to focus your attention on four things.
- First, you can see that the minimum voltage of 720mV at the lower end of the voltage frequency curve, up to 1500 MHz, is not affected by the negative voltage offset. That’s because there’s a lower limit for GPU voltage of 720 mV.
- Second, we can see that the voltage for frequencies between 1500 MHz and 1800 MHz is reduced by less than 50 mV because the lower voltage limit of 720 mV is enforced.
- Third, we can see that the voltage for frequencies between 1800 MHz and 2850 MHz is, as expected, negatively offset by 50 mV.
- Lastly, we find that the voltage for frequencies between 2850 MHz and 2975 MHz is reduced by less than 50 mV. Surprisingly, our highest frequency of 2975 MHz is still using 1200 mV! Just like at stock!
Interestingly, it seems that the actual voltage-frequency curve for this GPU extends to about 1250 mV. However, due to the Vmax imposed by AMD, the maximum voltage is limited to 1200 mV. Let me demonstrate this by charting the voltage frequency curve with 1125mV set in the driver.
I want to focus your attention on two interesting phenomena.
- First, as I alluded to in the previous segment, the voltage for the highest frequency of 2975 MHz is now lower than the Vmax of 1200 mV. With a negative offset of 75 mV and a resulting voltage of 1170 mV, we can conclude that the original voltage frequency point for 2975 MHz is around 1250 mV. However, that voltage is by default capped at 1200 mV.
- Second, did you notice the gap in the voltage frequency curve? Due to excessive negative voltage offset, GPU frequencies between 1800 MHz and 2450 MHz are no longer stable. However, frequencies lower and higher are stable. How odd, indeed. This behavior continues when further reducing the Vmax. For example, at 1100 mV, only the highest frequency point of 2975 MHz is stable.
This gap in the voltage frequency curve after applying a negative offset is the trick to squeezing the most performance out of your RX 6500 XT graphics card.
We can extract more performance from our GPU by combining the information we gathered in the previous overclocking strategy by using the minimum GPU frequency setting. The negative offset reduced the operating voltage for a given frequency, thus increasing the headroom for additional overclocking. However, be careful! As I already pointed out, increasing the minimum frequency too high may cause the GPU performance to drop sharply in heavy workloads.
My approach to tuning the GPU frequency is as follows:
- Open the Adrenalin software and set the minimum GPU frequency to the highest value that doesn’t cause performance throttling.
- Then, keep the Furmark GPU Stress Test running and monitor the current FPS.
- Reduce the GPU voltage setting to negatively offset the voltage frequency curve until you see graphical errors on your screen.
- In that case, INCREASE the frequency slightly until you see the performance drop off
- Repeat steps 3 and 4 until you find the right balance between instability and throttling
Using this approach, I could increase the minimum GPU frequency to 2675 MHz with an adjusted GPU voltage setting of 1125 mV.
Upon opening the AMD Adrenaline driver software
- Go to the Performance tab
- Click on Tuning
Then click Apply Changes to confirm the overclocked settings.
We re-ran the benchmarks and checked the performance increase compared to the default operation.
- Geekbench 5 OpenCL: +5.50%
- Geekbench 5 Vulkan: +15.92%
- Furmark 1080P: +2.11%
- FluidMark 1080P: +14.59%
- 3DMark Night Raid: +9.83%
- Simple RayTracing Benchmark: +8.78%
- Unigine Superposition: +7.91%
- Spaceship: +7.92%
- Shadow of the Tomb Raider: +6.49%
- CS:GO FPS Bench: +9.21%
- Final Fantasy XV: +10.66%
After tuning the voltage frequency curve, our improved manual overclock enabled us to improve the benchmark performance. While severely restricted by AMD, we still achieve a performance boost of up to +15.92% in Geekbench 5 Vulkan.
When running Furmark GPU Stress Test, the average GPU clock is 2602 MHz with 1.014 volts, and the GPU Memory clock is 2384 MHz with 1.36 volts. The average GPU and GPU Hot Spot temperature is 60.3 degrees Celsius and 76.5 degrees Celsius. The average TGP power is 92.009 watts.
When running the GPU-Z Render Test, the maximum GPU Clock is 2962 MHz with 1.195 volts.
OC Strategy #4: ElmorLabs EVC2 & Hardware Modifications
In our fourth overclocking strategy, we call upon the help of Elmor and his ElmorLabs EVC2SX device. We resort to hardware modifications to work around the power and voltage limitations of the Radeon RX 6500 XT graphics card.
I will elaborate on two distinct topics:
- Why do we need and use hardware modifications
- How we use the ElmorLabs EVC2SX for the hardware modification
Situation Analysis: Power, Voltage, and Frequency Limitations
It’s hard to determine where to begin this situation analysis. Every aspect of this card is restricted well before we’re running out of headroom:
- We can max out the frequency sliders for GPU and memory frequency;
- We’re at the upper limit for GPU Vmax
- We have plenty of thermal headroom despite maxing out the power limit as the GPU reaches only 60 degrees Celsius during FurMark.
As I discussed earlier in this video, since Navi 10, AMD has discontinued using the powerplay driver and moved all GPU power management functions to the integrated SMU. As a result, where historically, enthusiasts could work around the overclocking limitations using so-called soft powerplay tables, this approach is no longer supported.
Unfortunately, we know very little about AMD’s power management technologies as they’ve not been very forthcoming since the release of Navi. But, we can make a reasonable assumption that the SMU reads the VBIOS information at boot, stores it, and uses it for runtime power management.
In this process, there are two ways how we could work around the overclocking limitations:
- Modify the VBIOS with a different configuration
- Communicate directly with the SMU to override the stored configuration
Unfortunately, neither are an option.
The VBIOS requires a valid digital signature that only AMD can provide. Software VBIOS flashing tools will reject a VBIOS that doesn’t have a valid digital signature. Modifying a VBIOS and flashing it onto the graphics card with a physical ROM flasher still gets the VBIOS on the card. Still, the card will simply not boot up.
Communication with the SMU is very complicated, and unless AMD provides direct access, it is simply impossible.
The last option I want to discuss is using Soft Power Play Tables. As a reminder: the soft powerplay table is just a Windows registry key that mirrors the information of the VBIOS PowerPlayInfo module, which contains the power management configuration. While AMD had said they would discontinue the use of powerplay from Navi onwards, softpowerplay tables could still work around the overclocking limitations on Navi 10 graphics cards like the Radeon RX 5700 XT.
The most user-friendly approach is using MorePowerTool, which enables users to customize the registry entry with their settings. Unfortunately, MorePowerTool doesn’t work with the Radeon RX 6500 XT because the PowerPlayInfo module for Navi 24 has a slightly different structure than prior Navi cards.
But there are more problems with the softpowerplay table: it’s not very useful even if you build your own registry key from scratch by extracting the PowerPlayInfo module from the VBIOS.
Yes, you can get the softpowerplay table to work. And, yes, you can extend the maximum overclocking limits as shown in the Adrenalin software. But they don’t work.
Setting a value higher than the maximum default limitations forces the GPU into some failsafe state. As a result, the GPU frequency is reduced to 500 MHz. That happens not only when we try to go beyond the default GPU frequency limits, but also memory frequency limit, and even the Power Limit.
We can only guess what’s causing this behavior, but our original theory holds up. At boot, the SMU reads the power management configuration from the VBIOS and stores it locally for power management at run time. Any value requested from the driver that exceeds these limits triggers a failsafe state.
The conclusion is that unless AMD provides the performance tuning tools, there’s no way for us to go beyond their arbitrary and excessively constrained limitations.
Well, almost.
Hardware Modifications
Now that we understand the hardware modifications’ need and purpose, let’s get started. For this graphics card, I will do one hardware modification.
This hardware modification adds an I2C pin header on the graphics card PCB, allowing us to communicate with the onboard ON Semiconductor NCP81022 digital voltage controllers. We can then connect the EVC2SX device to the I2C pins and easily control the behavior of the voltage controllers.
I will focus only on the voltage controller providing the GPU voltage (VDDCR_GFX) and SOC voltage (VDDCR_SOC). The other voltage controller offers the voltage for the memory controller and the memory. I focus exclusively on the GPU voltage controller because that’s where I obtained the only tangible performance benefit.
GPU Voltage Controller – Modification
When we open the ON Semiconductor NCP81022 datasheet, we find much information about the voltage controller. Under the SVI2 Interface Register Map, we can find more information on the various settings for this voltage controller. The more suitable options for us are:
- Special purpose offset, which we can use to offset the GPU and SOC voltage
- Loadline, which lets us adjust the voltage loadline
- ADC disable
It’s that last one that will give us the most benefit.
ADC is short for Analog to Digital Converter and does what the name says: translate analog electrical signals into digital signals mainly for data processing purposes.
In the NCP81022 diagram, the ADC sits between the SVI2 interface and a multiplexer. SVI stands for Serial VID Interface and is AMD’s standard for communication between the VRM controller and the CPU or GPU. A multiplexer, or data selector, is an electrical device that selects between several analog or digital input signals and forwards the selected input to a single output line.
The GPU sends VID commands for the GPU voltage to the voltage controller via the SVI2 interface. The controller then uses a DAC, or digital to analog converter, to set the voltage. Via the multiplexer, six analog inputs feedback information to the GPU. The ADC first converts these analog signals to digital information, which is then stored in the controller registers. The GPU will then read those registers back via the SVI2 interface and adjust voltage requests accordingly.
The reported information via the ADC includes sensed GPU and SOC voltage, maximum capable GPU and SOC current, and slew rate for GPU and SOC voltage.
By disabling the ADC, we force the voltage controller to stop updating these registers. As a result, the information in the registers is the last update. If we disable it when the card is idle, those values are very low.
While I don’t know how the GPU SMU interacts with the voltage controller, as a result of disabling the ADC, we can use much more power. The GPU voltage is also no longer limited to the Vmax of 1.2V. In Furmark, the power draw on the 6-pin connector, measured with the Elmorlabs PMD, doubles from about 75W to 150W. Also, the maximum voltage increases from 1.2V to 1.26V.
ElmorLabs EVC2SX
The ElmorLabs EVC2SX is the latest addition to the EVC2 product line.
The ElmorLabs EVC2 enables digital or analog voltage control using I2C/SMBus/PMBus. The device also has UART and SPI functionality. It can be considered the foundation for the ElmorLabs ecosystem as you can expand the functionality of some ElmorLabs products by connecting it to an EVC2.
In this case, we’re interested in the 3 I2C headers that provide digital voltage control. I’ll try to keep the step-by-step explanation as practical as possible in this video.
Step 1: identify the digital voltage controllers you want to control with the EVC2SX.
We did this in the previous segment
Step 2: determine how the hardware modification will work
We did this in the previous segment
Step 3: ensure the EVC2SX supports the I2C device
You can refer to the EVC2 Beta Software forum topic for a complete list of the supported I2C devices. If your device is not listed, you can leave a message in the forum or Discord.
Step 4: find the headers near the I2C marking on the EVC2SX PCB
On the EVC2SX, each I2C header has three pins: SCL, SDA, and GND. That stands for Serial Clock, Serial Data, and Ground. It’s essential to connect the pins on the EVC2 to the correct pins on the graphics card.
Step 5: connect the various pins to the relevant points on your graphics card
Since there are only three pins, it should be straightforward. If you are unsure, use a digital multimeter to locate the ground pin on both the graphics card and EVC2 I2C header. The data pin is always in the middle, and the other pin is the clock.
Step 6: open the ElmorLabs EVC2 software for voltage monitoring and control
You can find the relevant controls under the I2C submenu items. First click “Find Devices.” That will check if any supported devices are present on the I2C bus. In our case, it will find two NCP81022 voltage controllers. One controller manages the GPU and SOC voltage, and the other handles the Memory Controller and Memory voltage.
We select the top voltage controller in the menu and immediately enable the Monitoring function. If the I2C is connected well, you should now see the charts update. Now you can check if this controller indeed manages the GPU voltage. You can see the Loop 1 Output Voltage jumps between 0.01V and 0.719V. You can also run a 3D workload like GPU-Z Render to verify.
Then you can configure the voltage controller using the dropdown menus. In our case, we simply change ADC setting from Yes to No. After applying the changes, you should now see the monitoring charts flatline.
You can change other settings as well. Remember that the Loop 1 options control the GPU voltage, and Loop 2 settings control the SOC voltage.
3GHz Radeon 6500 XT with Voltage Offset
Before I move on to the overclocking settings and the performance results, I want to show you a neat little trick. As I already mentioned, using the Elmorlabs EVC2SX, you can increase the GPU voltage using a voltage offset. As it turns out, increasing the voltage can also help increase the frequency. We can use this trick to get the Radeon RX 6500 XT to 3 GHz.
With ADC disabled, our maximum voltage for the highest frequency is about 1.26V. As we increase the voltage offset with the EVC, we can increase the effective voltage. The maximum reported frequency increases from 2973 MHz at 1.16V to 2996 MHz at 1.36V.
By playing with the undervolt option in the GPU driver and the voltage offset function of the voltage controller, we can increase the GPU voltage to 1.375V. With this voltage, we see a maximum GPU frequency of 3002 MHz reported.
I also ran 3DMark Time Spy to check the performance, and as you can see, it’s possible to run 3DMark near 3 GHz with the Radeon RX 6500 XT.
Unfortunately, I cannot give you a reason why this happens. I suspect it’s related to the clock stretching, or AVFS circuit in the GPU die, which adjusts the GPU frequency down if a Vdroop or unstable voltage delivery is detected. But that’s pure speculation, so don’t take my word for it.
Upon opening the Elmorlabs EVC2 software,
Upon opening the AMD Adrenaline driver software
- Go to the Performance tab
- Click on Tuning
- Under Tuning Control, Manual Tuning, click Custom
- Set GPU Tuning to Enabled
- Set Advanced Control to Enabled
- Set Min Frequency (MHz) to 2875 MHz
- Set Max Frequency (MHz) to 2975 MHz
- Set GPU Voltage to 1100 mV
- Set VRAM Tuning to Enabled
- Set Memory Timing to Fast Timing
- Set Advanced Control to Enabled
- Set Max Frequency (MHz) to 2400
- Set Power Tuning to Enabled
- Set Power Limit (%) to 15
Then click Apply Changes to confirm the overclocked settings.
We re-ran the benchmarks and checked the performance increase compared to the default operation.
- Geekbench 5 OpenCL: +6.97%
- Geekbench 5 Vulkan: +33.96%
- Furmark 1080P: +13.65%
- FluidMark 1080P: +18.89%
- 3DMark Night Raid: +11.84%
- Simple RayTracing Benchmark: +9.04%
- Unigine Superposition: +10.03%
- Spaceship: +11.19%
- Tomb Raider: +7.79%
- CS:GO FPS Bench: +9.84%
- Final Fantasy XV: +13.46%
Working around some of the electrical limitations of the RX 6500 XT allows us to maintain a higher GPU boost frequency for longer. That results in significant performance improvements in workloads limited by the minimum GPU frequency or the maximum permitted power. We see a maximum performance uplift of +34% in Geekbench 5 Vulkan.
When running Furmark GPU Stress Test, the average GPU clock is 2931 MHz with 1.27 volts, and the GPU Memory clock is 2384 MHz with 1.36 volts. The average GPU and GPU Hot Spot temperature is 71.7 degrees Celsius and 104.1 degrees Celsius. The average TGP power is an estimated 160 watts.
When running the GPU-Z Render Test, the maximum GPU Clock is 2968 MHz with 1.27 volts.
OC Strategy #5: PARAMi Water Cooling
In our final overclocking strategy, we slap on a water block to see if improving the cooling also improves our performance.
The mounting holes are very close, 43×43 mm, so finding a fitting water block was not easy. I eventually settled for a PARAMi universal GPU water block I found locally in Taiwan. It was still a bit of a hack job to fit the water block, but eventually, everything worked.
We use the same settings as we already maxed out the clock frequency sliders in our previous overclocking strategy.
Upon opening the Elmorlabs EVC2 software,
- Access the I2C1 section
- Click Find Devices and ensure two NCP81022 controllers pop up
- Access the NCP81022 (20) menu
- Enable the Monitoring function
- Verify that the monitoring is indeed active
- Set ADC Update to No
- Click Apply Changes
- Verify that the monitoring output is no longer dynamic but a fixed value.
Upon opening the AMD Adrenaline driver software
- Go to the Performance tab
- Click on Tuning
- Under Tuning Control, Manual Tuning, click Custom
- Set GPU Tuning to Enabled
- Set Advanced Control to Enabled
- Set Min Frequency (MHz) to 2875 MHz
- Set Max Frequency (MHz) to 2975 MHz
- Set GPU Voltage to 1100 mV
- Set VRAM Tuning to Enabled
- Set Memory Timing to Fast Timing
- Set Advanced Control to Enabled
- Set Max Frequency (MHz) to 2400
- Set Power Tuning to Enabled
- Set Power Limit (%) to 15
Then click Apply Changes to confirm the overclocked settings.
We re-ran the benchmarks and checked the performance increase compared to the default operation.
- Geekbench 5 OpenCL: +7.00%
- Geekbench 5 Vulkan: +33.97%
- Furmark 1080P: +20.09%
- FluidMark 1080P: +19.59%
- 3DMark Night Raid: +11.93%
- Unigine Superposition: +11.93%
- Spaceship: +11.19%
- Tomb Raider: +7.79%
- CS:GO FPS Bench: +9.95%
- Final Fantasy XV: +13.86%
The main benefit of adding water cooling is the significantly improved temperatures. During a FurMark stress test, the GPU and GPU hotspot temperature decreased by 8.5 and 20 degrees Celsius, respectively. Unfortunately, since AMD has imposed arbitrary clock limits, there’s no way to improve the clock frequency further to achieve better performance. The only place where we see any uplift from our previous overclocking strategy is in FurMark, where the card can sustain slightly higher clock frequencies due to the lower temperature.
When running Furmark GPU Stress Test, the average GPU clock is 2948 MHz with 1.27 volts, and the GPU Memory clock is 2384 MHz with 1.36 volts. The average GPU and GPU Hot Spot temperature is 63.2 degrees Celsius and 84.2 degrees Celsius. The average TGP power is an estimated 160 watts.
When running the GPU-Z Render Test, the maximum GPU Clock is 2967 MHz with 1.27 volts.
AMD Radeon RX 6500 XT: Conclusion
Alright, let us wrap this up.
As I explained at the beginning of the video, this project was mostly about me getting up to speed on overclocking AMD discrete graphics cards. The RX 6500 XT seemed like a great product to do that as it was the cheapest RDNA 2 graphics card available.
Unfortunately, I was quickly proven wrong. To make a long story short: AMD is incredibly restrictive for custom performance tuning. It’s a simple situation: once you run out of slider, that’s the end of your overclocking journey. While we could jump to the conclusion that AMD simply hates overclocking, I think it’s important to highlight a couple of things.
First and foremost, it’s essential to recognize the tremendous progress AMD has made in terms of GPU power management. They’ve developed some genuinely brilliant technologies integrated into their GPUs in their search to provide increased performance within a specified power budget. Unfortunately, an unintended side-effect of the more comprehensive integrated power management technologies is that overclocking is highly restricted unless explicitly enabled by the vendor.
On the CPU side, we’ve seen AMD address this particular challenge by providing enthusiasts with performance tuning tools such as Precision Boost Overdrive and OC Mode. That effectively allows an enthusiast to tune their AMD silicon as they please: not just maximize the performance through overclocking, but also maximize power efficiency by undervolting.
On the GPU side, this effort is – to put it gently – lacking. It looks like the AMD GPU team is somewhat going out of their way to prevent their customers from tuning the product to their needs and desires. To me, that seems more like a self-defeating move than anything else.
Take the Radeon RX 6500 XT, for example. Media and customers alike crucified this graphics card for not meeting performance expectations. Looking at the overclocking headroom, it seems the story for this product could’ve been about its great overclocking potential. At a time when reasonably priced high-performance GPUs are hard to come by, providing your customers with an opportunity to squeeze as much performance per dollar out of an affordable card seems like a great way to win over the hearts and minds of enthusiasts and gamers alike.
I get the business logic of wanting your customers to pay for a certain amount of value offered with your products. But I think AMD overlooks that offered value doesn’t equate perceived value. Enabling advanced performance tuning allows customers to find perceived value in your product that wasn’t offered in the first place and thus may increase their willingness to purchase.
Something to think about.
Anyway, that’s all for today!
I want to thank my Patreon supporters, Coffeepenbit, Andrew, Furna, and Chris, for supporting my work.
As per usual, if you have any questions or comments, feel free to drop them in the comment section below.
See you next time!
3586 MHZ Intel Alchemist with Liquid Nitrogen - SkatterBencher
[…] graphics cards using TSMC N6 are the Radeon RX 6500 XT and RX 6400. I did overclock the former in SkatterBencher #41, but due to AMD’s poor support for GPU overclocking, it’s impossible to see how fast […]
5 Minute Overclock: AMD Radeon RX 6500 XT to 2962 MHz - 5 Minute Overclock
[…] I’ll speed run you through the OC settings and provide some notes and tips along the way. Please note that this is for entertainment purposes only and not the whole picture. Please don’t outright copy these settings and apply them to your system. If you want to learn how to overclock this system, please check out the longer SkatterBencher article. […]
Let's Talk 6 GHz Raptor Lake - SkatterBencher
[…] and they’re less constrained than the Radeon 6000 series. While I enjoyed overclocking the Radeon RX 6500 XT, the artificially imposed overclocking constraints ultimately make AMD GPU overclocking a very […]