The SkatterBencher AI Benchmark leaderboard consists of all benchmark results from SkatterBencher CPU and GPU overclocking guides. You can find more information about how the benchmark is configured below the AI Benchmark leaderboard.
AI Benchmark Leaderboard
SKU | OC Strategy | Benchmark Score |
---|---|---|
NVIDIA RTX A5000 Laptop | WSL_CUDNN | 26830 |
NVIDIA RTX A5000 Laptop | WSL_TENSORRT | 26767 |
NVIDIA RTX A5000 Laptop | WN_DIRECTML | 26318 |
NVIDIA RTX A5000 Laptop | WSL_DIRECTML | 23287 |
Intel Arc A770 | SB64_OCS4 | 16935 |
Intel Arc A770 | SB64_OCS3 | 16561 |
NVIDIA GeForce RTX 3050 | SB62_OCS2 | 16418 |
Intel Arc A770 | SB64_OCS2 | 15972 |
NVIDIA GeForce RTX 3050 | SB62_OCS1 | 15847 |
Intel Arc A770 | SB64_OCS1 | 15669 |
NVIDIA GeForce RTX 3050 | SB62_OCS3 | 15167 |
NVIDIA GeForce RTX 3050 | SB62_EVGA | 14556 |
Intel Arc A770 | SB64_Stock | 14325 |
NVIDIA GeForce RTX 3050 | SB62_Stock | 14209 |
AMD Radeon 780M | SB70_OCS5 | 12738 |
Intel Xeon w7-3465X | SB63_OCS4 | 12100 |
AMD Radeon 780M | SB70_OCS4 | 11705 |
AMD Radeon 780M | SB70_OCS3 | 11650 |
Intel Xeon w7-3465X | SB63_OCS3 | 11431 |
AMD Radeon 780M | SB70_OCS2 | 10688 |
AMD Radeon 760 | SB73_OCS5 | 10509 |
AMD Radeon 780M | SB70_OCS1 | 10353 |
AMD Radeon 780M | SB70_Stock | 9986 |
AMD Radeon 760 | SB73_OCS4 | 9751 |
AMD Radeon 760 | SB73_OCS3 | 9742 |
AMD Ryzen Threadripper 7960X | SB71_OCS4 | 9605 |
Intel Xeon w5-3435X | SB61_OCS4 | 9591 |
AMD Ryzen Threadripper 7980X | SB66_OCS2 | 9567 |
Intel Xeon w7-2495X | SB59_OCS4 | 9492 |
AMD Ryzen Threadripper 7980X | SB66_OCS2 | 9485 |
AMD Ryzen Threadripper 7980X | SB66_OCS1 | 9477 |
AMD Ryzen Threadripper 7980X | SB66_OCS1 | 9438 |
AMD Ryzen Threadripper 7980X | SB66_OCS3 | 9426 |
Intel Xeon w7-2495X | SB59_OCS2 | 9354 |
AMD Ryzen Threadripper 7960X | SB71_OCS3 | 9350 |
AMD Ryzen Threadripper 7980X | SB66_Stock | 9327 |
Intel Xeon w5-3435X | SB61_OCS3 | 9283 |
AMD Ryzen Threadripper 7980X | SB66_OCS4 | 9270 |
Intel Xeon w7-2495X | SB59_OCS3 | 9033 |
AMD Ryzen Threadripper 7980X | SB66_Stock | 9023 |
Intel Xeon w7-3465X | SB63_OCS1 | 8988 |
AMD Ryzen Threadripper 7960X | SB71_OCS2 | 8840 |
AMD Ryzen Threadripper 7960X | SB71_OCS1 | 8798 |
AMD Radeon 760 | SB73_OCS2 | 8786 |
Intel Xeon w7-3465X | SB63_Stock | 8599 |
Intel Xeon w5-3435X | SB61_OCS2 | 8541 |
AMD Ryzen Threadripper 7960X | SB71_Stock | 8501 |
AMD Ryzen 7 9700X | SB78X_OCS2 | 8343 |
AMD Ryzen 7 9700X | SB78X_OCS1 | 8119 |
AMD Radeon 760 | SB73_OCS1 | 8091 |
Intel Xeon w7-2495X | SB59_OCS1 | 7988 |
AMD Ryzen 7 9700X | SB78_OCS5 | 7882 |
AMD Ryzen 7 9700X | SB78_OCS4 | 7876 |
Intel Xeon w7-2495X | SB59_Stock | 7593 |
AMD Ryzen 9 7950X | SB45_OCS5 | 7497 |
AMD Ryzen 9 7950X | SB45_OCS3 | 7484 |
AMD Ryzen 9 7950X | SB45_OCS4 | 7441 |
AMD Ryzen 7 9700X | SB78_OCS3 | 7438 |
AMD Ryzen 9 7950X | SB45_OCS2 | 7363 |
AMD Radeon 760 | SB73_Stock | 7347 |
AMD Ryzen 7 9700X | SB78_OCS2 | 7221 |
AMD Ryzen 7 9700X | SB78_OCS1 | 7212 |
AMD Ryzen 9 7950X3D | SB56_OCS2 | 7080 |
AMD Ryzen 9 7950X | SB45_OCS1 | 6990 |
Intel Xeon w5-3435X | SB61_OCS1 | 6981 |
Intel Core i9-13900K | SB49_OCS3 | 6980 |
AMD Ryzen 9 7950X3D | SB56_OCS3 | 6956 |
Intel Core i9-13900K | SB49_OCS2 | 6954 |
Intel Core i9-14900K | SB67_OCS3 | 6949 |
Intel Core i9-14900K | SB67_OCS4 | 6943 |
Intel Core i9-13900K | SB49_OCS4 | 6935 |
AMD Ryzen 9 7950X3D | SB56_OCS1 | 6928 |
Intel Xeon w5-3435X | SB61_Stock | 6800 |
AMD Ryzen 5 9600X | SB79_OCS4 | 6780 |
Intel Core i9-14900K | SB67_OCS2 | 6780 |
AMD Ryzen 5 9600X | SB79_OCS3 | 6772 |
Intel Core i9-13900K | SB49_OCS1 | 6754 |
AMD Ryzen 9 7900 | SB54_OCS5 | 6687 |
AMD Ryzen 9 7900 | SB54_OCS3 | 6657 |
AMD Ryzen 9 7900 | SB54_OCS4 | 6587 |
Intel Arc A380 | SB44_OCS4 | 6582 |
AMD Ryzen 9 7950X | SB45_Stock | 6580 |
Intel Core i9-14900K | SB67_OCS1 | 6552 |
Intel Core i9-13900KS | SB53_OCS4 | 6526 |
AMD Ryzen 9 7900 | SB54_OCS2 | 6489 |
Intel Core i9-13900KS | SB53_OCS3 | 6476 |
AMD Ryzen 9 7900X | SB46_OCS4 | 6466 |
AMD Ryzen 9 7900X | SB46_OCS3 | 6457 |
AMD Ryzen Threadripper 5990X | SB43_OCS5 | 6426 |
AMD Ryzen 9 7900X | SB46_OCS5 | 6407 |
AMD Ryzen 9 7950X3D | SB56_Stock | 6405 |
AMD Ryzen 5 9600X | SB79_OCS2 | 6386 |
Intel Arc A380 | SB44_OCS3 | 6382 |
AMD Ryzen 9 7900X | SB46_OCS2 | 6359 |
Intel Core i7-14700K | SB68_OCS5 | 6331 |
Intel Arc A380 | SB44_OCS2 | 6331 |
AMD Ryzen 9 7900 | SB54_OCS1 | 6328 |
Intel Core i7-14700K | SB68_OCS4 | 6321 |
AMD Ryzen Threadripper 5990X | SB43_OCS4 | 6315 |
AMD Ryzen 9 7900X | SB46_OCS1 | 6311 |
AMD Ryzen Threadripper 5990X | SB43_OCS3_NoSMT | 6252 |
Intel Core i9-13900KS | SB53_OCS2 | 6219 |
Intel Core i7-14700K | SB68_OCS3 | 6212 |
Intel Core i7-14700K | SB68_OCS2 | 6211 |
AMD Ryzen 9 7900X3D | SB58_OCS3 | 6195 |
Intel Core i9-13900K | SB49_Stock | 6192 |
AMD Ryzen 5 9600X | SB79_OCS1 | 6191 |
AMD Ryzen 9 7900X3D | SB58_OCS2 | 6125 |
AMD Ryzen 7 9700X | SB78_Stock | 6108 |
AMD Ryzen 9 7900X3D | SB58_OCS1 | 6098 |
Intel Arc A380 | SB44_OCS1 | 6098 |
Intel Core i9-13900KS | SB53_OCS1 | 6073 |
Intel Core i7-14700K | SB68_OCS1 | 6006 |
AMD Ryzen Threadripper 5990X | SB43_OCS2_NoSMT | 5952 |
AMD Ryzen 9 7900X | SB46_Stock | 5951 |
AMD Ryzen Threadripper 5990X | SB43_OCS1_NoSMT | 5950 |
AMD Radeon 740M | SB75_OCS5 | 5920 |
Intel Arc A380 | SB44_Stock | 5799 |
Intel Core i9-13900K P-Core | SB52_OCS5 | 5773 |
Intel Core i7-13700K | SB50_OCS4 | 5764 |
Intel Core i9-13900K P-Core | SB52_OCS4 | 5755 |
AMD Ryzen 9 7900X3D | SB58_Stock | 5710 |
AMD Radeon 740M | SB75_OCS4 | 5671 |
Intel Core i9-13900K P-Core | SB52_OCS3 | 5670 |
Intel Core i9-14900K | SB67_Stock | 5663 |
AMD Ryzen 7 7800X3D | SB60_OCS4 | 5655 |
Intel Core i7-13700K | SB50_OCS3 | 5629 |
AMD Ryzen 7 7700X | SB47_OCS3 | 5587 |
AMD Ryzen 7 7700X | SB47_OCS4 | 5576 |
AMD Ryzen 7 7700X | SB47_OCS2 | 5548 |
AMD Ryzen 5 9600X | SB79_Stock | 5543 |
AMD Ryzen 7 7700X | SB47_OCS5 | 5541 |
Intel Core i9-13900K P-Core | SB52_OCS2 | 5539 |
AMD Ryzen 7 7700X | SB47_OCS6 | 5528 |
AMD Ryzen 7 7800X3D | SB60_OCS5 | 5527 |
AMD Radeon 740M | SB75_OCS3 | 5515 |
AMD Ryzen 7 7800X3D | SB60_OCS2 | 5473 |
Intel Core i9-13900K P-Core | SB52_OCS1 | 5471 |
AMD Ryzen 7 7700X | SB47_OCS1 | 5469 |
AMD Ryzen 7 7800X3D | SB60_OCS3 | 5468 |
Intel Core i9-13900KS | SB53_Stock | 5448 |
AMD Ryzen Threadripper 5990X | SB43_OCS3 | 5401 |
Intel Core i7-14700K | SB68_Stock | 5370 |
AMD Ryzen 9 7900 | SB54_Stock | 5368 |
AMD Ryzen 7 7800X3D | SB60_OCS1 | 5363 |
AMD Ryzen Threadripper 3990X | SB36_OCS5 | 5344 |
AMD Ryzen Threadripper 3990X | SB36_OCS2_NoSMT | 5325 |
AMD Ryzen Threadripper 5990X | SB43_OCS2 | 5227 |
AMD Ryzen Threadripper 3990X | SB36_OCS1_NoSMT | 5212 |
AMD Ryzen 7 7700X | SB47_Stock | 5172 |
AMD Ryzen Threadripper 5990X | SB43_OCS1 | 5158 |
Intel Core i9-12900KF | SB34_OCS4 | 5153 |
AMD Ryzen 7 7800X3D | SB60_Stock | 5144 |
Intel Core i9-12900K | SB30_OCS5 | 5115 |
Intel Core i7-13700K | SB50_OCS2 | 5100 |
Intel Core i7-13700K | SB50_OCS1 | 5062 |
Intel Core i9-12900K | SB30_OCS2 | 5044 |
Intel Core i9-12900KF | SB34_OCS3 | 4989 |
Intel Core i9-13900K P-Core | SB52_Stock | 4961 |
AMD Radeon 740M | SB75_OCS2 | 4864 |
Intel Core i9-12900KF | SB34_OCS2 | 4821 |
AMD Ryzen Threadripper 3990X | SB36_OCS2 | 4820 |
Intel Core i9-12900K | SB30_OCS4 | 4754 |
Intel Core i7-13700K | SB50_Stock | 4727 |
AMD Ryzen Threadripper 3990X | SB36_OCS4 | 4687 |
AMD Ryzen 7 8700G | SB69_OCS2 | 4680 |
Intel Core i9-12900K | SB30_OCS1 | 4650 |
AMD Ryzen Threadripper 3990X | SB36_OCS1 | 4632 |
AMD Ryzen Threadripper 5990X | SB43_Stock_NoSMT | 4594 |
AMD Radeon 740M | SB75_OCS1 | 4580 |
AMD Ryzen 5 7600X | SB48_OCS4 | 4575 |
AMD Ryzen 5 7600X | SB48_OCS5 | 4573 |
AMD Ryzen 5 7600X | SB48_OCS6 | 4570 |
AMD Ryzen 5 7600X | SB48_OCS3 | 4562 |
Intel Core i9-12900KF | SB34_OCS1 | 4550 |
AMD Ryzen 7 8700G | SB69_OCS4 | 4544 |
AMD Ryzen 7 8700G | SB69_OCS1 | 4544 |
AMD Radeon 740M | SB75_Stock | 4530 |
AMD Ryzen 5 7600X | SB48_OCS2 | 4529 |
Intel Core i9-12900K | SB30_OCS3 | 4498 |
Intel Core i7-12700K | SB31_OCS4 | 4494 |
AMD Ryzen 5 7600X | SB48_OCS1 | 4491 |
AMD Ryzen 7 8700F | SB76_OCS2 | 4465 |
Intel Core i7-12700K | SB31_OCS3 | 4454 |
AMD Ryzen Threadripper 3990X | SB36_OCS3 | 4418 |
AMD Ryzen 7 8700G | SB69_OCS3 | 4407 |
Intel Core i5-13600K | SB51_OCS3 | 4337 |
AMD Ryzen 7 8700F | SB76_OCS3 | 4328 |
Intel Core i9-12900K | SB30_Stock | 4324 |
AMD Ryzen 7 8700F | SB76_OCS1 | 4304 |
Intel Core i5-13600K | SB51_OCS4 | 4301 |
Intel Core i7-12700K | SB31_OCS2 | 4285 |
AMD Ryzen Threadripper 5990X | SB43_Stock | 4277 |
Intel Core i9-12900KF | SB34_Stock | 4269 |
AMD Ryzen 5 7600X | SB48_Stock | 4205 |
AMD Ryzen 3 5300GE (IGP) | SB35_OCS3 | 4202 |
AMD Ryzen 3 5300GE (IGP) | SB35_OCS4 | 4200 |
AMD Ryzen 7 8700F | SB76_Stock | 4140 |
Intel Core i7-12700K | SB31_OCS1 | 4115 |
AMD Ryzen 5 8600G | SB72_OCS3 | 4084 |
AMD Ryzen 5 8600G | SB72_OCS2 | 4037 |
AMD Ryzen Threadripper 3990X | SB36_Stock_NoSMT | 4023 |
Intel Core i7-12700K | SB31_Stock | 3939 |
AMD Ryzen 5 8600G | SB72_OCS1 | 3837 |
Intel Core i5-13600K | SB51_OCS2 | 3813 |
Intel Core i5-13600K | SB51_OCS1 | 3748 |
AMD Ryzen 7 8700G | SB69_Stock | 3746 |
Intel Core i5-12400 | SB37_OCS3 | 3680 |
Intel Core i5-12400 | SB37_OCS4 | 3648 |
AMD Ryzen Threadripper 3990X | SB36_Stock | 3626 |
AMD Ryzen 7 5800X3D | SB39_OCS3 | 3609 |
AMD Ryzen 3 5300GE (IGP) | SB35_OCS2 | 3596 |
AMD Ryzen 5 8600G | SB72_Stock | 3591 |
Intel Core i5-13600K | SB51_Stock | 3563 |
AMD Ryzen 7 5800X3D | SB39_OCS1 | 3480 |
AMD Ryzen 7 5800X3D | SB39_OCS2 | 3473 |
AMD Ryzen 3 5300GE (IGP) | SB35_OCS1 | 3470 |
AMD Ryzen 5 8500GE | SB74_OCS3 | 3435 |
AMD Ryzen 5 8500GE | SB74_OCS2 | 3395 |
Intel Core i5-12600KF | SB32_OCS3 | 3257 |
AMD Ryzen 5 8500GE | SB74_OCS1 | 3206 |
AMD Ryzen 7 5800X3D | SB39_Stock | 3196 |
Intel Core i5-12600KF | SB32_OCS2 | 3196 |
Intel Core i5-12400 | SB37_OCS2 | 3195 |
Intel Core i5-12600KF | SB32_OCS1 | 3130 |
AMD Ryzen 5 8500GE | SB74_Stock | 3110 |
Intel Core i9-11980HK | SB65_OCS3 | 3088 |
Intel Core i5-12400 | SB37_OCS1 | 3083 |
Intel Core i9-11980HK | SB65_OCS2 | 3079 |
AMD Ryzen 3 5300GE (IGP) | SB35_Stock | 2775 |
Intel Core i9-11980HK | SB65_OCS1 | 2771 |
Intel Core i9-11900H | WSL_DIRECTML_ONEDNN | 2735 |
Intel Core i9-11900H | WN_DIRECTML_ONEDNN | 2713 |
Intel Core i5-12600KF | SB32_Stock | 2681 |
AMD Radeon Graphics (Ryzen 7000) | SB55_OCS3 | 2645 |
AMD Radeon Graphics (Ryzen 7000) | SB55_OCS2 | 2622 |
Intel Core i9-11980HK | SB65_Stock | 2575 |
Intel Core i5-12400 | SB37_Stock | 2551 |
Intel Core i9-11900H | WSL_DIRECTML | 2240 |
Intel UHD Graphics 770 (13th Gen) | SB57_OCS4 | 2072 |
AMD Radeon Graphics (Ryzen 7000) | SB55_OCS1 | 2057 |
Intel UHD Graphics 770 (13th Gen) | SB57_OCS3 | 2055 |
Intel UHD Graphics 770 (12th Gen) | SB33_OCS4 | 2046 |
AMD Radeon Graphics (Ryzen 7000) | SB55_Stock | 2037 |
Intel UHD Graphics 770 (12th Gen) | SB33_OCS3 | 1949 |
Intel UHD Graphics 770 (12th Gen) | SB33_OCS2 | 1825 |
Intel UHD Graphics 770 (13th Gen) | SB57_OCS2 | 1739 |
AMD Ryzen 3 5300GE | SB35_OCS3 | 1504 |
AMD Ryzen 3 5300GE | SB35_OCS4 | 1469 |
AMD Ryzen 3 5300GE | SB35_OCS2 | 1448 |
Intel UHD Graphics 750 (11th Gen) | SB28_OCS6 | 1446 |
AMD Ryzen 3 5300GE | SB35_OCS1 | 1419 |
Intel UHD Graphics for 11th Gen | WN_DIRECTML | 1417 |
Intel UHD Graphics for 11th Gen | WSL_DIRECTML | 1396 |
Intel UHD Graphics 730 (12th Gen) | SB38_OCS4 | 1388 |
Intel UHD Graphics 770 (13th Gen) | SB57_OCS1 | 1387 |
Intel UHD Graphics 770 (13th Gen) | SB57_Stock | 1379 |
Intel UHD Graphics 770 (12th Gen) | SB33_OCS1 | 1361 |
Intel UHD Graphics 770 (12th Gen) | SB33_Stock | 1352 |
Intel UHD Graphics 750 (11th Gen) | SB28_OCS5 | 1264 |
Intel UHD Graphics 730 (12th Gen) | SB38_OCS3 | 1223 |
AMD Ryzen 3 5300GE | SB35_Stock | 1179 |
Intel UHD Graphics 730 (12th Gen) | SB38_OCS2 | 1116 |
Intel UHD Graphics 750 (11th Gen) | SB28_OCS4 | 1107 |
Intel UHD Graphics 750 (11th Gen) | SB28_OCS3 | 1098 |
Intel UHD Graphics 730 (12th Gen) | SB38_OCS1 | 1080 |
Intel UHD Graphics 730 (12th Gen) | SB38_Stock | 1072 |
Intel UHD Graphics 750 (11th Gen) | SB28_OCS2 | 1066 |
Intel UHD Graphics 750 (11th Gen) | SB28_Stock | 1064 |
Intel UHD Graphics 750 (11th Gen) | SB28_OCS1 | 1064 |
Intel Core i9-11900H | WN_DIRECTML | 892 |
Raspberry Pi 5 | SB77_OCS2 | 296 |
Raspberry Pi 5 | SB77_OCS1 | 293 |
Raspberry Pi 5 | SB77_Stock | 272 |
Leaderboard Notes:
- OC Strategy is coded as SkatterBencher # + OC Strategy #
- The Core i9-13900KS benchmark scores seem off. I noticed a particular issue with the OS for that system where the score would be significantly slower than expected unless the process affinity was manually set.
- TensorFlow-DirectML was used for: Intel UHD Graphics 750 (11th Gen), Intel UHD Graphics 730 (12th Gen), Intel UHD Graphics 770 (12th Gen), Intel UHD Graphics 770 (13th Gen), AMD Radeon Graphics (Ryzen 7000), AMD Ryzen 3 5300GE (IGP), and Intel Arc A380.
AI Benchmark
AI Benchmark Alpha is an open-source python library for evaluating AI performance of various hardware platforms, including CPUs, GPUs and TPUs. The benchmark is relying on the TensorFlow machine learning library, and is providing a lightweight solution for assessing inference and training speed for key Deep Learning models. The benchmark itself takes about 20 minutes to run and outputs three scores: inference score, training score, and AI-Score. It’s the latter we use as performance measurement.
While there are many ways to run AI Benchmark on your machine, I use an Anaconda Python environment with the latest Tensorflow optimizations.
In this blog post I describe my AI Benchmark process in detail: AI Benchmark to Measure Machine Learning Performance. Below you can find a quick summary
Installing AI Benchmark on Windows Native
- First, download and install Anaconda for Windows.
- After completing the installation, run the Anaconda Prompt.
- Create a new Python environment for the benchmark. Make sure to specify Python version 3.10 as the tensorflow-directml-plugin supports only versions 3.7, 3.8, 3.9, and 3.10
- conda create -n aibench python=3.10
- Activate your newly created environment
- conda activate aibench
- Download and install the base TensorFlow-CPU
- pip install tensorflow-cpu
- Download and install the tensorflow-directml package
- pip install tensorflow-directml-plugin
- Download and install the numpy 1.23 package
- pip install numpy==1.23
- Download and install the ai_benchmark package
- pip install ai_benchmark
Installing AI Benchmark on Windows Subsystem for Linux
- First, make sure WSL is correctly installed on your Windows PC.
- Open PowerShell and type wsl –install
- Follow the installation instructions
- If that’s the case, then open the WSL prompt.
- Now download the Anaconda for Linux 64-Bit (x86) installer.
- Browse to ./Users/{your_pc_name}/Downloads (or any folder you have write permissions on)
- wget https://repo.anaconda.com/archive/Anaconda3-2023.07-2-Linux-x86_64.sh (or latest Linux package)
- Next, install Anaconda
- bash ~/Downloads/Anaconda3-2020.05-Linux-x86_64.sh
- Then, close and reopen Windows Subsystem for Linux
- Create a new Python environment for the benchmark. Make sure to specify Python version 3.10 as the tensorflow-directml-plugin supports only versions 3.7, 3.8, 3.9, and 3.10
- conda create -n aibench python=3.10
- Activate your newly created environment
- conda activate aibench
- Download and install the base TensorFlow-CPU
- pip install tensorflow-cpu
- Download and install the tensorflow-directml package
- pip install tensorflow-directml-plugin
- Download and install the numpy 1.23 package
- pip install numpy==1.23
- Download and install the ai_benchmark package
- pip install ai_benchmark
Running the AI Benchmark
Starting the AI Benchmark is the same on Windows Native and Windows Subsystem for Linux.
- Open the Anaconda prompt
- Activate the conda environment with AI Benchmark
- conda activate aibench
- Start Python
- python
- Import the AI Benchmark package
- from ai_benchmark import AIBenchmark
- Specify the benchmark parameters
- benchmark = AIBenchmark(use_CPU=None, verbose_level=3)
- Use_CPU=True runs the benchmark on the CPU
- Use_CPU=None runs the benchmark on the GPU
- Verbose_level=0 runs the test silently
- Verbose_level=1 runs the test with short summary
- Verbose_level=2 provides information about each run
- Verbose_level=3 provides the tensorflow logs during the run
- benchmark = AIBenchmark(use_CPU=None, verbose_level=3)
- Then, lastly, start the benchmark
- benchmark.run()
AI Benchmark Optimizations and Tricks
Since deep neural network and machine learning performance is a big selling point, companies work hard to release performance-optimizing software packages for their hardware. I tend to use these packages in my SkatterBencher guides.
OneDNN for Intel CPU Architectures
Intel oneDNN is an open-source, high-performance library designed to accelerate deep learning applications on Intel architecture CPUs. It provides optimized primitives for various deep learning operations, such as convolutions, inner products, and other key operations used in neural networks.
The oneAPI Deep Neural Network Library (oneDNN) optimizations are available in the official x86-64 TensorFlow after v2.5. The feature is off by default before v2.9, but users can enable those CPU optimizations by configuring the environment variable TF_ENABLE_ONEDNN_OPTS.
- Windows Native: set TF_ENABLE_ONEDNN_OPTS=1
- Windows Subsystem for Linux: export TF_ENABLE_ONEDNN_OPTS=1
Since TensorFlow v2.9, the oneAPI Deep Neural Network Library (oneDNN) optimizations are enabled by default.
ZenDNN for AMD CPU Architectures
I also came across an AMD equivalent library called ZenDNN. However, I’ve yet to try this on an AMD system, so I’ll leave you a link for now.
CuDNN or TensorRT for NVIDIA GPU Architectures
You can also rely on the NVIDIA CUDA Deep Neural Network library (cuDNN) for NVIDIA GPUs. CuDNN is a GPU-accelerated library of primitives for deep neural networks. The installation requires a different version of TensorFlow – not TensorFlow-DirectML – since TensorFlow 2.10 is no longer available on Windows Native.
The installation is pretty straightforward. After installing Anaconda for Linux on WSL, do the following
- First, create and activate a new Anaconda environment
- conda create -n aibenchNV
- conda activate aibenchNV
- Then, install the appropriate Cuda Toolkit. You can find a support matrix on NVIDIA’s website
- conda install -c conda-forge cudatoolkit=11.8
- Next, install the CuDNN package
- pip install nvidia-cudnn-cu11==8.9.4.25
- Then, install TensorFlow
- pip install tensorflow
- Then, install numpy 1.23
- pip install numpy==1.23
- Lastly, install AI Benchmark
- pip install ai_benchmark
Instead of CuDNN, you can also consider installing the TensorRT Python library. Similarly, TensorRT is a deep-learning library powered by CUDA. TensorRT provides APIs and parsers to import trained models from all major deep learning frameworks. It then generates optimized runtime engines deployable in the data center, automotive, and embedded environments.
RocM for AMD GPU Architectures
Lastly, I also want to mention AMD’s ROCm software package. ROCm is an open-source software stack for GPU computation featuring a collection of drivers, development tools, and APIs enabling GPU programming from the low-level kernel to end-user applications.
While ROCm is fully integrated into ML frameworks such as PyTorch and TensorFlow, it’s currently unavailable on Windows Native, so I’ve yet to use ROCm optimizations for AMD graphics cards. So, just like with ZenDNN, I’ll leave you with a link to the documentation.
Multiple GPUs
These days it’s common to have multiple graphics devices in a single system. Usually, that’s the integrated graphics of the CPU and a high-performance discrete graphics card. If you want to switch between the graphics device, you can set an environment variable before starting Python.
- Windows Native: set DML_VISIBLE_DEVICES=0,1
- Windows Subsystem for Linux: export DML_VISIBLE_DEVICES=0,1
here the number indicates the specific device. Device 0 is the NVIDIA dGPU on the EK Flat PC, and device 1 is the Intel iGPU. By default, AI Benchmark will run the first available device. So, if I want to run on the iGPU, I’d have to set DML_VISIBLE_DEVICES=1.
DXGI_ERROR_DEVICE_REMOVED
The AI Benchmark is a pretty tough benchmark that can take a long time. Sometimes, you may run into an error called DXGI_ERROR_DEVICE_REMOVED while running the benchmark. That happens when there’s a timeout when the device takes too long to complete a workload.
You can increase the timeout with the registry entry “TdrDelay” to solve the issue. This registry entry will extend the time a software application waits for the IGP. https://docs.microsoft.com/en-us/windows-hardware/drivers/display/tdr-registry-keys
- KeyPath: HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\GraphicsDrivers
- KeyValue: TdrDelay
- ValueType: REG_DWORD
- ValueData: Number of seconds to delay. The default value is 2 seconds.