TensorFlow-DirectML allows us to run the AI Benchmark python benchmark on Intel Rocket Lake UHD Graphics 750 integrated graphics.
Unless you’ve been living under a rock for the past couple of years, AI and deep learning are a pretty big deal these days. So, I wanted to include at least one benchmark test to see if overclocking the IGP could improve deep learning performance.
I came across AI Benchmark: https://pypi.org/project/ai-benchmark/.
AI Benchmark Alpha is an open-source python library for evaluating AI performance of various hardware platforms, including CPUs, GPUs and TPUs. The benchmark is relying on the TensorFlow machine learning library, and is providing a lightweight solution for assessing inference and training speed for key Deep Learning models.
Installing AI Benchmark turned out to be a bit more of a hassle than initially expected. First, it relies on the TensorFlow machine learning library. TensorFlow can be used with AVX-enabled CPUs or CUDA-enabled GPUs, neither of which describes our integrated graphics.
Then I came across TensorFlow-DirectML. TensorFlow-DirectML broadens the reach of TensorFlow beyond its traditional Graphics Processing Unit (GPU) support, by enabling high-performance training and inferencing of machine learning models on any Windows devices with a DirectX 12-capable GPU through DirectML, a hardware accelerated deep learning API on Windows.
While I am pretty software-illiterate, I was able to use the TensorFlow-DirectML package to get the benchmark to run on the integrated graphics. For those who want to also run it, I’ll quickly run you through the process.
- First, install Anaconda
- Then, run the Anaconda Prompt
- Create a new Python environment for the benchmark. Make sure to specify Python version 3.7 as only versions 3.5, 3.6, and 3.7 are supported by tensorflow-directml
- conda create -n aibench python=3.7
- Activate your newly created environment
- conda activate aibench
- Download and install the ai_benchmark package
- pip install ai_benchmark
- Download and install the tensorflow-directml package
- pip install tensorflow-directml
- Start python
- python
- Import the AI Benchmark package
- From ai_benchmark import AIBenchmark
- Specify AI Benchmark to run on the IGP
- Benchmark = AIBenchmark(use_CPU=None, verbose_level=3)
- Use_CPU=None will prevent the benchmark from running on the CPU
- Verbose_level=3 will provide us detailed information during the benchmark
- Benchmark = AIBenchmark(use_CPU=None, verbose_level=3)
- Start the benchmark
- Benchmark.run()
The benchmark itself takes about 20 minutes to run and outputs three scores: inference score, training score, and AI-Score. It’s the latter we use as performance measurement.
Do note that the AI Benchmark requires a substantial amount of memory. The integrated graphics does not have dedicated memory but rather shares the memory with the CPU. As you’ll see later in the video: when using 2 sticks of 8GB we ended up running out of memory and unable to complete the benchmark. So, I recommend a minimum of 32GB of system memory.
Also, you may run into an error called DXGI_ERROR_DEVICE_REMOVED while running the benchmark. This happens when there’s a timeout. Basically, the benchmark figures our integrated graphics gave up and went home. But our IGP didn’t give up … it’s just a bit slow.
To solve the issue, you can increase the timeout with the registry entry “TdrDelay”. This registry entry will extend the time a software application waits for the IGP. I use a value of 20 and this resolved my problem.
KeyPath: HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\GraphicsDrivers
KeyValue: TdrDelay
ValueType: REG_DWORD
ValueData: Number of seconds to delay. The default value is 2 seconds.
Oh, and did I mention already the IGP is pretty slow? The IGP gets a score of about 1,000 points at stock. When running the AI Benchmark on our 8 Rocket Lake CPU cores using the regular TensorFlow library, we get a score of about 3,000 points. So 3x higher.
TensorFlow-DirectML in SkatterBencher Guides
We use TensorFlow-DirectML in the following SkatterBencher guides:
- SkatterBencher #28: Intel UHD Graphics 750 Overclocked to 1750 MHz (link)