讓 IntelAMD GPU 直接跑 CUDA 程式的 ZLUDA

先前提過「在 Intel 內顯上面直接跑 CUDA 程式的 ZLUDA」，結果後來事情大翻轉，AMD 跑去贊助專案，變成支援 AMD GPU 了：「AMD Quietly Funded A Drop-In CUDA Implementation Built On ROCm: It's Now Open-Source」，專案在 GitHub 的 vosen/ZLUDA 這邊，而這包支援 AMD GPU 的 commit log 則是在 1b9ba2b2333746c5e2b05a2bf24fa6ec3828dcdf 這包巨大的 commit：

Nobody expects the Red Team

Too many changes to list, but broadly:
* Remove Intel GPU support from the compiler
* Add AMD GPU support to the compiler
* Remove Intel GPU host code
* Add AMD GPU host code
* More device instructions. From 40 to 68
* More host functions. From 48 to 184
* Add proof of concept implementation of OptiX framework
* Add minimal support of cuDNN, cuBLAS, cuSPARSE, cuFFT, NCCL, NVML
* Improve ZLUDA launcher for Windows

其中的轉折以及後續的故事其實還蠻不知道怎麼說的... 作者一開始在 Intel 上班，弄一弄 Intel 覺得這沒前景，然後 AMD 接觸後贊助這個專案，到後面也覺得沒前景，於是依照後來跟 AMD 的合約，如果 AMD 覺得沒前景，可以 open source 出來：

Why is this project suddenly back after 3 years? What happened to Intel GPU support?

In 2021 I was contacted by Intel about the development od ZLUDA. I was an Intel employee at the time. While we were building a case for ZLUDA internally, I was asked for a far-reaching discretion: not to advertise the fact that Intel was evaluating ZLUDA and definitely not to make any commits to the public ZLUDA repo. After some deliberation, Intel decided that there is no business case for running CUDA applications on Intel GPUs.

Shortly thereafter I got in contact with AMD and in early 2022 I have left Intel and signed a ZLUDA development contract with AMD. Once again I was asked for a far-reaching discretion: not to advertise the fact that AMD is evaluating ZLUDA and definitely not to make any commits to the public ZLUDA repo. After two years of development and some deliberation, AMD decided that there is no business case for running CUDA applications on AMD GPUs.

One of the terms of my contract with AMD was that if AMD did not find it fit for further development, I could release it. Which brings us to today.

這個其實還蠻好理解的，CUDA 畢竟是 Nvidia 家的 ecosystem，除非你反超越後自己定義一堆自家專屬的功能 (像是當年 Microsoft 在 IE 上的玩法)，不然只是幫人抬轎。

Phoronix 在 open source 前幾天先拿到軟體進行測試，而他這幾天測試的結果給了「頗不賴」的評價：

Andrzej Janik reached out and provided access to the new ZLUDA implementation for AMD ROCm to allow me to test it out and benchmark it in advance of today's planned public announcement. I've been testing it out for a few days and it's been a positive experience: CUDA-enabled software indeed running atop ROCm and without any changes. Even proprietary renderers and the like working with this "CUDA on Radeon" implementation.

另外為了避免測試時有些測試軟體會回傳到伺服器造成資訊外洩，ZLUDA 在這邊故意設定為 Graphics Device，而在這次 open source 公開後會改回正式的名稱：

In my screenshots and for the past two years of development the exposed device name for Radeon GPUs via CUDA has just been "Graphics Device" rather than the actual AMD Radeon graphics adapter with ROCm. The reason for this has been due to CUDA benchmarks auto-reporting results and other software that may have automated telemetry, to avoid leaking the fact of Radeon GPU use under CUDA, it's been set to the generic "Graphics Device" string. I'm told as part of today's open-sourcing of this ZLUDA on Radeon code that the change will be in place to expose the actual Radeon graphics card string rather than the generic "Graphics Device" concealer.

作者的測試看起來在不同的測試項目下差異頗大，但如果依照作者的計算方式，整體效能跟 OpenCL 版本差不多：

Phoronix 那邊則是做了與 Nvidia 比較的測試... 這邊拿的是同樣都有支援 Nvidia 與 AMD 家的卡的 Blender 測試，然後跑出來的結果讓人傻眼，透過 ZLUDA 轉譯出來的速度比原生支援的速度還快，這 optimization 看起來又有得討論了：(這是 BMW27 的測試，在 Classroom 的測試也發現一樣的情況)

但即使如此，CUDA over AMD GPU 應該還是不會起來，官方會儘量讓各 framework 原生支援，而大多數的開發者都是在 framework 上面開發，很少會自己從頭幹...

在 Intel 內顯上面直接跑 CUDA 程式的 ZLUDA

Hacker News 首頁上看到的有趣東西：「Zluda: Run CUDA code on Intel GPUs, unmodified (github.com/vosen)」，專案在「CUDA on Intel GPUs」這邊，這是個最後更新在 2021 年的專案。這個專案的想法可以猜得出來，想要吃 CUDA 的 ecosystem，把現有用 CUDA 的應用程式直接跑在 Intel 的 GPU 上面，這樣對於一些只有 CUDA 卻沒有 OpenCL 的實作就有機會拿來用。一開始本來以為是給 Intel 新的獨立顯卡 Arc，結果發現是 2021 年就停更的專案，是以內顯來測試的： ZLUDA performance has been measured with GeekBench 5.2.3 on Intel UHD 630. 從 benchmark…

June 16, 2023

In "Computer"

一個檔案直接跑起大型語言模型的 llamafile

llamafile 是昨天很紅的一個專案，由 Mozilla Internet Ecosystem (MIECO) 弄出來的專案，可以使用一個檔案直接跑起大型語言模型的 HTTP server，讓你可以在瀏覽器裡面直接使用。直接看官方的 README.md 就可以蠻簡單的跑起來，不過 Simon Willison 也有寫一篇文章介紹一下，可以看看：「llamafile is the new best way to run a LLM on your own computer」。這邊說的「一個檔案」是指同一個檔案同時可以在 Windows、macOS、Linux、FreeBSD、OpenBSD 以及 NetBSD 上面跑，而且這個檔案也把大型語言模型 (LLM) 的 model 檔案包進去，所以檔案會蠻大的，但畢竟就是方便讓人使用：下載下來直接執行，預設就會在 port 8080 跑起來，可以直接連到 http://127.0.0.1:8080/ 連進去使用。 llamafile 用到的技術是 Cosmopolitan 專案，可以把多個平台的執行檔包在同一個檔案裡面使用。另外用到的專案是 llama.cpp，這個蠻多人都用過了，可以很方便的用 CPU…

December 1, 2023

In "Computer"

Intel CPU + AMD GPU 合一的的系統

先前就有看到 Intel 要與 AMD 合作，將 Intel CPU + AMD GPU 整合在一起以對抗 Nvidia，現在看到 HP 推出對應的筆電了：「HP’s new 15-inch Spectre x360 uses the hybrid Intel/AMD processor」。不過名字剛好跟最近的安全漏洞撞到了 XDDD (所以才想寫 XDDD) The new Spectre x360 15 is one of the first systems to be announced that uses the new Kaby Lake-G processors from Intel.…

January 9, 2018

In "Computer"

Author Gea-Suan LinPosted on February 13, 2024February 13, 2024Categories Computer, Murmuring, SoftwareTags amd, blender, cuda, driver, intel, nvidia, open, opencl, rocm, source, zluda

Your email address will not be published. Required fields are marked *

Comment *

Name *

Email *

Website

Notify me of follow-up comments by email.

Notify me of new posts by email.

To respond on your own website, enter the URL of your response which should contain a link to this post's permalink URL. Your response will then appear (possibly after moderation) on this page. Want to update or remove your response? Update or delete your post and re-enter your post's URL again. (Learn More)

Previous Previous post: VirtualBox 內的 Windows 上傳速度很慢的問題

讓 IntelAMD GPU 直接跑 CUDA 程式的 ZLUDA

讓 IntelAMD GPU 直接跑 CUDA 程式的 ZLUDA

Related

在 Intel 內顯上面直接跑 CUDA 程式的 ZLUDA

一個檔案直接跑起大型語言模型的 llamafile

Intel CPU + AMD GPU 合一的的系統

Leave a Reply

Post navigation

Recommend

Everything To Know About Harley-Davidson's First Motorcycle

Amazon Prime Video makes Dolby Vision, Atmos a paid upgrade

回到历史看未来，重温历史上的 20 款创新经典车型

Talking Drupal #437 - Drupal Mail & Easy Email

The Rise and Fall of GOFAI

Vehicle owners club app with awesome collectible cards

酷态科推出 65W 氮化镓充电头闪充套装，2C + 1A + 充电线 94 元

黄仁勋回应奥特曼7万亿芯片计划：笑了

Build and Deploy Linux Systems from macOS

Shell scripting with Elixir

About Joyk