本帖最后由 zxc786 于 2024-2-18 17:53 编辑
https://blog.gslin.org/archives/2024/02/13/11655/%E8%AE%93-intelamd-gpu-%E7%9B%B4%E6%8E%A5%E8%B7%91-cuda-%E7%A8%8B%E5%BC%8F%E7%9A%84-zluda/ Nobody expects the Red Team Too many changes to list, but broadly:
* Remove Intel GPU support from the compiler
* Add AMD GPU support to the compiler
* Remove Intel GPU host code
* Add AMD GPU host code
* More device instructions. From 40 to 68
* More host functions. From 48 to 184
* Add proof of concept implementation of OptiX framework
* Add minimal support of cuDNN, cuBLAS, cuSPARSE, cuFFT, NCCL, NVML
* Improve ZLUDA launcher for Windows
其中的轉折以及後續的故事其實還蠻不知道怎麼說的... 作者一開始在 Intel 上班,弄一弄 Intel 覺得這沒前景,然後 AMD 接觸後贊助這個專案,到後面也覺得沒前景,於是依照後來跟 AMD 的合約,如果 AMD 覺得沒前景,可以 open source 出來: Why is this project suddenly back after 3 years? What happened to Intel GPU support? In 2021 I was contacted by Intel about the development od ZLUDA. I was an Intel employee at the time. While we were building a case for ZLUDA internally, I was asked for a far-reaching discretion: not to advertise the fact that Intel was evaluating ZLUDA and definitely not to make any commits to the public ZLUDA repo. After some deliberation, Intel decided that there is no business case for running CUDA applications on Intel GPUs. Shortly thereafter I got in contact with AMD and in early 2022 I have left Intel and signed a ZLUDA development contract with AMD. Once again I was asked for a far-reaching discretion: not to advertise the fact that AMD is evaluating ZLUDA and definitely not to make any commits to the public ZLUDA repo. After two years of development and some deliberation, AMD decided that there is no business case for running CUDA applications on AMD GPUs. One of the terms of my contract with AMD was that if AMD did not find it fit for further development, I could release it. Which brings us to today.
Phoronix 在 open source 前幾天先拿到軟體進行測試,而他這幾天測試的結果給了「頗不賴」的評價: Andrzej Janik reached out and provided access to the new ZLUDA implementation for AMD ROCm to allow me to test it out and benchmark it in advance of today's planned public announcement. I've been testing it out for a few days and it's been a positive experience: CUDA-enabled software indeed running atop ROCm and without any changes. Even proprietary renderers and the like working with this "CUDA on Radeon" implementation.
另外為了避免測試時有些測試軟體會回傳到伺服器造成資訊外洩,ZLUDA 在這邊故意設定為 Graphics Device,而在這次 open source 公開後會改回正式的名稱: In my screenshots and for the past two years of development the exposed device name for Radeon GPUs via CUDA has just been "Graphics Device" rather than the actual AMD Radeon graphics adapter with ROCm. The reason for this has been due to CUDA benchmarks auto-reporting results and other software that may have automated telemetry, to avoid leaking the fact of Radeon GPU use under CUDA, it's been set to the generic "Graphics Device" string. I'm told as part of today's open-sourcing of this ZLUDA on Radeon code that the change will be in place to expose the actual Radeon graphics card string rather than the generic "Graphics Device" concealer.
作者的測試看起來在不同的測試項目下差異頗大,但如果依照作者的計算方式,整體效能跟 OpenCL 版本差不多: Phoronix 那邊則是做了與 Nvidia 比較的測試... 這邊拿的是同樣都有支援 Nvidia 與 AMD 家的卡的 Blender 測試,然後跑出來的結果讓人傻眼,透過 ZLUDA 轉譯出來的速度比原生支援的速度還快,這 optimization 看起來又有得討論了:(這是 BMW27 的測試,在 Classroom 的測試也發現一樣的情況) 但即使如此,CUDA over AMD GPU 應該還是不會起來,官方會儘量讓各 framework 原生支援,而大多數的開發者都是在 framework 上面開發,很少會自己從頭幹...
--------------------- 【【Stable diffusion】windows下A卡AI绘画终于有救了?据测windows/zluda效率堪比linux/rocm。】
AMD YES!Stable diffusionA卡可以满血WINDOWS上跑了。 ------ 【【Stable diffusion】AMD显卡windows下使用Zluda运行SD简易教程】 https://www.bilibili.com/video/BV1MW421N7gP/?share_source=copy_web&vd_source=919509e8ee7f16e2af1f245c5c17b664 做了个简单教程。
|