AMD Releases ROCm 6.3 Including Vision Libraries, Multi-Node FFT, SGLang, and the Fortran Compiler
A redesigned FlashAttention-2 for optimal AI training and inference, the addition of multi-node Fast Fourier Transform (FFT), a new Fortran compiler, improved computer vision libraries like rocDecode, rocJPEG, and rocAL, and SGLang integration for accelerated AI inferencing are just a few of the new features and optimizations included in AMD’s latest ROCm 6.3 version.
AMD claims that the SGLang runtime, which is now supported by ROCm 6.3, is specifically designed to optimize inference on models such as LLMs and VLMs on AMD Instinct GPUs. Python integration and pre-configured ROCm Docker containers promise 6x faster throughput and much simpler use.A new AMD Fortran compiler with direct GPU offloading, backward compatibility, and integration with HIP Kernels and ROCm libraries, a new multi-node FFT support in rocFFT that simplifies multi-node scaling and improves scalability, and improved computer vision libraries, rocDecode, rocJPEG, and rocAL, for AV1 codec support, GPU-accelerated JPEG decoding, and better audio augmentation are some of the additional transformer optimizations brought about by AMD ROCm 6.3.
In addition to continuing to embrace the open-source philosophy and adapt to developer needs, AMD was pleased to observe that ROCm 6.3 continues to “deliver cutting-edge tools to simplify development while driving better performance and scalability for AI and HPC workloads.” Additional information is available on the AMD ROCm Blogs or the ROCm Documentation Hub.