跳过正文
  1. 开源项目/

llama.cpp Adds Full CUDA 12 Support — Up to 3x Speedup

·58 字·1 分钟
RekCore
作者
RekCore
用通俗易懂的语言,为你解读 AI 世界正在发生的一切

llama.cpp CUDA 12 Backend Delivers Massive Performance Gains
#

The llama.cpp project has merged full CUDA 12 backend support into its mainline branch, unlocking significant performance improvements for NVIDIA GPU owners. Benchmarks show up to 3x speedup for popular models like LLaMA 3.1, Qwen 2.5, and Mistral when running on Ada Lovelace and Hopper architectures.

Key Features
#