LaunchDevelopers
12 days ago
tiny-vLLM: high performance LLM inference engine in C++ and CUDA
tiny-vLLM is an open-source LLM inference engine built with C++ and CUDA. It is available on GitHub for developers to use.
·
12 days ago
tiny-vLLM is an open-source LLM inference engine built with C++ and CUDA. It is available on GitHub for developers to use.