|
@@ -10,11 +10,11 @@ Inference of [LLaMA](https://arxiv.org/abs/2302.13971) model in pure C/C++
|
|
|
|
|
|
|
|
### Hot topics
|
|
### Hot topics
|
|
|
|
|
|
|
|
|
|
+- Collecting Apple Silicon performance stats:
|
|
|
|
|
+ - M-series: https://github.com/ggerganov/llama.cpp/discussions/4167
|
|
|
|
|
+ - A-series: https://github.com/ggerganov/llama.cpp/discussions/4508
|
|
|
- Added Mixtral support: https://github.com/ggerganov/llama.cpp/pull/4406
|
|
- Added Mixtral support: https://github.com/ggerganov/llama.cpp/pull/4406
|
|
|
-- **llama.h API change for handling KV cache offloading and data type: https://github.com/ggerganov/llama.cpp/pull/4309**
|
|
|
|
|
-- Using `llama.cpp` with AWS instances: https://github.com/ggerganov/llama.cpp/discussions/4225
|
|
|
|
|
- Looking for contributions to improve and maintain the `server` example: https://github.com/ggerganov/llama.cpp/issues/4216
|
|
- Looking for contributions to improve and maintain the `server` example: https://github.com/ggerganov/llama.cpp/issues/4216
|
|
|
-- Collecting Apple Silicon performance stats: https://github.com/ggerganov/llama.cpp/discussions/4167
|
|
|
|
|
|
|
|
|
|
----
|
|
----
|
|
|
|
|
|