Georgi Gerganov
|
0d52a69e4b
ci : fix cmake option (#11125)
|
1 year ago |
Mathieu Baudier
|
02f0430141
Disable GL_KHR_cooperative_matrix Vulkan extension if not available. (#11117)
|
1 year ago |
ag2s20150909
|
bec2183f2c
fix: Vulkan shader gen binary path when Cross-compiling (#11096)
|
1 year ago |
Johannes Gäßler
|
53ff6b9b9f
GGUF: C++ refactor, backend support, misc fixes (#11030)
|
1 year ago |
Diego Devesa
|
017cc5f446
ggml-backend : only offload from host buffers (fix) (#11124)
|
1 year ago |
Diego Devesa
|
a3d50bc022
ggml-backend : only offload from host buffers (#11120)
|
1 year ago |
Radoslav Gerganov
|
a4dd490069
rpc : code cleanup (#11107)
|
1 year ago |
Akarshan Biswas
|
c0d6f790d0
SYCL: Use get_multi_ptr instead of deprecated get_pointer in wkv6 (#11087)
|
1 year ago |
Eric Curtin
|
dc7cef9f37
llama-run : fix context size (#11094)
|
1 year ago |
Georgi Gerganov
|
ecebbd292d
llama : remove unused headers (#11109)
|
1 year ago |
Xuan Son Nguyen
|
96be8c3264
github : add cmd line field to bug report (#11090)
|
1 year ago |
Georgi Gerganov
|
e6e7c75d94
server : fix extra BOS in infill endpoint (#11106)
|
1 year ago |
Xuan Son Nguyen
|
09186fabbe
llama : remove check flash_attn with lora (#11104)
|
1 year ago |
Asghar Ghorbani
|
96a1dc27c3
llama : prevent system info string accumulation across calls (#11101)
|
1 year ago |
Daniel Bevenius
|
6369f867a4
llama : rename missed batch params/vars to ubatch (#10059)
|
1 year ago |
Georgi Gerganov
|
47182dd03f
llama : update llama_model API names (#11063)
|
1 year ago |
Georgi Gerganov
|
3e6e7a6bc2
tokenize : escape the prompt (#11058)
|
1 year ago |
Georgi Gerganov
|
ae2f606bb5
mmap : fix fileno macro clash (#11076)
|
1 year ago |
Georgi Gerganov
|
727368c60f
llama : use LLAMA_TOKEN_NULL (#11062)
|
1 year ago |
Georgi Gerganov
|
5047dd3546
llama : use _impl suffix instead of _internal (#11060)
|
1 year ago |
Johannes Gäßler
|
46e3556e01
CUDA: add BF16 support (#11093)
|
1 year ago |
0cc4m
|
b56f079e28
Vulkan: Add device-specific blacklist for coopmat for the AMD proprietary driver (#11074)
|
1 year ago |
fairydreaming
|
9394bbd484
llama : Add support for DeepSeek V3 (#11049)
|
1 year ago |
matt23654
|
f922a9c542
[GGML][RPC] Support for models with non-512-aligned tensors over RPC. (#11047)
|
1 year ago |
DAN™
|
46be942214
llama : add support for the cohere2 model architecture (#10900)
|
1 year ago |
Georgi Gerganov
|
78c6785175
sync : ggml
|
1 year ago |
Georgi Gerganov
|
5e3b08d606
ggml : do not install metal source when embed library (ggml/1054)
|
1 year ago |
Daniel Bevenius
|
db68c93b57
ggml : improve inputs log sched_print_assignments (ggml/1053)
|
1 year ago |
Gilad S.
|
c31fc8b966
fix: Vulkan shader gen binary path (#11037)
|
1 year ago |
Molly Sophia
|
4b0c638b9a
common : disable KV cache shifting automatically for unsupported models (#11053)
|
1 year ago |