Christian Fillion
|
7ee953a64a
llama : add llama_sampler_init for safe usage of llama_sampler_free (#11727)
|
11 ヶ月 前 |
Akarshan Biswas
|
ec3bc8270b
SYCL: remove XMX info from print devices (#11712)
|
11 ヶ月 前 |
Daniel Bevenius
|
b7552cfcbc
common : add default embeddings presets (#11677)
|
11 ヶ月 前 |
Jinyang He
|
225bbbfa39
ggml : optimize and build warning fix for LoongArch (#11709)
|
11 ヶ月 前 |
tv1wnd
|
855cd0734a
llama : fix old glm4 models (#11670)
|
11 ヶ月 前 |
Georgi Gerganov
|
8a59053f63
sync : ggml
|
11 ヶ月 前 |
Patrick Peng
|
1d20e53c40
rpc: fix known RCE in rpc-server (ggml/1103)
|
11 ヶ月 前 |
Xuan-Son Nguyen
|
2fb3c32a16
server : (webui) migrate project to ReactJS with typescript (#11688)
|
11 ヶ月 前 |
Tei Home
|
9ab42dc722
docs: update fedora cuda guide for 12.8 release (#11393)
|
11 ヶ月 前 |
Akarshan Biswas
|
194b2e69f8
SYCL: Adjust support condition for norm operators (#11674)
|
11 ヶ月 前 |
Georgi Gerganov
|
9dd7a0390f
llama : add log about loading model tensors (#11699)
|
11 ヶ月 前 |
Adrien Gallouët
|
c0d4843225
build : fix llama.pc (#11658)
|
11 ヶ月 前 |
junchao-zhao
|
8d4d2be143
ggml : fix LoongArch compile error with 128-bit SIMD (#11701)
|
11 ヶ月 前 |
Jeff Bolz
|
2c6c8df56d
vulkan: optimize coopmat2 iq2/iq3 callbacks (#11521)
|
11 ヶ月 前 |
Rémy O
|
8a7e3bf17a
vulkan: initial support for IQ4_XS quantization (#11501)
|
11 ヶ月 前 |
Jeff Bolz
|
1b598b3058
vulkan: use smaller combined allocations to avoid fragmentation (#11551)
|
11 ヶ月 前 |
Charles Duffy
|
902368a06b
metal : avoid breaking build when metal API predates TARGET_OS_VISION (#11690)
|
11 ヶ月 前 |
Matvey Soloviev
|
c3db0480bb
readme : add link to Autopen under UIs (#11684)
|
11 ヶ月 前 |
Georgi Gerganov
|
d774ab3acc
metal : adjust support conditions for norm operators (#11671)
|
11 ヶ月 前 |
Johannes Gäßler
|
fa62da9b2d
CUDA: support for mat. mul. with ne03 != ne13 (#11656)
|
11 ヶ月 前 |
SAMI
|
1ec208083c
llava: add quantization for the visual projector LLAVA, Qwen2VL (#11644)
|
11 ヶ月 前 |
Olivier Chafik
|
9f4cc8f8d3
`sync`: minja (#11641)
|
11 ヶ月 前 |
Johannes Gäßler
|
fd08255d0d
CUDA: non-contiguous (RMS) norm support (#11659)
|
11 ヶ月 前 |
fxzjshm
|
3ec9fd4b77
HIP: force max threads per block to be 1024 (#11621)
|
11 ヶ月 前 |
Xuan-Son Nguyen
|
3962fc1a79
server : add try..catch to places not covered by set_exception_handler (#11620)
|
11 ヶ月 前 |
Radoslav Gerganov
|
1bef571f6a
arg : list RPC devices first when using --list-devices (#11655)
|
11 ヶ月 前 |
Olivier Chafik
|
db288b60cb
`tool-call`: command r7b fix for normal responses (#11608)
|
11 ヶ月 前 |
Shelby Jenkins
|
106045e7bb
readme : add llm_client Rust crate to readme bindings (#11628)
|
11 ヶ月 前 |
Jhen-Jie Hong
|
f117d84b48
swift : fix llama-vocab api usage (#11645)
|
11 ヶ月 前 |
Jhen-Jie Hong
|
534c46b53c
metal : use residency set for other platforms (#11648)
|
11 ヶ月 前 |