Georgi Gerganov
|
e0dbec0bc6
llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181)
|
10 ヶ月 前 |
Ishaan Gandhi
|
2048b5913d
server : fix crash when using verbose output with input tokens that are not in printable range (#12178) (#12338)
|
10 ヶ月 前 |
Oscar Barenys
|
f08f4b3187
Update build.yml for Windows Vulkan builder to use Vulkan 1.4.304 SDK for VK_NV_cooperative_matrix2 support (#12301)
|
10 ヶ月 前 |
Daniel Bevenius
|
80a02aa858
llama.swiftui : fix xcframework dir in README [no ci] (#12353)
|
10 ヶ月 前 |
Alberto Cabrera Pérez
|
363f8c5d67
sycl : variable sg_size support for mmvq kernels (#12336)
|
10 ヶ月 前 |
uvos
|
34c961b181
CUDA/HIP: Fix fattn-vec-* when device warp size is not 32 (#12315)
|
10 ヶ月 前 |
Xuan-Son Nguyen
|
7841fc723e
llama : Add Gemma 3 support (+ experimental vision capability) (#12343)
|
10 ヶ月 前 |
Jeff Bolz
|
bf69cfe62f
vulkan: fix bug in coopmat1 mul_mat_id (#12316)
|
10 ヶ月 前 |
uvos
|
10f2e81809
CUDA/HIP: refractor mmqv to unify the calculation of nwarps and rows per block between host and device code. (#12177)
|
10 ヶ月 前 |
jklincn
|
ba7654380a
ggml-backend : fix backend search path (#12330)
|
10 ヶ月 前 |
BB-fat
|
6ab2e4765a
metal : Cache the Metal library at the device context level (#12265)
|
10 ヶ月 前 |
Xuan-Son Nguyen
|
96e1280839
clip : bring back GPU support (#12322)
|
10 ヶ月 前 |
Eve
|
2c9f833d17
mat vec double buffer (#12188)
|
10 ヶ月 前 |
R0CKSTAR
|
251364549f
musa: support new arch mp_31 and update doc (#12296)
|
10 ヶ月 前 |
Henry Linjamäki
|
8acdacb3ea
opencl: use OpenCL C standard supported by the device (#12221)
|
10 ヶ月 前 |
John Bean
|
89b2b56e86
readme: added Sidekick to available UIs (#12311)
|
10 ヶ月 前 |
Georgi Gerganov
|
e128a1bf5b
tests : fix test-quantize-fns to init the CPU backend (#12306)
|
10 ヶ月 前 |
marcoStocchi
|
6ef79a67ca
common : refactor '-o' option (#12278)
|
10 ヶ月 前 |
Olivier Chafik
|
4e39a3c332
`server`: extract <think> tags from qwq outputs (#12297)
|
10 ヶ月 前 |
Olivier Chafik
|
be421fc429
`tool-call`: ensure there's always a non-empty tool call id (#12292)
|
10 ヶ月 前 |
Olivier Chafik
|
87c2630546
allow missing content in message if tool_calls provided (#12293)
|
10 ヶ月 前 |
Olivier Chafik
|
2b3a25c212
`sampler`: fixes trigger tokens + lazy grammars (fix typo cast from token to string) (#12291)
|
10 ヶ月 前 |
tc-mb
|
8352cdc87b
llava : fix bug in minicpm-v code (#11513)
|
10 ヶ月 前 |
Georgi Gerganov
|
1e2f78a004
server : add speculative decoding presets for FIM (#12287)
|
10 ヶ月 前 |
Georgi Gerganov
|
0fd7ca7a21
authors : update (#12271)
|
10 ヶ月 前 |
Jason C.H
|
6fefc05a7a
ggml-backend : make path_str compatible with C++20 (#12269)
|
10 ヶ月 前 |
Georgi Gerganov
|
7ab364390f
server : infill gen ends on new line (#12254)
|
10 ヶ月 前 |
Daniel Bevenius
|
7c7f3b7f43
ggml : skip intermediate .air file when compiling .metallib (#12247)
|
10 ヶ月 前 |
Georgi Gerganov
|
102ac1891d
sync : ggml
|
10 ヶ月 前 |
vmobilis
|
d6ae2fa061
ggml : ggml_compute_forward_concat() for arbitrary tensor type (ggml/1118)
|
10 ヶ月 前 |