Georgi Gerganov
|
196f5083ef
common : more accurate sampling timing (#17382)
|
2 months ago |
o7si
|
5088b435d4
convert : fix TypeError when loading base model remotely in convert_lora_to_gguf (#17385)
|
2 months ago |
Piotr Wilkin (ilintar)
|
845f200b28
ggml : Fix transposed SOLVE_TRI result (#17323)
|
2 months ago |
Scott Fudally
|
a7784a8b1d
DGX Spark: UMA support (#17368)
|
2 months ago |
Adrien Gallouët
|
79bb743512
ggml : remove useless and error-prone variadic macros (#17399)
|
2 months ago |
sudhiarm
|
3ae282a06f
kleidiai: fix zero-size array declaration (#17240)
|
2 months ago |
ixgbe
|
5be353ec4a
ggml-cpu:add RISC-V RVV (Zvfh) optimization for FP16 vector scaling (#17314)
|
2 months ago |
Giuseppe Scrivano
|
7d77f07325
vulkan: implement ADD1, ARANGE, FILL, SOFTPLUS, STEP, ROUND, CEIL, FLOOR, TRUNC (#17319)
|
2 months ago |
Jeff Bolz
|
1fa4551af0
vulkan: support larger argsort (#17313)
|
2 months ago |
Jeff Bolz
|
2eba631b81
vulkan: Add copy_transpose shader (#17371)
|
2 months ago |
Aleksander Grygier
|
99c53d6558
webui: Add a "Continue" Action for Assistant Message (#16971)
|
2 months ago |
Sigbjørn Skjæret
|
07b0e7a5ac
convert : use self.block_count everywhere instead of reading hparams (#17359)
|
2 months ago |
Aman Gupta
|
fd7353d5eb
cuda: fix rope fusion for gemma3 (#17378)
|
2 months ago |
Piotr Wilkin (ilintar)
|
6fd4f95367
Fix too relaxed check on CUDA "fast copy" (can_be_transposed) condition (#17332)
|
2 months ago |
Ruben Ortlam
|
980b7cd17e
vulkan: force full subgroups for flash attention to fix intel subgroup crash (#17356)
|
2 months ago |
Jeremy Rand
|
c49daff5ba
ggml-cpu: Don't pass -mpowerpc64 when -mcpu already implies it (#17308)
|
2 months ago |
Xuan-Son Nguyen
|
10e9780154
chat: fix int overflow, prevent size calculation in float/double (#17357)
|
2 months ago |
Haiyue Wang
|
a045492088
vocab : call reserve() for building plamo-2-translate suffix (#17343)
|
2 months ago |
hksdpc255
|
1920345c3b
common : Generalized XML-style tool-call parsing with streaming support (GLM 4.5/4.6 + MiniMax M2 + SeedOSS + Kimi-K2 + Qwen3-Coder + Apriel-1.5 + Xiaomi-MiMo) (#16932)
|
2 months ago |
jiahao su
|
561a3e2788
ci : change the openEuler-310p image to fix release (#17361)
|
2 months ago |
Georgi Gerganov
|
f40a2e5f11
gitignore : be more specific about ignored stuff (#17354)
|
2 months ago |
Chenguang Li
|
bc4064cfea
CANN: fix acl_tensor_ptr usage in ASCEND_310P ROPE (#17347)
|
2 months ago |
o7si
|
97cb3fd5ae
fix: resolve undefined variable 'svr' compilation error (#17348)
|
2 months ago |
jiahao su
|
ffa277a54c
CANN: Add openEuler-cann in build and release (#17192)
|
2 months ago |
Jeff Bolz
|
da95bf2a85
vulkan: support noncontig i32 copy (#17328)
|
2 months ago |
Xuan-Son Nguyen
|
0de8878c96
server: split HTTP into its own interface (#17216)
|
2 months ago |
Ruben Ortlam
|
38e2c1b412
vulkan: add log RTE support to fix Nvidia CI (#17320)
|
2 months ago |
Adrien Gallouët
|
cb44fc84e8
cmake : fix ARM feature verification (#17170)
|
2 months ago |
Adrien Gallouët
|
cb623de3fc
ggml : add missing AVX512 feature checks (#17270)
|
2 months ago |
Georgi Gerganov
|
7aaeedc098
metal : support I32 -> I32 copy (#17317)
|
2 months ago |