Lukas Straub
|
a9f77a8be3
server : add openai-style logit_bias support (#14946)
|
пре 5 месеци |
Aman Gupta
|
8a4a856277
Add LLaDA 8b Diffusion model (#14771)
|
пре 5 месеци |
hipudding
|
11490b3672
CANN: Improve loading efficiency after converting weights to NZ format. (#14985)
|
пре 5 месеци |
compilade
|
66625a59a5
graph : reduce splits for recurrent and hybrid models (#14825)
|
пре 5 месеци |
lhez
|
6e6725459a
opencl: add `mul_mat_f32_f32_l4_lm` and `mul_mat_f16_f32_l4_lm` (#14809)
|
пре 5 месеци |
Ed Addario
|
e9192bec56
quantize : fix using combined imatrix GGUFs (multiple datasets) (#14973)
|
пре 5 месеци |
Daniel Bevenius
|
41e78c567e
server : add support for `embd_normalize` parameter (#14964)
|
пре 5 месеци |
uvos
|
ad4a700117
HIP: enable mfma mmq on gfx908 and gfx90a for select datatypes and shapes (#14949)
|
пре 5 месеци |
Georgi Gerganov
|
e32a4ec60e
sync : ggml
|
пре 5 месеци |
Kai Pastor
|
e228de9449
cmake : Fix BLAS link interface (ggml/1316)
|
пре 5 месеци |
Kai Pastor
|
73a8e5ca03
vulkan : fix 32-bit builds (ggml/1313)
|
пре 5 месеци |
Johannes Gäßler
|
92b8810ec7
CUDA: skip masked KV slices for all FA kernels (#14924)
|
пре 5 месеци |
Georgi Gerganov
|
00131d6eaf
tests : update for LLAMA_SET_ROWS=1 (#14961)
|
пре 5 месеци |
Georgi Gerganov
|
1e15bfd42c
graph : fix stack-use-after-return (#14960)
|
пре 5 месеци |
Douglas Hanley
|
a118d80233
embeddings: fix extraction of CLS pooling results (#14927)
|
пре 5 месеци |
Xinpeng Dou
|
61550f8231
CANN: update ops docs (#14935)
|
пре 5 месеци |
uvos
|
aa79524c51
HIP: remove the use of __HIP_PLATFORM_AMD__, explicitly support only AMD targets (#14945)
|
пре 5 месеци |
uvos
|
b77d11179d
HIP: add GGML_HIP_MMQ_MFMA option to allow disableing the MFMA path. (#14930)
|
пре 5 месеци |
uvos
|
c7aa1364fd
HIP: Ignore unsupported unroll transformation in fattn-vec (#14931)
|
пре 5 месеци |
kallewoof
|
1a67fcc306
common : avoid logging partial messages (which can contain broken UTF-8 sequences) (#14937)
|
пре 5 месеци |
hipudding
|
204f2cf168
CANN: Add ggml_set_rows (#14943)
|
пре 5 месеци |
Sigbjørn Skjæret
|
138b288b59
cuda : add softcap fusion (#14907)
|
пре 5 месеци |
Johannes Gäßler
|
bbd0f91779
server-bench: make seed choice configurable (#14929)
|
пре 5 месеци |
Aman Gupta
|
0a5036bee9
CUDA: add roll (#14919)
|
пре 5 месеци |
lhez
|
8ad7b3e65b
opencl : add ops docs (#14910)
|
пре 5 месеци |
Leonard Mosescu
|
bda62193b2
test-backend-ops : extend test case filtering (#14865)
|
пре 5 месеци |
Radoslav Gerganov
|
c556418b60
llama-bench : use local GPUs along with RPC servers (#14917)
|
пре 5 месеци |
xctan
|
db16e2831c
ggml-cpu : deduplicate scalar implementations (#14897)
|
пре 5 месеци |
Akarshan Biswas
|
cd1fce6d4f
SYCL: Add set_rows support for quantized types (#14883)
|
пре 5 месеци |
Xuan-Son Nguyen
|
00fa15fedc
mtmd : add support for Voxtral (#14862)
|
пре 5 месеци |