momonga
|
c392e5094d
server (webui): Fix Premature Submission During IME Conversion (#11971)
|
11 месяцев назад |
Charles Xu
|
c5d91a7400
ggml-cpu: Add CPU backend support for KleidiAI library (#11390)
|
11 месяцев назад |
Prashant Vithule
|
4806498bf1
ggml: aarch64: implement SVE kernels for q3_K_q8_K vector dot (#11917)
|
11 месяцев назад |
Michael Engel
|
0d559580a0
run : add --chat-template-file (#11961)
|
11 месяцев назад |
Johannes Gäßler
|
d04e7163c8
doc: add links to ggml examples [no ci] (#11958)
|
11 месяцев назад |
Daniel Bevenius
|
d07c621393
common : add llama.vim preset for Qwen2.5 Coder (#11945)
|
11 месяцев назад |
Georgi Gerganov
|
abd4d0bc4f
speculative : update default params (#11954)
|
11 месяцев назад |
Daniel Bevenius
|
9626d9351a
llama : fix indentation in llama-grammar [no ci] (#11943)
|
11 месяцев назад |
igardev
|
b58934c183
server : (webui) Enable communication with parent html (if webui is in iframe) (#11940)
|
11 месяцев назад |
Olivier Chafik
|
63e489c025
tool-call: refactor common chat / tool-call api (+ tests / fixes) (#11900)
|
11 месяцев назад |
Xuan-Son Nguyen
|
63ac128563
server : add TEI API format for /rerank endpoint (#11942)
|
11 месяцев назад |
MoonRide303
|
5137da7b8c
scripts: corrected encoding when getting chat template (#11866) (#11907)
|
11 месяцев назад |
xiaobing318
|
09aaf4f1f5
docs : Fix duplicated file extension in test command (#11935)
|
11 месяцев назад |
Johannes Gäßler
|
73e2ed3ce3
CUDA: use async data loading for FlashAttention (#11894)
|
11 месяцев назад |
Eve
|
f7b1116af1
update release requirements (#11897)
|
11 месяцев назад |
Antoine Viallon
|
c4d29baf32
server : fix divide-by-zero in metrics reporting (#11915)
|
11 месяцев назад |
Rémy O
|
2eea03d86a
vulkan: implement several ops relevant for ggml_opt (#11769)
|
11 месяцев назад |
Xuan-Son Nguyen
|
0f2bbe6564
server : bump httplib to 0.19.0 (#11908)
|
11 месяцев назад |
standby24x7
|
fe163d5bf3
common : Fix a typo in help (#11899)
|
11 месяцев назад |
Xuan-Son Nguyen
|
818a340ea8
ci : fix (again) arm64 build fails (#11895)
|
11 месяцев назад |
Jeff Bolz
|
bf42a23d0a
vulkan: support multi/vision rope, and noncontiguous rope (#11902)
|
11 месяцев назад |
Hale Chan
|
c2ea16f260
metal : fix the crash caused by the lack of residency set support on Intel Macs. (#11904)
|
11 месяцев назад |
Johannes Gäßler
|
6dde178248
scripts: fix compare-llama-bench commit hash logic (#11891)
|
11 месяцев назад |
708-145
|
fc10c38ded
examples: fix typo in imatrix/README.md (#11884)
|
11 месяцев назад |
Adrian Kretz
|
22885105a6
metal : optimize dequant q6_K kernel (#11892)
|
11 месяцев назад |
Georgi Gerganov
|
c2cd24fbfd
readme : add notice about new package registry (#11890)
|
11 месяцев назад |
Georgi Gerganov
|
68ff663a04
repo : update links to new url (#11886)
|
11 месяцев назад |
Olivier Chafik
|
f355229692
server: fix type promotion typo causing crashes w/ --jinja w/o tools (#11880)
|
11 месяцев назад |
Rémy O
|
fc1b0d0936
vulkan: initial support for IQ1_S and IQ1_M quantizations (#11528)
|
11 месяцев назад |
Michał Moskal
|
89daa2564f
llguidance build fixes for Windows (#11664)
|
11 месяцев назад |