Sam/Samuel
|
3f750f8d76
metal: add support for opt_step_sgd (#16539)
|
3 months ago |
Georgi Gerganov
|
c515fc5771
ggml : fix scalar path for computing norm (#16558)
|
3 months ago |
hipudding
|
f9bc66c3eb
CANN: Update several operators to support FP16 data format (#16251)
|
3 months ago |
Sam/Samuel
|
a31cf36ad9
metal : add opt_step_adamw and op_sum (#16529)
|
3 months ago |
Pascal
|
81d54bbfd5
webui: remove client-side context pre-check and rely on backend for limits (#16506)
|
3 months ago |
Neo Zhang Jianyu
|
c7be9febcb
[SYCL] fix UT fault cases: count-equal, argsort, pad OPs (#16521)
|
3 months ago |
Mathieu Baudier
|
8415f61e23
ci : add Vulkan on Ubuntu with default packages build (#16532)
|
3 months ago |
Aldehir Rojas
|
2c301e91ab
common : handle unicode during partial json parsing (#16526)
|
3 months ago |
Georgi Gerganov
|
4b2dae383d
common : update presets (#16504)
|
3 months ago |
sirus20x6
|
41aac5c69b
ggml : Fix FP16 ELU positive branch (#16519)
|
3 months ago |
Daniel Bevenius
|
a2fba89a42
hparams : add check for layer index in is_recurrent (#16511)
|
3 months ago |
sirus20x6
|
20cc625edc
ggml: Correct SVE implementation in ggml_vec_dot_f16_unroll (#16518)
|
3 months ago |
Johannes Gäßler
|
11f0af5504
CUDA: faster tile FA, add oob checks, more HSs (#16492)
|
3 months ago |
Georgi Gerganov
|
a3cb04744f
metal : fix mul-mm condition + fix mul-mv permuted kernels (#16494)
|
3 months ago |
Pascal
|
4a8fbe0a5e
feat: render user content as markdown option (#16358)
|
3 months ago |
Yann Follet
|
31d0ff1869
server / ranking : add sorting and management of top_n (#16403)
|
3 months ago |
Diego Devesa
|
97870e6497
cuda : avoid initializing unused devices (#16510)
|
3 months ago |
amirai21
|
477a66b035
convert : correctly handle LLaMA tokenizer for Jamba (#16470)
|
3 months ago |
Georgi Gerganov
|
e60f01d941
server : fix division by zero when reporting stats (#16501)
|
3 months ago |
Georgi Gerganov
|
81086cd6a3
vocab : mark EOT token for Granite models (#16499)
|
3 months ago |
Radoslav Gerganov
|
68ee98ae18
server : return HTTP 400 if prompt exceeds context length (#16486)
|
3 months ago |
Radoslav Gerganov
|
cdb6da468c
server : log requests to /v1/completions (#16495)
|
3 months ago |
Prajwal B Mehendarkar
|
6d69ab3f26
cmake : Dont define XOPENSOURCE on AIX (#16481)
|
3 months ago |
Pascal
|
1faa13a118
webui: updated the chat service to only include max_tokens in the req… (#16489)
|
3 months ago |
duduta
|
1deee0f8d4
cpu : optimize the ggml NORM operation (#15953)
|
3 months ago |
Georgi Gerganov
|
d00cbea63c
server : host-memory prompt caching (#16391)
|
3 months ago |
Pascal
|
8328fd4bae
No markdown in cot (#16483)
|
3 months ago |
Daniel Bevenius
|
56b4795842
model-conversion : add support for SentenceTransformers (#16387)
|
3 months ago |
sudhiarm
|
2c0d875ae6
ci: add ARM64 Kleidiai build and test support (#16462)
|
3 months ago |
Chenguang Li
|
aa4711d369
CANN: Improve ACL graph matching (#16166)
|
3 months ago |