Georgi Gerganov
|
51543729ff
ggml : fix redefinition of vaddvq_f32 for 32-bit ARM (#6906)
|
1 ano atrás |
Daniel Bevenius
|
4ab99d8d47
clip : rename lerp function to avoid conflict (#6894)
|
1 ano atrás |
Georgi Gerganov
|
54770413c4
ggml : fix MIN / MAX macros (#6904)
|
1 ano atrás |
Georgi Gerganov
|
aa750c1ede
tests : minor bash stuff (#6902)
|
1 ano atrás |
jiez
|
1966eb2615
quantize : add '--keep-split' to quantize model into shards (#6688)
|
1 ano atrás |
Johannes Gäßler
|
784e11dea1
README: add graphic for matrix multiplication (#6881)
|
1 ano atrás |
Douglas Hanley
|
b4e4b8a935
llama : add llama_get_pooling_type function (#6862)
|
1 ano atrás |
mgroeber9110
|
3fe847b574
server : do not apply Markdown formatting in code sections (#6850)
|
1 ano atrás |
Kyle Mistele
|
37246b1031
common : revert showing control tokens by default for server (#6860)
|
1 ano atrás |
Johannes Gäßler
|
28103f4832
Server: fix seed for multiple slots (#6835)
|
1 ano atrás |
Georgi Gerganov
|
c0d1b3e03e
ggml : move 32-bit arm compat in ggml-impl.h (#6865)
|
1 ano atrás |
Tristan Druyen
|
abd3314064
llama : add phi 3 chat template (#6857)
|
1 ano atrás |
Junyang Lin
|
3fec68be4e
convert : add support of codeqwen due to tokenizer (#6707)
|
1 ano atrás |
liuwei-git
|
c8297c6af5
llama : add phi3 support (#6852)
|
1 ano atrás |
Anas Ahouzi
|
4e96a812b3
[SYCL] Windows default build instructions without -DLLAMA_SYCL_F16 flag activated (#6767)
|
1 ano atrás |
Justine Tunney
|
192090bae4
llamafile : improve sgemm.cpp (#6796)
|
1 ano atrás |
Dave Airlie
|
e931888d50
ggml : fix calloc argument ordering. (#6820)
|
1 ano atrás |
Georgi Gerganov
|
8960fe86ae
llama : fix typo in <|im_end|> token text (#6745)
|
1 ano atrás |
Pierrick Hymbert
|
c0956b09ba
ci: fix job are cancelling each other (#6781)
|
1 ano atrás |
github-actions[bot]
|
e9b4a1bf68
flake.lock: Update
|
1 ano atrás |
Olivier Chafik
|
5cf5e7d490
`build`: generate hex dump of server assets during build (#6661)
|
1 ano atrás |
Georgi Gerganov
|
40f74e4d73
llama : add option to render special/control tokens (#6807)
|
1 ano atrás |
Georgi Gerganov
|
b9cc76d87e
ggml : fix ggml_backend_cpu_supports_op() for CPY (#0)
|
1 ano atrás |
Wouter
|
7dbdba5690
llama : add llama-3 chat template (#6751)
|
1 ano atrás |
pmysl
|
c1386c936e
gguf-py : add IQ1_M to GGML_QUANT_SIZES (#6761)
|
1 ano atrás |
Jan Boon
|
e8d35f47cb
doc : add link to falcon (#6789)
|
1 ano atrás |
Mohammadreza Hendiani
|
2cca09d509
readme : add Fedora instructions (#6783)
|
1 ano atrás |
Justine Tunney
|
89b0bf0d5d
llava : use logger in llava-cli (#6797)
|
1 ano atrás |
Pedro Cuenca
|
b97bc3966e
llama : support Llama 3 HF conversion (#6745)
|
1 ano atrás |
Jan Boon
|
b8109bc013
doc : server tests require llama to be built with curl enabled (#6788)
|
1 ano atrás |