slaren
|
e2764cd7ca
gguf : fix mismatch between alloc and free functions (#6929)
|
1 year ago |
Justine Tunney
|
4b1c3c98b4
llamafile : use 64-bit integers in sgemm (#6928)
|
1 year ago |
Pierrick Hymbert
|
bbe3c6e761
ci: server: fix python installation (#6925)
|
1 year ago |
Pierrick Hymbert
|
7f5ff558ee
server: stop generation at `n_ctx_train` if `n_predict` is not set (#6638)
|
1 year ago |
Pierrick Hymbert
|
9e4e077ec5
ci: server: fix python installation (#6922)
|
1 year ago |
Georgi Gerganov
|
83b72cb086
Merge pull request from GHSA-p5mv-gjc5-mwqv
|
1 year ago |
Pierrick Hymbert
|
d4a9afc100
ci: server: fix python installation (#6918)
|
1 year ago |
Pierrick Hymbert
|
7d641c26ac
ci: fix concurrency for pull_request_target (#6917)
|
1 year ago |
Pierrick Hymbert
|
5790c8dac1
bench: server add stop word for PHI-2 (#6916)
|
1 year ago |
vik
|
46e12c4692
llava : add support for moondream vision language model (#6899)
|
1 year ago |
Georgi Gerganov
|
dba497e0c1
cmake : restore LLAMA_LLAMAFILE_DEFAULT
|
1 year ago |
Georgi Gerganov
|
fa0b4ad252
cmake : remove obsolete ANDROID check
|
1 year ago |
slaren
|
d6e1d44f16
llama : synchronize before get/set session data (#6911)
|
1 year ago |
Georgi Gerganov
|
853d06ffe2
ci : tmp disable slow tests
|
1 year ago |
BarfingLemurs
|
3fe0596c18
readme : update model list (#6908)
|
1 year ago |
slaren
|
0ead1f1072
llama : check that all the tensor data is in the model file (#6885)
|
1 year ago |
Georgi Gerganov
|
51543729ff
ggml : fix redefinition of vaddvq_f32 for 32-bit ARM (#6906)
|
1 year ago |
Daniel Bevenius
|
4ab99d8d47
clip : rename lerp function to avoid conflict (#6894)
|
1 year ago |
Georgi Gerganov
|
54770413c4
ggml : fix MIN / MAX macros (#6904)
|
1 year ago |
Georgi Gerganov
|
aa750c1ede
tests : minor bash stuff (#6902)
|
1 year ago |
jiez
|
1966eb2615
quantize : add '--keep-split' to quantize model into shards (#6688)
|
1 year ago |
Johannes Gäßler
|
784e11dea1
README: add graphic for matrix multiplication (#6881)
|
1 year ago |
Douglas Hanley
|
b4e4b8a935
llama : add llama_get_pooling_type function (#6862)
|
1 year ago |
mgroeber9110
|
3fe847b574
server : do not apply Markdown formatting in code sections (#6850)
|
1 year ago |
Kyle Mistele
|
37246b1031
common : revert showing control tokens by default for server (#6860)
|
1 year ago |
Johannes Gäßler
|
28103f4832
Server: fix seed for multiple slots (#6835)
|
1 year ago |
Georgi Gerganov
|
c0d1b3e03e
ggml : move 32-bit arm compat in ggml-impl.h (#6865)
|
1 year ago |
Tristan Druyen
|
abd3314064
llama : add phi 3 chat template (#6857)
|
1 year ago |
Junyang Lin
|
3fec68be4e
convert : add support of codeqwen due to tokenizer (#6707)
|
1 year ago |
liuwei-git
|
c8297c6af5
llama : add phi3 support (#6852)
|
1 year ago |