Xuan Son Nguyen
|
c421ac072d
lora : warn user if new token is added in the adapter (#9948)
|
1 year ago |
Molly Sophia
|
4ff7fe1fb3
llama : add chat template for RWKV-World + fix EOT (#9968)
|
1 year ago |
leo-pony
|
6b8447352d
[CANN] Adapt to dynamically loadable backends mechanism (#9970)
|
1 year ago |
Daniel Bevenius
|
674804a996
arg : fix typo in embeddings argument help [no ci] (#9994)
|
1 year ago |
Georgi Gerganov
|
e94a138d64
llama.vim : fix info text display [no ci] (#9787)
|
1 year ago |
Georgi Gerganov
|
e01c67affe
llama.vim : move info to the right of screen [no ci] (#9787)
|
1 year ago |
Asghar Ghorbani
|
994cfb1acb
readme : update UI list (#9972)
|
1 year ago |
Daniel Bevenius
|
94008cc760
arg : fix attention non-causal arg value hint (#9985)
|
1 year ago |
Georgi Gerganov
|
dbd5f2f573
llama.vim : plugin for Neovim (#9787)
|
1 year ago |
Georgi Gerganov
|
f594bc80ba
ggml : add asserts for type conversion in fattn kernels (#9971)
|
1 year ago |
Radoslav Gerganov
|
d5ebd79c76
rpc : pack only RPC structs (#9959)
|
1 year ago |
Georgi Gerganov
|
55e47786e3
llama : default sampling changes + greedy update (#9897)
|
1 year ago |
Georgi Gerganov
|
bc21975084
speculative : fix handling of some input params (#9963)
|
1 year ago |
Neo Zhang Jianyu
|
1db8c84fc6
fix mul_mat_vec_q and *_vec_q error (#9939)
|
1 year ago |
Loïc Carrère
|
45f097645e
readme : update bindings list (#9951)
|
1 year ago |
icppWorld
|
7cab2083c7
readme : update infra list (#9942)
|
1 year ago |
Xuan Son Nguyen
|
cda0e4b648
llama : remove all_pos_0, all_pos_1, all_seq_id from llama_batch (#9745)
|
1 year ago |
Radoslav Gerganov
|
afd9909a64
rpc : backend refactoring (#9912)
|
1 year ago |
Ouadie EL FAROUKI
|
87421a23e8
[SYCL] Add SYCL Backend registry, device and Event Interfaces (#9705)
|
1 year ago |
Ma Mingfei
|
60ce97c9d8
add amx kernel for gemm (#8998)
|
1 year ago |
Georgi Gerganov
|
8901755ba3
server : add n_indent parameter for line indentation requirement (#9929)
|
1 year ago |
Daniel Bevenius
|
6f55bccbb8
llama : rename batch_all to batch (#8881)
|
1 year ago |
Georgi Gerganov
|
17bb928080
readme : remove --memory-f32 references (#9925)
|
1 year ago |
Georgi Gerganov
|
9f45fc1e99
llama : change warning to debug log
|
1 year ago |
Georgi Gerganov
|
99bd4ac28c
llama : infill sampling handle very long tokens (#9924)
|
1 year ago |
Tim Wang
|
3752217ed5
readme : update bindings list (#9918)
|
1 year ago |
Diego Devesa
|
f010b77a37
vulkan : add backend registry / device interfaces (#9721)
|
1 year ago |
Gilad S.
|
2194200278
fix: allocating CPU buffer with size `0` (#9917)
|
1 year ago |
Gilad S.
|
73afe681aa
fix: use `vm_allocate` to allocate CPU backend buffer on macOS (#9875)
|
1 year ago |
Daniel Bevenius
|
9e04102448
llama : suppress conversion from 'size_t' to 'int' (#9046)
|
1 year ago |