Diego Devesa
|
e0e912f49b
llama : add option to override model tensor buffers (#11397)
|
9 months ago |
Georgi Gerganov
|
e0dbec0bc6
llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181)
|
10 months ago |
Xuan-Son Nguyen
|
7841fc723e
llama : Add Gemma 3 support (+ experimental vision capability) (#12343)
|
10 months ago |
Georgi Gerganov
|
bdcf8b6a56
cont : fix mmap flag print (#11699)
|
11 months ago |
Georgi Gerganov
|
ed926d8833
llama : fix defrag logic (#11707)
|
11 months ago |
magicse
|
333820d749
llama : fix progress dots (#11730)
|
11 months ago |
tv1wnd
|
855cd0734a
llama : fix old glm4 models (#11670)
|
11 months ago |
Johannes Gäßler
|
fd08255d0d
CUDA: non-contiguous (RMS) norm support (#11659)
|
11 months ago |
piDack
|
0cec062a63
llama : add support for GLM-Edge and GLM-Edge-V series models (#10573)
|
11 months ago |
Molly Sophia
|
325afb370a
llama: fix missing k_cache store for rwkv6qwen2 (#11445)
|
11 months ago |
Johannes Gäßler
|
df984e0147
llama: refactor llama_decode_impl (#11381)
|
11 months ago |
Frank Mai
|
1d8ee06000
rpc: fix register position (#11424)
|
11 months ago |
Radoslav Gerganov
|
667d72846c
rpc : early register backend devices (#11262)
|
1 year ago |
Xuan Son Nguyen
|
681149ced2
llama : add `llama_model_load_from_splits` (#11255)
|
1 year ago |
Johannes Gäßler
|
432df2d5f9
RoPE: fix back, CUDA support for back + noncont. (#11240)
|
1 year ago |
Georgi Gerganov
|
afa8a9ec9b
llama : add `llama_vocab`, functions -> methods, naming (#11110)
|
1 year ago |
Molly Sophia
|
ee7136c6d1
llama: add support for QRWKV6 model architecture (#11001)
|
1 year ago |
Pierrick Hymbert
|
f8feb4b01a
model: Add support for PhiMoE arch (#11003)
|
1 year ago |
Xuan Son Nguyen
|
4d2b3d8804
lora : improve compat with `mergekit-extract-lora` (#11131)
|
1 year ago |
Georgi Gerganov
|
ecebbd292d
llama : remove unused headers (#11109)
|
1 year ago |
Xuan Son Nguyen
|
09186fabbe
llama : remove check flash_attn with lora (#11104)
|
1 year ago |
Asghar Ghorbani
|
96a1dc27c3
llama : prevent system info string accumulation across calls (#11101)
|
1 year ago |
Daniel Bevenius
|
6369f867a4
llama : rename missed batch params/vars to ubatch (#10059)
|
1 year ago |
Georgi Gerganov
|
47182dd03f
llama : update llama_model API names (#11063)
|
1 year ago |
Georgi Gerganov
|
5047dd3546
llama : use _impl suffix instead of _internal (#11060)
|
1 year ago |
fairydreaming
|
9394bbd484
llama : Add support for DeepSeek V3 (#11049)
|
1 year ago |
DAN™
|
46be942214
llama : add support for the cohere2 model architecture (#10900)
|
1 year ago |
Georgi Gerganov
|
f66f582927
llama : refactor `src/llama.cpp` (#10902)
|
1 year ago |
Yun Dou
|
b92a14a841
llama : support InfiniAI Megrez 3b (#10893)
|
1 year ago |
ymcki
|
6f0c9e034b
llama : support for Llama-3_1-Nemotron-51B (#10669)
|
1 year ago |