Georgi Gerganov
|
d9d54e498d
speculative : refactor and add a simpler example (#10362)
|
1 year ago |
Xuan Son Nguyen
|
cda0e4b648
llama : remove all_pos_0, all_pos_1, all_seq_id from llama_batch (#9745)
|
1 year ago |
Diego Devesa
|
7eee341bee
common : use common_ prefix for common library functions (#9805)
|
1 year ago |
Georgi Gerganov
|
6262d13e0b
common : reimplement logging (#9418)
|
1 year ago |
Georgi Gerganov
|
0abc6a2c25
llama : llama_perf + option to disable timings during decode (#9355)
|
1 year ago |
Xuan Son Nguyen
|
bfe76d4a17
common : move arg parser code to `arg.cpp` (#9388)
|
1 year ago |
Xuan Son Nguyen
|
1b9ae5189c
common : refactor arg parser (#9308)
|
1 year ago |
Georgi Gerganov
|
df270ef745
llama : refactor sampling v2 (#9294)
|
1 year ago |
Liu Jia
|
0a4ce78681
common : Changed tuple to struct (TODO fix) (#8823)
|
1 year ago |
Georgi Gerganov
|
1442677f92
common : refactor cli arg parsing (#7675)
|
1 year ago |
Georgi Gerganov
|
6ff13987ad
common : normalize naming style (#7462)
|
1 year ago |
Pedro Cuenca
|
b97bc3966e
llama : support Llama 3 HF conversion (#6745)
|
1 year ago |
compilade
|
557410b8f0
llama : greatly reduce output buffer memory usage (#6122)
|
1 year ago |
compilade
|
c2101a2e90
llama : support Mamba Selective State Space Models (#5328)
|
1 year ago |
bmwl
|
f486f6e1e5
ggml : add numa options (#5377)
|
1 year ago |
Georgi Gerganov
|
6b0a7420d0
llama : KV cache view API + better KV cache management (#4170)
|
2 years ago |
Daniel Bevenius
|
9d5949f04b
examples : fix typo in parallel example doc comment (#4181)
|
2 years ago |
cebtenzzre
|
b12fa0d1c1
build : link against build info instead of compiling against it (#3879)
|
2 years ago |
Marcus Dunn
|
5be6c803fa
llama : remove token functions with `context` args in favor of `model` (#3720)
|
2 years ago |
Georgi Gerganov
|
d1031cf49c
sampling : refactor init to use llama_sampling_params (#3696)
|
2 years ago |
Georgi Gerganov
|
0e89203b51
speculative : add tree-based sampling example (#3624)
|
2 years ago |
Kerfuffle
|
70c29da118
common : fix mirostat state when using multiple sequences (#3543)
|
2 years ago |
Georgi Gerganov
|
fcca0a7004
refact : fix convert script + zero out KV cache to avoid nans (#3523)
|
2 years ago |
pudepiedj
|
a8777ad84e
parallel : add option to load external prompt file (#3416)
|
2 years ago |
Georgi Gerganov
|
ac2219fef3
llama : fix session saving/loading (#3400)
|
2 years ago |
slaren
|
16bc66d947
llama.cpp : split llama_context_params into model and context params (#3301)
|
2 years ago |
Georgi Gerganov
|
ec893798b7
llama : custom attention mask + parallel decoding + no context swaps (#3228)
|
2 years ago |