Pierrick Hymbert
|
b804b1ef77
eval-callback: Example how to use eval callback for debugging (#6576)
|
1 rok temu |
Minsoo Cheong
|
64e7b47c69
examples : add "retrieval" (#6193)
|
1 rok temu |
Pierrick Hymbert
|
d0d5de42e5
gguf-split: split and merge gguf per batch of tensors (#6135)
|
1 rok temu |
DAN™
|
bcebd7dbf6
llama : add support for GritLM (#5959)
|
1 rok temu |
John
|
6c00a06692
gguf : add python reader example (#5216)
|
1 rok temu |
Abhilash Majumder
|
0f648573dd
ggml : add unified SYCL backend for Intel GPUs (#2690)
|
2 lat temu |
Georgi Gerganov
|
4be5ef556d
metal : remove old API (#4919)
|
2 lat temu |
Kawrakow
|
326b418b59
Importance Matrix calculation (#4861)
|
2 lat temu |
Georgi Gerganov
|
b0034d93ce
examples : add passkey test (#3856)
|
2 lat temu |
LeonEricsson
|
7082d24cec
lookup : add prompt lookup decoding example (#4484)
|
2 lat temu |
Georgi Gerganov
|
922754a8d6
lookahead : add example for lookahead decoding (#4207)
|
2 lat temu |
zakkor
|
2fa02b4b3d
examples : add tokenize (#4039)
|
2 lat temu |
Georgi Gerganov
|
d1031cf49c
sampling : refactor init to use llama_sampling_params (#3696)
|
2 lat temu |
M. Yusuf Sarıgöz
|
370359e5ba
examples: support LLaVA v1.5 (multimodal model) (#3436)
|
2 lat temu |
Georgi Gerganov
|
8c70a5ff25
batched : add bench tool (#3545)
|
2 lat temu |
xaedes
|
0e76a8992c
train : finetune LORA (#2632)
|
2 lat temu |
Georgi Gerganov
|
ec893798b7
llama : custom attention mask + parallel decoding + no context swaps (#3228)
|
2 lat temu |
Georgi Gerganov
|
47068e5170
speculative : PoC for speeding-up inference via speculative sampling (#2926)
|
2 lat temu |
Georgi Gerganov
|
c90d135eb4
examples : fix underscore in beam-search + .gitignore (close #2900)
|
2 lat temu |
Matt Pulver
|
c82742ac9c
llama : add llama_beam_search() (#2267)
|
2 lat temu |
Georgi Gerganov
|
6381d4e110
gguf : new file format with flexible meta data (beta) (#2398)
|
2 lat temu |
slaren
|
097e121e2f
llama : add benchmark example (#2626)
|
2 lat temu |
byte-6174
|
b19edd54d5
Adding support for llama2.c models (#2559)
|
2 lat temu |
DannyDaemonic
|
3498588e0f
Add --simple-io option for subprocesses and break out console.h and cpp (#1558)
|
2 lat temu |
Evan Jones
|
84e09a7d8b
llama : add grammar-based sampling (#1773)
|
2 lat temu |
ningshanwutuobang
|
cfa0750bc9
llama : support input embeddings directly (#1910)
|
2 lat temu |
Georgi Gerganov
|
051e1b0e6a
llama : fix kv_cache `n` init (close #1903)
|
2 lat temu |
xaedes
|
e32089b2c2
train : improved training-from-scratch example (#1652)
|
2 lat temu |
Georgi Gerganov
|
ecb217db4f
llama : Metal inference (#1642)
|
2 lat temu |
Steward Garcia
|
7e4ea5beff
examples : add server example with REST API (#1443)
|
2 lat temu |