Georgi Gerganov
|
47068e5170
speculative : PoC for speeding-up inference via speculative sampling (#2926)
|
2 anni fa |
Georgi Gerganov
|
c90d135eb4
examples : fix underscore in beam-search + .gitignore (close #2900)
|
2 anni fa |
Matt Pulver
|
c82742ac9c
llama : add llama_beam_search() (#2267)
|
2 anni fa |
Georgi Gerganov
|
6381d4e110
gguf : new file format with flexible meta data (beta) (#2398)
|
2 anni fa |
slaren
|
097e121e2f
llama : add benchmark example (#2626)
|
2 anni fa |
byte-6174
|
b19edd54d5
Adding support for llama2.c models (#2559)
|
2 anni fa |
DannyDaemonic
|
3498588e0f
Add --simple-io option for subprocesses and break out console.h and cpp (#1558)
|
2 anni fa |
Evan Jones
|
84e09a7d8b
llama : add grammar-based sampling (#1773)
|
2 anni fa |
ningshanwutuobang
|
cfa0750bc9
llama : support input embeddings directly (#1910)
|
2 anni fa |
Georgi Gerganov
|
051e1b0e6a
llama : fix kv_cache `n` init (close #1903)
|
2 anni fa |
xaedes
|
e32089b2c2
train : improved training-from-scratch example (#1652)
|
2 anni fa |
Georgi Gerganov
|
ecb217db4f
llama : Metal inference (#1642)
|
2 anni fa |
Steward Garcia
|
7e4ea5beff
examples : add server example with REST API (#1443)
|
2 anni fa |
xaedes
|
f954edda93
ggml : implement backward pass for llama + small training-llama-from-scratch example (#1360)
|
2 anni fa |
Stephan Walter
|
f0d70f147d
Various fixes to mat_mul benchmark (#1253)
|
2 anni fa |
xaedes
|
0c5692345d
examples : add save_load_state example (#1150)
|
2 anni fa |
unbounded
|
62cfc54f77
Add quantize-stats command for testing quantization (#728)
|
2 anni fa |
Georgi Gerganov
|
a316a425d0
Overhaul the examples structure
|
2 anni fa |