slaren
|
d6e1d44f16
llama : synchronize before get/set session data (#6911)
|
пре 1 година |
slaren
|
0ead1f1072
llama : check that all the tensor data is in the model file (#6885)
|
пре 1 година |
Georgi Gerganov
|
aa750c1ede
tests : minor bash stuff (#6902)
|
пре 1 година |
jiez
|
1966eb2615
quantize : add '--keep-split' to quantize model into shards (#6688)
|
пре 1 година |
Douglas Hanley
|
b4e4b8a935
llama : add llama_get_pooling_type function (#6862)
|
пре 1 година |
Johannes Gäßler
|
28103f4832
Server: fix seed for multiple slots (#6835)
|
пре 1 година |
Tristan Druyen
|
abd3314064
llama : add phi 3 chat template (#6857)
|
пре 1 година |
liuwei-git
|
c8297c6af5
llama : add phi3 support (#6852)
|
пре 1 година |
Georgi Gerganov
|
8960fe86ae
llama : fix typo in <|im_end|> token text (#6745)
|
пре 1 година |
Georgi Gerganov
|
40f74e4d73
llama : add option to render special/control tokens (#6807)
|
пре 1 година |
Wouter
|
7dbdba5690
llama : add llama-3 chat template (#6751)
|
пре 1 година |
Pedro Cuenca
|
b97bc3966e
llama : support Llama 3 HF conversion (#6745)
|
пре 1 година |
nopperl
|
9958c81b79
Implement the OLMo architecture (#6741)
|
пре 1 година |
slaren
|
0d56246f4b
ggml : group all experts in a single ggml_mul_mat_id (#6505)
|
пре 1 година |
Ren Xuancheng
|
e11b2e6e1e
Qwen2 : assume tied weights if lm_head/output weights is missing (#6738)
|
пре 1 година |
slaren
|
c71bfd736e
llama : fix compatibility with old 2 expert models (#6735)
|
пре 1 година |
Georgi Gerganov
|
532c1737a1
llama : make general.name optional (#6709)
|
пре 1 година |
Ashish
|
dbceec87c0
llama : add StableLM2 12B (#6635)
|
пре 1 година |
Shijie
|
f4dea7da18
llama : add qwen2moe (#6074)
|
пре 1 година |
Daniel Bevenius
|
4fbd8098e6
gguf : add special tokens metadata for FIM/Infill (#6689)
|
пре 1 година |
compilade
|
132f55795e
llama : fix restoring the number of outputs from state files (#6687)
|
пре 1 година |
David Renshaw
|
1958f7e06c
llama : add missing kv clear in llama_beam_search (#6664)
|
пре 1 година |
Chao Jiang
|
04fbc5f23e
Add Command R chat template (#6650)
|
пре 1 година |
Pierrick Hymbert
|
4bd0f93e4a
model: support arch `DbrxForCausalLM` (#6515)
|
пре 1 година |
jiez
|
91c736015b
llama : add gguf_remove_key + remove split meta during quantize (#6591)
|
пре 1 година |
MasterYi1024
|
dee7f8d692
Correct free memory and total memory. (#6630)
|
пре 1 година |
Clint Herron
|
04a5ac211e
Optimization: eliminate addition of redundant stacks when advancing grammar. (#6616)
|
пре 1 година |
Olivier Chafik
|
cbaadc9294
grammars: 1.5x faster inference w/ complex grammars (vector reserves / reuses) (#6609)
|
пре 1 година |
Pierrick Hymbert
|
b804b1ef77
eval-callback: Example how to use eval callback for debugging (#6576)
|
пре 1 година |
slaren
|
4f407a0a35
llama : add model types for mixtral (#6589)
|
пре 1 година |