Georgi Gerganov
|
666867b799
ggml : fix llamafile sgemm wdata offsets (#6710)
|
1 ano atrás |
Justine Tunney
|
8cc91dc63c
ggml : add llamafile sgemm (#6414)
|
1 ano atrás |
Ashish
|
dbceec87c0
llama : add StableLM2 12B (#6635)
|
1 ano atrás |
Shijie
|
f4dea7da18
llama : add qwen2moe (#6074)
|
1 ano atrás |
Daniel Bevenius
|
8a56075b07
gritlm : add --outdir option to hf.sh script (#6699)
|
1 ano atrás |
Georgi Gerganov
|
58227ffdeb
perplexity : require positive --ctx-size arg (#6695)
|
1 ano atrás |
Daniel Bevenius
|
4fbd8098e6
gguf : add special tokens metadata for FIM/Infill (#6689)
|
1 ano atrás |
Olivier Chafik
|
7593639ce3
`main`: add --json-schema / -j flag (#6659)
|
1 ano atrás |
compilade
|
132f55795e
llama : fix restoring the number of outputs from state files (#6687)
|
1 ano atrás |
Pierrick Hymbert
|
3272896d79
server : revert "minor layout improvements" (#6684)
|
1 ano atrás |
Steven Prichard
|
7fc16a2c32
swift : linux support (#6590)
|
1 ano atrás |
Neo Zhang Jianyu
|
17e98d4c96
fix mul_mat_id() for new input, make the ut pass (#6682)
|
1 ano atrás |
David Renshaw
|
1958f7e06c
llama : add missing kv clear in llama_beam_search (#6664)
|
1 ano atrás |
Chao Jiang
|
04fbc5f23e
Add Command R chat template (#6650)
|
1 ano atrás |
Georgi Gerganov
|
f184dd9208
flake.lock: Update (#6669)
|
1 ano atrás |
Dave
|
422c2aff1c
Added support for GGML_OP_CLAMP in Metal (#6662)
|
1 ano atrás |
Sigbjørn Skjæret
|
8800226d65
Fix --split-max-size (#6655)
|
1 ano atrás |
Jaemin Son
|
e689fc4e91
[bug fix] convert github repository_owner to lowercase (#6673)
|
1 ano atrás |
James A Capozzoli
|
a4ec34e1cd
convert : enable the `--use-temp-file` cli flag (#6645)
|
1 ano atrás |
Neo Zhang Jianyu
|
de17e3f745
fix memcpy() crash, add missed cmd in guide, fix softmax (#6622)
|
1 ano atrás |
Johannes Gäßler
|
b5e7285baf
CUDA: fix matrix multiplication logic for tests (#6667)
|
1 ano atrás |
Pierrick Hymbert
|
4bd0f93e4a
model: support arch `DbrxForCausalLM` (#6515)
|
1 ano atrás |
Olivier Chafik
|
ab9a3240a9
JSON schema conversion: ⚡️ faster repetitions, min/maxLength for strings, cap number length (#6555)
|
1 ano atrás |
slaren
|
fbbc030ba9
metal : unify mul_mv_id kernels (#6556)
|
1 ano atrás |
Daniel Bevenius
|
4cc120c744
infill : add download instructions for model (#6626)
|
1 ano atrás |
Pierrick Hymbert
|
24ee66ed0d
server : coherent log output for KV cache full (#6637)
|
1 ano atrás |
jiez
|
91c736015b
llama : add gguf_remove_key + remove split meta during quantize (#6591)
|
1 ano atrás |
Rene Leonhardt
|
5c4d767ac0
chore: Fix markdown warnings (#6625)
|
1 ano atrás |
Georgi Gerganov
|
ef21ce4ccb
imatrix : remove invalid assert (#6632)
|
1 ano atrás |
MasterYi1024
|
dee7f8d692
Correct free memory and total memory. (#6630)
|
1 ano atrás |