Jaemin Son
|
e689fc4e91
[bug fix] convert github repository_owner to lowercase (#6673)
|
1 rok temu |
James A Capozzoli
|
a4ec34e1cd
convert : enable the `--use-temp-file` cli flag (#6645)
|
1 rok temu |
Neo Zhang Jianyu
|
de17e3f745
fix memcpy() crash, add missed cmd in guide, fix softmax (#6622)
|
1 rok temu |
Johannes Gäßler
|
b5e7285baf
CUDA: fix matrix multiplication logic for tests (#6667)
|
1 rok temu |
Pierrick Hymbert
|
4bd0f93e4a
model: support arch `DbrxForCausalLM` (#6515)
|
1 rok temu |
Olivier Chafik
|
ab9a3240a9
JSON schema conversion: ⚡️ faster repetitions, min/maxLength for strings, cap number length (#6555)
|
1 rok temu |
slaren
|
fbbc030ba9
metal : unify mul_mv_id kernels (#6556)
|
1 rok temu |
Daniel Bevenius
|
4cc120c744
infill : add download instructions for model (#6626)
|
1 rok temu |
Pierrick Hymbert
|
24ee66ed0d
server : coherent log output for KV cache full (#6637)
|
1 rok temu |
jiez
|
91c736015b
llama : add gguf_remove_key + remove split meta during quantize (#6591)
|
1 rok temu |
Rene Leonhardt
|
5c4d767ac0
chore: Fix markdown warnings (#6625)
|
1 rok temu |
Georgi Gerganov
|
ef21ce4ccb
imatrix : remove invalid assert (#6632)
|
1 rok temu |
MasterYi1024
|
dee7f8d692
Correct free memory and total memory. (#6630)
|
1 rok temu |
Pierrick Hymbert
|
81da18e71c
eval-callback: use ggml_op_desc to pretty print unary operator name (#6631)
|
1 rok temu |
Georgi Gerganov
|
9ed2737acc
ci : disable Metal for macOS-latest-cmake-x64 (#6628)
|
1 rok temu |
Clint Herron
|
04a5ac211e
Optimization: eliminate addition of redundant stacks when advancing grammar. (#6616)
|
1 rok temu |
Clint Herron
|
f7001ccc5a
As suggested by @slaren, disabling Metal for test to fix CI build on OSX from #6576 (#6619)
|
1 rok temu |
Nikolas
|
a474f50ebb
Refactor Error Handling for CUDA (#6575)
|
1 rok temu |
Olivier Chafik
|
cbaadc9294
grammars: 1.5x faster inference w/ complex grammars (vector reserves / reuses) (#6609)
|
1 rok temu |
Hugo Roussel
|
1bbdaf6ecd
ci: download artifacts to release directory (#6612)
|
1 rok temu |
Daniel Bevenius
|
f4183afe6a
scripts : add --outdir option to hf.sh (#6600)
|
1 rok temu |
Pierrick Hymbert
|
b804b1ef77
eval-callback: Example how to use eval callback for debugging (#6576)
|
1 rok temu |
Daniel Bevenius
|
8228b66dbc
gguf : add option to not check tensor data (#6582)
|
1 rok temu |
Ralph Soika
|
b3a96f27f0
minor layout improvements (#6572)
|
1 rok temu |
slaren
|
4f407a0a35
llama : add model types for mixtral (#6589)
|
1 rok temu |
slaren
|
65c64dc36f
convert.py : add consolidated.safetensors for mixtral 8x22b (#6587)
|
1 rok temu |
Pierrick Hymbert
|
67fac4b95f
docs : how to add a model (#6565)
|
1 rok temu |
Artem Zinnatullin
|
29122d32ac
readme : fix ROCm link (#6579)
|
1 rok temu |
sjxx
|
b231b37b09
readme : update UI list (#6560)
|
1 rok temu |
Jiří Sejkora
|
ba5e134e07
readme: fix typo in amdgpu target name (#6573)
|
1 rok temu |