Guillaume Wenzek
|
5f66ebca9c
ggml : extend ggml_get_rows, ggml_repeat, ggml_concat (ggml/639)
|
2 lat temu |
Justin Parker
|
f2eb19bd8b
server : throw an error when `slot unavailable` (#4741)
|
2 lat temu |
Georgi Gerganov
|
f3f62f0d83
metal : optimize ggml_mul_mat_id (faster Mixtral PP) (#4725)
|
2 lat temu |
Phil H
|
0ef3ca2ac6
server : add token counts to html footer (#4738)
|
2 lat temu |
Georgi Gerganov
|
540938f890
llama : llama_model_desc print number of experts
|
2 lat temu |
Marcus Dunn
|
0040d42eeb
llama : replace all API facing `int`'s with `int32_t` (#4577)
|
2 lat temu |
postmasters
|
83e633c27e
llama : differentiate the KV dims in the attention (#4657)
|
2 lat temu |
Georgi Gerganov
|
32866c5edd
editorconfig : fix whitespace and indentation #4710
|
2 lat temu |
minarchist
|
5d7002d437
server : add --override-kv parameter (#4710)
|
2 lat temu |
Nam D. Tran
|
26f3071d71
py : re-enable mmap in convert hf (#4732)
|
2 lat temu |
Daniel Bevenius
|
775ac8712a
finetune: fix typo in README.md (#4733)
|
2 lat temu |
Georgi Gerganov
|
58ba655af0
metal : enable shader debugging (cmake option) (#4705)
|
2 lat temu |
Someone Serge
|
edd1ab7bc3
flake.lock: update
|
2 lat temu |
Someone Serge
|
198ed7ebfc
flake.nix: suggest the binary caches
|
2 lat temu |
Someone Serge
|
d836174731
workflows: nix-ci: add a qemu job for jetsons
|
2 lat temu |
Someone Serge
|
06f2a5d190
workflows: nix-flakestry: drop tag filters
|
2 lat temu |
Someone Serge
|
c5239944ba
workflows: weekly `nix flake update`
|
2 lat temu |
Someone Serge
|
1e9ae54cf2
workflows: nix-ci: add a job for eval
|
2 lat temu |
Someone Serge
|
7adedecbe3
workflows: nix-ci: init; build flake outputs
|
2 lat temu |
Someone Serge
|
356ea17e0f
flake.nix: expose checks
|
2 lat temu |
Someone Serge
|
a5c088d8c6
flake.nix: rocm not yet supported on aarch64, so hide the output
|
2 lat temu |
Someone Serge
|
1e3900ebac
flake.nix: expose full scope in legacyPackages
|
2 lat temu |
Georgi Gerganov
|
e39106c055
ggml : add ggml_vdotq_s32 alias (#4715)
|
2 lat temu |
Georgi Gerganov
|
9fbda719de
clip : refactor + bug fixes (#4696)
|
2 lat temu |
Johannes Gäßler
|
39d8bc71ed
CUDA: fixed tensor cores not being used on RDNA3 (#4697)
|
2 lat temu |
automaticcat
|
24a447e20a
ggml : add ggml_cpu_has_avx_vnni() (#4589)
|
2 lat temu |
Johannes Gäßler
|
a20f3c7465
CUDA: fix tensor core logic for Pascal and HIP (#4682)
|
2 lat temu |
Georgi Gerganov
|
0235b9b571
clip : use ggml_backend_buffer_is_host (#4205)
|
2 lat temu |
Steward Garcia
|
ce18d727a4
clip : enable gpu backend (#4205)
|
2 lat temu |
hydai
|
91bb39cec7
cuda: fix vmm oom issue on NVIDIA AGX Orin (#4687)
|
2 lat temu |