Georgi Gerganov
|
815fe72adc
sync : ggml
|
1 年之前 |
Georgi Gerganov
|
f221d56220
ggml : alloc ggml_contexts on the heap (whisper/2525)
|
1 年之前 |
Zhenwei Jin
|
e597e50794
build: fix build error in Windows env with OneAPI setup (#10107)
|
1 年之前 |
Diego Devesa
|
85679d37f3
llama : improve output buffer type selection (#10098)
|
1 年之前 |
Diego Devesa
|
1e9f94994e
quantize : fix --keep-split (#10114)
|
1 年之前 |
Diego Devesa
|
c02e5ab2a6
llama : fix buffer checks for mamba and rwk (#10111)
|
1 年之前 |
Zhenwei Jin
|
ab3d71f97f
loader: refactor tensor weights storage (#9935)
|
1 年之前 |
Kevin Gibbons
|
0a683e8088
server : include scheme when printing URL (#10106)
|
1 年之前 |
Diego Devesa
|
dea5e86051
ggml : check tensor name lengths in gguf files (#10100)
|
1 年之前 |
Sergio López
|
1329c0a75e
kompute: add mul_mat_q4_k shader (#10097)
|
1 年之前 |
Sergio López
|
61408e7fad
kompute: add backend registry / device interfaces (#10045)
|
1 年之前 |
Diego Devesa
|
b9e02e8184
ggml : fix memory leaks when loading invalid gguf files (#10094)
|
1 年之前 |
Rich Dougherty
|
6763f713bb
readme : more lora detail in main example readme (#10064)
|
1 年之前 |
Rich Dougherty
|
79a2bc042d
convert : more detailed convert lora usage docs (#10065)
|
1 年之前 |
xctan
|
fc83a9e584
ggml : add Q4_0_8_8 RISC-V GEMV and GEMM kernels (#10029)
|
1 年之前 |
Diego Devesa
|
c5b0f4b5d9
llama : refactor model loader with backend registry (#10026)
|
1 年之前 |
Changyeon Kim
|
8f275a7c45
ggml: Add POOL2D OP for GPU acceleration to the Vulkan backend in the MobileVLM model. (#9763)
|
1 年之前 |
Georgi Gerganov
|
8d8ff71536
llama : remove Tail-Free sampling (#10071)
|
1 年之前 |
arch-btw
|
61715d5cc8
llama : Add IBM granite template (#10013)
|
1 年之前 |
Georgi Gerganov
|
07028f9d74
flake.lock: Update (#10063)
|
1 年之前 |
R0CKSTAR
|
524afeec9d
musa: workaround for Guilty Lockup in cleaning src0 (#10042)
|
1 年之前 |
Georgi Gerganov
|
8125e6cbfc
server : don't overfill the batch during infill (#10018)
|
1 年之前 |
Georgi Gerganov
|
8841ce3f43
llama : switch KQ multiplication to F32 precision by default (#10015)
|
1 年之前 |
Georgi Gerganov
|
cc2983d375
sync : ggml
|
1 年之前 |
bssrdf
|
8c60a8a462
increase cuda_cpy block size (ggml/996)
|
1 年之前 |
Georgi Gerganov
|
9e4a2563ea
scripts : fix amx sync [no ci]
|
1 年之前 |
Georgi Gerganov
|
668750357e
metal : support permuted matrix multiplicaions (#10033)
|
1 年之前 |
wwoodsTM
|
ff252ea48e
llama : add DRY sampler (#9702)
|
1 年之前 |
Michael Podvitskiy
|
d80fb71f8b
llama: string_split fix (#10022)
|
1 年之前 |
Srihari-mcw
|
2f8bd2b901
llamafile : extend sgemm.cpp support for Q5_0 models (#10010)
|
1 年之前 |