Diego Devesa
|
b9e02e8184
ggml : fix memory leaks when loading invalid gguf files (#10094)
|
пре 1 година |
Rich Dougherty
|
6763f713bb
readme : more lora detail in main example readme (#10064)
|
пре 1 година |
Rich Dougherty
|
79a2bc042d
convert : more detailed convert lora usage docs (#10065)
|
пре 1 година |
xctan
|
fc83a9e584
ggml : add Q4_0_8_8 RISC-V GEMV and GEMM kernels (#10029)
|
пре 1 година |
Diego Devesa
|
c5b0f4b5d9
llama : refactor model loader with backend registry (#10026)
|
пре 1 година |
Changyeon Kim
|
8f275a7c45
ggml: Add POOL2D OP for GPU acceleration to the Vulkan backend in the MobileVLM model. (#9763)
|
пре 1 година |
Georgi Gerganov
|
8d8ff71536
llama : remove Tail-Free sampling (#10071)
|
пре 1 година |
arch-btw
|
61715d5cc8
llama : Add IBM granite template (#10013)
|
пре 1 година |
Georgi Gerganov
|
07028f9d74
flake.lock: Update (#10063)
|
пре 1 година |
R0CKSTAR
|
524afeec9d
musa: workaround for Guilty Lockup in cleaning src0 (#10042)
|
пре 1 година |
Georgi Gerganov
|
8125e6cbfc
server : don't overfill the batch during infill (#10018)
|
пре 1 година |
Georgi Gerganov
|
8841ce3f43
llama : switch KQ multiplication to F32 precision by default (#10015)
|
пре 1 година |
Georgi Gerganov
|
cc2983d375
sync : ggml
|
пре 1 година |
bssrdf
|
8c60a8a462
increase cuda_cpy block size (ggml/996)
|
пре 1 година |
Georgi Gerganov
|
9e4a2563ea
scripts : fix amx sync [no ci]
|
пре 1 година |
Georgi Gerganov
|
668750357e
metal : support permuted matrix multiplicaions (#10033)
|
пре 1 година |
wwoodsTM
|
ff252ea48e
llama : add DRY sampler (#9702)
|
пре 1 година |
Michael Podvitskiy
|
d80fb71f8b
llama: string_split fix (#10022)
|
пре 1 година |
Srihari-mcw
|
2f8bd2b901
llamafile : extend sgemm.cpp support for Q5_0 models (#10010)
|
пре 1 година |
Georgi Gerganov
|
bc5ba007b2
server : check that the prompt fits in the slot's context (#10030)
|
пре 1 година |
Xuan Son Nguyen
|
958367bf53
server : refactor slot input data, move tokenizer to HTTP thread (#10023)
|
пре 1 година |
Georgi Gerganov
|
40f2555797
ci : fix cmake flags for SYCL
|
пре 1 година |
Johannes Gäßler
|
167a515651
CUDA: fix insufficient buffer clearing for MMQ (#10032)
|
пре 1 година |
Johannes Gäßler
|
c39665f589
CUDA: fix MMQ for non-contiguous src0, add tests (#10021)
|
пре 1 година |
wwoodsTM
|
0a1c750c80
server : samplers accept the prompt correctly (#10019)
|
пре 1 година |
Georgi Gerganov
|
190a37d797
sync : ggml
|
пре 1 година |
Georgi Gerganov
|
2d3aba9ee8
llama.vim : bump generation time limit to 3s [no ci]
|
пре 1 година |
Johannes Gäßler
|
80273a306d
CUDA: fix 1D im2col, add tests (ggml/993)
|
пре 1 година |
Daniel Bevenius
|
c19af0acb1
ggml : remove redundant set of contexts used field (ggml/978)
|
пре 1 година |
Michael Coppola
|
ac113a0fee
llama.vim : add classic vim support (#9995)
|
пре 1 година |