Chenguang Li
|
67e3f6f601
CANN: add operator fusion support for ADD + RMS_NORM (#17512)
|
преди 3 седмици |
Francisco Herrera
|
92ac1e016b
doc: clarify that steps also apply to linux for opencl (#18002)
|
преди 3 седмици |
Ali Tariq
|
8e3a761189
ci : init git lfs in every build for RISC-V (#18590)
|
преди 3 седмици |
Daniel Bevenius
|
d3dce4e0a5
sampling : add support for backend sampling (#17004)
|
преди 3 седмици |
Tarek Dakhran
|
4974bf53cf
model : mtmd : make input norm optional in LFM2-VL (#18594)
|
преди 3 седмици |
Aman Gupta
|
908a9e5a1e
CUDA: disable cuda graph when using n-cpu-moe (#18593)
|
преди 3 седмици |
Aman Gupta
|
5126c41c1c
ggml-cuda: remove unused params in ggml_cuda_graph (#18579)
|
преди 3 седмици |
Aldehir Rojas
|
cef1d23c5a
common/grammar : replace problematic backtracking regex `[\s\S]*` (#18342)
|
преди 3 седмици |
Georgi Gerganov
|
c69c7ebc90
graph : fix graph reuse logic when `n_pos_per_embd > 1` (#18566)
|
преди 3 седмици |
Aman Gupta
|
e57f52334b
ggml-cuda: fixes for concurrent streams (#18496)
|
преди 3 седмици |
Georgi Gerganov
|
a554a1ecc7
context : fix reserve token padding to n_seqs (#18536)
|
преди 3 седмици |
Johannes Gäßler
|
0f2e42ca1d
CUDA: only allocate FA tmp buffer if needed (#18564)
|
преди 3 седмици |
pl752
|
9dba9f5352
(Bugfix, ggml-cuda) Pool alloc count fix + small size computation type adjustment (#18559)
|
преди 3 седмици |
Shouyu
|
bcfc8c3cec
ggml-hexagon: optimize activation function (#18393)
|
преди 3 седмици |
Jeff Bolz
|
18ddaea2ae
vulkan: Optimize GGML_OP_CUMSUM (#18417)
|
преди 3 седмици |
Jeff Bolz
|
706e3f93a6
vulkan: Implement mmvq for iq1_s/iq1_m (#18450)
|
преди 3 седмици |
Prabod
|
5755e52d15
model : Maincoder-1B support (#18534)
|
преди 3 седмици |
Georgi Gerganov
|
f38de16341
metal : adjust extra size for FA buffer to avoid reallocations (#18545)
|
преди 3 седмици |
Georgi Gerganov
|
af1e8e1a6c
graph : reduce topology branching (#18548)
|
преди 3 седмици |
Georgi Gerganov
|
d84a6a98be
vocab : reduce debug logs about non-EOG control tokens (#18541)
|
преди 3 седмици |
Chris Rohlf
|
c6f0e832da
rpc : use unordered_map::reserve and emplace (#18513)
|
преди 3 седмици |
MeeMin
|
e86f3c2221
cuda : fix copy of large tensors (ggml_nbytes <= INT_MAX assertion) (#18433)
|
преди 3 седмици |
Sigbjørn Skjæret
|
169ee68ffb
model : remove modern-bert iswa template (#18529)
|
преди 3 седмици |
tt
|
ced765be44
model: support youtu-vl model (#18479)
|
преди 3 седмици |
Piotr Wilkin (ilintar)
|
3ccccc83f7
Add conversion support for IQuestCoderForCausalLM (#18524)
|
преди 4 седмици |
o7si
|
d0a6a31470
model : add support for JinaBertModel with non-gated ffn (#18475)
|
преди 4 седмици |
o7si
|
2b2afade9f
convert : fix encoding of WPM vocab for BERT models (#18500)
|
преди 4 седмици |
HelloKS
|
f4f5019254
model: add Solar Open model (#18511)
|
преди 4 седмици |
Anri Lombard
|
d5574c919c
webui: fix code copy stripping XML/HTML tags (#18518)
|
преди 4 седмици |
Aman Gupta
|
26831bded9
ggml-cuda: remove unneccesary prints on ggml_cuda_init (#18502)
|
преди 4 седмици |