Daniel Bevenius
|
37f10f955f
make : remove make in favor of CMake (#15449)
|
5 mesiacov pred |
xctan
|
f470bc36be
ggml-cpu : split arch-specific implementations (#13892)
|
7 mesiacov pred |
Georgi Gerganov
|
4773d7a02f
examples : remove infill (#13283)
|
8 mesiacov pred |
Xuan-Son Nguyen
|
9b61acf060
mtmd : rename llava directory to mtmd (#13311)
|
8 mesiacov pred |
Diego Devesa
|
1d36b3670b
llama : move end-user examples to tools directory (#13249)
|
8 mesiacov pred |
David Huang
|
84778e9770
CUDA/HIP: Share the same unified memory allocation logic. (#12934)
|
9 mesiacov pred |
R0CKSTAR
|
251364549f
musa: support new arch mp_31 and update doc (#12296)
|
10 mesiacov pred |
Johannes Gäßler
|
a28e0d5eb1
CUDA: app option to compile without FlashAttention (#12025)
|
10 mesiacov pred |
Bodhi
|
0b3863ff95
MUSA: support ARM64 and enable dp4a .etc (#11843)
|
11 mesiacov pred |
Olivier Chafik
|
63e489c025
tool-call: refactor common chat / tool-call api (+ tests / fixes) (#11900)
|
11 mesiacov pred |
Georgi Gerganov
|
68ff663a04
repo : update links to new url (#11886)
|
11 mesiacov pred |
Johannes Gäßler
|
864a0b67a6
CUDA: use mma PTX instructions for FlashAttention (#11583)
|
11 mesiacov pred |
Olivier Chafik
|
8b576b6c55
Tool call support (generic + native for Llama, Functionary, Hermes, Mistral, Firefunction, DeepSeek) w/ lazy grammars (#9639)
|
11 mesiacov pred |
Olivier Chafik
|
6171c9d258
Add Jinja template support (#11016)
|
1 rok pred |
HimariO
|
ba1cb19cdd
llama : add Qwen2VL support + multimodal RoPE (#10361)
|
1 rok pred |
Djip007
|
19d8762ab6
ggml : refactor online repacking (#10446)
|
1 rok pred |
Xuan Son Nguyen
|
91c36c269b
server : (web ui) Various improvements, now use vite as bundler (#10599)
|
1 rok pred |
Georgi Gerganov
|
8648c52101
make : deprecate (#10514)
|
1 rok pred |
Wang Qin
|
43957ef203
build: update Makefile comments for C++ version change (#10598)
|
1 rok pred |
Diego Devesa
|
7cc2d2c889
ggml : move AMX to the CPU backend (#10570)
|
1 rok pred |
Tristan Druyen
|
be0e350c8b
Fix HIP flag inconsistency & build docs (#10524)
|
1 rok pred |
R0CKSTAR
|
249cd93da3
mtgpu: Add MUSA_DOCKER_ARCH in Dockerfiles && update cmake and make (#10516)
|
1 rok pred |
Eric Curtin
|
0cc63754b8
Introduce llama-run (#10291)
|
1 rok pred |
Diego Devesa
|
5931c1f233
ggml : add support for dynamic loading of backends (#10469)
|
1 rok pred |
Georgi Gerganov
|
d9d54e498d
speculative : refactor and add a simpler example (#10362)
|
1 rok pred |
Anthony Van de Gejuchte
|
3952a221af
Fix missing file renames in Makefile due to changes in commit ae8de6d50a (#10413)
|
1 rok pred |
Georgi Gerganov
|
cf32a9b93a
metal : refactor kernel args into structs (#10238)
|
1 rok pred |
Johannes Gäßler
|
c3ea58aca4
CUDA: remove DMMV, consolidate F16 mult mat vec (#10318)
|
1 rok pred |
Georgi Gerganov
|
a4200cafad
make : add ggml-opt (#0)
|
1 rok pred |
Georgi Gerganov
|
84274a10c3
tests : remove test-grad0
|
1 rok pred |