Ruben Ortlam
|
803dac2e48
vulkan: use vec dot for matrix matrix multiplications (#16056)
|
4 ماه پیش |
Benni
|
459c0c2c1a
server: fix SSE and OpenAI compatibility for error messages when streaming (#16109)
|
4 ماه پیش |
ssweens
|
be79d9fdd9
llama-bench: add --devices and --list-devices support (#16039)
|
4 ماه پیش |
shun095
|
f432d8d83e
chat: Fix streaming parser for granite models (#15682)
|
4 ماه پیش |
Aleksander Grygier
|
4067f07fc5
feat: Improve mobile UI for Settings Dialog (#16084)
|
4 ماه پیش |
Xuan-Son Nguyen
|
4b8560ab56
chat : fix build on arm64 (#16101)
|
4 ماه پیش |
Xuan-Son Nguyen
|
0dd58b6877
ggml : refactor forward_dup for cpu backend (#16062)
|
4 ماه پیش |
Adrien Gallouët
|
69ffd89163
ggml-amx : fix ggml_amx_init() on generic Linux (#16049)
|
4 ماه پیش |
Adrien Gallouët
|
246c0d9c79
cmake : fix static linking for OpenMP on Unix-like systems (#16031)
|
4 ماه پیش |
Shawn Gu
|
3edd87cd05
opencl: optimize mxfp4 kernels (#16037)
|
4 ماه پیش |
Jeff Bolz
|
c0b45097c3
rename optimize_graph to graph_optimize (#16082)
|
4 ماه پیش |
Bowen Han
|
38dbdf4c05
CUDA: Optimize PAD_REFLECT_1D (#15957)
|
4 ماه پیش |
Johannes Gäßler
|
368560a1e3
CUDA: fix compilation on CC 6.0 (#16091)
|
4 ماه پیش |
Eric Curtin
|
4ca088b036
Add resumable downloads for llama-server model loading (#15963)
|
4 ماه پیش |
Georgi Gerganov
|
703f9e32c4
metal : use function constants for mul_mv_ext kernels (#16074)
|
4 ماه پیش |
Sigbjørn Skjæret
|
ad6bd9083b
cuda : add missing F32<->I32 entries in ggml_cuda_cpy_fn (#16060)
|
4 ماه پیش |
Radoslav Gerganov
|
2b6b55a59f
server : include usage statistics only when user request them (#16052)
|
4 ماه پیش |
Georgi Gerganov
|
e58174cecb
llama : bump max seq limit from 64 to 256 (#15916)
|
4 ماه پیش |
Georgi Gerganov
|
b213fce89b
metal : improve F32, F16 and BF16 mat-vec multiplication (#16057)
|
4 ماه پیش |
Jhen-Jie Hong
|
e00f3fd8ff
metal : avoid call free for non-owned buffer (#16067)
|
4 ماه پیش |
Georgi Gerganov
|
f2f28380ea
metal : handle nil cv during pipeline creation (#16065)
|
4 ماه پیش |
Chenguang Li
|
62c3b645c5
CANN: Remove print (#16044)
|
4 ماه پیش |
Reese Levine
|
d304f459d8
GGML WebGPU: Support for ADD, MUL, RMS_NORM, GET_ROWS operators (#16018)
|
4 ماه پیش |
Georgi Gerganov
|
0320ac5264
metal : refactor + optimize v2 (#15995)
|
4 ماه پیش |
Aleksander Grygier
|
a7a98e0fff
SvelteKit-based WebUI (#14839)
|
4 ماه پیش |
Xuan-Son Nguyen
|
8f8f2274ee
convert : add Llama4ForCausalLM (#16042)
|
4 ماه پیش |
Johannes Gäßler
|
c959b676be
CUDA: fix FA occupancy, optimize tile kernel (#15982)
|
4 ماه پیش |
David Ribeiro Alves
|
cd08fc3ecc
common : Fix corrupted memory error on json grammar initialization (#16038)
|
4 ماه پیش |
Eve
|
cb5bb6cc05
vulkan: automatically remove unsupported devices (#15976)
|
4 ماه پیش |
Daniel Bevenius
|
a91d035b90
ci : revert back to macos-13 for macOS-latest-cmake-x64 (#16040)
|
4 ماه پیش |