Piotr Wilkin
|
890fa2c1e3
WE HAVE OUTPUT!
|
4 mesi fa |
Piotr Wilkin
|
e590a75905
Cleanup complete, now for the recurrent memory management...
|
4 mesi fa |
Piotr Wilkin
|
2b0673c315
Cleanup ggml_delta_net
|
4 mesi fa |
Piotr Wilkin (ilintar)
|
72c98b0c7d
Merge pull request #1 from ggml-org/xsn/qwen3next_experiment
|
4 mesi fa |
Xuan Son Nguyen
|
e83ef74733
one less magic number
|
4 mesi fa |
Xuan Son Nguyen
|
f643b957f4
refactor softplus fn
|
4 mesi fa |
Xuan Son Nguyen
|
46110e0630
split q_proj/gate
|
4 mesi fa |
Piotr Wilkin
|
9832f2934a
Remove comments as half of them are wrong anyways
|
4 mesi fa |
Piotr Wilkin
|
8152df60f3
Getting closer (graph builds for bs=1 but tensor shaping is still wrong for bigger sizes)
|
4 mesi fa |
Piotr Wilkin
|
e0c5dff2a7
Rewrite to tensor ops
|
4 mesi fa |
Piotr Wilkin
|
178230ee21
Getting to decode stage...
|
4 mesi fa |
Piotr Wilkin (ilintar)
|
c78f9fce68
Merge branch 'ggml-org:master' into qwen3_next
|
4 mesi fa |
Radoslav Gerganov
|
2b6b55a59f
server : include usage statistics only when user request them (#16052)
|
4 mesi fa |
Georgi Gerganov
|
e58174cecb
llama : bump max seq limit from 64 to 256 (#15916)
|
4 mesi fa |
Georgi Gerganov
|
b213fce89b
metal : improve F32, F16 and BF16 mat-vec multiplication (#16057)
|
4 mesi fa |
Jhen-Jie Hong
|
e00f3fd8ff
metal : avoid call free for non-owned buffer (#16067)
|
4 mesi fa |
Georgi Gerganov
|
f2f28380ea
metal : handle nil cv during pipeline creation (#16065)
|
4 mesi fa |
Chenguang Li
|
62c3b645c5
CANN: Remove print (#16044)
|
4 mesi fa |
Piotr Wilkin
|
344331c2b6
First draft
|
4 mesi fa |
Reese Levine
|
d304f459d8
GGML WebGPU: Support for ADD, MUL, RMS_NORM, GET_ROWS operators (#16018)
|
4 mesi fa |
Georgi Gerganov
|
0320ac5264
metal : refactor + optimize v2 (#15995)
|
4 mesi fa |
Aleksander Grygier
|
a7a98e0fff
SvelteKit-based WebUI (#14839)
|
4 mesi fa |
Xuan-Son Nguyen
|
8f8f2274ee
convert : add Llama4ForCausalLM (#16042)
|
4 mesi fa |
Johannes Gäßler
|
c959b676be
CUDA: fix FA occupancy, optimize tile kernel (#15982)
|
4 mesi fa |
David Ribeiro Alves
|
cd08fc3ecc
common : Fix corrupted memory error on json grammar initialization (#16038)
|
4 mesi fa |
Eve
|
cb5bb6cc05
vulkan: automatically remove unsupported devices (#15976)
|
4 mesi fa |
Daniel Bevenius
|
a91d035b90
ci : revert back to macos-13 for macOS-latest-cmake-x64 (#16040)
|
4 mesi fa |
Jie Fu (傅杰)
|
745cbcf2fe
llama-quant : fix the verification of attention layers for encoder-decoder models (#16023)
|
4 mesi fa |
Jie Fu (傅杰)
|
1cbd80f8cf
examples : support encoder-decoder models in the simple example (#16002)
|
4 mesi fa |
Shane A
|
85286f3548
model : add OLMo3 support (#16015)
|
4 mesi fa |