Georgi Gerganov
|
2b3389677a
ggml : refactor rope norm/neox (#7634)
|
1 year ago |
Georgi Gerganov
|
d48c88cbd5
ggml : remove ggml_flash_attn and ggml_flash_ff (#7463)
|
1 year ago |
slaren
|
5bf3953d7e
cuda : improve cuda pool efficiency using virtual memory (#4606)
|
2 years ago |
Georgi Gerganov
|
afefa319f1
ggml : change ggml_scale to take a float instead of tensor (#4573)
|
2 years ago |
Richard Kiss
|
9494d7c477
english : use `typos` to fix comments and logs (#4354)
|
2 years ago |
Georgi Gerganov
|
4760e7cc0b
sync : ggml (backend v2) (#3912)
|
2 years ago |
Georgi Gerganov
|
f93af02488
sync : ggml (conv 1d + 2d updates, UB fixes) (#3468)
|
2 years ago |
Cebtenzzre
|
bc39553c90
build : enable more non-default compiler warnings (#3200)
|
2 years ago |
xaedes
|
0e76a8992c
train : finetune LORA (#2632)
|
2 years ago |
Georgi Gerganov
|
ec893798b7
llama : custom attention mask + parallel decoding + no context swaps (#3228)
|
2 years ago |
xaedes
|
44c117f41e
train : mem usage and other improvements (#2439)
|
2 years ago |
Eve
|
81844fbcfd
tests : Fix compilation warnings (Linux/GCC) (#2451)
|
2 years ago |