Damian Stewart
|
381efbf480
llava : expose as a shared library for downstream projects (#3613)
|
%!s(int64=2) %!d(string=hai) anos |
slaren
|
2833a6f63c
ggml-cuda : fix f16 mul mat (#3961)
|
%!s(int64=2) %!d(string=hai) anos |
Kerfuffle
|
d9ccce2e33
Allow common process_escapes to handle \x sequences (#3928)
|
%!s(int64=2) %!d(string=hai) anos |
Thái Hoàng Tâm
|
bb60fd0bf6
server : fix typo for --alias shortcut from -m to -a (#3958)
|
%!s(int64=2) %!d(string=hai) anos |
Jared Van Bortel
|
132d25b8a6
cuda : fix disabling device with --tensor-split 1,0 (#3951)
|
%!s(int64=2) %!d(string=hai) anos |
Meng Zhang
|
3d48f42efc
llama : mark LLM_ARCH_STARCODER as full offload supported (#3945)
|
%!s(int64=2) %!d(string=hai) anos |
Eve
|
c41ea36eaa
cmake : MSVC instruction detection (fixed up #809) (#3923)
|
%!s(int64=2) %!d(string=hai) anos |
Eve
|
a7fac013cf
ci : use intel sde when ci cpu doesn't support avx512 (#3949)
|
%!s(int64=2) %!d(string=hai) anos |
slaren
|
48ade94538
cuda : revert CUDA pool stuff (#3944)
|
%!s(int64=2) %!d(string=hai) anos |
Kerfuffle
|
f28af0d81a
gguf-py: Support 01.AI Yi models (#3943)
|
%!s(int64=2) %!d(string=hai) anos |
Peter Sugihara
|
d9b33fe95b
metal : round up to 16 to fix MTLDebugComputeCommandEncoder assertion (#3938)
|
%!s(int64=2) %!d(string=hai) anos |
Xiao-Yong Jin
|
5ba3746171
ggml-metal: fix yarn rope (#3937)
|
%!s(int64=2) %!d(string=hai) anos |
slaren
|
abb77e7319
ggml-cuda : move row numbers to x grid dim in mmv kernels (#3921)
|
%!s(int64=2) %!d(string=hai) anos |
Georgi Gerganov
|
8f961abdc4
speculative : change default p_accept to 0.5 + CLI args (#3919)
|
%!s(int64=2) %!d(string=hai) anos |
Georgi Gerganov
|
05816027d6
common : YAYF (yet another YARN fix) (#3925)
|
%!s(int64=2) %!d(string=hai) anos |
cebtenzzre
|
3fdbe6b66b
llama : change yarn_ext_factor placeholder to -1 (#3922)
|
%!s(int64=2) %!d(string=hai) anos |
Kerfuffle
|
629f917cd6
cuda : add ROCM aliases for CUDA pool stuff (#3918)
|
%!s(int64=2) %!d(string=hai) anos |
Andrei
|
51b2fc11f7
cmake : fix relative path to git submodule index (#3915)
|
%!s(int64=2) %!d(string=hai) anos |
Georgi Gerganov
|
224e7d5b14
readme : add notice about #3912
|
%!s(int64=2) %!d(string=hai) anos |
Georgi Gerganov
|
c7743fe1c1
cuda : fix const ptrs warning causing ROCm build issues (#3913)
|
%!s(int64=2) %!d(string=hai) anos |
Oleksii Maryshchenko
|
d6069051de
cuda : use CUDA memory pool with async memory allocation/deallocation when available (#3903)
|
%!s(int64=2) %!d(string=hai) anos |
Georgi Gerganov
|
4ff1046d75
gguf : print error for GGUFv1 files (#3908)
|
%!s(int64=2) %!d(string=hai) anos |
slaren
|
21958bb393
cmake : disable LLAMA_NATIVE by default (#3906)
|
%!s(int64=2) %!d(string=hai) anos |
Georgi Gerganov
|
2756c4fbff
gguf : remove special-case code for GGUFv1 (#3901)
|
%!s(int64=2) %!d(string=hai) anos |
Georgi Gerganov
|
1efae9b7dc
llm : prevent from 1-D tensors being GPU split (#3697)
|
%!s(int64=2) %!d(string=hai) anos |
cebtenzzre
|
b12fa0d1c1
build : link against build info instead of compiling against it (#3879)
|
%!s(int64=2) %!d(string=hai) anos |
Georgi Gerganov
|
4d719a6d4e
cuda : check if this fixes Pascal card regression (#3882)
|
%!s(int64=2) %!d(string=hai) anos |
Georgi Gerganov
|
183b3fac6c
metal : fix build errors and kernel sig after #2268 (#3898)
|
%!s(int64=2) %!d(string=hai) anos |
cebtenzzre
|
2fffa0d61f
cuda : fix RoPE after #2268 (#3897)
|
%!s(int64=2) %!d(string=hai) anos |
cebtenzzre
|
0eb332a10f
llama : fix llama_context_default_params after #2268 (#3893)
|
%!s(int64=2) %!d(string=hai) anos |