Ed Addario
|
30e5b01de2
quantize : change int to unsigned int for KV overrides (#14197)
|
7 months ago |
uvos
|
e54b394082
CUDA/HIP: fix ssm_scan on devices where warp size is not 32 (#14196)
|
7 months ago |
uvos
|
2c2caa4443
HIP: Replace usage of depricated preprocessor macro __AMDGCN_WAVEFRONT_SIZE__ (#14183)
|
7 months ago |
Georgi Gerganov
|
5fce5f948d
kv-cache : fix use-after-move of defrag info (#14189)
|
7 months ago |
Mikko Juola
|
9ae4143bc6
model : add dots.llm1 architecture support (#14044) (#14118)
|
7 months ago |
Georgi Gerganov
|
c311ac664d
cparams : rename LLAMA_MAX_PARALLEL_SEQUENCES to LLAMA_MAX_SEQ (#14188)
|
7 months ago |
Georgi Gerganov
|
b9912ac570
batch : auto-gen positions + verify multi-sequence input (#14177)
|
7 months ago |
Pepijn de Vos
|
00ba772610
docs : remove WIP since PR has been merged (#13912)
|
7 months ago |
Piotr
|
3cb203c89f
llama-chat : Do not throw when tool parsing fails (#14012)
|
7 months ago |
Aman Gupta
|
2e42be42bd
compare-llama-bench: add option to plot (#14169)
|
7 months ago |
Georgi Gerganov
|
fb85a288d7
vocab : fix build (#14175)
|
7 months ago |
Svetlozar Georgiev
|
40643edb86
sycl: fix docker image (#14144)
|
7 months ago |
Guy Goldenberg
|
3cfbbdb44e
Merge commit from fork
|
7 months ago |
Georgi Gerganov
|
80709b70a2
batch : add LLAMA_BATCH_DEBUG environment variable (#14172)
|
7 months ago |
ddpasa
|
26ff3685bf
docs : Update multimodal.md (#14122)
|
7 months ago |
Georgi Gerganov
|
60c666347b
batch : rework llama_batch_allocr (#14153)
|
7 months ago |
Georgi Gerganov
|
b7cc7745e3
readme : remove survey link (#14168)
|
7 months ago |
Christian Kastner
|
cc8d081879
cmake: Add ability to pass in LLAMA_BUILD_NUMBER/COMMIT (#14167)
|
7 months ago |
Đinh Trọng Huy
|
d714dadb57
pooling : make cls_b and cls_out_b optional (#14165)
|
7 months ago |
Georgi Gerganov
|
ffad043973
server : fix SWA condition for full context reprocess (#14163)
|
7 months ago |
Anton Mitkov
|
0889eba570
sycl: Adding additional cpy dbg print output (#14034)
|
7 months ago |
Ewan Crawford
|
c61285e739
SYCL: Bump oneMath commit (#14152)
|
7 months ago |
Christian Kastner
|
09cf2c7c65
cmake : Improve build-info.cpp generation (#14156)
|
7 months ago |
Georgi Gerganov
|
c33fe8b8c4
vocab : prevent heap overflow when vocab is too small (#14145)
|
7 months ago |
Anton Mitkov
|
ed52f3668e
sycl: Remove not needed copy f16->f32 for dnnl mul mat (#14125)
|
7 months ago |
Georgi Gerganov
|
a681b4ba83
readme : remove project status link (#14149)
|
7 months ago |
Georgi Gerganov
|
7d516443dd
server : re-enable SWA speculative decoding (#14131)
|
7 months ago |
Georgi Gerganov
|
f6e1a7aa87
context : simplify output counting logic during decode (#14142)
|
7 months ago |
Georgi Gerganov
|
c3ee46fab4
batch : remove logits_all flag (#14141)
|
7 months ago |
Georgi Gerganov
|
e2c0b6e46a
cmake : handle whitepsaces in path during metal build (#14126)
|
7 months ago |