cturan/llama.cpp

Autor	SHA1 Mensaxe	Data
Georgi Gerganov	85a7d8677b memory : remove KV cache size padding (#16812)	hai 2 meses
Georgi Gerganov	a8ca18b4b8 llama-bench : clarify benchmarked parts of the computation (#16823)	hai 2 meses
l3utterfly	8284efc35c initialise buffer.device in ggml_hexagon_session (#16816)	hai 2 meses
Sam Malayek	1c1409e131 embedding: add raw option for --embd-output-format (#16541)	hai 2 meses
Johannes Gäßler	7a0e900e36 llama: consistent ctx <-> buf order for KV cache (#16746)	hai 2 meses
Aldehir Rojas	280d97be96 grammar : support array references in json schema (#16792)	hai 2 meses
Chenguang Li	3479efd112 CANN: Improve device ID handling and aclnnArange checks (#16752)	hai 2 meses
Aman Gupta	463bbf20bf CUDA: add unused vars to mmvf and mmvq (#16807)	hai 2 meses
tamarPal	ad8d36beff sycl: add SSM_CONV operation support (#16800)	hai 2 meses
Yuri Khrustalev	c053e18a66 chat: Add LFM2 tool handling (#16763)	hai 2 meses
Xuan-Son Nguyen	e1ab084803 mtmd : fix idefics3 preprocessing (#16806)	hai 2 meses
Diego Devesa	5a4ff43e7d llama : disable pipeline parallelism if compute buffer allocation fails (#16748)	hai 2 meses
Acly	10640e31aa ggml : fix interpolate with align-corners and ne=1 (#16700)	hai 2 meses
Johannes Gäßler	80d28f104c HIP: fix AMDGPU_TARGETS, update documentation (#16803)	hai 2 meses
Xuan-Son Nguyen	c55d53acec model : add LightOnOCR-1B model (#16764)	hai 2 meses
Johannes Gäßler	945501f5ea llama: fix leaked buffers for mmap + split files (#16765)	hai 2 meses
Aman Gupta	75cbdd3fce test-backend-ops: print failed tests at the end (#16785)	hai 2 meses
tamarPal	2b9bd9bf4e sycl: add ROLL operation support (#16665)	hai 2 meses
shani-f	59fc1ec8e8 sycl: add REPEAT_BACK operation support (#16734)	hai 2 meses
Aman Gupta	75d33b9302 CUDA: support for weight clamp in top-k norm (#16702)	hai 2 meses
Acly	3470a5c891 ggml-alloc : make gallocr prefer chunks that allow memory reuse (#16788)	hai 2 meses
Sigbjørn Skjæret	bd562fe4f7 cuda : use fast copy when src and dst are of different type and contiguous (#16789)	hai 2 meses
leejet	bbac6a26b2 ggml: fix cuda kernel launch configuration for k_compute_batched_ptrs to support large batch (#16744)	hai 2 meses
Sigbjørn Skjæret	73a48c9790 convert : enable expert group selection for all models with it (#16691)	hai 2 meses
Sigbjørn Skjæret	f696428ce8 graph : add clamping to ffn_moe_weights_sum to avoid div-by-zero (#16655)	hai 2 meses
Sigbjørn Skjæret	7cce4f8158 model : set res->t_embd in SmallThinker models (#16782)	hai 2 meses
amirai21	8d8862829c docs : add Jamba to Text-only models list (#16778)	hai 2 meses
Aman Gupta	f77c13b91f CUDA: General GEMV fusion (#16715)	hai 2 meses
Gilad S.	3cfa9c3f12 vulkan: deduplicate Microsoft Direct3D12 devices (#16689)	hai 2 meses
Galunid	5d195f17bc convert : handle mmproj filename/path properly (#16760)	hai 2 meses

Posterior Anterior

Commit History Buscar

Commit History