cturan/llama.cpp

Autor	SHA1 Mensaxe	Data
Rémy O	438a83926a vulkan: add specific MMV kernels for IQ2 and IQ3 quants + optimizations (#11595)	hai 10 meses
Johannes Gäßler	9c42b1718c CUDA: fix logic for V100 + GGML_CUDA_FORCE_MMQ (#12098)	hai 10 meses
Prashant Vithule	05e6f5aad0 ggml: aarch64: implement SVE kernels for q2_k_q8_k vector dot (#12064)	hai 10 meses
hipudding	673cfef9aa CANN: Fix build error with GCC 13 (#11990)	hai 10 meses
Eve	fbeda9002d vulkan: matmul dequantization improvements (#12015)	hai 10 meses
Daniele	581650b7ca vulkan: improve im2col (#11826)	hai 10 meses
Vladimir Vuksanovic	b95c8af37c cmake: Fix ggml backend dependencies and installation (#11818)	hai 10 meses
Ting Lou	a800ae46da llava : add struct for FFI bindgen (#12079)	hai 10 meses
Sigbjørn Skjæret	69050a11be Refactor gguf scripts to improve metadata handling (#11909)	hai 10 meses
Aleksei Nikiforov	3567ee3a94 gguf-py: enable reading non-native endian files (#12081)	hai 10 meses
Kante Yin	53e4db1012 readme : update infra list (#9096)	hai 10 meses
Olivier Chafik	d7cfe1ffe0 docs: add docs/function-calling.md to lighten server/README.md's plight (#12069)	hai 11 meses
Jeff Bolz	a82c9e7c23 vulkan: fix assertion when qy_needs_dequant (#12068)	hai 11 meses
rhjdvsgsgks	401af80b54 server: handle echo=false on /v1/completions (#12060)	hai 11 meses
Judd	c132239bfb add OP sigmoid (#12056)	hai 11 meses
Molly Sophia	393fca629e ggml-cpu: Fix build with sve (#12059)	hai 11 meses
Rémy O	61d4f39dfe vulkan: implement more backpropagation operators (#11914)	hai 11 meses
Olivier Chafik	0b52745649 server: support add_generation_prompt query param (#12062)	hai 11 meses
Alex Brooks	4d1051a40f Add Doc for Converting Granite Vision -> GGUF (#12006)	hai 11 meses
Vitali Lovich	3e9a2860e9 llama : expose llama_model_n_head_kv in the API (#11997)	hai 11 meses
Gian-Carlo Pascutto	58d07a8043 metal : copy kernels for quant to F32/F16 conversions (#12017)	hai 11 meses
lhez	34a846b584 opencl: fix for small models (#11950)	hai 11 meses
Alex Brooks	7a2c913e66 llava : Add Granite Vision Support (#11794)	hai 11 meses
Neo Zhang Jianyu	08d5986290 [SYCL] Optimize mul_mat for Q4_0 on Intel GPU (#12035)	hai 11 meses
Aleksei Nikiforov	651adf4b66 gguf_convert_endian.py: implement byteswapping for q4_k and q6_k (#11349)	hai 11 meses
Akarshan Biswas	8303e8b0fb SYCL: Fix GGML_SYCL_DEBUG macro (#11995)	hai 11 meses
Florent BENOIT	7ad0779f5d run: allow to customize prompt by env var LLAMA_PROMPT_PREFIX (#12041)	hai 11 meses
Eric Curtin	f777a73e18 Some llama-run cleanups (#11973)	hai 11 meses
Aaron Teo	af7747c95a ggml-cpu: Support s390x SIMD Instruction Set (#12019)	hai 11 meses
Johannes Gäßler	a28e0d5eb1 CUDA: app option to compile without FlashAttention (#12025)	hai 11 meses

Posterior Anterior

Commit History Buscar

Commit History