cturan/llama.cpp

Autor	SHA1 Nachricht	Datum
Neo Zhang Jianyu	08d5986290 [SYCL] Optimize mul_mat for Q4_0 on Intel GPU (#12035)	vor 11 Monaten
Aleksei Nikiforov	651adf4b66 gguf_convert_endian.py: implement byteswapping for q4_k and q6_k (#11349)	vor 11 Monaten
Akarshan Biswas	8303e8b0fb SYCL: Fix GGML_SYCL_DEBUG macro (#11995)	vor 11 Monaten
Florent BENOIT	7ad0779f5d run: allow to customize prompt by env var LLAMA_PROMPT_PREFIX (#12041)	vor 11 Monaten
Eric Curtin	f777a73e18 Some llama-run cleanups (#11973)	vor 11 Monaten
Aaron Teo	af7747c95a ggml-cpu: Support s390x SIMD Instruction Set (#12019)	vor 11 Monaten
Johannes Gäßler	a28e0d5eb1 CUDA: app option to compile without FlashAttention (#12025)	vor 11 Monaten
Ting Lou	36c258ee92 llava: build clip image from pixels (#11999)	vor 11 Monaten
Georgi Gerganov	f3e64859ed ci : fix arm upload artifacts (#12024)	vor 11 Monaten
Johannes Gäßler	5fa07c2f93 CUDA: optimize FA for GQA + large batches (#12014)	vor 11 Monaten
Rohanjames1997	335eb04a91 ci : Build on Github-hosted arm64 runners (#12009)	vor 11 Monaten
Georgi Gerganov	cf756d6e0a server : disable Nagle's algorithm (#12020)	vor 11 Monaten
Gian-Carlo Pascutto	d70908421f cuda: Add Q5_1, Q5_0, Q4_1 and Q4_0 to F32 conversion support. (#12000)	vor 11 Monaten
Daniel Bevenius	de8b5a3624 llama.swiftui : add "Done" dismiss button to help view (#11998)	vor 11 Monaten
Georgi Gerganov	51f311e057 llama : skip loading unused tensors (#12004)	vor 11 Monaten
Johannes Gäßler	586d5fe6eb doc: update contributing guidelines [no ci] (#11969)	vor 11 Monaten
PureJourney	ecc8e3aeff CUDA: correct the lowest Maxwell supported by CUDA 12 (#11984)	vor 11 Monaten
Bodhi	0b3863ff95 MUSA: support ARM64 and enable dp4a .etc (#11843)	vor 11 Monaten
Alex Brooks	ee02ad02c5 clip : fix visual encoders with no CLS (#11982)	vor 11 Monaten
momonga	c392e5094d server (webui): Fix Premature Submission During IME Conversion (#11971)	vor 11 Monaten
Charles Xu	c5d91a7400 ggml-cpu: Add CPU backend support for KleidiAI library (#11390)	vor 11 Monaten
Prashant Vithule	4806498bf1 ggml: aarch64: implement SVE kernels for q3_K_q8_K vector dot (#11917)	vor 11 Monaten
Michael Engel	0d559580a0 run : add --chat-template-file (#11961)	vor 11 Monaten
Johannes Gäßler	d04e7163c8 doc: add links to ggml examples [no ci] (#11958)	vor 11 Monaten
Daniel Bevenius	d07c621393 common : add llama.vim preset for Qwen2.5 Coder (#11945)	vor 11 Monaten
Georgi Gerganov	abd4d0bc4f speculative : update default params (#11954)	vor 11 Monaten
Daniel Bevenius	9626d9351a llama : fix indentation in llama-grammar [no ci] (#11943)	vor 11 Monaten
igardev	b58934c183 server : (webui) Enable communication with parent html (if webui is in iframe) (#11940)	vor 11 Monaten
Olivier Chafik	63e489c025 tool-call: refactor common chat / tool-call api (+ tests / fixes) (#11900)	vor 11 Monaten
Xuan-Son Nguyen	63ac128563 server : add TEI API format for /rerank endpoint (#11942)	vor 11 Monaten

Neuer Älter

Commit Verlauf Finden

Commit Verlauf