cturan/llama.cpp

Autor	SHA1 Mensaje	Fecha
Jeff Bolz	a82c9e7c23 vulkan: fix assertion when qy_needs_dequant (#12068)	hace 10 meses
rhjdvsgsgks	401af80b54 server: handle echo=false on /v1/completions (#12060)	hace 10 meses
Judd	c132239bfb add OP sigmoid (#12056)	hace 10 meses
Molly Sophia	393fca629e ggml-cpu: Fix build with sve (#12059)	hace 10 meses
Rémy O	61d4f39dfe vulkan: implement more backpropagation operators (#11914)	hace 10 meses
Olivier Chafik	0b52745649 server: support add_generation_prompt query param (#12062)	hace 10 meses
Alex Brooks	4d1051a40f Add Doc for Converting Granite Vision -> GGUF (#12006)	hace 10 meses
Vitali Lovich	3e9a2860e9 llama : expose llama_model_n_head_kv in the API (#11997)	hace 10 meses
Gian-Carlo Pascutto	58d07a8043 metal : copy kernels for quant to F32/F16 conversions (#12017)	hace 10 meses
lhez	34a846b584 opencl: fix for small models (#11950)	hace 10 meses
Alex Brooks	7a2c913e66 llava : Add Granite Vision Support (#11794)	hace 10 meses
Neo Zhang Jianyu	08d5986290 [SYCL] Optimize mul_mat for Q4_0 on Intel GPU (#12035)	hace 10 meses
Aleksei Nikiforov	651adf4b66 gguf_convert_endian.py: implement byteswapping for q4_k and q6_k (#11349)	hace 10 meses
Akarshan Biswas	8303e8b0fb SYCL: Fix GGML_SYCL_DEBUG macro (#11995)	hace 10 meses
Florent BENOIT	7ad0779f5d run: allow to customize prompt by env var LLAMA_PROMPT_PREFIX (#12041)	hace 11 meses
Eric Curtin	f777a73e18 Some llama-run cleanups (#11973)	hace 11 meses
Aaron Teo	af7747c95a ggml-cpu: Support s390x SIMD Instruction Set (#12019)	hace 11 meses
Johannes Gäßler	a28e0d5eb1 CUDA: app option to compile without FlashAttention (#12025)	hace 11 meses
Ting Lou	36c258ee92 llava: build clip image from pixels (#11999)	hace 11 meses
Georgi Gerganov	f3e64859ed ci : fix arm upload artifacts (#12024)	hace 11 meses
Johannes Gäßler	5fa07c2f93 CUDA: optimize FA for GQA + large batches (#12014)	hace 11 meses
Rohanjames1997	335eb04a91 ci : Build on Github-hosted arm64 runners (#12009)	hace 11 meses
Georgi Gerganov	cf756d6e0a server : disable Nagle's algorithm (#12020)	hace 11 meses
Gian-Carlo Pascutto	d70908421f cuda: Add Q5_1, Q5_0, Q4_1 and Q4_0 to F32 conversion support. (#12000)	hace 11 meses
Daniel Bevenius	de8b5a3624 llama.swiftui : add "Done" dismiss button to help view (#11998)	hace 11 meses
Georgi Gerganov	51f311e057 llama : skip loading unused tensors (#12004)	hace 11 meses
Johannes Gäßler	586d5fe6eb doc: update contributing guidelines [no ci] (#11969)	hace 11 meses
PureJourney	ecc8e3aeff CUDA: correct the lowest Maxwell supported by CUDA 12 (#11984)	hace 11 meses
Bodhi	0b3863ff95 MUSA: support ARM64 and enable dp4a .etc (#11843)	hace 11 meses
Alex Brooks	ee02ad02c5 clip : fix visual encoders with no CLS (#11982)	hace 11 meses

Posterior Anterior

Historial de Commits Buscar

Historial de Commits