cturan/llama.cpp

Autor	SHA1 Mensagem	Data
Johannes Gäßler	fcf6538ba6 CUDA: fix unused warning in mmq.cu (#7442)	há 1 ano atrás
Georgi Gerganov	c3f8d58356 tests : test-tokenizer-0.sh print more info (#7402)	há 1 ano atrás
Amir	11474e756d examples: cache hf model when --model not provided (#7353)	há 1 ano atrás
Johannes Gäßler	d8ee902227 CUDA: deduplicate mmq code (#7397)	há 1 ano atrás
jaime-m-p	d7e852c1bc Tokenizer SPM fixes for phi-3 and llama-spm (bugfix) (#7425)	há 1 ano atrás
jaime-m-p	917dc8cfa6 Tokenizer SPM fixes for phi-3 and llama-spm (#7375)	há 1 ano atrás
Georgi Gerganov	fabf30b4c4 llama : remove Persimmon (#7408)	há 1 ano atrás
Johannes Gäßler	20385cebcc perplexity: update README FP16 results [no ci] (#7413)	há 1 ano atrás
Radoslav Gerganov	db10f01310 rpc : track allocated buffers (#7411)	há 1 ano atrás
Georgi Gerganov	3bc10cb485 server : fix temperature + disable some tests (#7409)	há 1 ano atrás
AidanBeltonS	6bf9b66fa3 [SYCL] Update SYCL upscale operation (#7321)	há 1 ano atrás
Bingan	26cd4237bc Update README.md (#7410)	há 1 ano atrás
Herman Semenov	213e90ed73 ggml-opencl, llama: using reserve() if count already known (#7272)	há 1 ano atrás
junchao-loongson	65c58207ec ggml : add loongarch lsx and lasx support (#6454)	há 1 ano atrás
Georgi Gerganov	1cc0155d04 server : tuning tests (#7388)	há 1 ano atrás
Georgi Gerganov	e932094d58 server : return error on too large embedding input (#7389)	há 1 ano atrás
Georgi Gerganov	2789baf480 tests : fix --keep_split -> --keep-split (#7374)	há 1 ano atrás
Srihari-mcw	33c8d50acc Add provisions for windows support for BF16 code including CMake provision for enabling AVX512_BF16 (#7258)	há 1 ano atrás
slaren	d359f30921 llama : remove MPI backend (#7395)	há 1 ano atrás
Fred Douglas	1ea2a0036e quantize : fix --keep-split check (#7374)	há 1 ano atrás
0cc4m	f030ec1f7a Vulkan Embedding Fix (#7360)	há 1 ano atrás
slaren	e4e6f67be6 ggml : fix another case of quants nans (#7387)	há 1 ano atrás
Johannes Gäßler	5ca49cbecd ggml: implement quantized KV cache for FA (#7372)	há 1 ano atrás
Johannes Gäßler	1b01f06db0 server: add test for token probs (#7347)	há 1 ano atrás
Johannes Gäßler	41858392e1 server: fix seed being reported back (#7382)	há 1 ano atrás
Anas Ahouzi	6aade19ee7 Add StableLM2 pre-tokenizer (#7349)	há 1 ano atrás
slaren	ab33f7a338 cuda : clear error after buffer allocation failure (#7376)	há 1 ano atrás
Brian	e23b974f4c labeler.yml: Use settings from ggerganov/llama.cpp [no ci] (#7363)	há 1 ano atrás
Georgi Gerganov	854d365aba cmake : update android comments (#7341)	há 1 ano atrás
fraxy-v	f5bf761747 Capture CUDA logging output (#7298)	há 1 ano atrás

Recente Antigo

Histórico de Commits Pesquisar

Histórico de Commits