cturan/llama.cpp

Auteur	SHA1 Message	Date
Georgi Gerganov	f8e58135cf metal : move dequantize templates to beginning of MSL source (#0)	il y a 1 an
leo-pony	329ed914c9 CANN: adjust backend registry refactor. (#10158)	il y a 1 an
Georgi Gerganov	ce027adfb3 sync : ggml	il y a 1 an
Yuri Khrustalev	284e5b0275 cmake : make it possible linking ggml as external lib (ggml/1003)	il y a 1 an
Plamen Minev	e2292aaa17 metal : fix minor string leaks (ggml/1004)	il y a 1 an
Diego Devesa	9f40989351 ggml : move CPU backend to a separate file (#10144)	il y a 1 an
Georgi Gerganov	08828a6d7d metal : minor fixup in FA kernel (#10143)	il y a 1 an
Georgi Gerganov	1839f69130 flake.lock: Update (#10146)	il y a 1 an
Christian Köhnenkamp	9830b6923b Add apple arm to presets (#10134)	il y a 1 an
sasha0552	42cadc74bd server : fix slot selection by lru (#10126)	il y a 1 an
Georgi Gerganov	45950415ed server : fix endpoint checks (#10135)	il y a 1 an
Georgi Gerganov	1926d6e39d llama : adjust default context size + print warnings (#10136)	il y a 1 an
Diego Devesa	b634f8a26f simple-chat : only add bos on first prompt (#10129)	il y a 1 an
Xuan Son Nguyen	7554aa4655 convert-lora : make `--base` optional (#10110)	il y a 1 an
Diego Devesa	a6744e43e8 llama : add simple-chat example (#10124)	il y a 1 an
Diego Devesa	e991e3127f llama : use smart pointers for ggml resources (#10117)	il y a 1 an
Shupei Fan	418f5eef26 vulkan : improve ggml_vk_create_buffer error handling (#9898)	il y a 1 an
Georgi Gerganov	ba6f62eb79 readme : update hot topics	il y a 1 an
sasha0552	d865d1478c server : fix smart selection of available slot (#10120)	il y a 1 an
Georgi Gerganov	1804adb0cf ggml : remove ggml_scratch (#10121)	il y a 1 an
Georgi Gerganov	815fe72adc sync : ggml	il y a 1 an
Georgi Gerganov	f221d56220 ggml : alloc ggml_contexts on the heap (whisper/2525)	il y a 1 an
Zhenwei Jin	e597e50794 build: fix build error in Windows env with OneAPI setup (#10107)	il y a 1 an
Diego Devesa	85679d37f3 llama : improve output buffer type selection (#10098)	il y a 1 an
Diego Devesa	1e9f94994e quantize : fix --keep-split (#10114)	il y a 1 an
Diego Devesa	c02e5ab2a6 llama : fix buffer checks for mamba and rwk (#10111)	il y a 1 an
Zhenwei Jin	ab3d71f97f loader: refactor tensor weights storage (#9935)	il y a 1 an
Kevin Gibbons	0a683e8088 server : include scheme when printing URL (#10106)	il y a 1 an
Diego Devesa	dea5e86051 ggml : check tensor name lengths in gguf files (#10100)	il y a 1 an
Sergio López	1329c0a75e kompute: add mul_mat_q4_k shader (#10097)	il y a 1 an

Récemment Précédemment

Historique des commits Trouver

Historique des commits