cturan/llama.cpp

Autor	SHA1 Mensagem	Data
Zhenwei Jin	506122d854 llama-bench : add support for getting cpu info on Windows (#8824)	há 1 ano atrás
slaren	2b1f616b20 ggml : reduce hash table reset cost (#8698)	há 1 ano atrás
hipudding	1bdd8ae19f [CANN] Add Ascend NPU backend (#6035)	há 1 ano atrás
Radoslav Gerganov	e65bbf606c llama-bench : fix RPC indication (#7936)	há 1 ano atrás
slaren	f578b86b21 move BLAS to a separate backend (#6210)	há 1 ano atrás
Johannes Gäßler	148995e5e5 llama-bench: more compact markdown tables (#7879)	há 1 ano atrás
Georgi Gerganov	1442677f92 common : refactor cli arg parsing (#7675)	há 1 ano atrás
Georgi Gerganov	554c247caf ggml : remove OpenCL (#7735)	há 1 ano atrás
slaren	adc9ff3841 llama-bench : allow using a different printer for stderr with -oe (#7722)	há 1 ano atrás
Radoslav Gerganov	210d99173d llama-bench : add support for the RPC backend (#7435)	há 1 ano atrás
Georgi Gerganov	6ff13987ad common : normalize naming style (#7462)	há 1 ano atrás
slaren	b18532a4ef phi3 : duplicate rope factors in each layer (#7447)	há 1 ano atrás
slaren	e849648888 llama-bench : add pp+tg test type (#7199)	há 1 ano atrás
kunnis	628b299106 Adding support for the --numa argument for llama-bench. (#7080)	há 1 ano atrás
Georgi Gerganov	9c67c2773d ggml : add Flash Attention (#5021)	há 1 ano atrás
Justine Tunney	8cc91dc63c ggml : add llamafile sgemm (#6414)	há 1 ano atrás
slaren	280345968d cuda : rename build flag to LLAMA_CUDA (#6299)	há 1 ano atrás
Kawrakow	76aa30a263 Add ability to use Q5_0, Q5_1, and IQ4_NL for quantized K cache (#6183)	há 1 ano atrás
slaren	2bf8d0f7c4 backend : offload large batches to GPU (#6083)	há 1 ano atrás
slaren	b0bc9f4a9d llama-bench : use random tokens to improve accuracy with mixtral (#6069)	há 1 ano atrás
Steve Grubb	6e0438da3c gguf : fix resource leaks (#6061)	há 1 ano atrás
slaren	f30ea47a87 llama : add pipeline parallelism support (#6017)	há 1 ano atrás
Georgi Gerganov	6cdabe6526 llama-bench : add embeddings option (#5924)	há 1 ano atrás
Neo Zhang Jianyu	715641391d Support multiple GPUs (split mode) on SYCL backend (#5806)	há 1 ano atrás
Pierrick Hymbert	3ab8b3a92e llama : cleanup unused mmq flags (#5772)	há 1 ano atrás
Georgi Gerganov	ab336a9d5e code : normalize enum names (#5697)	há 1 ano atrás
bmwl	f486f6e1e5 ggml : add numa options (#5377)	há 1 ano atrás
Michael Klimenko	52bb63c708 refactor : switch to emplace_back to avoid extra object (#5291)	há 1 ano atrás
Neo Zhang Jianyu	128dcbd3c9 add --no-mmap in llama-bench (#5257)	há 2 anos atrás
Georgi Gerganov	5cb04dbc16 llama : remove LLAMA_MAX_DEVICES and LLAMA_SUPPORTS_GPU_OFFLOAD (#5240)	há 2 anos atrás

Recente Antigo

Histórico de Commits Pesquisar

Histórico de Commits