cturan/llama.cpp

Auteur	SHA1 Message	Date
slaren	ae1f211ce2 cuda : refactor into multiple files (#6269)	il y a 1 an
Xuan Son Nguyen	ad3a0505e3 Server: clean up OAI params parsing function (#6284)	il y a 1 an
Neo Zhang Jianyu	95ad616cdd [SYCL] fix SYCL backend build on windows is break by LOG() error (#6290)	il y a 1 an
Minsoo Cheong	64e7b47c69 examples : add "retrieval" (#6193)	il y a 1 an
Justine Tunney	7733f0c760 ggml : support AVX512VNNI (#6280)	il y a 1 an
Rick G	a32b77c4b2 Fix heap corruption from wmode out-of-bound writes on windows (#6272)	il y a 1 an
Georgi Gerganov	a0e584defd imatrix : fix wname for mul_mat_id ops (#6271)	il y a 1 an
Johannes Gäßler	7aed0ffe68 Fixed lookup compilation issues on Windows (#6273)	il y a 1 an
Pierrick Hymbert	ea279d5609 ci : close inactive issue, increase operations per run (#6270)	il y a 1 an
Minsoo Cheong	586e7bc561 sampling : deduplicated code for probability distribution access (#6240)	il y a 1 an
Meng, Hengyu	ddf6568510 [SYCL] offload op (#6217)	il y a 1 an
Neo Zhang Jianyu	d03224ac98 Support build win release for SYCL (#6241)	il y a 1 an
Jared Van Bortel	94d1b3b411 use _wfopen instead of fopen on Windows (#6248)	il y a 1 an
Georgi Gerganov	95562175f8 gitignore : gguf-split	il y a 1 an
Pierrick Hymbert	f482bb2e49 common: llama_load_model_from_url split support (#6192)	il y a 1 an
Pierrick Hymbert	1997577d5e server: docs: `--threads` and `--threads`, `--ubatch-size`, `--log-disable` (#6254)	il y a 1 an
Julius Arkenberg	476b0251b2 llama : add grok-1 support (#6204)	il y a 1 an
Pierrick Hymbert	21cad01b6e split: add gguf-split in the make build target (#6262)	il y a 1 an
Pierrick Hymbert	1b26aebe4d server: flush stdout after logging in both text and json layout (#6253)	il y a 1 an
Johannes Gäßler	50ccaf5eac lookup: complement data from context with general text statistics (#5479)	il y a 1 an
Georgi Gerganov	56a00f0a2f common : default --hf-file to --model (#6234)	il y a 1 an
fraxy-v	92397d87a4 convert-llama2c-to-ggml : enable conversion of GQA models (#6237)	il y a 1 an
Kawrakow	1d0331c12a quantize: options for output and token embedding tensors qtype (#6239)	il y a 1 an
Pierrick Hymbert	dba1af6129 llama_model_loader: support multiple split/shard GGUFs (#6187)	il y a 1 an
Minsoo Cheong	ee804f6223 ci: apply concurrency limit for github workflows (#6243)	il y a 1 an
Georgi Gerganov	80bd33bc2c common : add HF arg helpers (#6234)	il y a 1 an
Nexesenex	e80f06d2a1 llama : correction of the attn.v.weight quantization for IQ3_XS (#6209)	il y a 1 an
Olivier Chafik	f77a8ffd3b tests : conditional python & node json schema tests (#6207)	il y a 1 an
Olivier Chafik	72114edf06 json-schema-to-grammar : fix order of props + non-str const/enum (#6232)	il y a 1 an
slaren	2f0e81e053 cuda : add LLAMA_CUDA_NO_PEER_COPY to workaround broken ROCm p2p copy (#6208)	il y a 1 an

Récemment Précédemment

Historique des commits Trouver

Historique des commits