AidanBeltonS
|
e82f9e2b83
[SYCL] Fix batched impl for NVidia GPU (#6164)
|
1 жил өмнө |
Kawrakow
|
cbc8343619
Make IQ1_M work for QK_K = 64 (#6327)
|
1 жил өмнө |
Sigbjørn Skjæret
|
e562b9714b
common : change --no-penalize-nl to --penalize-nl (#6334)
|
1 жил өмнө |
Georgi Gerganov
|
2ab4f00d25
llama2c : open file as binary (#6332)
|
1 жил өмнө |
Mateusz Charytoniuk
|
1740d6dd4e
readme : add php api bindings (#6326)
|
1 жил өмнө |
Eric Zhang
|
0642b22cd1
server: public: use relative routes for static files (#6325)
|
1 жил өмнө |
Neo Zhang Jianyu
|
a4f569e8a3
[SYCL] fix no file in win rel (#6314)
|
1 жил өмнө |
Jared Van Bortel
|
32c8486e1f
wpm : portable unicode tolower (#6305)
|
1 жил өмнө |
compilade
|
557410b8f0
llama : greatly reduce output buffer memory usage (#6122)
|
1 жил өмнө |
Kawrakow
|
55c1b2a3bb
IQ1_M: 1.75 bpw quantization (#6302)
|
1 жил өмнө |
Pedro Cuenca
|
e097633f63
convert-hf : fix exception in sentencepiece with added tokens (#6320)
|
1 жил өмнө |
Kawrakow
|
d25b1c31b0
quantize : be able to override metadata by key (#6321)
|
1 жил өмнө |
Minsoo Cheong
|
deb7240100
embedding : adjust `n_ubatch` value (#6296)
|
1 жил өмнө |
Jan Boon
|
3d032ece8e
server : add `n_discard` parameter (#6300)
|
1 жил өмнө |
Joseph Stahl
|
e190f1fca6
nix: make `xcrun` visible in Nix sandbox for precompiling Metal shaders (#6118)
|
1 жил өмнө |
slaren
|
280345968d
cuda : rename build flag to LLAMA_CUDA (#6299)
|
1 жил өмнө |
Christian Kögler
|
b06c16ef9f
nix: fix blas support (#6281)
|
1 жил өмнө |
Kawrakow
|
1f2fd4e727
tests : include IQ2_XXS and IQ2_XS in test-quantize-fns (#6303)
|
1 жил өмнө |
Georgi Gerganov
|
43139cc528
flake.lock: Update (#6266)
|
1 жил өмнө |
slaren
|
2f34b865b6
cuda : fix LLAMA_CUDA_F16 build (#6298)
|
1 жил өмнө |
slaren
|
ae1f211ce2
cuda : refactor into multiple files (#6269)
|
1 жил өмнө |
Xuan Son Nguyen
|
ad3a0505e3
Server: clean up OAI params parsing function (#6284)
|
1 жил өмнө |
Neo Zhang Jianyu
|
95ad616cdd
[SYCL] fix SYCL backend build on windows is break by LOG() error (#6290)
|
1 жил өмнө |
Minsoo Cheong
|
64e7b47c69
examples : add "retrieval" (#6193)
|
1 жил өмнө |
Justine Tunney
|
7733f0c760
ggml : support AVX512VNNI (#6280)
|
1 жил өмнө |
Rick G
|
a32b77c4b2
Fix heap corruption from wmode out-of-bound writes on windows (#6272)
|
1 жил өмнө |
Georgi Gerganov
|
a0e584defd
imatrix : fix wname for mul_mat_id ops (#6271)
|
1 жил өмнө |
Johannes Gäßler
|
7aed0ffe68
Fixed lookup compilation issues on Windows (#6273)
|
1 жил өмнө |
Pierrick Hymbert
|
ea279d5609
ci : close inactive issue, increase operations per run (#6270)
|
1 жил өмнө |
Minsoo Cheong
|
586e7bc561
sampling : deduplicated code for probability distribution access (#6240)
|
1 жил өмнө |