Pierrick Hymbert
|
24ecb58168
Revert "server bench: fix bench not waiting for model load (#7284)" (#7334)
|
1 gadu atpakaļ |
Radoslav Gerganov
|
9afdffe70e
rpc : get available mem for the CPU backend
|
1 gadu atpakaļ |
Radoslav Gerganov
|
3b3963c55c
rpc : add command line arg for specifying backend memory
|
1 gadu atpakaļ |
Jared Van Bortel
|
dda64fc17c
convert : get general.name from model dir, not its parent (#5615)
|
1 gadu atpakaļ |
Herman Semenov
|
0350f58152
grammar, json, llama: replace push on emplace if it possible (#7273)
|
1 gadu atpakaļ |
Vaibhav Srivastav
|
ad52d5c259
doc: add references to hugging face GGUF-my-repo quantisation web tool. (#7288)
|
1 gadu atpakaļ |
Max Krasnyansky
|
172b78210a
ci: fix bin/Release path for windows-arm64 builds (#7317)
|
1 gadu atpakaļ |
Max Krasnyansky
|
13ad16af12
Add support for properly optimized Windows ARM64 builds with LLVM and MSVC (#7191)
|
1 gadu atpakaļ |
Daniel Bevenius
|
8f7080bf48
readme : remove stray double quote (#7310)
|
1 gadu atpakaļ |
kunnis
|
e1b40ac3b9
ggml : use dynamic thread scheduling for matrix multiplication (#6915)
|
1 gadu atpakaļ |
agray3
|
dc020985b8
Avoid unnecessarily disabling CUDA graphs (#7302)
|
1 gadu atpakaļ |
slaren
|
344f9126cc
ggml : tag ggml_tensor::backend as deprecated (#7290)
|
1 gadu atpakaļ |
AidanBeltonS
|
9a17ab914b
Add missing " (#7303)
|
1 gadu atpakaļ |
dm4
|
ea3b0590ee
embedding : free the batch after execution (#7297)
|
1 gadu atpakaļ |
Georgi Gerganov
|
29499bb593
sync : ggml
|
1 gadu atpakaļ |
John Balis
|
48aa8fd1f2
ggml : add `ggml_upscale_ext` (ggml/814)
|
1 gadu atpakaļ |
Johannes Gäßler
|
583fd6b000
server bench: fix bench not waiting for model load (#7284)
|
1 gadu atpakaļ |
Georgi Gerganov
|
9f773486ab
script : sync ggml-rpc
|
1 gadu atpakaļ |
Georgi Gerganov
|
e8a7fd4fb0
metal : support FA without mask + add asserts (#7278)
|
1 gadu atpakaļ |
Georgi Gerganov
|
a5e3fde857
sync : ggml
|
1 gadu atpakaļ |
Georgi Gerganov
|
f308ea7059
metal : tune soft_max number of threads (whisper/0)
|
1 gadu atpakaļ |
Georgi Gerganov
|
c3c88f296a
ggml : try fix ppc64 (whisper/0)
|
1 gadu atpakaļ |
Przemysław Pawełczyk
|
182adefcf3
ggml : expose SSE3 and SSSE3 for MSVC when AVX is available (whisper/2128)
|
1 gadu atpakaļ |
Hong Bo PENG
|
0d26d8ccd8
ggml : optimize for ppc64le using VSX intrinsics (ggml/784)
|
1 gadu atpakaļ |
Steve Grubb
|
4f0263633b
server: free sampling contexts on exit (#7264)
|
1 gadu atpakaļ |
Brian
|
1265c670fd
Revert "move ndk code to a new library (#6951)" (#7282)
|
1 gadu atpakaļ |
Radoslav Gerganov
|
5e31828d3e
ggml : add RPC backend (#6829)
|
1 gadu atpakaļ |
slaren
|
541600201e
llama : disable pipeline parallelism with nkvo (#7265)
|
1 gadu atpakaļ |
Elton Kola
|
efc8f767c8
move ndk code to a new library (#6951)
|
1 gadu atpakaļ |
Haggai Nuchi
|
e0f556186b
Add left recursion check: quit early instead of going into an infinite loop (#7083)
|
1 gadu atpakaļ |