Eve
|
adc5dd92e8
vulkan: scale caching for k quants + misc fixes (#11081)
|
hai 1 ano |
Georgi Gerganov
|
f11cfdfd7f
ci : use -no-cnv in gguf-split tests (#11254)
|
hai 1 ano |
Junil Kim
|
1d8504338e
fix: ggml: fix vulkan-shaders-gen build (#10448)
|
hai 1 ano |
Johannes Gäßler
|
432df2d5f9
RoPE: fix back, CUDA support for back + noncont. (#11240)
|
hai 1 ano |
Daniel Bevenius
|
0ccd7f3eb2
examples : add embd_to_audio to tts-outetts.py [no ci] (#11235)
|
hai 1 ano |
Akarshan Biswas
|
f446c2cf6a
SYCL: Add gated linear attention kernel (#11175)
|
hai 1 ano |
Xuan Son Nguyen
|
b4d92a59a2
ci : add -no-cnv for tests (#11238)
|
hai 1 ano |
Georgi Gerganov
|
bbf3e55e35
vocab : add dummy tokens for "no_vocab" type (#11231)
|
hai 1 ano |
ebraminio
|
c5bf0d1bd7
server : Improve code snippets direction between RTL text (#11221)
|
hai 1 ano |
Olivier Chafik
|
091592d758
Refactor test-chat-template.cpp (#11224)
|
hai 1 ano |
Georgi Gerganov
|
44d1e796d0
sync : ggml
|
hai 1 ano |
Georgi Gerganov
|
a4f3f5d8e6
scripts : sync gguf (cont)
|
hai 1 ano |
Georgi Gerganov
|
48e1ae0e61
scripts : sync gguf
|
hai 1 ano |
Georgi Gerganov
|
d00a80e89d
scripts : sync opencl
|
hai 1 ano |
ebraminio
|
504af20ee4
server : (UI) Improve messages bubble shape in RTL (#11220)
|
hai 1 ano |
Xuan Son Nguyen
|
84a44815f7
cli : auto activate conversation mode if chat template is available (#11214)
|
hai 1 ano |
Andreas Kieslinger
|
39509fb082
cuda : CUDA Graph Compute Function Refactor (precursor for performance improvements) (#11042)
|
hai 1 ano |
Georgi Gerganov
|
a29f0870d4
contrib : add naming guidelines (cont) (#11177)
|
hai 1 ano |
ebraminio
|
437e05f714
server : (UI) Support for RTL text as models input or output (#11208)
|
hai 1 ano |
Georgi Gerganov
|
ca001f6656
contrib : add naming guidelines (cont) (#11177)
|
hai 1 ano |
Xuan Son Nguyen
|
00b4c3da62
common : support tag-based --hf-repo like on ollama (#11195)
|
hai 1 ano |
Georgi Gerganov
|
7426a26b24
contrib : add naming guidelines (#11177)
|
hai 1 ano |
Daniel Bevenius
|
8f70fc3d1b
llama : remove 'd' from bad special token log (#11212)
|
hai 1 ano |
Radoslav Gerganov
|
1244cdcf14
ggml : do not define GGML_USE_CUDA when building with GGML_BACKEND_DL (#11211)
|
hai 1 ano |
Eric Curtin
|
924518e2e5
Reset color before we exit (#11205)
|
hai 1 ano |
Xuan Son Nguyen
|
9a483999a6
llama : fix chat template gguf key (#11201)
|
hai 1 ano |
Georgi Gerganov
|
08f10f69c3
llama : remove notion of CLS token (#11064)
|
hai 1 ano |
Georgi Gerganov
|
afa8a9ec9b
llama : add `llama_vocab`, functions -> methods, naming (#11110)
|
hai 1 ano |
Vinesh Janarthanan
|
c05e8c9934
gguf-py: fixed local detection of gguf package (#11180)
|
hai 1 ano |
Daniel Bevenius
|
2739a71e4b
convert : sort print supported models [no ci] (#11179)
|
hai 1 ano |