Johannes Gäßler
|
f64d44a9b9
CUDA: Fixed OpenLLaMA 3b mmq, reduced compile time (#2590)
|
2 gadi atpakaļ |
byte-6174
|
b19edd54d5
Adding support for llama2.c models (#2559)
|
2 gadi atpakaļ |
Equim
|
53dc399472
server: fixed wrong variable name in timing json (#2579)
|
2 gadi atpakaļ |
DannyDaemonic
|
9ca4abed89
Handle `ENABLE_VIRTUAL_TERMINAL_PROCESSING` more gracefully on earlier versions of Windows.
|
2 gadi atpakaļ |
Christian Demsar
|
e59fcb2bc1
Add --n-predict -2 for stopping generation on full context (#2565)
|
2 gadi atpakaļ |
Martin Krasser
|
1638757767
Fix grammar-based sampling issue in server (#2566)
|
2 gadi atpakaļ |
Sam Spilsbury
|
916a9acdd0
ggml-alloc: Don't try to re-use buffers of external tensors (#2562)
|
2 gadi atpakaļ |
grahameth
|
ea04a4ca19
add log_callback to llama_context_params for custom logging. (#2234)
|
2 gadi atpakaļ |
Johannes Gäßler
|
25d43e0eb5
CUDA: tuned mul_mat_q kernels (#2546)
|
2 gadi atpakaļ |
Martin Krasser
|
f5bfea0580
Allow passing grammar to completion endpoint (#2532)
|
2 gadi atpakaļ |
Johannes Gäßler
|
acfc5478ff
CUDA: tighter VRAM scratch size for 65b/70b (#2551)
|
2 gadi atpakaļ |
chaihahaha
|
7ed8d1fe7f
llm.vim : multiline autocompletion, get rid of "^@" (#2543)
|
2 gadi atpakaļ |
Georgi Gerganov
|
e7f94d6fdc
vim : bring back simple llm.vim example
|
2 gadi atpakaļ |
AustinMroz
|
2d7baaf50f
vim : streaming and more (#2495)
|
2 gadi atpakaļ |
klosax
|
f3c3b4b167
Add --rope-scale parameter (#2544)
|
2 gadi atpakaļ |
Georgi Gerganov
|
93356bdb7a
ggml : mul mat tweaks (#2372)
|
2 gadi atpakaļ |
Georgi Gerganov
|
60baff7c85
ggml : pad result of ggml_nbytes()
|
2 gadi atpakaļ |
Georgi Gerganov
|
9082b5dfbf
ggml : change params pointer (style change) (#2539)
|
2 gadi atpakaļ |
Georgi Gerganov
|
99d29c0094
ggml : sync (custom ops) (#2537)
|
2 gadi atpakaļ |
Johannes Gäßler
|
3d9a551816
Fixed mmap prefetch for GPU offloading (#2529)
|
2 gadi atpakaļ |
Georgi Gerganov
|
f6f9896ac3
metal : fix out-of-bounds access + inc concurrency nodes (#2416)
|
2 gadi atpakaļ |
GiviMAD
|
34a14b28ff
[Makefile] Move ARM CFLAGS before compilation (#2536)
|
2 gadi atpakaļ |
Henri Vasserman
|
7297128db8
[Zig] Rewrite build for Zig 0.11 (#2514)
|
2 gadi atpakaļ |
DannyDaemonic
|
86c3219895
console : fix issue related to Windows 11 PowerShell console mode persistence (#2521)
|
2 gadi atpakaļ |
Keiichi Tabata
|
2e8265ae17
convert.py : add missing abstract methods for quantized data (#2491)
|
2 gadi atpakaļ |
Johannes Gäßler
|
f514d1b306
CUDA: faster k-quant mul_mat_q kernels (#2525)
|
2 gadi atpakaļ |
Jonas Wunderlich
|
332311234a
fix firefox autoscroll (#2519)
|
2 gadi atpakaļ |
Cebtenzzre
|
182af739c4
server: regenerate completion.js.hpp (#2515)
|
2 gadi atpakaļ |
Cebtenzzre
|
4329d1acb0
CUDA: use min compute capability of GPUs actually used (#2506)
|
2 gadi atpakaļ |
Cebtenzzre
|
02f9d96a86
CUDA: check if event is NULL before cudaStreamWaitEvent (#2505)
|
2 gadi atpakaļ |