slaren
|
335acd2ffd
fix convert-lora-to-ggml.py (#2738)
|
2 سال پیش |
klosax
|
5290c38e6e
main : insert bos if no tokens (#2727)
|
2 سال پیش |
akawrykow
|
cc34dbda96
gitignore : fix for windows (#2729)
|
2 سال پیش |
Cebtenzzre
|
7c2227a197
chmod : make scripts executable (#2675)
|
2 سال پیش |
JohnnyB
|
f19dca04ea
devops : RPM Specs (#2723)
|
2 سال پیش |
Kawrakow
|
8207214b6a
Fix values shown in the quantize tool help (#2735)
|
2 سال پیش |
Kawrakow
|
62959e740e
Strided perplexity (#2714)
|
2 سال پیش |
IgnacioFDM
|
7f7ddd5002
Fix ggml to gguf conversion on Windows (#2733)
|
2 سال پیش |
Xiao-Yong Jin
|
b8ad1b66b2
server : allow json array in prompt or content for direct token input (#2306)
|
2 سال پیش |
Evan Jones
|
f5fe98d11b
docs : add grammar docs (#2701)
|
2 سال پیش |
Kerfuffle
|
777f42ba18
Improve handling of special tokens in GGML to GGUF converter (#2725)
|
2 سال پیش |
goerch
|
46ef5b5fcf
llama : fix whitespace escaping in tokenizer (#2724)
|
2 سال پیش |
Johannes Gäßler
|
c63bb1d16a
CUDA: use mul_mat_q kernels by default (#2683)
|
2 سال پیش |
Alex Petenchea
|
3b6cfe7c92
convert.py : clarifying error message (#2718)
|
2 سال پیش |
Jiahao Li
|
800c9635b4
Fix CUDA softmax by subtracting max value before exp (#2665)
|
2 سال پیش |
Georgi Gerganov
|
deb7dfca4b
gguf : add ftype meta info to the model (#2710)
|
2 سال پیش |
Kawrakow
|
bac66994cf
Quantization imrovements for k_quants (#2707)
|
2 سال پیش |
slaren
|
519c981f8b
embedding : evaluate prompt in batches (#2713)
|
2 سال پیش |
slaren
|
1123f7fbdf
ggml-cuda : use graph allocator (#2684)
|
2 سال پیش |
Georgi Gerganov
|
ef3f333d37
ggml : sync latest (SAM + SD operators, CUDA alibi) (#2709)
|
2 سال پیش |
slaren
|
8e4364f2af
llama-bench : minor fixes (#2695)
|
2 سال پیش |
Kylin
|
1e3bc523d8
ggml : support CUDA's half type for aarch64(#1455) (#2670)
|
2 سال پیش |
Shouzheng Liu
|
14b1d7e6f7
metal : add missing barriers for mul-mat (#2699)
|
2 سال پیش |
Jhen-Jie Hong
|
226255b44e
server : fallback to default if client param is null (#2688)
|
2 سال پیش |
Kerfuffle
|
930523c8e1
Fix convert-llama-ggmlv3-to-gguf.py vocab conversion (#2698)
|
2 سال پیش |
Georgi Gerganov
|
c8dba409e6
py : remove obsolete script
|
2 سال پیش |
Georgi Gerganov
|
6381d4e110
gguf : new file format with flexible meta data (beta) (#2398)
|
2 سال پیش |
Shouzheng Liu
|
dadbed99e6
metal : fix synchronization in new matrix multiplication kernel (#2686)
|
2 سال پیش |
Kawrakow
|
cb1c0727bd
HellaSwag: split token evaluation into batches if needed (#2681)
|
2 سال پیش |
slaren
|
9e232f0234
ggml : move all type info to ggml_type_traits (#2663)
|
2 سال پیش |