Aarni Koskela
|
e4386f417f
server : add a subtle loading animation to the edit box (#2466)
|
2 lat temu |
Jiahao Li
|
35195689cd
2x faster (rms) norm cuda kernels (3.7% e2e improvement) (#2985)
|
2 lat temu |
slaren
|
cf9b08485c
ggml-alloc : use virtual memory for measurement (#2973)
|
2 lat temu |
Georgi Gerganov
|
47068e5170
speculative : PoC for speeding-up inference via speculative sampling (#2926)
|
2 lat temu |
Georgi Gerganov
|
8f429fa511
perplexity : fix ETA by warming up the model with an empty run
|
2 lat temu |
Kerfuffle
|
6519e9c99c
gguf(python): Fix special vocab handling when id < 0 (#2984)
|
2 lat temu |
Georgi Gerganov
|
b7f2aa9e51
metal : restore 363f0bf and fix reduce in F16_F32 kernels (#2986)
|
2 lat temu |
Alon
|
73a12a6344
cov : disable comment in PRs (#2989)
|
2 lat temu |
opparco
|
3730134776
llama : fix bpe tokenize from byte (#2889)
|
2 lat temu |
Georgi Gerganov
|
d9151e6f57
metal : revert 6af0bab until we fix it
|
2 lat temu |
Alon
|
afc43d5f82
cov : add Code Coverage and codecov.io integration (#2928)
|
2 lat temu |
Wentai Zhang
|
6460f758db
opencl : fix a bug in ggml_cl_pool_malloc() for ggml_cl_mul_mat_f32() (#2955)
|
2 lat temu |
Kawrakow
|
ca82cf7bac
metal : more optimizations (#2959)
|
2 lat temu |
kchro3
|
6a31a3bd98
swift : add support for k-quants (#2983)
|
2 lat temu |
Kerfuffle
|
cff7b0bf07
convert.py : BPE fixes (#2938)
|
2 lat temu |
Ido S
|
340af42f09
docs : add `catai` to `README.md` (#2967)
|
2 lat temu |
momonga
|
c42f0ec6b3
examples : fix gpt-neox (#2943)
|
2 lat temu |
kchro3
|
2753415afd
swift : add missing c file to Package.swift (#2978)
|
2 lat temu |
Cebtenzzre
|
bc054af97a
make : support overriding CFLAGS/CXXFLAGS/CPPFLAGS/LDFLAGS (#2886)
|
2 lat temu |
Kerfuffle
|
3358c381f6
logging: Fix creating empty file even when disabled (#2966)
|
2 lat temu |
bandoti
|
52315a4216
readme : update clblast instructions (#2903)
|
2 lat temu |
Karsten Weiss
|
8b56b4f2c3
metal : show all Metal device instances in the system (#2952)
|
2 lat temu |
Jhen-Jie Hong
|
21f3d1be86
k-quants : fix build on armv7 (android only) (#2920)
|
2 lat temu |
Jhen-Jie Hong
|
571083f508
server : avoid aniprompt in probabilities of final response (#2849)
|
2 lat temu |
Engininja2
|
f04d002844
cuda : vsubss4 for older versions of ROCm/clang (#2942)
|
2 lat temu |
ZHAOKAI WANG
|
69fdbb9abc
readme : quick start command fix (#2908)
|
2 lat temu |
Kerfuffle
|
5d6f19f16b
Allow quantize to only copy tensors, some other improvements (#2931)
|
2 lat temu |
Georgi Gerganov
|
0d58936686
llama2c : rename function
|
2 lat temu |
Cebtenzzre
|
6c9c23429b
make : use unaligned vector moves on MinGW (#2945)
|
2 lat temu |
m3ndax
|
ee8654bcd0
minor : add const qualifiers (#2853)
|
2 lat temu |