Howard Su
|
051c70dcd5
llama: Don't double count the sampling time (#2107)
|
2 лет назад |
Johannes Gäßler
|
9e4475f5cf
Fixed OpenCL offloading prints (#2082)
|
2 лет назад |
Nigel Bosch
|
7f0e9a775e
embd-input: Fix input embedding example unsigned int seed (#2105)
|
2 лет назад |
Georgi Gerganov
|
b472f3fca5
readme : add link web chat PR
|
2 лет назад |
Georgi Gerganov
|
ed9a54e512
ggml : sync latest (new ops, macros, refactoring) (#2106)
|
2 лет назад |
jwj7140
|
f257fd2550
Add an API example using server.cpp similar to OAI. (#2009)
|
2 лет назад |
Tobias Lütke
|
7ee76e45af
Simple webchat for server (#1998)
|
2 лет назад |
Henri Vasserman
|
acc111caf9
Allow old Make to build server. (#2098)
|
2 лет назад |
ZhouYuChen
|
23c7c6fc91
Update Makefile: clean simple (#2097)
|
2 лет назад |
Erik Scholz
|
698efad5fb
CI: make the brew update temporarily optional. (#2092)
|
2 лет назад |
Govlzkoy
|
14a2cc71f6
[ggml] fix index for ne03 value in ggml_cl_mul_f32 (#2088)
|
2 лет назад |
Henri Vasserman
|
1cf14ccef1
fix server crashes (#2076)
|
2 лет назад |
Howard Su
|
cc45a7feb8
Fix crash of test-tokenizer-0 under Debug build (#2064)
|
2 лет назад |
Howard Su
|
55dbb915cc
[llama] No need to check file version when loading vocab score (#2079)
|
2 лет назад |
WangHaoranRobin
|
d7d2e6a0f0
server: add option to output probabilities for completion (#1962)
|
2 лет назад |
Georgi Gerganov
|
46088f7231
ggml : fix build with OpenBLAS (close #2066)
|
2 лет назад |
Johannes Gäßler
|
0bc2cdfc87
Better CUDA synchronization logic (#2057)
|
2 лет назад |
Johannes Gäßler
|
befb3a3562
Test-based VRAM scratch size + context adjustment (#2056)
|
2 лет назад |
Daniel Drake
|
b213227067
cmake : don't force -mcpu=native on aarch64 (#2063)
|
2 лет назад |
Aaron Miller
|
2f8cd979ec
metal : release buffers when freeing metal context (#2062)
|
2 лет назад |
Judd
|
471aab6e4c
convert : add support of baichuan-7b (#2055)
|
2 лет назад |
Georgi Gerganov
|
463f2f4c4f
llama : fix return value of llama_load_session_file_internal (#2022)
|
2 лет назад |
Rand Xie
|
cb44dbc7de
llama : catch llama_load_session_file_internal exceptions (#2022)
|
2 лет назад |
Georgi Gerganov
|
79f634a19d
embd-input : fix returning ptr to temporary
|
2 лет назад |
Georgi Gerganov
|
04606a1599
train : fix compile warning
|
2 лет назад |
Qingyou Meng
|
b1ca8f36a9
ggml : disable GGML_TASK_INIT and GGML_TASK_FINALIZE by default (#1995)
|
2 лет назад |
Howard Su
|
b8c8dda75f
Use unsigned for random seed (#2006)
|
2 лет назад |
LostRuins
|
96a712ca1b
Porting the improved K-Quant CUDA kernels to OpenCL (#1966)
|
2 лет назад |
m3ndax
|
d3494bb86b
llama : replacing auto &kv with const auto &kv (#2041)
|
2 лет назад |
Salvador E. Tropea
|
5b351e94d0
cuda : remove nchannels_x argument from mul_mat_vec_nc_f16_f32 (#2028)
|
2 лет назад |