Olivier Chafik
|
3d804dec76
sync: minja (#11499)
|
11 months ago |
mgroeber9110
|
ffd0821c57
vocab : correctly identify LF token for GPT-2 style BPE tokenizer (#11496)
|
11 months ago |
Daniel Bevenius
|
4314e56c4f
server : use lambda instead of std::bind (#11507)
|
11 months ago |
Isaac McFadyen
|
496e5bf46b
server : (docs) added response format for /apply-template [no ci] (#11503)
|
11 months ago |
Guspan Tanadi
|
7919256c57
readme : reference examples relative links (#11505)
|
11 months ago |
Daniel Bevenius
|
e0449763a4
server : update json snippets in README.md [no ci] (#11492)
|
11 months ago |
Nigel Bosch
|
eb7cf15a80
server : add /apply-template endpoint for additional use cases of Minja functionality (#11489)
|
11 months ago |
Rémy Oudompheng
|
66ee4f297c
vulkan: implement initial support for IQ2 and IQ3 quantizations (#11360)
|
11 months ago |
Daniel Bevenius
|
e51c47b401
server : update auto gen files comments [no ci] (#11484)
|
11 months ago |
Jeff Bolz
|
2711d0215f
vulkan: Catch pipeline creation failure and print an error message (#11436)
|
11 months ago |
Eric Curtin
|
f0d4b29edf
Parse https://ollama.com/library/ syntax (#11480)
|
11 months ago |
Georgi Gerganov
|
815857791d
sync : ggml
|
11 months ago |
William Tambellini
|
1a0e87d291
ggml : add option to not print stack on abort (ggml/1081)
|
1 year ago |
issixx
|
d2e518e9b4
ggml-cpu : fix ggml_graph_compute_thread did not terminate on abort. (ggml/1065)
|
1 year ago |
Daniel Bevenius
|
b636228c0a
embedding : enable --no-warmup option (#11475)
|
11 months ago |
Molly Sophia
|
325afb370a
llama: fix missing k_cache store for rwkv6qwen2 (#11445)
|
11 months ago |
Emreerdog
|
794fe23f29
cmake: add hints for locating ggml on Windows using Llama find-package (#11466)
|
11 months ago |
peidaqi
|
cf8cc856d7
server : Fixed wrong function name in llamacpp server unit test (#11473)
|
11 months ago |
Xuan-Son Nguyen
|
d0c08040b6
ci : fix build CPU arm64 (#11472)
|
11 months ago |
uvos
|
be5ef7963f
HIP: Supress transformation warning in softmax.cu
|
11 months ago |
Nikita Sarychev
|
cae9fb4361
HIP: Only call rocblas_initialize on rocblas versions with the multiple instantation bug (#11080)
|
11 months ago |
Eric Curtin
|
7fee2889e6
Add github protocol pulling and http:// (#11465)
|
11 months ago |
Nuno
|
d7d1eccacc
docker: allow installing pip packages system-wide (#11437)
|
11 months ago |
someone13574
|
4bf3119d61
cmake : don't fail on `GGML_CPU=OFF` (#11457)
|
11 months ago |
Nuno
|
f643120bad
docker: add perplexity and bench commands to full image (#11438)
|
11 months ago |
Akarshan Biswas
|
6e84b0ab8e
SYCL : SOFTMAX F16 mask support and other fixes (#11261)
|
11 months ago |
Michael Engel
|
2b8525d5c8
Handle missing model in CLI parameters for llama-run (#11399)
|
11 months ago |
Eric Curtin
|
a4417ddda9
Add new hf protocol for ollama (#11449)
|
11 months ago |
Haus1
|
d6d24cd9ed
AMD: parse the architecture as supplied by gcnArchName (#11244)
|
11 months ago |
lexasub
|
a5203b4465
llama : minor fixes for up llama load model speed (#11448)
|
11 months ago |