Neo Zhang Jianyu
|
715641391d
Support multiple GPUs (split mode) on SYCL backend (#5806)
|
1 жил өмнө |
crasm
|
9bf297a02b
workflows : remove nocleanup arg for check-requirements.sh (#5826)
|
1 жил өмнө |
Tushar
|
cb5e8f7fc4
build(nix): Introduce flake.formatter for `nix fmt` (#5687)
|
1 жил өмнө |
nold
|
da3b9ba2b7
convert-hf-to-gguf : require einops for InternLM2ForCausalLM (#5792)
|
1 жил өмнө |
Sourab Mangrulkar
|
c29af7e225
llama : add StarCoder2 support (#5795)
|
1 жил өмнө |
Georgi Gerganov
|
38d16b1426
server : remove api_like_OAI.py proxy script (#5808)
|
1 жил өмнө |
ddpasa
|
c2224f003b
ggml-vulkan: fix VULKAN_CHECK_RESULTS flag, which was previously broken (#5813)
|
1 жил өмнө |
kunal-vaishnavi
|
e743386728
gemma : fix bfloat16 -> float16 conversion issue (#5810)
|
1 жил өмнө |
Miwa / Ensan
|
f49a535686
common : fix flag `--logits-all` to `--all-logits` (#5805)
|
1 жил өмнө |
Pierrick Hymbert
|
3ab8b3a92e
llama : cleanup unused mmq flags (#5772)
|
1 жил өмнө |
Douglas Hanley
|
9600d59e01
unicode : switch to multimap based nfd_map (#5799)
|
1 жил өмнө |
Pierrick Hymbert
|
5cb02b4a01
server: allow to override threads server pool with --threads-http (#5794)
|
1 жил өмнө |
Eve
|
6ea0f010ff
ci : add Ubuntu 22 Vulkan CI run (#5789)
|
1 жил өмнө |
Georgi Gerganov
|
f105471ef6
server : fix newlines in help (#5785)
|
1 жил өмнө |
AidanBeltonS
|
38d1521608
[SYCL] Use batched mul_mat pathway (#5591)
|
1 жил өмнө |
Xuan Son Nguyen
|
052051d8ae
Server: normalize naming (#5779)
|
1 жил өмнө |
Marcus Dunn
|
d5ab29757e
llama : constified `llama_set_state_data`'s `src` (#5774)
|
1 жил өмнө |
Georgi Gerganov
|
87c91c0766
ci : reduce 3b ppl chunks to 1 to avoid timeout (#5771)
|
1 жил өмнө |
Eve
|
317709b2a8
make portability_enumeration_ext apple only (#5757)
|
1 жил өмнө |
Georgi Gerganov
|
08c5ee87e4
llama : remove deprecated API (#5770)
|
1 жил өмнө |
Georgi Gerganov
|
78aacf3634
awq-py : remove (#5768)
|
1 жил өмнө |
Georgi Gerganov
|
8c0e8f4e73
sync : ggml
|
1 жил өмнө |
slaren
|
2774b0c974
add google magika inference example (ggml/748)
|
1 жил өмнө |
UEXTM.com
|
5f70671856
Introduce backend GUIDs (ggml/743)
|
1 жил өмнө |
Xuan Son Nguyen
|
a693bea1e6
server : hit Ctrl+C twice to exit (#5734)
|
1 жил өмнө |
compilade
|
adcb12a9ba
llama : fix non-quantization of expert gating tensors (#5754)
|
1 жил өмнө |
Douglas Hanley
|
177628bfd8
llama : improve BERT tokenization (#5740)
|
1 жил өмнө |
Daniel Bevenius
|
6c4416868d
readme : add link to LLaVA 1.6 models (#5758)
|
1 жил өмнө |
Jorge A
|
efc72253f7
server : add "/chat/completions" alias for "/v1/...` (#5722)
|
1 жил өмнө |
Kawrakow
|
7c4263d426
ggml : make i-quants work with super-blocks of 64 (CPU,Metal) (#5760)
|
1 жил өмнө |