jiahao su
|
466ea66f33
CANN: Add Ascend CANN build ci (#10217)
|
il y a 11 mois |
uvos
|
5f0db9522f
hip : Add hipGraph and VMM support to ROCM (#11362)
|
il y a 11 mois |
Johannes Gäßler
|
c5d9effb49
CUDA: fix FP16 cuBLAS GEMM (#11396)
|
il y a 11 mois |
uvos
|
9fbadaef4f
rocBLAS: Avoid fp32->fp16->fp32 conversion on cdna (#11356)
|
il y a 11 mois |
Georgi Gerganov
|
9755129c27
release : pack /lib in the packages (#11392)
|
il y a 11 mois |
Jafar Uruç
|
a07c2c8a52
docs : Update readme to build targets for local docker build (#11368)
|
il y a 11 mois |
Johannes Gäßler
|
8137b4bb2b
CPU/CUDA: fix (GQA) mul mat back, add CUDA support (#11380)
|
il y a 11 mois |
Bernhard M. Wiedemann
|
1af6945eb0
cmake : avoid -march=native when reproducible build is wanted (#11366)
|
il y a 11 mois |
Eric Curtin
|
01f37edf1a
Update llama-run README.md (#11386)
|
il y a 11 mois |
stduhpf
|
c07e87f38b
server : (webui) put DeepSeek R1 CoT in a collapsible <details> element (#11364)
|
il y a 11 mois |
Jeff Bolz
|
564804b79b
tests: fix some mul_mat test gaps (#11375)
|
il y a 11 mois |
Eric Curtin
|
05f63cc9ee
Update documentation (#11373)
|
il y a 11 mois |
Eric Curtin
|
f7fb43cd0b
Add -ngl (#11372)
|
il y a 11 mois |
Xuan Son Nguyen
|
5845661640
server : add more clean up when cancel_tasks is called (#11340)
|
il y a 11 mois |
Eric Curtin
|
f211d1dc10
Treat hf.co/ prefix the same as hf:// (#11350)
|
il y a 11 mois |
amd-dwang
|
955a6c2d91
Vulkan-run-test: fix mmq_wg_denoms (#11343)
|
il y a 11 mois |
Jeff Bolz
|
1971adf55e
vulkan: sort shaders for more deterministic binary (#11315)
|
il y a 11 mois |
Jeff Bolz
|
5245729e33
vulkan: fix diag_mask_inf (#11323)
|
il y a 11 mois |
Diego Devesa
|
6152129d05
main : update README documentation for batch size (#11353)
|
il y a 11 mois |
Georgi Gerganov
|
16d3df7ab0
readme : add plugin links (#11355)
|
il y a 11 mois |
Diego Devesa
|
12c2bdf2de
server : fix draft context not being released (#11354)
|
il y a 11 mois |
Olivier Chafik
|
c64d2becb1
`minja`: sync at https://github.com/google/minja/commit/0f5f7f2b3770eb682fbc11763266d45204173686 (#11352)
|
il y a 11 mois |
Jiří Podivín
|
96f4053934
Adding logprobs to /v1/completions (#11344)
|
il y a 1 an |
Olivier Chafik
|
a94f3b2727
`common`: utils to split / join / repeat strings (from json converter) (#11342)
|
il y a 1 an |
tc-mb
|
3e3357fd77
llava : support Minicpm-omni (#11289)
|
il y a 1 an |
Olivier Chafik
|
6171c9d258
Add Jinja template support (#11016)
|
il y a 1 an |
Xuan Son Nguyen
|
e28245f35f
export-lora : fix tok_embd tensor (#11330)
|
il y a 1 an |
Radoslav Gerganov
|
6da5bec81c
rpc : better caching of the base buffer pointer (#11331)
|
il y a 1 an |
Eric Curtin
|
2e2f8f093c
linenoise.cpp refactoring (#11301)
|
il y a 1 an |
Georgi Gerganov
|
2139667ec4
metal : fix out-of-bounds write (#11314)
|
il y a 1 an |