Johannes Gäßler
|
cb5fad4c6c
CUDA: refactor and optimize IQ MMVQ (#8215)
|
1 year ago |
Mateusz Charytoniuk
|
dae57a1ebc
readme: add Paddler to the list of projects (#8239)
|
1 year ago |
Xuan Son Nguyen
|
49122a873f
gemma2: add sliding window mask (#8227)
|
1 year ago |
Roni
|
0ddeff1023
readme : update tool list (#8209)
|
1 year ago |
Michael Francis
|
3840b6f593
nix : enable curl (#8043)
|
1 year ago |
Georgi Gerganov
|
257f8e41e2
nix : remove OpenCL remnants (#8235)
|
1 year ago |
iacore
|
694c59cb42
Document BERT support. (#8205)
|
1 year ago |
zhentaoyu
|
197fe6c1d7
[SYCL] Update SYCL-Rope op and Refactor (#8157)
|
1 year ago |
Georgi Gerganov
|
d0a7145ba9
flake.lock: Update (#8218)
|
1 year ago |
Xuan Son Nguyen
|
9ef0780062
Fix new line issue with chat template, disable template when in-prefix/suffix is set (#8203)
|
1 year ago |
Andrei
|
1c5eba6f8e
llama: Add attention and final logit soft-capping, update scaling factor to Gemma2 (#8197)
|
1 year ago |
Xuan Son Nguyen
|
72272b83a3
fix code typo in llama-cli (#8198)
|
1 year ago |
Olivier Chafik
|
8748d8ac6f
json: attempt to skip slow tests when running under emulator (#8189)
|
1 year ago |
Xuan Son Nguyen
|
26a39bbd6b
Add MiniCPM, Deepseek V2 chat template + clean up `llama_chat_apply_template_internal` (#8172)
|
1 year ago |
Sigbjørn Skjæret
|
38373cfbab
Add SPM infill support (#8016)
|
1 year ago |
slaren
|
b851b3fba0
cmake : allow user to override default options (#8178)
|
1 year ago |
Olivier Chafik
|
139cc621e9
`json`: restore default additionalProperties to false, fix some pattern escapes (#8180)
|
1 year ago |
pculliton
|
e57dc62057
llama: Add support for Gemma2ForCausalLM (#8156)
|
1 year ago |
Xuan Son Nguyen
|
a27aa50ab7
Add missing items in makefile (#8177)
|
1 year ago |
Olivier Chafik
|
cb0b06a8a6
`json`: update grammars/README w/ examples & note about additionalProperties (#8132)
|
1 year ago |
loonerin
|
558f44bf83
CI: fix release build (Ubuntu+Mac) (#8170)
|
1 year ago |
slaren
|
8172ee9da9
cmake : fix deprecated option names not working (#8171)
|
1 year ago |
Xuan Son Nguyen
|
16791b8f0b
Add chatml fallback for cpp `llama_chat_apply_template` (#8160)
|
1 year ago |
Georgi Gerganov
|
ab3679112d
flake.lock: Update (#8071)
|
1 year ago |
jukofyork
|
97877eb10b
Control vector loading fixes (#8137)
|
1 year ago |
Raj Hammeer Singh Hada
|
387952651a
Delete examples/llama.android/llama/CMakeLists.txt (#8165)
|
1 year ago |
Sigbjørn Skjæret
|
6030c61281
Add Qwen2MoE 57B-A14B model identifier (#8158)
|
1 year ago |
Johannes Gäßler
|
85a267daaa
CUDA: fix MMQ stream-k for --split-mode row (#8167)
|
1 year ago |
kustaaya
|
f675b20a3b
Added support for Viking pre-tokenizer (#8135)
|
1 year ago |
Sigbjørn Skjæret
|
911e35bb8b
llama : fix CodeLlama FIM token checks (#8144)
|
1 year ago |