Georgi Gerganov
|
9a818f7c42
scripts : improve get-pg.sh (#4838)
|
2 years ago |
iohub
|
18adb4e9bb
readme : add 3rd party collama reference to UI list (#4840)
|
2 years ago |
Georgi Gerganov
|
d9653894df
scripts : script to get Paul Graham essays in txt format (#4838)
|
2 years ago |
Behnam M
|
128de3585b
server : update readme about token probs (#4777)
|
2 years ago |
Zsapi
|
8c58330318
server : add api-key flag to documentation (#4832)
|
2 years ago |
Georgi Gerganov
|
18c2e1752c
ggml : fix vld1q_s8_x4 32-bit compat (#4828)
|
2 years ago |
Johannes Gäßler
|
8f900abfc0
CUDA: faster softmax via shared memory + fp16 math (#4742)
|
2 years ago |
howlger
|
1fc2f265ff
common : fix the short form of `--grp-attn-w`, not `-gat` (#4825)
|
2 years ago |
Georgi Gerganov
|
a9a8c5de3d
readme : add link to SOTA models
|
2 years ago |
Kawrakow
|
dd5ae06405
SOTA 2-bit quants (#4773)
|
2 years ago |
Georgi Gerganov
|
668b31fc7d
swift : exclude ggml-metal.metal from the package (#4822)
|
2 years ago |
Georgi Gerganov
|
42ea63c5a3
llama.swiftui : update readme
|
2 years ago |
Georgi Gerganov
|
52531fdff8
main : add self-extend support (#4815)
|
2 years ago |
Georgi Gerganov
|
b0034d93ce
examples : add passkey test (#3856)
|
2 years ago |
Lars Grammel
|
b7e7982953
readme : add lgrammel/modelfusion JS/TS client for llama.cpp (#4814)
|
2 years ago |
slaren
|
226460cc0d
llama-bench : add no-kv-offload parameter (#4812)
|
2 years ago |
Johannes Gäßler
|
d5a410e855
CUDA: fixed redundant value dequantization (#4809)
|
2 years ago |
Georgi Gerganov
|
9dede37d81
llama : remove unused vars (#4796)
|
2 years ago |
Georgi Gerganov
|
3c36213df8
llama : remove redundant GQA check (#4796)
|
2 years ago |
Alex Azarov
|
72d8407b36
llama.swiftui : use llama.cpp as SPM package (#4804)
|
2 years ago |
Georgi Gerganov
|
d117d4dc5d
llama : print tensor meta for debugging
|
2 years ago |
Alex Azarov
|
3418c03ecc
llama.swiftui : add visionOS target (#4805)
|
2 years ago |
Konstantin Zhuravlyov
|
63ee677efd
ggml : use __builtin_amdgcn_sudot4 in __dp4a for gfx11 (#4787)
|
2 years ago |
Georgi Gerganov
|
67984921a7
server : fix n_predict check (#4798)
|
2 years ago |
Daniel Illescas Romero
|
c75ca5d96f
llama.swiftui : use correct pointer for llama_token_eos (#4797)
|
2 years ago |
Georgi Gerganov
|
96e80dabc6
examples : improve base-translate.sh script (#4783)
|
2 years ago |
a-n-n-a-l-e-e
|
eec22a1c63
cmake : check for openblas64 (#4134)
|
2 years ago |
Ikko Eltociear Ashimine
|
be36bb946a
flake.nix : fix typo (#4700)
|
2 years ago |
Georgi Gerganov
|
91d38876df
metal : switch back to default.metallib (ggml/681)
|
2 years ago |
Georgi Gerganov
|
d061bf9405
ggml : fix q2_k bpw in comments (ggml/680)
|
2 years ago |