Henrik Forstén
|
6724ef1657
Fix CudaMemcpy direction (#4599)
|
2 лет назад |
slaren
|
48b7ff193e
llama : fix platforms without mmap (#4578)
|
2 лет назад |
Herman Semenov
|
48b24b170e
ggml : add comment about backward GGML_OP_DIAG_MASK_INF (#4203)
|
2 лет назад |
Michael Kesper
|
28cb35a0ec
make : add LLAMA_HIP_UMA option (#4587)
|
2 лет назад |
rhuddleston
|
f31b984898
ci : tag docker image with build number (#4584)
|
2 лет назад |
Deins
|
2bb98279c5
readme : add zig bindings (#4581)
|
2 лет назад |
bobqianic
|
0137ef88ea
ggml : extend `enum ggml_log_level` with `GGML_LOG_LEVEL_DEBUG` (#4579)
|
2 лет назад |
crasm
|
c7e9701f86
llama : add ability to cancel model loading (#4462)
|
2 лет назад |
Georgi Gerganov
|
afefa319f1
ggml : change ggml_scale to take a float instead of tensor (#4573)
|
2 лет назад |
Georgi Gerganov
|
769a7bc85e
gguf-py : fix broken link
|
2 лет назад |
Georgi Gerganov
|
32259b2dad
gguf : simplify example dependencies
|
2 лет назад |
Samuel Maynard
|
4a5f9d629e
ci : add `jlumbroso/free-disk-space` to docker workflow (#4150)
|
2 лет назад |
slaren
|
d232aca5a7
llama : initial ggml-backend integration (#4520)
|
2 лет назад |
Marcus Dunn
|
31f27758fa
llama : allow getting n_batch from llama_context in c api (#4540)
|
2 лет назад |
Finn Voorhees
|
56fa50819f
metal : fix `ggml_metal_log` vargs (#4373)
|
2 лет назад |
Erik Garrison
|
0f630fbc92
cuda : ROCm AMD Unified Memory Architecture (UMA) handling (#4449)
|
2 лет назад |
arlo-phoenix
|
562cf222b5
ggml-cuda: Fix HIP build by adding define for __trap (#4569)
|
2 лет назад |
Jared Van Bortel
|
8fe03ffdda
common : remove incorrect --model-draft default (#4568)
|
2 лет назад |
Johannes Gäßler
|
9154494808
CUDA: mul_mat_id always on GPU for batches >= 32 (#4553)
|
2 лет назад |
Georgi Gerganov
|
c083718c89
readme : update coding guidelines
|
2 лет назад |
howlger
|
880e352277
py : open merges file as 'utf-8' (#4566)
|
2 лет назад |
bobqianic
|
66f35a2f48
cuda : better error message for ggml_get_rows (#4561)
|
2 лет назад |
slaren
|
1398823922
cuda : replace asserts in wrong architecture checks with __trap (#4556)
|
2 лет назад |
Johannes Gäßler
|
d3223afdad
llama : disable per-tensor info prints on model load (#4562)
|
2 лет назад |
LoganDark
|
1d7a1912ce
Fix access violation in ggml_cuda_free_data if tensor->extra is NULL (#4554)
|
2 лет назад |
Johannes Gäßler
|
799fc22689
CUDA: Faster Mixtral prompt processing (#4538)
|
2 лет назад |
Eric Sommerlade
|
328b83de23
ggml : fixed check for _MSC_VER (#4535)
|
2 лет назад |
arlo-phoenix
|
a7aee47b98
ggml-cuda: Fix HIP build (#4528)
|
2 лет назад |
Georgi Gerganov
|
0e18b2e7d0
llama.swiftui : add tinyllama 1.1B F16
|
2 лет назад |
Georgi Gerganov
|
6ff39b129d
llama.swiftui : add more models
|
2 лет назад |