Commit History

Author SHA1 Message Date
  Georgi Gerganov 5cb04dbc16 llama : remove LLAMA_MAX_DEVICES and LLAMA_SUPPORTS_GPU_OFFLOAD (#5240) 2 years ago
  Georgi Gerganov efb7bdbbd0 metal : add im2col F32 dst support (#5132) 2 years ago
  JidongZhang-THU 15606309a0 llava : add MobileVLM support (#5132) 2 years ago
  Neo Zhang Jianyu b2b9f025e7 format license text, restore apache license by legal suggestion (#5233) 2 years ago
  slaren dabcc5b471 ggml : limit n_threads to the max n_tasks (#5238) 2 years ago
  0cc4m f8e9140cb4 Vulkan Fixes (#5223) 2 years ago
  Yiming Cui d62520eb2c Fix typos of IQ2_XXS and IQ3_XXS in llama.cpp (#5231) 2 years ago
  Neo Zhang Jianyu 01684139c3 support SYCL backend windows build (#5208) 2 years ago
  Jared Van Bortel e8dc55d006 kompute : llama-bench support and ggml_cpu_has_kompute() (#5226) 2 years ago
  Georgi Gerganov e0085fdf7c Revert "server : change deps.sh xxd files to string literals (#5221)" 2 years ago
  Georgi Gerganov e6f291d158 server : fix context shift (#5195) 2 years ago
  JohnnyB 4003be0e5f server : change deps.sh xxd files to string literals (#5221) 2 years ago
  Kawrakow fea4fd4ba7 ggml : fix IQ3_XXS on Metal (#5219) 2 years ago
  Georgi Gerganov 8f8ddfcfad sync : ggml (#0) 2 years ago
  Georgi Gerganov 6fb50ebbf0 gguf : fix comparison (ggml/715) 2 years ago
  John Balis 625a699b54 `ggml_cuda_cpy` support for 4d tensors and float16->float32 upcasting (ggml/686) 2 years ago
  Georgi Gerganov a4b07c057a gguf : add input validation, prevent integer overflows (ggml/709) 2 years ago
  Georgi Gerganov 549a1e6cd5 ci : fix yolo URLs + fix metal capture (ggml/712) 2 years ago
  Jack Mousseau 5f14ee0b0c metal : add debug capture backend function (ggml/694) 2 years ago
  Kawrakow 8e14e3ddb3 Faster AVX2 dot product for IQ2_XS (#5187) 2 years ago
  Kawrakow f4d7e54974 SOTA 3-bit quants (#5196) 2 years ago
  0cc4m 2256f36b79 Vulkan Windows APU Memory Handling (#5199) 2 years ago
  Vladimir Malyutin 7359016c7c quantize : fix typo (#5211) 2 years ago
  divinity76 813416991a main : allow empty --prompt-cache file (#5176) 2 years ago
  Romain Neutron 5589921ef8 readme : minor (#5204) 2 years ago
  Georgi Gerganov 49f44b5c55 readme : update hot topics 2 years ago
  Wu Jian Ping 6685cc41c2 server : improve README (#5209) 2 years ago
  Paul Tsochantaris ceebbb5b21 ggml alloc: Fix for null dereference on alloc failure (#5200) 2 years ago
  Jared Van Bortel 6daa69ee81 kompute : fix fallback to CPU (#5201) 2 years ago
  Jared Van Bortel fbf1ddec69 Nomic Vulkan backend (#4456) 2 years ago