Răsfoiți Sursa

RWKV v6: Add time_mix_decay_w1/w2 in quant exclusion list (#9387)

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
Molly Sophia 1 an în urmă
părinte
comite
0b4ac75772
2 a modificat fișierele cu 4 adăugiri și 0 ștergeri
  1. 2 0
      convert_hf_to_gguf.py
  2. 2 0
      src/llama.cpp

+ 2 - 0
convert_hf_to_gguf.py

@@ -302,6 +302,8 @@ class Model:
                             gguf.MODEL_TENSOR.TIME_MIX_FIRST,
                             gguf.MODEL_TENSOR.TIME_MIX_W1,
                             gguf.MODEL_TENSOR.TIME_MIX_W2,
+                            gguf.MODEL_TENSOR.TIME_MIX_DECAY_W1,
+                            gguf.MODEL_TENSOR.TIME_MIX_DECAY_W2,
                         )
                     )
                     or not new_name.endswith(".weight")

+ 2 - 0
src/llama.cpp

@@ -17530,6 +17530,8 @@ static void llama_model_quantize_internal(const std::string & fname_inp, const s
         quantize &= name.find("time_mix_first.weight") == std::string::npos;
         quantize &= name.find("time_mix_w1.weight") == std::string::npos;
         quantize &= name.find("time_mix_w2.weight") == std::string::npos;
+        quantize &= name.find("time_mix_decay_w1.weight") == std::string::npos;
+        quantize &= name.find("time_mix_decay_w2.weight") == std::string::npos;
 
         // do not quantize relative position bias (T5)
         quantize &= name.find("attn_rel_b.weight") == std::string::npos;