Просмотр исходного кода

convert : use n_groups instead of hardcoded values in reshape (#18929)

* convert : use n_groups instead of hardcoded values in reshape

This commit modifies the conversion script for NemotronHModel to use
the 'n_groups' hyperparameter, and allow Python to calculate the the
last dimension, using -1, when reshaping the 'mixer.norm.weight' tensor.

* use self.n_group instead of self.hparams["n_groups"]
Daniel Bevenius 1 неделя назад
Родитель
Сommit
7dee9ff59a
1 измененных файлов с 1 добавлено и 1 удалено
  1. 1 1
      convert_hf_to_gguf.py

+ 1 - 1
convert_hf_to_gguf.py

@@ -9212,7 +9212,7 @@ class NemotronHModel(GraniteHybridModel):
                 return [(mapped_name, reshaped_data)]
 
             if name.endswith("mixer.norm.weight"):
-                reshaped_data = data_torch.reshape(8, 512)
+                reshaped_data = data_torch.reshape(self.n_group, -1)
                 mapped_name = self.map_tensor_name(name)
                 return [(mapped_name, reshaped_data)]