فهرست منبع

ggml: Fix data race in ggml threadpool (#11736)

After the barrier in last iteration is executed, still the loop termination
condition will be executed. However main thread can destroy the cgraph object
and its nodes already, then another thread will access it, but the thing is already gone.
Also trouble can happen when n_nodes == 0 or abort is called, but I'm not sure if the
prior situation is possible.

Last syncronization should be done after the loop to ensure the cgraph/cplan won't be
accessed after the main thread exits from the function.
Karol Kontny 11 ماه پیش
والد
کامیت
4d3465c5ae
1فایلهای تغییر یافته به همراه5 افزوده شده و 1 حذف شده
  1. 5 1
      ggml/src/ggml-cpu/ggml-cpu.c

+ 5 - 1
ggml/src/ggml-cpu/ggml-cpu.c

@@ -13856,9 +13856,13 @@ static thread_ret_t ggml_graph_compute_thread(void * data) {
             tp->ec    = GGML_STATUS_ABORTED;
         }
 
-        ggml_barrier(state->threadpool);
+        if (node_n + 1 < cgraph->n_nodes) {
+            ggml_barrier(state->threadpool);
+        }
     }
 
+    ggml_barrier(state->threadpool);
+
     return 0;
 }