How to run TopDownBFS.cpp with hybrid in-socket multithreading: If using CMake: look at the generated makefile by CMake If using makefile: open up your makefile. Either way, add these two compilation flags: -DTHREADED -fopenmp Set OMP_NUM_THREADS to your NUMA-node concurrency. Compile... Run... but with smaller grid (divided by OMP_NUM_THREADS). For instance, -np 16 will run on 96 cores if OMP_NUM_THREADS=6.