How to run TopDownBFS.cpp with hybrid in-socket multithreading:

If using CMake: look at the generated makefile by CMake
If using makefile: open up your makefile. 

Either way, add these two compilation flags: -DTHREADED -fopenmp

Set OMP_NUM_THREADS to your NUMA-node concurrency. 

Compile...
Run... but with smaller grid (divided by OMP_NUM_THREADS). For instance, -np 16 will run on 96 cores if OMP_NUM_THREADS=6.