Monday, July 9, 2012

Optimizing Grid Engine for AMD Bulldozer Systems

The AMD Bulldozer series (including Piledriver, which was released recently) is very interesting from a micro-architecture point of view. A Bulldozer processor can have as many as 16-cores, and cores are further grouped into modules. With the current generation, each module contains 2 cores, so an 8-module processor has 16 cores, and a 4-module processor has 8 cores, etc.

  • The traditional SMT (eg. Intel Hyper-Threading) pretty much duplicates the register file and the processor front-end, but as most of the execution pipeline is shared between the pair of SMT threads, performance can be greatly affected when the SMT threads are competing for hardware resources.
  • For Bulldozer, only the Floating-Point Unit, L1 instruction cache, L2 cache, and a small part of the execution pipeline are shared, making resource contention a much smaller concern.
A lot of HPC clusters completely turn off SMT, as performance is the main objective for those installations. On the other hand,  Bulldozer processors are less affected by a dumb OS scheduler, but it still helps if system software understands the architecture of the processor. For example, the patched Windows scheduler that understands the AMD hardware can boost the performance by 10%.

And what does this mean for Grid Engine? The Open Grid Scheduler project implemented Grid Engine Job Binding with the hwloc library (initially for the AMD Magny-Cours Opteron 6100 series - the original PLPA library that was used by Sun Grid Engine was not able to handle newer AMD processors), and we also added Linux cpuset support (when the Grid Engine cgroups Integration is enabled and with the cpuset controller present). In both cases, the execution daemon essentially becomes the local scheduler that dispatches processes to the execution cores. With a smarter execution daemon (execd), we can speed up job execution with no changes to any application code.

(And yes, this feature will be available in the GE 2011.11 update 1 release.)