Tuesday, July 31, 2012

Using Grid Engine in the Xeon Phi Environment

The Intel Xeon Phi (Codename: MIC - Many Integrated Core) is an interesting HPC hardware platform. While it is not in production yet, there are already lots of porting and optimization work done for Xeon Phi. In the beginning of this year, we also started the conversation with Intel - we wanted to make sure that Open Grid Scheduler/Grid Engine works in the Xeon Phi environment as we have received requests for Xeon Phi support at SC11.

While a lot of information is still under NDA, there are already lots of papers published on the Internet, and even Intel engineers themselves have already disclosed information about the Xeon Phi architecture and programming models. For example, by now, it is widely known that Xeon Phi runs an embedded Linux OS image on the board - and in fact the users can log onto the board and use it as a multi-core Linux machine.

One Xeon Phi execution model is the more traditional offload execution model, where applications running on the host processors offload sections of code to the Xeon Phi accelerator. Note that Intel also defines the Language Extensions for Offload (LEO) compiler directives to ease porting. And yet with the standalone execution model, users execute code directly and natively on the board, and the host processors are not involved in the computation.

Open Grid Scheduler/Grid Engine can be easily configured to handle the offload execution model, as the Xeon Phi used this way is very similar to a GPU device. Grid Engine can easily schedule jobs to the hosts that have Xeon Phi boards, and Grid Engine can make sure that the hardware resources are not oversubscribed. Yet the standalone execution model is more interesting, it is the Linux OS environment that most HPC users are familar with, but it adds a level of indirection to job execution. We don't have the design finalized yet, as the software environment is not released to the public, but our plan is to support both execution models in a future version of Open Grid Scheduler/Grid Engine.