Wednesday, May 20, 2009

Wednesday: the afternoon

The Wednesday sessions continued in the afternoon with an overview of three top projects, Blue Waters, Sequoia, and Roadrunner.

Blue Waters is project with the aim to provides scientists with a compute power of at least 1 PetaFlop - sustained. Bill Kramer from NCSA stressed that next to performance (peak) effectiveness, reliability, and consistency are equally important when offering supercomputing services. But the Blue Waters project is much more than just a PetaFlops and beyond project. An important part is given to the education of users for developing petascale application. That also includes the provisioning with tools like sophisticated compilers, for performance analysis, etc.
"Watch the word sustained", Bill Kramer points out. He compares Peak performance values to the speedometer of a car which shows 180mph but you can never reach it on a normal street - "Linpack is like NASCAR".
Blue waters will be an open platform. First users have not been selected yet. That is expected to happen soon. Those will have to demonstrate that their applications will be able to run on such a facility.

In place of Tom Spelce, who unexpectedly could not make it to the conference John Westlund from LLNL gave on update on the Sequoia project. This project focuses on creating a supercomputer with 20 PetaFlops peak. Six target applications have been chosen for the initial phase of the project. All of them are material science codes from quantum molecular dynamics to a dislocation dynamics code for materials under high pressure strengths. That choice is not surprising, since Sequoia is planned to contain 1.6 Million cores. The MD method is known to be able to scale out to these numbers. Nevertheless, the extremely high number of compute cores imposes a large pressure especially on the MPI communication library development.

Ben Bergen from Los Alamos National Laboratories gave an update on the status of the Roadrunner system. He presented VPIC, a plasma physics code which implements a particle in cell algorithm. It reaches extraordinary 11% Peak of the Roadrunner system. Part of the secrets behind it is a triple buffer approach when using the Cell PCIe boards. Nevertheless, due to a missing instruction cache it is necessary to use overlays for the program text. That means mimicking a cache by software which eats up performance. The results were submitted for the Gordon Bell Price. Unfortunately, it has just been missed.
Ben points out that during the development and porting phase his team was not satisfied with the performance of the PPE component on the Cell processor. Compared to the AMD processors on the main board, the PPE is simply not powerful enough, he argues. Although, he is aware that it is difficult for IBM, he would like to see more support when dealing with PCIe based Cell-boards as used in Roadrunner.
Cerrillos is a Roadrunner-like system with 162TF which recently has been installed at LANL in order to allow unclassified research outside the fence. A call for compute time had been issued, the evaluation of the proposals has just been finished and allocations will be granted soon.

Key note Wednesday

This year, Wednesday is the day of the joint SPScicomP and SPXXL sessions. This day was opened by Francesc Subirada, Associate Director BSC. He welcomed all participants of ScicomP and SPXXL and gave a brief overview of the role of BSC as a supercomputing center in Europe and the role of supercomputing in Europe in general. He especially mentioned and underlined the importance of the European PRACE project.

The first talk of the key note session was given by Jesus Labarta. He introduced the project MareInconito - the undiscovered, unconquered ocean. The aim of this project is to design a 10+ Petaflops supercomputer by 2010/11 in cooperation with IBM. This supercomputer is going to be the Spanish contribution to PRACE.
Based on the Cell processor it will be usable for a broad spectrum of applications. The project is subdivided into six focus areas from Programming models over Performance and analysis tools to hardware like Interconnect and processor.
Within this project the StarSs - Star Superscalar suite is being developed, CellSs and SMPSs are already available today.

John Romein, Stichting ASTRON (Netherlands Institute for Radio Astronomy) presented the LOFAR project - The LOw Frequency ARray. LOFAR is composed of antennas distributed over large parts of the Netherlands and North-Western Germany. Combined they form the world's largest radio telescope. The data from those antennas are gathered and centrally processed using a BlueGene/P supercomputer (2.5 racks right now).

Thomas Lippert, head of the Jülich Supercomputing Center in Germany talked about the ongoing changes at his center which is about to become the most powerful supercomputing site in Europe. When stressing the importance of supercomputing and the race for even more powerful systems, he stated that concerning climate change, "we need a crystal ball and our crystal ball is supercomputing."
In his opinion, it is today unknown, if there ever applications exist, which can make efficient use of exascale compute power. Since there are scientific problems, which demand even more compute power, the motivation must be to find ways to leverage these compute power with new and improved algorithms and applications.

Jülich is going to be a European tier-0 Computing center in 2010. In preparation for that, the German Research School for Simulation Science has been founded which now attracts top scientists from around the world.
The Blue Gene is currently being updated and will be in service within the next weeks offering 1020 TFlops peak performance - the first European PetaFlop system.

In addition QPace was mentioned. QPace is a development project aiming at providing a computer specially suited for QCD Applications. It is expected to perform with 25.6 TFlops per Rack.