Tuesday, May 11, 2010
ScicomP 16
This year’s ScicomP and SP XXL user groups are holding their annual summer meeting in San Francisco, CA, USA. During the ScicomP conference, natural scientists from various fields present their work carried out with the help of the latest supercomputers from IBM.
The intention of this conference is to engage natural scientists in discussions with computer scientists, supercomputer administrators and IBM representatives. This helps computing centers as well as IBM to even better understand needs and requirements of scientific communities. At the same time it also highlights the extraordinary work which can be carried out on today’s most advanced computing systems.
During the course of the conference, plenty of possibilities are given for discussion and exchanging ideas between scientists, supercomputing center’s staff members, and IBM representatives.
Monday, May 25, 2009
Friday, May 22, 2009
ScicomP 16 - Next year's meeting location
ScicomP 16, next years meeting is going to take place in San Francisco, CA. It will be hosted by Lawrence Berkeley National Laboratory.
May 10th - 14th
The meeting venue is
Rooms will be $119+tax for ScicomP and the room block will be held until April 18th, 2010. There is a 72-hour cancellation policy.
The venue is easily reachable by public transit (BART).
Last day
For the last day I only have one photo left, since I needed to take off to the airport.
Piyush Chadhary aka PC from IBM gave an entertaining overview and update on IBM's HPC software and hardware
Piyush Chadhary aka PC from IBM gave an entertaining overview and update on IBM's HPC software and hardware
Thursday, May 21, 2009
IBM Updates
Thursday afternoon was in first place dedicated to updates concerning Compilers and MPI environment including a feedback session.
Barna Bihari from ICON Consulting started the session with a look into IBM's transactional memory compiler. This alphaworks project allows the developer to specify transactional memory regions (-> link)
Roch Archambault, IBM, when giving his compiler update
Chulho Kim, IBM, who updated us on MPI and lead the feedback session
Barna Bihari from ICON Consulting started the session with a look into IBM's transactional memory compiler. This alphaworks project allows the developer to specify transactional memory regions (-> link)
Roch Archambault, IBM, when giving his compiler update
Chulho Kim, IBM, who updated us on MPI and lead the feedback session
Blue Gene Morning
Thursday morning was mainly focused on topics around IBM's BlueGene computer architecture.
Todd Inglett from IBM gave an overview of what has been changed and updated recently. (In order to get the IBM speakers not in trouble, in case they accidentally provided information, which would otherwise have required a NDA at this point of time, I will not cover IBM talks in more detail.)
Pascal Vezolle, IBM France, talking about Mixed OpenMP/MPI approaches on Blue Gene for CDF applications
Partial Global Address Space languages are without a doubt a hot topic.
Rajesh Nishtala from UC-Berkeley gave an insight into implementation of those languages, with a focus on Berkley's own UPC on the BlueGene/P architecture. He finds that UPC outperforms MPI on the BlueGene/P.
Markus Blatt from IWR, University Heidelberg, Germany, introduces DUNE - a partial differential equation solver framework which has recently been ported to the BlueGene/P architecture.
Memory debugging on parallel architectures is often a tedious task. Ed Hinckel from TotalView Technologies showed, what can be achieved on this front using the memory debugging features of TotalView on the BlueGene architecture.
Finally, David Klepacki, IBM gave an update on IBM's ACTC work and tools
Todd Inglett from IBM gave an overview of what has been changed and updated recently. (In order to get the IBM speakers not in trouble, in case they accidentally provided information, which would otherwise have required a NDA at this point of time, I will not cover IBM talks in more detail.)
Pascal Vezolle, IBM France, talking about Mixed OpenMP/MPI approaches on Blue Gene for CDF applications
Partial Global Address Space languages are without a doubt a hot topic.
Rajesh Nishtala from UC-Berkeley gave an insight into implementation of those languages, with a focus on Berkley's own UPC on the BlueGene/P architecture. He finds that UPC outperforms MPI on the BlueGene/P.
Markus Blatt from IWR, University Heidelberg, Germany, introduces DUNE - a partial differential equation solver framework which has recently been ported to the BlueGene/P architecture.
Memory debugging on parallel architectures is often a tedious task. Ed Hinckel from TotalView Technologies showed, what can be achieved on this front using the memory debugging features of TotalView on the BlueGene architecture.
Finally, David Klepacki, IBM gave an update on IBM's ACTC work and tools
Wednesday, May 20, 2009
Wednesday: the afternoon
The Wednesday sessions continued in the afternoon with an overview of three top projects, Blue Waters, Sequoia, and Roadrunner.
Blue Waters is project with the aim to provides scientists with a compute power of at least 1 PetaFlop - sustained. Bill Kramer from NCSA stressed that next to performance (peak) effectiveness, reliability, and consistency are equally important when offering supercomputing services. But the Blue Waters project is much more than just a PetaFlops and beyond project. An important part is given to the education of users for developing petascale application. That also includes the provisioning with tools like sophisticated compilers, for performance analysis, etc.
"Watch the word sustained", Bill Kramer points out. He compares Peak performance values to the speedometer of a car which shows 180mph but you can never reach it on a normal street - "Linpack is like NASCAR".
Blue waters will be an open platform. First users have not been selected yet. That is expected to happen soon. Those will have to demonstrate that their applications will be able to run on such a facility.
In place of Tom Spelce, who unexpectedly could not make it to the conference John Westlund from LLNL gave on update on the Sequoia project. This project focuses on creating a supercomputer with 20 PetaFlops peak. Six target applications have been chosen for the initial phase of the project. All of them are material science codes from quantum molecular dynamics to a dislocation dynamics code for materials under high pressure strengths. That choice is not surprising, since Sequoia is planned to contain 1.6 Million cores. The MD method is known to be able to scale out to these numbers. Nevertheless, the extremely high number of compute cores imposes a large pressure especially on the MPI communication library development.
Ben Bergen from Los Alamos National Laboratories gave an update on the status of the Roadrunner system. He presented VPIC, a plasma physics code which implements a particle in cell algorithm. It reaches extraordinary 11% Peak of the Roadrunner system. Part of the secrets behind it is a triple buffer approach when using the Cell PCIe boards. Nevertheless, due to a missing instruction cache it is necessary to use overlays for the program text. That means mimicking a cache by software which eats up performance. The results were submitted for the Gordon Bell Price. Unfortunately, it has just been missed.
Ben points out that during the development and porting phase his team was not satisfied with the performance of the PPE component on the Cell processor. Compared to the AMD processors on the main board, the PPE is simply not powerful enough, he argues. Although, he is aware that it is difficult for IBM, he would like to see more support when dealing with PCIe based Cell-boards as used in Roadrunner.
Cerrillos is a Roadrunner-like system with 162TF which recently has been installed at LANL in order to allow unclassified research outside the fence. A call for compute time had been issued, the evaluation of the proposals has just been finished and allocations will be granted soon.
Blue Waters is project with the aim to provides scientists with a compute power of at least 1 PetaFlop - sustained. Bill Kramer from NCSA stressed that next to performance (peak) effectiveness, reliability, and consistency are equally important when offering supercomputing services. But the Blue Waters project is much more than just a PetaFlops and beyond project. An important part is given to the education of users for developing petascale application. That also includes the provisioning with tools like sophisticated compilers, for performance analysis, etc.
"Watch the word sustained", Bill Kramer points out. He compares Peak performance values to the speedometer of a car which shows 180mph but you can never reach it on a normal street - "Linpack is like NASCAR".
Blue waters will be an open platform. First users have not been selected yet. That is expected to happen soon. Those will have to demonstrate that their applications will be able to run on such a facility.
In place of Tom Spelce, who unexpectedly could not make it to the conference John Westlund from LLNL gave on update on the Sequoia project. This project focuses on creating a supercomputer with 20 PetaFlops peak. Six target applications have been chosen for the initial phase of the project. All of them are material science codes from quantum molecular dynamics to a dislocation dynamics code for materials under high pressure strengths. That choice is not surprising, since Sequoia is planned to contain 1.6 Million cores. The MD method is known to be able to scale out to these numbers. Nevertheless, the extremely high number of compute cores imposes a large pressure especially on the MPI communication library development.
Ben Bergen from Los Alamos National Laboratories gave an update on the status of the Roadrunner system. He presented VPIC, a plasma physics code which implements a particle in cell algorithm. It reaches extraordinary 11% Peak of the Roadrunner system. Part of the secrets behind it is a triple buffer approach when using the Cell PCIe boards. Nevertheless, due to a missing instruction cache it is necessary to use overlays for the program text. That means mimicking a cache by software which eats up performance. The results were submitted for the Gordon Bell Price. Unfortunately, it has just been missed.
Ben points out that during the development and porting phase his team was not satisfied with the performance of the PPE component on the Cell processor. Compared to the AMD processors on the main board, the PPE is simply not powerful enough, he argues. Although, he is aware that it is difficult for IBM, he would like to see more support when dealing with PCIe based Cell-boards as used in Roadrunner.
Cerrillos is a Roadrunner-like system with 162TF which recently has been installed at LANL in order to allow unclassified research outside the fence. A call for compute time had been issued, the evaluation of the proposals has just been finished and allocations will be granted soon.
Subscribe to:
Posts (Atom)