A 30sec compilation from ScicomP15 photos, Enjoy!
Monday, May 25, 2009
Friday, May 22, 2009
ScicomP 16 - Next year's meeting location
ScicomP 16, next years meeting is going to take place in San Francisco, CA. It will be hosted by Lawrence Berkeley National Laboratory.
May 10th - 14th
The meeting venue is
Rooms will be $119+tax for ScicomP and the room block will be held until April 18th, 2010. There is a 72-hour cancellation policy.
The venue is easily reachable by public transit (BART).
Last day
For the last day I only have one photo left, since I needed to take off to the airport.
Piyush Chadhary aka PC from IBM gave an entertaining overview and update on IBM's HPC software and hardware
Piyush Chadhary aka PC from IBM gave an entertaining overview and update on IBM's HPC software and hardware
Thursday, May 21, 2009
IBM Updates
Thursday afternoon was in first place dedicated to updates concerning Compilers and MPI environment including a feedback session.
Barna Bihari from ICON Consulting started the session with a look into IBM's transactional memory compiler. This alphaworks project allows the developer to specify transactional memory regions (-> link)
Roch Archambault, IBM, when giving his compiler update
Chulho Kim, IBM, who updated us on MPI and lead the feedback session
Barna Bihari from ICON Consulting started the session with a look into IBM's transactional memory compiler. This alphaworks project allows the developer to specify transactional memory regions (-> link)
Roch Archambault, IBM, when giving his compiler update
Chulho Kim, IBM, who updated us on MPI and lead the feedback session
Blue Gene Morning
Thursday morning was mainly focused on topics around IBM's BlueGene computer architecture.
Todd Inglett from IBM gave an overview of what has been changed and updated recently. (In order to get the IBM speakers not in trouble, in case they accidentally provided information, which would otherwise have required a NDA at this point of time, I will not cover IBM talks in more detail.)
Pascal Vezolle, IBM France, talking about Mixed OpenMP/MPI approaches on Blue Gene for CDF applications
Partial Global Address Space languages are without a doubt a hot topic.
Rajesh Nishtala from UC-Berkeley gave an insight into implementation of those languages, with a focus on Berkley's own UPC on the BlueGene/P architecture. He finds that UPC outperforms MPI on the BlueGene/P.
Markus Blatt from IWR, University Heidelberg, Germany, introduces DUNE - a partial differential equation solver framework which has recently been ported to the BlueGene/P architecture.
Memory debugging on parallel architectures is often a tedious task. Ed Hinckel from TotalView Technologies showed, what can be achieved on this front using the memory debugging features of TotalView on the BlueGene architecture.
Finally, David Klepacki, IBM gave an update on IBM's ACTC work and tools
Todd Inglett from IBM gave an overview of what has been changed and updated recently. (In order to get the IBM speakers not in trouble, in case they accidentally provided information, which would otherwise have required a NDA at this point of time, I will not cover IBM talks in more detail.)
Pascal Vezolle, IBM France, talking about Mixed OpenMP/MPI approaches on Blue Gene for CDF applications
Partial Global Address Space languages are without a doubt a hot topic.
Rajesh Nishtala from UC-Berkeley gave an insight into implementation of those languages, with a focus on Berkley's own UPC on the BlueGene/P architecture. He finds that UPC outperforms MPI on the BlueGene/P.
Markus Blatt from IWR, University Heidelberg, Germany, introduces DUNE - a partial differential equation solver framework which has recently been ported to the BlueGene/P architecture.
Memory debugging on parallel architectures is often a tedious task. Ed Hinckel from TotalView Technologies showed, what can be achieved on this front using the memory debugging features of TotalView on the BlueGene architecture.
Finally, David Klepacki, IBM gave an update on IBM's ACTC work and tools
Wednesday, May 20, 2009
Wednesday: the afternoon
The Wednesday sessions continued in the afternoon with an overview of three top projects, Blue Waters, Sequoia, and Roadrunner.
Blue Waters is project with the aim to provides scientists with a compute power of at least 1 PetaFlop - sustained. Bill Kramer from NCSA stressed that next to performance (peak) effectiveness, reliability, and consistency are equally important when offering supercomputing services. But the Blue Waters project is much more than just a PetaFlops and beyond project. An important part is given to the education of users for developing petascale application. That also includes the provisioning with tools like sophisticated compilers, for performance analysis, etc.
"Watch the word sustained", Bill Kramer points out. He compares Peak performance values to the speedometer of a car which shows 180mph but you can never reach it on a normal street - "Linpack is like NASCAR".
Blue waters will be an open platform. First users have not been selected yet. That is expected to happen soon. Those will have to demonstrate that their applications will be able to run on such a facility.
In place of Tom Spelce, who unexpectedly could not make it to the conference John Westlund from LLNL gave on update on the Sequoia project. This project focuses on creating a supercomputer with 20 PetaFlops peak. Six target applications have been chosen for the initial phase of the project. All of them are material science codes from quantum molecular dynamics to a dislocation dynamics code for materials under high pressure strengths. That choice is not surprising, since Sequoia is planned to contain 1.6 Million cores. The MD method is known to be able to scale out to these numbers. Nevertheless, the extremely high number of compute cores imposes a large pressure especially on the MPI communication library development.
Ben Bergen from Los Alamos National Laboratories gave an update on the status of the Roadrunner system. He presented VPIC, a plasma physics code which implements a particle in cell algorithm. It reaches extraordinary 11% Peak of the Roadrunner system. Part of the secrets behind it is a triple buffer approach when using the Cell PCIe boards. Nevertheless, due to a missing instruction cache it is necessary to use overlays for the program text. That means mimicking a cache by software which eats up performance. The results were submitted for the Gordon Bell Price. Unfortunately, it has just been missed.
Ben points out that during the development and porting phase his team was not satisfied with the performance of the PPE component on the Cell processor. Compared to the AMD processors on the main board, the PPE is simply not powerful enough, he argues. Although, he is aware that it is difficult for IBM, he would like to see more support when dealing with PCIe based Cell-boards as used in Roadrunner.
Cerrillos is a Roadrunner-like system with 162TF which recently has been installed at LANL in order to allow unclassified research outside the fence. A call for compute time had been issued, the evaluation of the proposals has just been finished and allocations will be granted soon.
Blue Waters is project with the aim to provides scientists with a compute power of at least 1 PetaFlop - sustained. Bill Kramer from NCSA stressed that next to performance (peak) effectiveness, reliability, and consistency are equally important when offering supercomputing services. But the Blue Waters project is much more than just a PetaFlops and beyond project. An important part is given to the education of users for developing petascale application. That also includes the provisioning with tools like sophisticated compilers, for performance analysis, etc.
"Watch the word sustained", Bill Kramer points out. He compares Peak performance values to the speedometer of a car which shows 180mph but you can never reach it on a normal street - "Linpack is like NASCAR".
Blue waters will be an open platform. First users have not been selected yet. That is expected to happen soon. Those will have to demonstrate that their applications will be able to run on such a facility.
In place of Tom Spelce, who unexpectedly could not make it to the conference John Westlund from LLNL gave on update on the Sequoia project. This project focuses on creating a supercomputer with 20 PetaFlops peak. Six target applications have been chosen for the initial phase of the project. All of them are material science codes from quantum molecular dynamics to a dislocation dynamics code for materials under high pressure strengths. That choice is not surprising, since Sequoia is planned to contain 1.6 Million cores. The MD method is known to be able to scale out to these numbers. Nevertheless, the extremely high number of compute cores imposes a large pressure especially on the MPI communication library development.
Ben Bergen from Los Alamos National Laboratories gave an update on the status of the Roadrunner system. He presented VPIC, a plasma physics code which implements a particle in cell algorithm. It reaches extraordinary 11% Peak of the Roadrunner system. Part of the secrets behind it is a triple buffer approach when using the Cell PCIe boards. Nevertheless, due to a missing instruction cache it is necessary to use overlays for the program text. That means mimicking a cache by software which eats up performance. The results were submitted for the Gordon Bell Price. Unfortunately, it has just been missed.
Ben points out that during the development and porting phase his team was not satisfied with the performance of the PPE component on the Cell processor. Compared to the AMD processors on the main board, the PPE is simply not powerful enough, he argues. Although, he is aware that it is difficult for IBM, he would like to see more support when dealing with PCIe based Cell-boards as used in Roadrunner.
Cerrillos is a Roadrunner-like system with 162TF which recently has been installed at LANL in order to allow unclassified research outside the fence. A call for compute time had been issued, the evaluation of the proposals has just been finished and allocations will be granted soon.
Key note Wednesday
This year, Wednesday is the day of the joint SPScicomP and SPXXL sessions. This day was opened by Francesc Subirada, Associate Director BSC. He welcomed all participants of ScicomP and SPXXL and gave a brief overview of the role of BSC as a supercomputing center in Europe and the role of supercomputing in Europe in general. He especially mentioned and underlined the importance of the European PRACE project.
The first talk of the key note session was given by Jesus Labarta. He introduced the project MareInconito - the undiscovered, unconquered ocean. The aim of this project is to design a 10+ Petaflops supercomputer by 2010/11 in cooperation with IBM. This supercomputer is going to be the Spanish contribution to PRACE.
Based on the Cell processor it will be usable for a broad spectrum of applications. The project is subdivided into six focus areas from Programming models over Performance and analysis tools to hardware like Interconnect and processor.
Within this project the StarSs - Star Superscalar suite is being developed, CellSs and SMPSs are already available today.
John Romein, Stichting ASTRON (Netherlands Institute for Radio Astronomy) presented the LOFAR project - The LOw Frequency ARray. LOFAR is composed of antennas distributed over large parts of the Netherlands and North-Western Germany. Combined they form the world's largest radio telescope. The data from those antennas are gathered and centrally processed using a BlueGene/P supercomputer (2.5 racks right now).
Thomas Lippert, head of the Jülich Supercomputing Center in Germany talked about the ongoing changes at his center which is about to become the most powerful supercomputing site in Europe. When stressing the importance of supercomputing and the race for even more powerful systems, he stated that concerning climate change, "we need a crystal ball and our crystal ball is supercomputing."
In his opinion, it is today unknown, if there ever applications exist, which can make efficient use of exascale compute power. Since there are scientific problems, which demand even more compute power, the motivation must be to find ways to leverage these compute power with new and improved algorithms and applications.
Jülich is going to be a European tier-0 Computing center in 2010. In preparation for that, the German Research School for Simulation Science has been founded which now attracts top scientists from around the world.
The Blue Gene is currently being updated and will be in service within the next weeks offering 1020 TFlops peak performance - the first European PetaFlop system.
In addition QPace was mentioned. QPace is a development project aiming at providing a computer specially suited for QCD Applications. It is expected to perform with 25.6 TFlops per Rack.
The first talk of the key note session was given by Jesus Labarta. He introduced the project MareInconito - the undiscovered, unconquered ocean. The aim of this project is to design a 10+ Petaflops supercomputer by 2010/11 in cooperation with IBM. This supercomputer is going to be the Spanish contribution to PRACE.
Based on the Cell processor it will be usable for a broad spectrum of applications. The project is subdivided into six focus areas from Programming models over Performance and analysis tools to hardware like Interconnect and processor.
Within this project the StarSs - Star Superscalar suite is being developed, CellSs and SMPSs are already available today.
John Romein, Stichting ASTRON (Netherlands Institute for Radio Astronomy) presented the LOFAR project - The LOw Frequency ARray. LOFAR is composed of antennas distributed over large parts of the Netherlands and North-Western Germany. Combined they form the world's largest radio telescope. The data from those antennas are gathered and centrally processed using a BlueGene/P supercomputer (2.5 racks right now).
Thomas Lippert, head of the Jülich Supercomputing Center in Germany talked about the ongoing changes at his center which is about to become the most powerful supercomputing site in Europe. When stressing the importance of supercomputing and the race for even more powerful systems, he stated that concerning climate change, "we need a crystal ball and our crystal ball is supercomputing."
In his opinion, it is today unknown, if there ever applications exist, which can make efficient use of exascale compute power. Since there are scientific problems, which demand even more compute power, the motivation must be to find ways to leverage these compute power with new and improved algorithms and applications.
Jülich is going to be a European tier-0 Computing center in 2010. In preparation for that, the German Research School for Simulation Science has been founded which now attracts top scientists from around the world.
The Blue Gene is currently being updated and will be in service within the next weeks offering 1020 TFlops peak performance - the first European PetaFlop system.
In addition QPace was mentioned. QPace is a development project aiming at providing a computer specially suited for QCD Applications. It is expected to perform with 25.6 TFlops per Rack.
Tuesday, May 19, 2009
Hybrid Afternoon
Tuesday afternoon was essentially dedicated to topics around heterogeneous hardware. The special focus was laid on the Cell/B.E. Architecture.
Wei Huang from NCAR introduced a new tool which helps to get organized with FORTRAN codes. It greatly support a developer in understanding other people's codes and as well as identify design flaws in his own one.
It allows to easily visualize the call tree and module use. In addition it adds things like variable and function use counts.
Holger Brunst from the Technical University of Dresden showed how performance analysis data gathered on Cell code can be visualized and what information are available from it.
Implementing the same mathematical method for both, Cell and GPU platforms gives despite of the amount of work a deep insight on the performance differences especially under consideration that data has to be moved to and from the processing elements. Jose gave an analysis and comparison of how this affects his work
Godehard Sutmann from the Jülich Supercomputing Center (JSC) then gave an introduction into methods for simulating dilute solutions of polymers while taking into account hydrodynamic interactions.
This was followed by a talk about an implementation of parts of the outlined algorithms by Annika Schiller (also JSC). For realizing her project she applied the BSC's superscalar framework for Cell.
Wei Huang from NCAR introduced a new tool which helps to get organized with FORTRAN codes. It greatly support a developer in understanding other people's codes and as well as identify design flaws in his own one.
It allows to easily visualize the call tree and module use. In addition it adds things like variable and function use counts.
Holger Brunst from the Technical University of Dresden showed how performance analysis data gathered on Cell code can be visualized and what information are available from it.
Implementing the same mathematical method for both, Cell and GPU platforms gives despite of the amount of work a deep insight on the performance differences especially under consideration that data has to be moved to and from the processing elements. Jose gave an analysis and comparison of how this affects his work
Godehard Sutmann from the Jülich Supercomputing Center (JSC) then gave an introduction into methods for simulating dilute solutions of polymers while taking into account hydrodynamic interactions.
This was followed by a talk about an implementation of parts of the outlined algorithms by Annika Schiller (also JSC). For realizing her project she applied the BSC's superscalar framework for Cell.
Astronomy Morning
Honoring the year of astronomy, Tuesday's morning was dedicated to astronomy talks. Prof. Ibanez lead through the morning .
Miguel Angel Aloy tried to make plausible Why is it worth to spend 1.5 million CPU-hours in relativistic astrophysics.
Data processing of the GAIA mission is a challenging task. Xavier Luri of the University of Barcelona gave an interesting insight in how this challenge is going to be met.
Simulations of the inspiral and merger of unequal-mass neutron star binaries was the topic of José A. Font talk, who is from the Universidad de Valencia.
Finding interesting phenomena in cosmological simulation which can be used to understand observations is an important methodology in astrophysics.
Steffen Knollmann from the Universidad Autonoma de Madrid highlighted the various aspects of such analysis task on the basis of cosmological n-body simulations.
Relativistic hydrodynamic flows were simulated by Manuel Perucho from the Universidad de Valencia. He presented method and results on simulations of the interactions of hydrodynamic jets and stellar wind as well as the evolution of such jets.
The morning ended with a lively discussion on what users what like to see improved by hard- and software vendors including the tools providers from the various computing centers present.
Most of the concern dealt with the event of multicore processors exhibiting a large number of compute cores. Here, it was agreed that OpenMP/MPI hybrid codes seem to be the future interm of addressing the multicore challenge.
Nevertheless, various people pointed out that OpenMP/MPI hybridization is as a challenge in itself. Employing this kind of hybridization means for most applications a complete redesign of the (MPI-centric) application.
Furthermore, it was noted that the current I/O libraries suffer from various problems. The most prominent one is certainly the lack of support for high cpu numbers (>10000). Here, most libraries have limits which renders them unusable in a supercomputing context. But also better possibilities for handling large amounts of data in file systems was formulated as a demand to parallel filesystem vendors.
ScicomP 15 in Barcelona
About a hundred leading scientists and experts in the field of scientific computing are currently gathering in Barcelona. They are taking part in the IBM supported ScicomP meeting, addressing various aspects of porting and optimizing applications to IBM latest hardware systems. This year’s event is hosted and organized by the Barcelona Supercomputing Center (BSC).
The meeting started out with workshop and tutorial sessions on how to make efficient use of the PowerXCell Architecture which can also be found in the Playstation 3.
In addition, BSC offers daily tours showing mare nostrum, one of Europe’s fastes supercomputers. This supercomputer is very likely the first one which has been installed in a church.
The meeting started out with workshop and tutorial sessions on how to make efficient use of the PowerXCell Architecture which can also be found in the Playstation 3.
In addition, BSC offers daily tours showing mare nostrum, one of Europe’s fastes supercomputers. This supercomputer is very likely the first one which has been installed in a church.
Subscribe to:
Posts (Atom)