What do climate modeling, astrophysics, and medicine have in common? Big Data. It requires ‘big’ computers, too big to be just one computer but rather a number of them tied together.
Managing all this data and making ‘sense’ out of it puts huge demands on performance, scalability and energy-efficiency of infrastructures. In many cases these high capacity demands are addressed by including one or more types of accelerators to speed up calculations (e.g. Graphics Processing Units), leading to much more heterogeneous systems. Second, virtualization of computer, storage, and network resources increases the flexibility to structure and arrange these resources, known as cloud computing. Virtualization techniques are well known in scientific computing, but there are significant challenges with the application of cloud computing to scientific computing. Third, in recent years industry found a mode to adopt and commercialize cloud computing, which leads to different usage models for computer resources. Tools for scientific computing will have to take into account that computer resources will be spread over a multitude of different parties, computer environments, and interfaces.
The ICT challenge of this project is to ease the management of highly complicated scientific computing infrastructures by effectively shielding the user from the low-level complexity. The project will investigate how to design a programmable e-Science architecture while describing the infrastructure components and optimize them for typical usage scenarios. The architecture will investigate efficient methods to program data-intensive applications on heterogeneous systems and to build workflow-based collaborative problem solving environments. By solving these challenging ICT problems around resource usage and optimization, P20 will enable much easier access to e-Infrastructures, despite their growing complexity.
We primarily target our research at the scientific community. We work together with researchers from climate modeling, astrophysics, and medicine which gave us much valuable feedback and helps scientists to bridge the gap to demanding e-Science applications.
A project paper describing important aspects of the design of such a system has won the best-paper award at the Cloud Computing 2012 conference. This paper focuses on interoperability and integration issues for multi-provider, multi-domain heterogeneous cloud architectures. We have also helped in setting up a prototype infrastructure consisting of 6 geographically distributed heterogeneous clusters, which enabled many computing scientists to do award-winning research. Our research was presented in numerous papers and presentations, including a keynote at CCGrid’2012.