FrobertClusters, Clouds, and Data for Scientific Computing

CCDSC 2014

 

September 2nd – 5th, 2014

Ch‰teauformÕ

La Maison des Contes

427 Chemin de ChanzŽ, France

 

Sponsored by:

Grenoble Alps University, ICL/UT, AMD, ANR, Google, ParTec, The Portland Group, Intel, INRIA, Nvidia, ParTec, CGG Veritas, HP

 

        


Click here for the full size image. 

 

Clusters, Clouds, and Data for Scientific Computing

 2014

Ch‰teauformÕ

La Maison des Contes

427 Chemin de ChanzŽ, France

September 2nd – 5th, 2014

 

CCDSC 2014 will be held at a resort outside of Lyon France called La maison des contes http://www.chateauform.com/en/chateauform/maison/17/chateau-la-maison-des-contes

 

 

The address of the Chateau is:

Ch‰teauformÕ La Maison des Contes

427 chemin de ChanzŽ

69490 DareizŽ

 

Telephone: +33 1 30 28 69 69

 

1 hr 30 min from the Saint ExupŽry Airport

45 minutes from Lyon

 

 

GPS Coordinates: North latitude 45¡ 54' 20" East longitude 4¡ 30' 41"

 

Go to http://maps.google.com and type in: Ò427 chemin de ChanzŽ 69490 DareizŽÓ or see:

Maps: click here

Map of Chateau: click here

Message from the Program Chairs

 

This proceeding gathers information about the participants of the Workshop on Clusters, Clouds, and Data for Scientific Computing that will be held at La Maison des Contes, 427 Chemin de ChanzŽ, France on September 2nd – 5th, 2014.  This workshop is a continuation of a series of workshops started in 1992 entitled Workshop on Environments and Tools for Parallel Scientific Computing. These workshops have been held every two years and alternate between the U.S. and France. The purpose of this the workshop, which is by invitation only, is to evaluate the state-of-the-art and future trends for cluster computing and the use of computational clouds for scientific computing.

This workshop addresses a number of themes for developing and using both cluster and computational clouds. In particular, the talks covered:

¤  Survey and analyze the key deployment, operational and usage issues for clusters, clouds and grids, especially focusing on discontinuities produced by multicore and hybrid architectures, data intensive science, and the increasing need for wide area/local area interaction.

¤  Document the current state-of-the-art in each of these areas, identifying interesting questions and limitations. Experiences with clusters, clouds and grids relative to their science research communities and science domains that are benefitting from the technology.

¤  Explore interoperability among disparate clouds as well as interoperability between various clouds and grids and the impact on the domain sciences.

¤  Explore directions for future research and development against the background of disruptive trends and technologies and the recognized gaps in the current state-of-the-art.

 

Speakers will present their research and interact with all the participants on the future software technologies that will provide for easier use of parallel computers. 

 

This workshop was made possible thanks to sponsorship from ANR, Google, Hewlett-Packard, The Portland Group, Rhone-Alpes Region with the scientific support of the Innovative Computing Laboratory at the University of Tennessee in Knoxville (UTK) and University Joseph Fourier of Grenoble.

 Thanks!

 

Jack Dongarra, Knoxville, Tennessee, USA.

Bernard Tourancheau, Grenoble, France


Draft agenda (1/15/16 11:54 AM)

September 2nd – 5th, 2014

 

 

 

 

 

Tuesday

September 2nd

Introduction and Welcome               Jack Dongarra, U of Tenn

Bernard Tourancheau, U Grenoble

 

6:30  – 7:45

Session Chair: Jack Dongarra

 (2 talks - 25 minute each)

7:00

Michael Wolfe

A Compiler Engineer's View of High Performance Technical Computing

7:30

Patrick Geoffray

Google Cloud HPC

8:00 pm – 9:00 pm

Dinner

 

9:00 pm -

 

 

 

 

 

Wednesday,               September 3rd

 

 

7:30 - 8:30

Breakfast

 

8:30 - 10:35

Session Chair: Bernard Tourancheau, U Grenoble

 

(5 talks – 25 minutes each)

8:30

Pete Beckman

Cognitive Dissonance in HPC

8:55

Rosa Badia

Task-based programming with PyCOMPSs and its integration with data management activities at BSC

9:20

David Abramson

The WorkWays Problem Solving Environment

9:45

Jelena Pjesivac-grbovic

Google Cloud Platform focusing on Data Processing and Analytics tools available in GCP

10:10

Patrick Demichel

The Machine

10:35 -11:00

Coffee

 

11:00  - 1:05

Session Chair: Patrick Demichel

 (5 talks – 25 minutes each)

11:00

Franck Cappello

Toward Approximate Detection of Silent Data Corruptions.

11:25

George Bosilca

Mixed resilience solutions     

11:50

Yves Robert

Algorithms for coping with silent errors

12:15

Frank Mueller

On Determining a Viable Path to Resilience at Exascale

12:40

Satoshi Matsuoka

Towards Billion-Way Resiliency

1:05  - 2:00

Lunch

 

2:30 – 3:00

Coffee

 

3:00 - 5:30

Panel Chair: Rusty Lusk

 

 

Jean-Yves Berthou

Settling the Important Questions, Once and for All

 

Geoffrey Fox

 

Al Geist

 

Thilo Kielmann

 

JL Philippe

 

Vaidy Sunderam

5:45 – 7:30

Wine tasting Cellar Bruno

Travel time to Cellar Bruno 15-20 minutes: two possibilities, walking or cycling

8:00 – 9:00

Dinner

                                        

9:00 pm -

 

 

 

 

 

 

 

 

 

 

Thursday,           September 4th

 

 

7:30 - 8:30

Breakfast

 

8:30 - 10:35

Session Chair: Emmanuel Jeannot

 (4 talks – 25 minutes each)

8:30

Bill Gropp

Computing at a cross-roads: Big Data, Big Compute, and the Long Tail

8:55

Barbara Chapman

Portable Application Development in an Age of Node Diversity

9:20

Marc Buffat

High Performance computing and Big Data for turbulent transition analysis

9:45

Joel Saltz

Exascale Challenges in Integrative Multi-scale  Spatio-Temporal Analyses

10:35 -11:00

Coffee

 

11:00  - 1:05

Session Chair: Laurent Lefevre

 (5 talks – 25 minutes each)

11:00

Dan Reed

Adaptive, Large-Scale Computing Systems

11:25

Ewa Deelman

Building Community Resources For Scientific Workflow Research

11:50

Christian Perez

Evaluation of an HPC Component Model on Jacobi and 3D FFT Kernels.

12:15

Jeff Hollingsworth

NEMO: Autotuning power and performance

12:40

Rajeev Thakur

Future Node Architectures and their Implications for MPI

1:05  - 2:00

Lunch

 

2:00 – 4:00

Session Chair: Xavier Vigouroux

(3 talks – 25 minutes each)

2:30

Dimitrios Nikolopoulos

The Challenges and Opportunities of Micro-servers in the HPC Ecosystem

2:55

Mary Hall

Leveraging HPC Expertise and Technology in Data Analytics

3:20

Torsten Hoefler

Slim Fly: A Cost Effective Low-Diameter Network Topology

4:00 – 5:00

Coffee

 

5:00 - 7:05

Session Chair: Christian Perez

 (5 talks – 25 minutes each)

5:00

Jeff Vetter

Exploring Emerging Memory Technologies in the Extreme Scale HPC Co-Design Space

5:25

Bernd Mohr

The Score-P Tool Universe

5:50

Padma Ragahavan

Multilevel Data Structures for Accelerating Parallel Sparse Matrix Computations

6:15

Frederic Suter

Scalable Off-line Simulation of MPI applications

6:40

Christian Obrecht

Early attempts of implementing the lattice Boltzmann method on Intel's MIC architecture

8:00 – 9:00

Dinner

 

9:00 pm -

 

 

 

 

Friday,  September 5th

 

 

7:30 - 8:30

Breakfast

 

8:30 - 10:35

Session Chair:  Rosa Badia

 (5 talks – 25 minutes each)

8:30

Anthony Danalis

Why PaRSEC is the right runtime for exascale computing

8:55

Michela Taufer

Performance and Cost Effectiveness of DAG-based Workflow Executions on the Cloud

9:20

Martin Swany

Network Acceleration for Data Logistics in Distributed Computing

9:45

Satoshi Sekiguchi

Dataflow-centric Warehouse-scale Computing

10:10

Frederic Vivien

Scheduling Tree-Shaped Task Graphs to Minimize Memory and Makespan

10:35 -11:00

Coffee

 

11:00  - 1:05

Session Chair:  FrŽdŽric Suter

 (3 talks – 25 minutes each)

11:00

David Walker

Algorithms for In-Place Matrix Transposition

11:25

Laurent Lefevre

Towards Energy Proportional HPC and Cloud Infrastructures

11:50

Emmanuel Jeannot

Topology-aware Resource Selection

12:30  - 2:00

Lunch

 

2:00

Depart

 

 

 

 


 

 

 

Attendee List:

David

Abramson

U of Queensland

Rosa

Badia

BSC

Pete

Beckman

ANL

Jean-Yves

Berthou

ANR

George

Bosilca

UTK

Bill

Brantley

AMD

Marc

Buffat

U of Lyon

Franck

Cappello

ANL/INRIA

Barbara

Chapman

U of Houston

Francois

Courteille

Nvidia

Joe

Curley

Intel

Anthony

Danalis

UTK

Ewa

Deelman

ISI

Patrick

Demichel

HP

Benoit

Dinechin

Kalray

Jack

Dongarra

UTK/ORNL

Geoffrey

Fox

Indiana

Al

Geist

ORNL

Patrick

Geoffray

Google

Andrew

Grimshaw

U Virginia

Bill

Gropp

UIUC

Mary

Hall

Utah

Torsten

Hoefler

ETH

Jeff

Hollingsworth

U Maryland

Emmanuel

Jeannot

INRIA

Thilo

Kielmann

Vrije Universiteit

Laurent

Lefevre

INRIA

Rusty

Lusk

ANL

Satoshi

Matsuoka

Tokyo Institute of Technology

Bernd

Mohr

Juelich

Frank

Mueller

NC State

Raymond

Namyst

U Bordeaux & INRIA

Dimitrios

Nikolopoulos

Queen's University of Belfast

Christian

Obrecht

INSA Lyon

Jean-Laurent

Philippe

Intel

Christian

Perez

INRIA

Jelena

Pjesivac-grbovic

Google

Padma

Raghavan

Penn State

Dan

Reed

U of Iowa

Yves

Robert

ENS & INRIA

Joel

Saltz

Emory U

Satoshi

Sekiguchi

Grid Technology Research Center, AIST

Vaidy

Sunderam

Emory U

Frederic

Suter

CNRS/IN2P3

Martin

Swany

Indiana U

Michela

Taufer

U of Delaware

Marc

Tchiboukdjian

CGG

Rajeev

Thakur

Argonne

Bernard

Tourancheau

University Grenoble

StŽphane

UbŽda

INRIA

Jeff

Vetter

ORNL

Xavier

Vigouroux

Bull

Frederic

Vivien

ENS & INRIA

David

Walker

Cardiff

Michael

Wolfe

PGI

 

 

Arrival / Departure Information:

 

Here is some information on the meeting in Lyon.  We have updated the workshop webpage http://tiny.cc/ccdsc-2014 with the workshop agenda.

 

On Tuesday September 2nd there will be a bus to pick up participants at Lyon's Saint ExupŽry (old name Satolas) Airport at 3:00. (Note that the Saint ExupŽry airport has its own train station with direct TGV connections to Paris via Charles de Gaulle. If you arrive by train at Saint ExupŽry airport please go to the airport meeting point (point-rencontre) (second floor, next to the shuttles, near the hallway between the two terminals, see http://www.lyonaeroports.com/eng/Access-maps-car-parks/Maps.

 

 

The bus will be at the TGV station is after a long corridor from the airport terminal. The bus stop is near the station entrance on the parking lot called "depose minute".

 

 

The bus will then travel to pick up people at the Lyon Part Dieu railway station at 4:45. (There are two train stations in Lyon, you want Part Dieu station not the Perrache station.) There will be someone with a sign at the "Meeting Point/point de rencontre" of the station to direct you to the bus.

 

The bus is expected to arrive at the La Maison des Contes around 5:30. We would like to hold the first session on Tuesday evening from 6:30 pm to 8:00 pm, with dinner following the session. The La Maison des Contes is about 43 Km from Lyon. For a map to the La Maison des Contes go to http://maps.google.com and type in: Ò427 chemin de ChanzŽ 69490 DareizŽÓ or see: Maps: click here

Map of Chateau: click here

 

VERY IMPORTANT: Please send your arrival and departure times to Jack so we can arrange the appropriate size bus for transportation.  VERY VERY IMPORTANT: If your flight is such that you will miss the bus on Tuesday, September 2nd at 3:00 send Bernard your flight arrival information so he can arrange for a transportation to pick you up at the train station or the airport in Lyon. It turns out that a taxi from Lyon to the Chateau can cost as much as 100 Euro and the Chateau may be hard to find at night if you rent a car and are not a French driver :-). 

 

At the end of the meeting on Friday afternoon, we will arrange for a bus to transport people to the train station and airport. If you are catching an early flight in the morning of Saturday, September 6th  you may want to stay at the hotel located at Lyon's Saint ExupŽry Airport,

see http://www.lyonaeroports.com/eng/Shops-facilities/Hotels for details.

There are also many hotels in Lyon area, see: http://www.en.lyon-france.com/

 

Due to room constraints at the La Maison des Contes, you may have to share a room with another participant. Dress at the workshop is informal.  Please tell us if you need special requirements (vegetarian food etc...) We are expecting to have internet and wireless connections at the meeting.

 

Please send this information to Jack (dongarra@eecs.utk.edu) by July18th.

Name:

Institute:

Title:

Abstract:

ParticipantÕs brief biography:

 


 

 

Arrival  / Departure Details:

 

 

 

Arrival Times in Lyon

Departure Times in Lyon

David

Abramson

9/2 Part Dieu 2:00 pm

9/5 train to Paris @4pm

Rosa

Badia

9/2 VY1220 11:15

9/5 VY1223 19:00

Pete

Beckman

9/2 UA8914 10:10am

9/5

Jean-Yves

Berthou

Drive arrive 9/3 Wednesday

9/5

George

Bosilca

9/2 DL8344 2:30pm

9/5

Bill

Brantley

9/2 Airport 10:00am LATE 4:30pm

9/6 8:15am airport

Marc

Buffat

9/3 Drive

9/4

Franck

Cappello

9/2 Part Dieu 4:00 pm

9/5

Barbara

Chapman

9/2 Part Dieu 7:26pm (taxi at train station to cheateau)

9/5

Francois

Courteille

9/2 Part-Dieu

9/4 train

Joe

Curley

9/2 UA8914 10:10am

9/5 

Anthony

Danalis

9/2 AF7644 2:30pm

9/5 3:15pm St. Exupery

Ewa

Deelman

9/2 Part Dieu

9/5

Patrick

Demichel

Drive

Drive

Benoit

Dinechin

Drive

Drive

Jack

Dongarra

9/2 DL9288 11:15am

9/6 DL9521 6:35am

Geoffrey

Fox

9/2 BA360 11am

9/6 BA365 8:15am

Al

Geist

9/2 DL9515 1:20pm

9/6 DL8611 8:10am

Patrick

Geoffray

9/2 DL8344 2:30pm

9/5

Bill

Gropp

9/2 Part Dieu 2:00pm

9/5 Part Dieu 4pm

Mary

Hall

9/2 airport via train

9/5 train to Paris

Torsten

Hoefler

9/2 Part Dieu 3:26pm

9/5

Jeff

Hollingsworth

9/2 UA8914 10:10am

9/6 UA8881 6:55am

Emmanuel

Jeannot

9/2 airport 4:50pm (pickup by Benoit De Domechin at 5:30pm)

9/5 airport 4:25pm

Thilo

Kielmann

9/2 KL1417 13:20

9/5 KL1416 18:15

Laurent

Lefevre

Drive

9/5

Rusty

Lusk

9/2 UA8914 10:10am

9/5

Satoshi

Matsuoka

9/2 airport (arriving 9/1)

9/6 airport LH1077 2:40pm

Bernd

Mohr

9/2 4U9414 3:15pm (pickup by Benoit De Domechin at 5:30pm)

9/5 4U9417 8:30pm

Frank

Mueller

9/1 Part Dieu

9/5 Part Dieu 3:34pm

Raymond

Namyst

Drive

Leave Thursday morning

Dimitrios

Nikolopoulos

Drive arrive Wednesday  9/3

Drive depart 9/5

Christian

Obrecht

Drive

9/5

Christian

Perez

Drive

9/4 Drive

Jean-Laurent

Philippe

Drive

Leave Wed even

Jelena

Pjesivac-grbovic

9/2 BA360 11:00am

9/7 BA365 8:15am

Padma

Raghavan

9/2 CH532 1:50pm

9/6 AA8602 8:40am

Dan

Reed

9/2 AA6592 11:00am LATE 6:40pm

9/6 AA8602 8:40am

Yves

Robert

Drive

9/5

Joel

Saltz

9/2 AF7644 2:30pm

9/6 AF7641 10:55am

Satoshi

Sekiguchi

9/2 Part Dieu 4:28pm

9/5 Part Dieu 4pm

Vaidy

Sunderam

9/2 airport by 3:00

9/5

Frederic

Suter

Drive

9/5

Martin

Swany

9/2 DL8344 2:30pm

9/6 DL8611 8:10am

Michela

Taufer

9/2 TGV 4:45pm

9/5 TGV pm

Marc

Tchiboukdjian

9/2 Part Dieu 4:00pm

9/5 Part Dieu 6:00pm

Rajeev

Thakur

9/2 LF1076 1:55pm

9/5

Bernard

Tourancheau

9/2 airport

9/5

StŽphane

UbŽda

9/2 Drive

9/3 Depart

Jeff

Vetter

9/2 Part Dieu from Paris

9/6 airport

Xavier

Vigouroux

Drive

9/4

Frederic

Vivien

Drive

9/5

David

Walker

9/2 KL1413 11:15a

9/5 KL1416 6:15pm

Michael

Wolfe

9/2 from AMS at 1:20pm

9/6 6:35a to AMS

 

 

 

 

 

 

 

 


 

Abstracts:

 

David Abramson and Hoang Nguyen, University of Queensland

 

The WorkWays Problem Solving Environment

 

Science gateways allow computational scientists to interact with a complex mix of mathematical models, software tools and techniques, and high performance computers. Accordingly, various groups have built high-level problem-solving environments that allow these to be mixed freely. In this talk, we introduce an interactive workflow-based science gateway, called WorkWays. WorkWays integrates different domain specific tools, and at the same time is flexible enough to support user input, so that users can monitor and steer simulations as they execute. A benchmark design experiment is used to demonstrate WorkWays.

 

 

Rosa M Badia, Barcelona Supercomputing Center

 

Task-based programming with PyCOMPSs and its integration with data management activities at BSC

 

StarSs is a family of task-based programming models which is based on the idea of writing sequential code which is executed in parallel at runtime taking into account the data dependences between tasks.

COMPSs is an instance of StarSs, which intends to simplify the execution of Java applications in distributed infrastructures, including clusters and Clouds. For that purpose, COMPSs provides both a straightforward Java-based programming model and a componentised runtime that is able to interact with a wide variety of distributed computing middleware (e.g. gLite, Globus) and Cloud APIs (e.g. OpenStack, OpenNebula, Amazon EC2).

 

The talk will focus in the recent extensions to COMPSs: PyCOMPSs, a binding for the Python language which will enable a larger number of scientific applications in fields such as lifesciences and in the integration of COMPSs with new Big Data resource management methodologies developed at BSC, such as the Wasabi self-contained objects library and Cassandra data management policies. These activities are performed under the flagship project Human Brain Project and the Spanish BSC Severo Ochoa project.

 

 

Pete Beckman, ANL                                    

 

Cognitive Dissonance in HPC

 

At extreme-scale, the gulf between what we want and what we can have becomes more pronounced.  The list of conflicting truths, wants, and needs within the HPC community is probably too long to analyze and enumerate, which of course means it is Big Data.  For extreme-scale hardware and system software we must re-examine our investments, designs, beliefs, and performance tradeoffs.  

 

 

 

George Bosilca, UTK

 

Mixed resilience solutions

 

For too long sub-optimal resilience mechanisms have been praised as a one-size-fits-all fault management approaches in production-grade applications. Moving to larger and more powerful computing platforms, we started to realize that these solutions, while valid at certain sizes, are only able to support our programming paradigms or applications at a prohibitive hardware cost. In this talk I will focus on a particular method to cope with these imperfect approaches by combining different resilience methodologies in order to capitalize on their benefits and create cheaper, efficient and more stable ways to deal with failures. More specifically, this talk will cover the mixed case of coordinated checkpoint/restart together with algorithmic fault tolerance.

 

 

Marc BUFFAT, UniversitŽ Claude Bernard Lyon 1

 

High Performance computing and Big Data  for turbulent transition analysis

 

Understanding turbulent transition using numerical experiments is a computational challenge, because it requires very large accurate simulations. In the past, studies in scientific simulation have been mainly focused on the solver, because it was the most CPU consuming part. Nowadays highly accurate numerical solver, as the NadiaSpectral code in our group, allow to run very large turbulent transition simulations using billions of modes on HPC. However, due to the size of such simulations, specific issues are emerging related to the input/output and the analysis of the results. Particularly when large simulations are performed as experiments that must be analyzed in details without a priori knowledge, saving to disk the computed data at regular time steps for post-processing is a source of worrisome overhead. Thus new trends emerge that consider the analysis and the visualization as a part of a high-performance simulation using Òin-situ visualizationÓ. A tightly coupled in-situ processing using general purpose visualization tools as VisIt or ParaView is however not well adapted to our needs. In this talk, I will present a case study of an hybrid in-situ concurrent processing, that allow to interact with the simulation, analyze and visualize time dependent results while preserving the accuracy of large simulations.

 

 

Franck Cappello, Univ Paris/ANL

 

Toward Approximate Detection of Silent Data Corruptions.

 

Exascale systems will suffer more frequent soft errors than current systems. Hardware protections will detect and may correct most of them. However the probability of soft errors to stay unnoticed will become significant. These errors, known as silent soft errors may lead ultimately to wrong results. In this talk we will focus on the SDC detection problem and review existing system and algorithmic techniques. We will also introduce low cost approximate detection approaches that are promising in the Exascale context and beyond.

 

 

Barbara Chapman, U of Houston         

 

Portable Application Development in an Age of Node Diversity                                  

 

 

Anthony Danalis, UTK

 

Why PaRSEC is the right runtime for exascale computing

 

Current HPC systems feature increasing core counts, accelerators, and unpredictable memory access times. Developing efficient applications for such systems requires new programming paradigms. Solutions must react and adapt quickly to unexpected contentions and delays, and have the flexibility to rearrange the load balance to improve the resource utilization.  In this talk, we demonstrate why PaRSEC is the right solution for this problem. We outline the dataflow-based task execution model of PaRSEC and describe the Parameterized Task Graph (PTG) that enables this model. Then the PTG is contrasted with the more traditional Bulk Synchronous and Coarse Grain Parallelism model that is embodied in applications that use MPI for explicit message passing. Also, the PTG model is contrasted with the alternative approach for task execution, where the entire dynamic DAG of tasks is created and maintained in memory.  We then showcase example success stories and discuss future directions.

 

Ewa Deelman, ISI

 

Building Community Resources For Scientific Workflow Research

 

A significant amount of recent research in scientific workflows aims to develop new techniques, algorithms, and systems that can overcome the challenges of efficient and robust execution of ever larger workflows on increasingly complex distributed infrastructures. Since the infrastructures, systems, and applications are complex, and their behavior is difficult to reproduce using physical experiments, much of this research is based on simulation. However, there exists a shortage of realistic datasets and tools that can be used for such simulations. This talk describes a collection of tools and data that have enabled research on new techniques, algorithms, and systems for scientific workflows. These resources include: 1) execution traces of real workflow applications from which workflow and system characteristics such as resource usage and failure profiles can be extracted, 2) a synthetic workflow generator that can produce realistic synthetic workflows based on profiles extracted from execution traces, and 3) a simulator framework that can simulate the execution of synthetic workflows on realistic distributed infrastructures. The talk describes how these resources have been used to investigate new techniques for efficient and robust workflow execution, as well as provided the basis for improvements to the Pegasus Workflow Management System or other workflow tools. All the tools and data are freely available online for the community.

 

 

Patrick Demichel, HP

 

THE MACHINE

 

Our industry is challenged by the simultaneous end of regime of most of our old technologies developed for decades, and the insatiable demand of 10X more every 3 years to process the tsunami of data coming to us.  The HP-labs have identified this challenge many years ago and developed the technologies then a program to disrupt by at least 2 orders of magnitude the natural trends to enable the Exascale story This time, this will be a radically more disruptive evolution of our systems; we are forced to holistically redesign most of our hardware and software components to achieve this goal and deliver the promise of extracting the value in the data   This program is called "THE MACHINE";  this is not just the design of a massive Data Center;  but the redesign from scratch of a new infrastructure that will integrate the full ecosystem from the data centers to the billions of connected intelligent objects

 

 

 

Patrick Geoffery, Google

 

Google Cloud HPC

 

 

Bill Gropp, UIUC

 

Computing at a cross-roads: Big Data, Big Compute, and the Long Tail

 

The US National Science Foundation has commissioned a study on the future of advanced computing for NSF. The committee is soliciting input on the impact of computing, the tradeoffs between different kinds of computing and data capabilities, and alternative methods of providing cyberinfrastructure resources.  This talk will give an overview of the issues, pose questions for the audience, and invite input for the report.

 

 

Mary Hall, University of Utah

 

Leveraging HPC Expertise and Technology in Data Analytics

 

Scalable approaches to scientific simulation and to data analytics have mostly followed separate technology paths.  In HPC, performance and simulation accuracy have been principal drivers of technology, while data analytics research has primarily focused on programming tools and systems that are productive and resilient in the presence of frequent faults. This talk discusses how the future challenges in large-scale systems for both HPC and data analytics will face similar challenges in addressing scalability, energy efficiency, resilience and programmability.  We make several observations about programming trends and future architectures through surveying contemporary work in both areas, with a particular emphasis on architectures, programming systems and algorithms.  We then discuss where research on HPC can be leveraged in data analytics and how applications that are both compute- and data-intensive can evolve.

 

 

Torsten Hoefler, ETH ZŸrich

 

Slim Fly: A Cost Effective Low-Diameter Network Topology

 

We introduce a high-performance cost-effective network topology called Slim Fly that approaches the theoretically optimal network diameter. Slim Fly is based on graphs that approximate the solution to the degree-diameter problem. We analyze Slim Fly and compare it to both traditional and state-of-the-art networks. Our analysis shows that Slim Fly has significant advantages over other topologies in latency, bandwidth, resiliency, cost, and power consumption. Finally, we propose deadlock-free routing schemes and physical layouts for large computing centers as well as a detailed cost and power model. Slim Fly enables constructing cost effective and highly resilient datacenter and HPC networks that offer low latency and high bandwidth under different HPC workloads such as stencil or graph computations.

 

 

Jeff Hollingsworth, U Maryland

 

NEMO: Autotuning power and performance

 

Autotuning has demonstrated its utility in many domains.   However, increasingly there is a need to autotune for multiple objective functions (such as power and performance).  In this talk I will describe NEMO, a system for multi-objective autotuning.   NEMO allows efficiently finding solutions near the Pareto front without having to explicitly build the full Pareto front.   I will present some preliminary results of using NEMO to autotune a GPU kernel.

 

 

Emmanuel Jeannot, INRIA

 

Topology-aware Resource Selection

 

The way resources are allocated to application plays a crucial role in the performance of the execution. It has been shown recently that a non-contiguous allocation can slowdown the performance by more than 30%. However, a batch scheduler cannot always provide a contiguous allocation and even in the case of such allocation the way processes are mapped to the allocated resources have a big impact on the performance. The reason is that the topology of HPC machine is hierarchical and that the process affinity is not uniform (some pairs of processes exchange more data than some other pairs). Hence taking into account the topology of the machine and the process affinity is an effective way to increase the application performance.

 

Nowadays, the allocation and the mapping are decoupled. For instance, in Zoltan, processors are first allocated to the application and then pro- cesses are mapped to the allocated resources depending on the topology and the communication pattern. Decoupling allocation and mapping can lead to sub- optimal solutions where a better mapping could have been found if the resource selection had taken into account the process affinity.

 

In this talk, we will present our work for coupling the resource allocation and the topology-mapping. We have designed and implemented a new Slurm plug-in that takes as input the process affinity of the application and that, according to the machine topology selects resources and maps processes taking into account these two entries (affinity and topology). It is based on our process placement tool called TreeMatch that provides the algorithmic engine to compute the solution. We will present our preliminary results by emulating traces of the Curie machine that features 5040 nodes (2 socket of 8 cores each) and comparing our solution with the plain Slurm.

 

 

Laurent Lefevre, INRIA ENS

 

Towards Energy Proportional HPC and Cloud Infrastructures

 

Reducing energy consumption is part of the main concerns in cloud and HPC environments. Today servers energy consumption is far from ideal, mostly because it remains very high even with low usage state. An energy consumption proportional to the server load would bring important savings in terms of electricity consumption and then financial costs for a datacenter infrastructure. This talk will present our first result on this domain.

 

 

 

Satoshi Matsuoka and Kento Sato, Tokyo Institute of Technology

 

Towards Billion-Way Resiliency

 

Our "Billion-Way Resiliency" project aims at creating algorithms and software frameworks to achieve scalable resiliency in future exascale systems with high failure rates and limited I/O bandwidth. Currently, many future architectural plans assume burst-buffers to alleviate the I/O limitations; our modeling of resiliency I/O behavior demonstrates that, due to the burst buffer itself failing, there are various architectural tradeoffs. The good news is that, given the current failure rates we observe on todayÕs machines, controlling the reliability of exascale machines seem feasible, but the bad news is that it might not scale beyond. Also, issues such as fault detection, programming abstractions, as well as recovery protocols have been previously neglected in most research. While the recent UFLM proposal for MPI has been definitely a step forward, it is largely confined to the MPI layer, and the jury is out on whether such containment would be the formidable choice, or a software framework design that accommodates higher-level programming abstractions to the end-user, while communicating to lower-level system substrates such as batch-queue schedulers via a standardized interface at the same time, would be more powerful. We will touch upon the issue, along with other techniques such as checkpoint compression; all the technologies combined at this point seems to make billion-scale resiliency feasible for future exascale systems.

 

 

 

Bernd Mohr, Juelich

 

The Score-P Tool Universe

 

The talk will present an overview about the community effort Score-P which is a scalable and feature-rich run-time recording package for parallel performance monitoring. It supports profiling, event trace recording, as well as online monitoring; support for sampling is already on the roadmap. Score-P supports a variety of parallel programming models (MPI, OpenMP, CUDA, OpenShmem, GASPI, and others) and a all common HPC architectures (Linux clusters, Cray family, BlueGene family, and more). Unlike all comparable run-time monitoring packages, it is not tied to a particular analysis tool nor to one of the involved groups.  Instead it works natively with the four well-established analysis tools Periscope, Scalasca, TAU, and Vampir. Thus it leverages the complementary analysis methodologies of the four tools.  The presentation will highlight the mayor features of Score-P as well as Periscope, Scalasca, TAU, and Vampir and give an outlook to their roadmaps. Also, it will showcase selected application scenarios.

 

 

Frank Mueller, NC State

 

On Determining a Viable Path to Resilience at Exascale"

 

Exascale computing is projected to feature billion core parallelism.  At such large processor counts, faults will become more common place. Current techniques to tolerate faults focus on reactive schemes for recovery and generally rely on a simple checkpoint/restart mechanism. Yet, they have a number of shortcomings. (1) They do not scale and require complete job restarts. (2) Projections indicate that the mean-time-between-failures is approaching the overhead required for checkpointing.  (3) Existing approaches are application-centric, which increases the burden on application programmers and reduces portability.

 

To address these problems, we discuss a number of techniques and their level of maturity (or lack thereof) to address these problems. These include (a) scalable network overlays, (b) on-the-fly process recovery, (c) proactive process-level fault tolerance, (d) redundant execution, (e) the effort of SDCs on IEEE floating point arithmetic and (f) resilience modeling. In combination, these methods are aimed to pave the path to exascale computing.

 

 

 

Dimitrios Nikolopoulos, Queen's University of Belfast

 

The Challenges and Opportunities of Micro-servers in the HPC Ecosystem

 

 

Raymond Namyst, U Bordeaux & INRIA

 

Co-scheduling parallel codes over heterogeneous machines: a supervised approach

 

Enabling HPC applications to perform efficiently when invoking multiple parallel libraries simultaneously is a great challenge. Even if a uniform runtime system is used underneath, scheduling tasks or threads coming from different libraries over the same set of hardware resources introduces many issues, such as resource oversubscription, undesirable cache flushes or memory bus contention.  We present an extension of StarPU, a runtime system specifically designed for heterogeneous architectures, that allows multiple parallel codes to run concurrently with reduced interference. Such parallel codes run within scheduling contexts that provide confined execution environments which are used to partition computing resources. A hypervisor automatically expands or shrinks Scheduling Contexts using feedback from the runtime system to optimize resource utilization.

 

Christian Obrecht, CETHIL UMR 5008 (CNRS, INSA-Lyon, UCB-Lyon 1), UniversitŽ de Lyon

 

Early attempts of implementing the lattice Boltzmann method on Intel's MIC architecture

 

Starting back in the early 1990's, the lattice Boltzmann method (LBM) has become a well-acknowledged approach in computational fluid dynamics used in numerous industry-grade software such as PowerFLOW, X-Flow, Fluidyna, LaBS. From an algorithmic standpoint, the LBM operates on regular Cartesian grids (and potentially hierarchical refinement) with next-neighbours synchronisation constraints. It is therefore considered as a representative example of stencil computations (see SPEC CPU2006 Benchmark 470.lbm) and it proves to be well-suited for high-performance implementations.

 

In this contribution, we present two attempts to implement a three-dimensional LBM solver on Intel's MIC processor. The first version is based on the OpenCL framework and shows strong analogies with CUDA implementations of LBM. The second version takes advantage of Intel's MIC MPI support. We then report and discuss performance of both solvers on the MIC, as well as on other target systems such as GPUs or distributed systems.

 

 

Jelena Pjesivac-grbovic, Google

 

Google Cloud Platform focusing on Data Processing and Analytics tools available in GCP

 

 

Christian Perez, INRIA

 

Evaluation of an HPC Component Model on Jacobi and 3D FFT Kernels.

 

Scientific applications are increasingly getting complex, e.g. to improve their accuracy by taking into account more phenomena. Meanwhile, computing infrastructures are continuing their fast evolution. Thus, software engineering is becoming a major issue to achieve portability while achieving high performance. Software component model is a promising approach, which enables to manipulate the software architecture of an application. However, existing models do not provide enough support for portability across different hardware architectures. This talks sumarizes experience gained with L2C, a low level component model targeting in particular HPC, on Jacobi and 3D FFT kernels.

 

 

 

Padma Raghavan, Penn State

 

Multilevel Data Structures for Accelerating Parallel Sparse Matrix Computations

 

We propose multilevel forms of the traditional compressed sparse row  (CSR) representation for sparse matrices that map to the non-uniform memory architecture of multicore processors.   We seek to reduce the latencies of data accesses by leveraging temporal locality to enhance cache performance. We discuss and provide results that demonstrate that our CSR-K forms can greatly accelerate sparse matrix vector multiplication and sparse triangular solution on multicores.  We will also comment on how CSR-K dovetails with dynamic scheduling to enable these sparse computations with the multilevel data structures to approach the high rates of execution that are commanded by their dense matrix counterparts

 

 

Daniel A. Reed, University of Iowa

 

Adaptive, Large-Scale Computing Systems

 

HPC systems continue to grow in size and complexity.  TodayÕs leading edge systems contain many thousands of multicore, accelerator-driven nodes, and proposed, next-generation systems are likely to contain even more. At this scale, maintaining system operation when hardware components may fail every few minutes or hours is increasingly difficult. Increasing system sizes bring a complementary challenge surrounding energy availability and costs, with projected systems expected to consume twenty or more megawatts of power.  For future HPC systems to be useable and cost effective, we must develop new design methodologies and operating principles that embody the two important realities of large-scale systems: (a) frequent hardware component failures are a part of normal operation and (b) energy consumption and power costs must be managed as carefully as performance and resilience.  This talk will survey some of the challenges and current work on resilience and energy management.

 

 

Yves Robert, ENS Lyon

 

Algorithms for coping with silent errors

 

Silent errors have become a major problem for large-scale distributed  systems. Detection is hard, and correction is even harder. This talk presents generic algorithms to achieve both detection and correction of silent errors, by coupling verification mechanisms and checkpointing protocols. Application-specific techniques will also beinvestigated for sparse numerical linear algebra.

 

 

Joel  Saltz, SUNY Stony Brook

 

Exascale Challenges in Integrative Multi-scale  Spatio-Temporal Analyses

 

Integrative analyses of large scale spatio-temporal datasets play increasingly important roles in many areas of science and engineering.  Our recent work in this area  is motivated by application scenarios involving complementary digital microscopy, radiology and ÒomicÓ analyses in cancer research. In these scenarios, the objective is to use a coordinated set of image analysis, feature extraction and machine learning methods to predict disease progression and to aid in targeting new therapies.  I will describe tools and methods our group has developed for extraction, management, and analysis of features along with the systems software methods for optimizing execution on high end CPU/GPU platforms.   Once having provided our current work as an introduction,  I will then describe 1) related but much more ambitious exascale  biomedical and non-biomedical use cases that also involve the complex interplay between multi-scale structure and molecular mechanism and 2) concepts and requirements for methods and tools that address these challenges.

 

 

Satoshi Sekiguchi, Grid Technology Research Center, AIST

 

Dataflow-centric Warehouse-scale Computing

 

Foreseeing real-time big data processing in 2020, much more data have to be processed to get better understandings, however, simple scaleup of current system may not satisfy the requirement due to narrow bandwidth between I/O and CPUs to deal with "Big Data". Furthermore, technical trends in IT infrastructure shows no commodity-based servers will survive between gigantic data centers and trillion of edge devices. What does the data center 2020 look like ? We have started a small project to design data processing infrastructure from scratch considering applications to maximize use of wide-variety and multi-velocity of Big Data. This talk will introduce its concept and preliminary thoughts of its design.

 

 

 

Martin Swany, Indiana U

 

Network Acceleration for Data Logistics in Distributed Computing

 

Data movement is a key overhead in distributed computing environments, from HPC to big data applications in the cloud.  Data logistics concerns having data where it needs to be, minimizing effective overheads.  This talk will cover perspectives and specific examples from our recent work, included software-defined networks and programmable network interfaces.

 

 

 

FrŽdŽric SUTER

 

Scalable Off-line Simulation of MPI applications

 

In this talk, I will present the last developments related to the simulation of MPI applications with Time-Independent Traces and SimGrid. After an overview of the encouraging results we achieved and the capacities of this simulation framework, I will detail our on-going work to further increase the scalability of our simulations.

 

 

Michela Taufer, U of Delaware

 

Performance and Cost Effectiveness of DAG-based Workflow Executions on the Cloud

 

When executing DAG-like workflows on a virtualized platform such as the Cloud, we always search for scheduling policies that assure performance and cost effectiveness. The fact that we know the platform's physical characteristics only imperfectly makes our goal hard to achieve. In this talk we address this challenge by performing an exhaustive performance and cost analysis of ``oblivious'' scheduling heuristics on a Cloud platform whose computational characteristics cannot be known reliably. Our study considers three scheduling policies (AO, Greedy, and Sidney) under static and dynamic resource allocation on an EC2 testing environment. Our results outline the strength of the AO policy and show how this policy can effectively reallocate workflows of up to 4000-task DAGs from 2 to 32 vCPUs while providing up to 90% performance gain for 70% additional cost. In contrast, the other policies provide only marginal performance gain for much higher cost. Our empirical observations therefore make a strong case for adopting AO on the Cloud.

 

 

Rajeev Thakur, ANL

 

Future Node Architectures and their Implications for MPI

 

 

Jeffery Vetter, ORNL and GATech

 

Exploring Emerging Memory Technologies in the Extreme Scale HPC Co-Design Space

Concerns about energy-efficiency and reliability have forced our community to reexamine the full spectrum of architectures, software, and algorithms that constitute our ecosystem. While architectures and programming models have remained relatively stable for almost two decades, new architectural features, such as heterogeneous processing, nonvolatile memory, and optical interconnection networks, will demand that software systems and applications be redesigned so that they expose massive amounts of hierarchical parallelism, carefully orchestrate data movement, and balance concerns over performance, power, resiliency, and productivity. In what DOE has termed 'co-design,' teams of architects, software designers, and applications scientists, are working collectively to realize an integrated solution to these challenges. To tackle this challenge of power consumption and cost, we are investigating the design of future memory hierarchies, which includes nonvolatile memory. In this talk, I will sample these emerging memory technologies and discuss how we are preparing applications and software for these upcoming systems with radically different memory hierarchies.

 

 

Frederic Vivien

 

Scheduling Tree-Shaped Task Graphs to Minimize Memory and Makespan

 

We investigate the execution of tree-shaped task graphs using multiple processors. Each edge of such a tree represents a large IO file. A task can only be executed if all input and output files fit into memory, and a file can only be removed from memory after it has been consumed. Such trees arise, for instance, in the multifrontal method of sparse matrix factorization. The maximum amount of memory needed depends on the execution order of the tasks. With one processor the objective of the tree traversal is to minimize the required memory. This problem was well studied and optimal polynomial algorithms were proposed. Here, we extend the problem by considering multiple processors, which is of obvious interest in the application area of matrix factorization. With the multiple processors comes the additional objective to minimize the time needed to traverse the tree, i.e., to minimize the makespan. Not surprisingly, this problem proves to be much harder than the sequential one. We study the computational complexity of this problem and provide an inapproximability result even for unit weight trees. Several heuristics are proposed, each with a different optimization focus, and they are analyzed in an extensive experimental evaluation using realistic trees.

 

 

David Walker and Fred G. Gustavson

 

Algorithms for In-Place Matrix Transposition

 

This talk presents an implementation of an in-place, swap-based algorithm for transposing rectangular matrices. The implementation is based on an algorithm described by Tretyakov and Tyrtyshnikov [Optimal in-place transposition of rectangular matrices. Journal of Complexity 25 (2009), pp. 377–384 ], but we have introduced a number of variations. In particular, we show how the original algorithm can be modified to require constant additional memory. We also identify opportunities for exploiting parallelism. Performance measurements for different algorithm variants are presented and discussed.

 

 

Michael Wolfe, PGI

 

A Compiler Engineer's View of High Performance Technical Computing

 

Looking through the lens of a compiler writer, I explore the past few decades, the present state and likely future of HPC.  What computer architectures have come and gone?  What has survived, and what will we be using ten years hence?  How have programming languages changed?  What requirements and expectations does all this place on a compiler?

 

 


 

 

Biographies of Attendees:

 

David Abramson and Hoang Nguyen

University of Queensland

 

Professor David Abramson has been involved in computer architecture and high performance computing research since 1979. He has held appointments at Griffith University,CSIRO, RMIT and Monash University. Most recently at Monash he was the Director of the Monash e-Education Centre, Deputy Director of the Monash e-Research Centre and a Professor of Computer Science in the Faculty of Information Technology. He held an Australian Research Council Professorial Fellowship from 2007 to 2011. He has worked on a variety of HPC middleware components including the Nimrod family of tools and the Guard relative debugger.

 

Abramson is currently the Director of the Research Computing Centre at the University of Queensland. He is a fellow of the Association for Computing Machinery (ACM) and the Academy of Science and Technological Engineering (ATSE), and a Senior Member of the IEEE.

 

Nguyen is a PhD student in the School of Information Technology and Electrical Engineering at the University of Queensland.

 

 

Rosa M Badia

Barcelona Supercomputing Center

 

Rosa M. Badia holds a PhD on Computer Science (1994) from the Technical University of Catalonia (UPC). She is a Scientific Researcher from the Consejo Superior de Investigaciones Cient’ficas (CSIC) and team leader of the Grid Computing and Cluster research group at the Barcelona Supercomputing Center (BSC). She was involved in teaching and research activities at the UPC from 1989 to 2008, where she was an Associated Professor since year 1997. From 1999 to 2005 she was involved in research and development activities at the European Center of Parallelism of Barcelona (CEPBA). Her current research interest are programming models for complex platforms (from multicore, GPUs to Grid/Cloud).  The group lead by Dr. Badia has been developing StarSs programming model for more than 10 years, with a high success in adoption by application developers. Currently the group focuses its efforts in two instances of StarSs: OmpSs for heterogenoeus platforms and COMPSs for distributed computing (i.e. Cloud).  Dr Badia has published more than 120 papers in international conferences and journals in the topics of her research. She has participated in several European projects, for example BEinGRID, Brein, CoreGRID, OGF-Europe, SIENA, TEXT and VENUS-C,  and currently she is participating in the project Severo Ochoa (at Spanish level), TERAFLUX, ASCETIC, The Human Brain Project, EU-Brazil CloudConnect, and TransPlant and it is a member of HiPEAC2 NoE.

 

 

 


Pete Beckman

Argonne National Laboratory

Director, Exascale Technology and Computing Institute

Co-Director, Northwestern-Argonne Institute for Science and Engineering

 

Pete Beckman is the founder and director of the Exascale Technology and Computing Institute at Argonne National Laboratory and the co-director of the Northwestern-Argonne Institute for Science and Engineering. From 2008-2010 he was the director of the Argonne Leadership Computing Facility, where he led the Argonne team working with IBM on the design of Mira, a 10 petaflop Blue Gene/Q, and helped found the International Exascale Software Project.

 

Pete joined Argonne in 2002, serving first as director of engineering and later as chief architect for the TeraGrid, where he led the design and deployment team that created the world's most powerful Grid computing system for linking production HPC computing centers for the National Science Foundation. After the TeraGrid became fully operational, Pete started a research team focusing on petascale high-performance system software.

 

As an industry leader, he founded a Turbolinux-sponsored research laboratory in 2000 that developed the world's first dynamic provisioning system for cloud computing and HPC clusters. The following year, Pete became vice president of Turbolinux's worldwide engineering efforts, managing development offices in the U.S., Japan, China, Korea, and Slovenia.

 

Dr. Beckman has a Ph.D. in computer science from Indiana University (1993) and a B.A. in Computer Science, Physics, and Math from Anderson University (1985).

 

 

Jean-Yves Berthou, ANR

 

Jean-Yves Berthou has joined in September 2011 the French National Research Agency (ANR) as Director of the Department for Information and Communication Science and Technologies. He has been before that the Director of the EDF R&D Information Technologies program since 2008 and the coordinator of the EESI European Support Action, European  Exascale Software Initiative, www.eesi-project.eu.

 

Jean-Yves joined EDF R&D in 1997 as a researcher. He was the head of the Applied Scientific Computing Group (High Performance Computing, Simulation Platforms Development, Scientific Software Architecture) at EDF R&D from 2002 to 2006. He has been ChargŽ de Mission – Strategic Steering Manager for Simulation, in charge of the simulation program at EDF R&D from 2006 to 2009.

 

Jean-Yves received a Ph.D in computer science from "Pierre et Marie Curie" University (PARIS VI) in 1993. His research deals mainly with Parallelization, Parallel Programming and Software Architecture for Scientific Computing.

 

 

George Bosilca, UTK

 

George Bosilca is a Research Director at the Innovative Computing Laboratory, at the University of Tennessee, Knoxville. His area of interest evolved around parallel computer architecture and systems, high performance computing, programming paradigms, distributed algorithms and resilience.

 

 

Bill Brantley, AMD

 

Dr. Brantley is a Fellow Design Engineer in the Research Division of Advanced Micro Devices. He completed his Ph.D. at Carnegie Mellon University in ECE after working 3 years at Los Alamos National Laboratory.  Next, he joined the T.J. Watson Research Center where he began a project which led to the vector instruction set extensions for the Z-Series CPUs.  He was one of the architects of the 64 CPU RP3 (a DARPA supported HPC system development in the mid-80s) and led the processor design including a hardware performance monitor.  Later he contributed to a number of projects at IBM, mostly dealing with system level performance of RISC 6000 and other systems, eventually joining the Linux Technology Center. In 2002, he joined Advanced Micro Devices and helped to launch the Opteron.  He was leader of the performance team focused on HPC until 2012 when he joined the AMD Research Division where he led both interconnect and programmability efforts of our DoE Exascale Fast Forward contract.

 

 

Marc Buffat

universitŽ Claude Bernard Lyon 1

 

Marc BUFFAT is professor in the mechanical engineering department at the University Claude Bernard Lyon 1. His fields of expertise are Fluids Mechanics, Computational Fluid Dynamics and High Performance Computing. He heads the FLMSN "FŽdŽration Lyonnaise de ModŽlisation et Sciences NumŽriques" that gathers the HPC mesocenter at Lyon, member of the project EQUIPEX EQUIP@MESO, IXXI and CBP.

 

 

Franck Cappello, Univ Paris/ANL

 

Franck Cappello is senior scientist and the project manager of research on resilience at the extreme scale at Argonne National Laboratory since 2013. He old also a position of Adjunct Professor at the CS department of the University of Illinois at Urbana Champaign.  Cappello is the director of the Joint Laboratory on Extreme Scale Computing gathering Inria, UIUC/NCSA, ANL and BSC. He received his Ph.D. from the University of Paris XI in 1994 and joined CNRS, the French National Center for Scientific Research. In 2003, he joined INRIA, where he holds the position of permanent senior researcher. He has initiated and directed several R&D projects, including XtremWeb, MPICH-V, Grid5000, FTI. He also initiated and directed G8 "Enabling Climate Simulation at Exascale" project gathering 7 partners institutions from 6 countries. As a member of the executive committee of the International Exascale Software Project and as leader of the resilience topic in the European Exascale Software Initiative 1&2, he led the roadmap and strategy efforts for projects related to resilience at the extreme scale.

 

 

Barbara Chapman, U of Houston

Dr. Chapman studied Mathematics and Computer Science in New Zealand and at QueenÕs  University of Belfast, Northern Ireland, U.K. She is currently a Professor at the University of Houston, TX, where she is also the founding Director of the Center for Advanced Computing and Data Systems.

Professor Chapman performs research into programming models, compilers and tools for parallel and distributed computations, as well as into program development tools. She has been involved in the development of the OpenMP industry standard for parallel programming for 15 years, and is moreover engaged in the development of the OpenACC programming interface, the Multicore AssociationÕs library interfaces, and OpenSHMEM. Her group has created a near-industry strength compiler, OpenUH, which provides implementations for OpenMP and OpenACC, as well as for Fortran Co-arrays. The group has also produced a reference implementation for OpenSHMEM.

 

 

Francois Courteille

NVIDIA Corp

 

"F. Courteille got a MS degree in Computer Science from Institut National des Sciences AppliquŽes (INSA) de Lyon in 1977.He worked for more than 35 years as technical leader in HPC, first as pre-sales application project leader at Control Data Corporation on the supercomputer vector lines (CYBER 2xx and ETA) before moving to Convex then NEC Corp. (SX line and Earth Simulator) to lead the Western Europe pre-sales and benchmark teams. He has specialized in high performance computing application software porting and tuning on large scale parallel and vector systems with a specific interest on dense and sparse linear algebra. Today as Solution Architect at NVIDIA he is helping to design and/or promote high performance computing solutions (hardware & software) using GPUs."

 

 

Joe Curley, Intel

 

Joe Curley is director of marketing in Technical Computing Group at Intel Corporation in Hillsboro, OR, USA. Joe joined Intel in 2007 and has served in a series of technology planning and business leadership on what is now the Intel(r) Xeon Phi(tm) product line. Prior to joining Intel, Joe served in a series of business and engineering executive positions at Dell, Inc. from 1996 to 2007.  He started his career in technology in 1986 at computer graphics pioneer Tseng Labs, Inc., ultimately serving as general manager of advanced systems for the company.

 

 

Anthony Danalis, UTK

 

Anthony Danalis is currently a Research Scientist II with the Innovative Computing Laboratory at the University of Tennessee, Knoxville. His research interests come from the area of High Performance Computing. Recently, his work has been focused on the subjects of Compiler Analysis and Optimization, System Benchmarking, MPI, and Accelerators. He received his Ph.D. in Computer Science from the University of Delaware on Compiler Optimizations for HPC. Previously, he received an M.Sc. from the University of Delaware and an M.Sc. from the University of Crete, both on Computer Networks, and a B.Sc. in Physics from the University of Crete.

 

 

Ewa Deelman, ISI

 

Ewa Deelman is a Research Associate Professor at the USC Computer Science Department and an Assistant Director of Science Automation Technologies at the USC Information Sciences Institute. Dr. Deelman's research interests include the design and exploration of collaborative, distributed scientific environments, with particular emphasis on workflow management as well as the management of large amounts of data and metadata. At ISI, Dr. Deelman is leading the Pegasus project, which designs and implements workflow mapping techniques for large-scale applications running in distributed environments. Pegasus is being used today in a number of scientific disciplines, enabling researches to formulate complex computations in a declarative way. Dr. Deelman received her PhD in Computer Science from the Rensselaer Polytechnic Institute in 1997.

 

 

Patrick Demichel, HP 

 

MS degree in computer architecture from the Control Data Corporation Institute in Paris in 1975 Since work for Hewlett Packard 34 years on scientific computing Worked on real-time computing on the HP1000, hardware, software, then on linux on the HP9000 family Spent 5 years on porting CATIA on HP platforms Spent 5 years in HP-labs in Fort-Collins on the development of the IA64 processor Since 10 years is "senior HPC architect" with focus on largest projects and most innovative projects in EMEA Now Distinguished Technologist works with HP-labs on emerging technologies like sensors, memristors, photonics, cognitive computing, low power technologies for the programs Moonshot and THE MACHINE.

 

 

Benoit Dinechin, Kalray

 

Beno”t Dupont de Dinechin is Chief Technology Officer of Kalray (http://www.kalray.eu), a company that manufactures integrated manycore processors for embedded and industrial applications. He is also the Kalray VLIW core main architect, and co-architect of the Kalray Multi Purpose Processing Array (MPPA). Before joining Kalray, Beno”t was in charge of Research and Development of the STMicroelectronics Software, Tools, Services division, with special focus on compiler design, virtual machines for embedded systems, and component-based software development frameworks. He was promoted to STMicroelectronics National Fellow in 2008. Prior to his work at STMicroelectronics, Beno”t worked part-time at the Cray Research park (Minnesota, USA), where he developed the software pipeliner of the Cray T3E production compilers. Beno”t earned an engineering degree in Radar and Telecommunications from the Ecole Nationale SupŽrieure de l'AŽronautique et de l'Espace (Toulouse, France), and a doctoral degree in computer systems from the University Pierre et Marie Curie (Paris) under the direction of Prof. P. Feautrier. He completed his post-doctoral studies at the McGill university (Montreal, Canada) at the ACAPS laboratory led by Prof. G. R. Gao.

 

 

Jack Dongarra, UTK/ORNL

 

Jack Dongarra received a Bachelor of Science in Mathematics from Chicago State University in 1972 and a Master of Science in Computer Science from the Illinois Institute of Technology in 1973. He received his Ph.D. in Applied Mathematics from the University of New Mexico in 1980. He worked at the Argonne National Laboratory until 1989, becoming a senior scientist. He now holds an appointment as University Distinguished Professor of Computer Science in the Computer Science Department at the University of Tennessee and holds the title of Distinguished Research Staff in the Computer Science and Mathematics Division at Oak Ridge National Laboratory (ORNL), Turing Fellow at Manchester University, and an Adjunct Professor in the Computer Science Department at Rice University. He is the director of the Innovative Computing Laboratory at the University of Tennessee. He is also the director of the Center for Information Technology Research at the University of Tennessee which coordinates and facilitates IT research efforts at the University.

 

 

Geoffrey Fox, Indiana

 

Fox received a Ph.D. in Theoretical Physics from Cambridge University and is now distinguished professor of Informatics and Computing, and Physics at Indiana University where he is director of the Digital Science Center and Senior Associate Dean for Research and Director of the Data Science program at the School of Informatics and Computing.  He previously held positions at Caltech, Syracuse University and Florida State University after being a postdoc at the Institute of Advanced Study at Princeton, Lawrence Berkeley Laboratory and Peterhouse College Cambridge. He has supervised the PhD of 66 students and published around 1000 papers in physics and computer science with an hindex of 70 and over 25000 citations.

 

He currently works in applying computer science from infrastructure to analytics in Biology, Pathology, Sensor Clouds, Earthquake and Ice-sheet Science, Image processing, Deep Learning, Network Science and Particle Physics. The infrastructure work is built around Software Defined Systems on Clouds and Clusters. He is involved in several projects to enhance the capabilities of Minority Serving Institutions including the eHumanity portal. He has experience in online education and its use in MOOCÕs for areas like Data and Computational Science. He is a Fellow of APS and ACM.

 

 

Al Geist, ORNL

 

Al Geist is a Corporate Research Fellow at Oak Ridge National Laboratory. He is the Chief Technology Officer of the Oak Ridge Leadership Computing Facility and also leads the Extreme-scale Algorithms and Solver Resilience project.  His recent research is on Exascale computing and resilience needs of the hardware and software.

 

In his 31 years at ORNL, he has published two books and over 200 papers in areas ranging from heterogeneous distributed computing, numerical linear algebra, parallel computing, collaboration technologies, solar energy, materials science, biology, and solid state physics.

 

 

Patrick Geoffray

Google

Patrick received his PhD from University of Lyon in 2000 under the directions of Bernard Tourancheau. He worked at Myricom for 13 years implementing communication software and HPC interconnect technology. He joined Google in 2013 to work on amazing things.

 

 

Bill Gropp, UIUC

 

William Gropp is the Thomas M. Siebel Chair in the Department of Computer Science and Director of the Parallel Computing Institute at the University of Illinois in Urbana-Champaign.  He received his Ph.D. in Computer Science from Stanford University in 1982 and worked at Yale University and Argonne National Laboratory.  His research interests are in parallel computing, software for scientific computing, and numerical methods for partial differential equations.  He is a Fellow of ACM, IEEE, and SIAM and a member of the National Academy of Engineering.

 

 

Mary Hall

University of Utah

 

Mary Hall is a Professor in the School of Computing at University of Utah. Her research focuses on compiler technology for exploiting performance-enhancing features of a variety of computer architectures, with a recent emphasis on compiler-based performing tuning technology targeting many-core graphics processors and multi-core nodes in supercomputers. Hall's prior work has focused on compiler techniques for exploiting parallelism and locality on a diversity of architectures: automatic parallelization for SMPs, superword-level parallelism, processing-in-memory architectures and FPGAs. Professor Hall is an ACM Distinguished Scientist. She has published over 70 refereed conference, journal and book chapter articles, and has given more than 50 invited presentations.  She has co-authored several reports for government agencies to establish the research agenda in compilers and high-performance computing. 

 

 

 

 

Torsten Hoefler, ETH

 

Torsten is an Assistant Professor of Computer Science at ETH ZŸrich, Switzerland.  Before joining ETH, he lead the performance modeling and simulation efforts of parallel petascale applications for the NSF-funded Blue Waters project at NCSA/UIUC.  He is also a key member of the Message Passing Interface (MPI) Forum where he chairs the "Collective Operations and Topologies" working group.  Torsten won best paper awards at the ACM/IEEE Supercomputing Conference 2010 (SC10), EuroMPI 2013, ACM/IEEE Supercomputing Conference 2013 (SC13), and other conferences.  He published numerous peer-reviewed scientific conference and journal articles and authored chapters of the MPI-2.2 and MPI-3.0 standards.  For his work, Torsten received the SIAM SIAG/Supercomputing Junior Scientist Prize in 2012 and the IEEE TCSC Young Achievers in Scalable Computing Award in 2013. Following his Ph.D., the received the Young Alumni Award 2014 from Indiana University.  Torsten was elected into the first steering committee of ACM's SIGHPC in 2013.  He was the first European to receive those honors. In addition, he received the Best Student Award 2005 of the Chemnitz University of Technology. His research interests revolve around the central topic of "Performance-centric Software Development" and include scalable networks, parallel programming techniques, and performance modeling.  Additional information about Torsten can be found on his homepage at htor.inf.ethz.ch.

 

 

Jeff Hollingsworth, U Maryland

Jeffrey K. Hollingsworth is a Professor of the Computer Science Department at the University of Maryland, College Park. He also has an appointment in the University of Maryland Institute for Advanced Computer Studies and the Electrical and Computer Engineering Department. He received his PhD and MS degrees in computer sciences from the University of Wisconsin. He received a B. S. in Electrical Engineering from the University of California at Berkeley.

 

Dr. HollingsworthÕs research seeks to develop a unified framework to understand the performance of large systems and focuses in several areas. First, he developed new approach, called dynamic instrumentation, to permit the efficient measurement of large parallel applications. Second, he has developed an auto-tuning framework called Active Harmony that can be used to tune kernels, libraries, or full applications. Third, he is investigating the interactions between different layers of software and hardware to understand how they influence performance. He is Editor in chief of the journal Parallel Computing, was general chair of the SC12 conference, and is Vice Chair of ACM SIGHPC.

 

 

Emmanuel Jeannot, Inria

 

Emmanuel Jeannot Emmanuel Jeannot is a senior research scientist at INRIA (Institut National de Recherche en Informatique et en Automatique) and he has been conducting his research at INRIA Bordeaux Sud-Ouest and at the LaBRI laboratory since Sept. 2009. Before that, he held the same position at INRIA Nancy Grand-Est. From Jan. 2006 to Jul. 2006, he was a visiting researcher at the University of Tennessee, ICL laboratory. From Sept. 1999 to Sept. 2005, he was assistant professor at the UniversitŽ Henry PoincarŽ, Nancy 1. During the period 2000–2009, he did research at the LORIA laboratory. He got his Master and PhD degrees in computer science in 1996 and 1999, respectively both from Ecole Normale SupŽrieure de Lyon, at the LIP laboratory. After his PhD, he spent one year as a postdoc at the LaBRI laboratory in Bordeaux. His main research interests are scheduling for heterogeneous environments and grids, data redistribution, algorithms and models for parallel machines, grid computing software, adaptive online compression and programming models.

 

 

Thilo Kielmann

VU University Amsterdam

 

Thilo Kielmann studied Computer Science at Darmstadt University of Technology, Germany. He received his Ph.D. in Computer Engineering in 1997, and his habilitation in Computer Science in 2001, both from Siegen University, Germany. Since 1998, he is working at VU University Amsterdam, The Netherlands, where he is currently Associate Professor at the Computer Science Department. His research studies perfomability of large-scale, HPC systems, especially the trade-offs between application performance and other properties like monetary cost, energy consumption, or failure resilience. Being a systems person, he favours running code and solid experimentation.

 

 

 

Laurent Lefevre

Inria Avalon, Ecole Normale Superieure of Lyon, France

 

Since 2001, Laurent Lefevre is a permanent researcher in computer science at Inria (the French Institute for Research in Computer Science and Control). He is a member of the Avalon team (Algorithms and Software Architectures for Distributed and HPC Platforms) from the LIP laboratory in Ecole Normale SupŽrieure of Lyon, France. From 1997 to 2001, he was assistant professor in computer science in Lyon 1 University. He has organized several conferences in high performance networking and computing (ICPP2013, HPCC2009, CCGrid2008) and he is a member of several program committees. He has co-authored more than 100 papers published in refereed journals and conference proceedings. His interests include: energy efficiency in large scale distributed systems, high performance computing, distributed computing and networking, high performance networks protocols and services. He is a member of IEEE and takes part in several research projects. He has leaded the Inria Action de Recherche Cooperative GREEN-NET project on power aware software frameworks. Laurent Lefvre has been nominated as Management Committee member and WG leader of the European COST action IC0804 on Energy efficiency in large scale distributed systems (2009-2013) and he is co-WG leader of the European  COST Action IC1305 NESUS on sustainable ultrascale computing (2014-2018). Laurent Lefvre was work package leader in the PrimeEnergyIT project (Intelligent Energy in Europe European call, 2010-2012). Laurent Lefvre is the scientific representative for Inria and executive board member in the GreenTouch consortium dedicated on energy efficiency in networks (2010-2015).

 

 

Rusty Lusk

Argonne National Laboratory

 

Ewing ÒRustyÓ Lusk received his Ph.D. in mathematics from the University of Maryland in 1970.  He has published in mathematics (algebraic topology), automated theorem proving, database technology, logic programming, and parallel computing. He is best known for his work with the definition, implementation, and evangelization of the message-passing interface (MPI) standard.  He is currently the co-director for computer science of the NUCLEI (Nulear Computational Low-Energy Initiative) SciDAC-3 project.  He has been Director of the Mathematics and Computer Science Division at Argonne National Laboratory and currently holds the title of Argonne Distinguished Fellow Emeritus.

 

 

Satoshi Matsuoka, Tokyo Institute of Technology

 

 

Bernd Mohr, Juelich

 

Bernd Mohr started to design and develop tools for performance analysis of parallel programs already with his diploma thesis (1987) at the University of Erlangen in Germany, and continued this in his Ph.D. work (1987 to 1992). During a three year Postdoc position at the University of Oregon, he designed and implemented the original TAU performance analysis framework. Since 1996 he has been a senior scientist at Forschungszentrum JŸlich, Germany's largest multidisciplinary research center and home of one of Europe's most powerful HPC system, a 28-rack BlueGene/Q. Since 2000, he is the team leader for the group "Programming Environments and Performance Optimization". Besides being responsible for user support and training in regard to performance tools at the JŸlich Supercomputing Centre (JSC), he is leading the KOJAK and Scalasca performance tools efforts in collaboration with Prof. Dr. Felix Wolf of GRS Aachen. Since 2007, he also serves as deputy head for the JSC division "Application support". In 2012, Bernd Mohr joined the ISC program team to help set up the programs of the ISC conference series. He is an active member in the International Exascale Software Project (IESP) and work package leader in the European (EESI2) and JŸlich (EIC, ECL) Exascale efforts. For the SC and ISC Conference series, he serves on the Steering Committee. He is the author of several dozen conference and journal articles about performance analysis and tuning of parallel programs.

 

 

Frank Mueller, NC State

 

Frank Mueller (mueller@cs.ncsu.edu) is a Professor in Computer Science and a member of multiple research centers at North Carolina State University. Previously, he held positions at Lawrence Livermore National Laboratory and Humboldt University Berlin, Germany. He received his Ph.D. from Florida State University in 1994.  He has published papers in the areas of parallel and distributed systems, embedded and real-time systems and compilers.  He is a member of ACM SIGPLAN, ACM SIGBED and a senior member of the ACM and IEEE Computer Societies as well as an ACM Distinguished Scientist.  He is a recipient of an NSF Career Award, an IBM Faculty Award, a Google Research Award and a Fellowship from the Humboldt Foundation.

 

 

Raymond Namyst, University of Bordeaux

 

Raymond Namyst received his PhD from the University of Lille in

1997. He was lecturer at Ecole Normale Superieure de Lyon from 1998 to 2001. He became a full Professor position at University of Bordeaux in September 2002.

 

He is the scientific leader of the "Runtime" Inria Research Group devoted to the design of high performance runtime systems for parallel architectures. His main research interests are parallel computing,cscheduling on heterogeneous multiprocessor architectures (multicore,cNUMA, accelerators), and communicationscover high speed networks.  Hechas contributed tocthe development of many significant runtime systemsc(MPI, OpenMP) and most notably the StarPU software

(http://runtime.bordeaux.inria.fr/StarPU/).

 

 

Dimitrios Nikolopoulos

Queen's University of Belfast

 

Dimitrios S. Nikolopoulos is Professor in the School ofElectronics, Electrical Engineering and Computer Science, atQueen's University of Belfast, where he holds the Chair in High Performance and Distributed Computing (HPDC) and is Director of Research in the HPDC Cluster.  His current research activity explores real-time data-intensive systems, energy-efficient computing and new computing paradigms at the limits of power and reliability. Professor Nikolopoulos has been awarded the NSF CAREER Award, the US DoE Early Career Principal Investigator Award, an IBM Faculty Award, a Marie Curie Fellowship, a Fellowship from HIPEAC and seven best paper awards. His research has been supported with over £20 million of highly competitive, external research funding. He is a Senior Member of the ACM and a Senior Member of the IEEE.

 

 

Christian Obrecht

CETHIL UMR 5008 (CNRS, INSA-Lyon, UCB-Lyon 1), UniversitŽ de Lyon

 

Christian Obrecht is currently working as a post-doctoral researcher at the CETHIL laboratory in INSA-Lyon. He graduated in mathematics from ULP-Strasbourg in 1990 and taught at high school level until 2008. He received a MSc degree in computer science from UCB-Lyon in 2009 and afterwards served as a research engineer for EDF until 2013. He received a PhD degree from INSA-Lyon in 2012.  His research work focuses on implementation and optimization strategies for parallel CFD applications on emerging many-core architectures.

 

 

Jean-Laurent Philippe, Intel

 

Dr. Jean-Laurent Philippe is the Technical Sales Director for Enterprises at Intel Europe. His charter is to help large enterprises find the best solutions based on Intel solutions, platforms, technologies and products. Dr. Philippe has been with Intel over 20 years and has been in various positions in technical support and technical sales, and then has managed several teams and groups in technical presales.

Dr. Philippe holds a PhD from INPG (Grenoble, France) in computer science (automatic parallelization for distributed-memory supercomputers) and applied mathematics (cryptography). Dr. Philippe holds 2 patents in Japan on automated parallelization techniques.

 

 

Christian Perez, INRIA

 

Dr Christian Perez is an Inria researcher. He received his Ph.D. from the Ecole Normale SupŽrieure de Lyon, France, in 1999. He is leading the Avalon research team at LIP (Lyon, France), a joint team between Inria, CNRS, ENS Lyon, and the University Lyon 1. Avalon deals with energy consumption, data management, programming model, and scheduling of parallel and distributed applications on distributed and HPC platforms. His research topics include parallel and distributed programming models, application deployment, and resource management. He is also leading the Inria project laboratory HŽmŽra that gathers more than 20 French research groups to demonstrate ambitious up-scaling techniques for large scale distributed computing on the Grid'5000 experimental testbed.

 

 

Jelena Pjesivac-Grbovic

Google

 

Jelena Pjesivac-Grbovic is a staff software engineer in Systems Infrastructure at Google, focusing on building large-scale distributed data processing frameworks. 

 

 

Padma Raghavan, Penn State

 

 

Dan Reed

University of Iowa

 

Daniel A. Reed is Vice President for Research and Economic Development, as well as University Chair in Computational Science and Bioinformatics and Professor of Computer Science, Electrical and Computer Engineering and Medicine, at the University of Iowa.  Previously, he was MicrosoftÕs Corporate Vice President for Technology Policy and Extreme Computing, where he helped shape Microsoft's long-term vision for technology innovations in cloud computing and the company's associated policy engagement with governments and institutions around the world.  Before joining Microsoft, he was the ChancellorÕs Eminent Professor at UNC Chapel Hill, as well as the Director of the Renaissance Computing Institute (RENCI) and the ChancellorÕs Senior Advisor for Strategy and Innovation for UNC Chapel Hill.  Prior to that, he was Gutgsell Professor and Head of the Department of Computer Science at the University of Illinois at Urbana-Champaign (UIUC) and Director of the National Center for Supercomputing Applications (NCSA).

 

 

Yves Robert

ENS Lyon & Univ. Tenn. Knoxville

 

Yves Robert received his PhD degree from Institut National Polytechnique

de Grenoble. He is currently a full professor in the Computer Science

Laboratory LIP at ENS Lyon. He is the author of 7 books, 130+ papers

published  in international journals, and 200+ papers published in international

conferences. He is the editor of 11 book proceedings and 13 journal

special issues. He is the advisor of 26 PhD theses. His main research

interests are scheduling techniques and resilient algorithms for large-scale platforms.

Yves Robert served on many editorial boards, including IEEE TPDS.

He was the program chair  of HiPC'2006 in Bangalore, IPDPS'2008 in Miami,

ISPDC'2009 in Lisbon,  ICPP'2013 in Lyon and HiPC'2013 in Bangalore.

He is a Fellow  of the IEEE. He has been elected a Senior Member of Institut

Universitaire de France in 2007 and renewed in 2012.  He has been

awarded the 2014 IEEE TCSC Award for Excellence in Scalable Computing.

He holds a Visiting Scientist position at the University of Tennessee Knoxville since 2011.

 

 

Joel Saltz, Emory U

 

 

Satoshi Sekiguchi

National Institute of Advanced Industrial Science and Technology (AIST)

 

He has received BS from The University of Tokyo, ME from University of Tsukuba, and Ph.D. in Information Science and Technology from The University of Tokyo, respectively. He joined Electrotechnical Laboratory (ETL), Japan in 1984 to engage research in high-performance computing widely from its system architecture to applications. He has extraordinary knowledge in applying IT-based solutions to many of society's problems related to global climate change, environmental management and resource efficiency. He served as Director of Grid Technology Research Center, Director of Information Technology

Research Institute, National Institute of Advanced Industrial Science and Technology (AIST), and is currently Deputy General Director, Directorate for Information Technology and Electronics, AIST. He has been contributing to the Open Grid Forum as a member of board of directors, is a member of IEEE Computer Society, ACM and Information Processing Society of Japan as a fellow.

 

 

Vaidy Sunderam, Emory U

 

Vaidy Sunderam is Samuel Candler Dobbs Professor of Computer Science at Emory University. He is also Chair of the Department of Mathematics and Computer Science, and Director of the University's strategic initiative in Computational and Life Sciences. Professor Sunderam joined the Emory faculty in 1986 after receiving his PhD from the University of Kent, England where he was a Commonwealth Scholar. His research interests are in heterogeneous distributed systems and infrastructures for collaborative computing. He is the principal architect of several frameworks for metacomputing and collaboration, and his work is supported by grants from the National Science Foundation and the U.S. Department of Energy. Professor Sunderam teaches computer science courses at the beginning, advanced, and graduate levels, and advises graduate theses in the area of computer systems.  He is the recipient of several recognitions for teaching and research, including the Emory Williams Teaching award, the IEEE Gordon Bell prize for parallel processing, the IBM supercomputing award, and an R&D 100 research innovation award.

 

 

FrŽdŽric SUTER

IN2P3 Computing Center / CNRS, Lyon-Villeurbanne, France

 

FrŽdŽric Suter is a CNRS junior researcher at the IN2P3 Computing Center in Lyon, France, since October 2008. His research interests include scheduling, Grid computing and platform and application simulation. He obtained his M.S. from the UniversitŽ de Picardie Jules Verne, Amiens, France, in 1999 and his Ph.D. from the Ecole Normale SupŽrieure de Lyon, France, in 2002.

 

 

Martin Swany, Indiana U

 

Martin Swany is an Associate Professor of Computer Science in Indiana University's School of Informatics and Computing and the Associate Director of the Center for Research in Extreme Scale Technologies (CREST).  His research interests include high-performance parallel and distributed computing and networking.

 

 

Michela Taufer, U of Delaware

 

Michela Taufer joined the University of Delaware in 2007, where she was promoted to associate professor with tenure in 2012. She earned her M.S. degree in Computer Engineering from the University of Padova and her Ph.D. in Computer Science from the Swiss Federal Institute of Technology (ETH). She was a post-doctoral research supported by the La Jolla Interfaces in Science Training Program (also called LJIS) at UC San Diego and The Scripps Research Institute. Before she joined the University of Delaware, Michela was faculty in Computer Science at the University of Texas at El Paso.

 

Michela has a long history of interdisciplinary work with high-profile computational biophysics groups in several research and academic institutions. Her research interests include software applications and their advanced programmability in heterogeneous computing (i.e., multi-core platforms and GPUs); cloud computing and volunteer computing; and performance analysis, modeling and optimization of multi-scale applications.

 

She has been serving as the principal investigator of several NSF collaborative projects. She also has significant experience in mentoring a diverse population of students on interdisciplinary research. Michela's training expertise includes efforts to spread high-performance computing participation in undergraduate education and research as well as efforts to increase the interest and participation of diverse populations in interdisciplinary studies.

 

Michela has served on numerous IEEE program committees (SC and IPDPS among others) and has reviewed for most of the leading journals in parallel computing. - See more at: http://www.nvidia.com/content/cuda/spotlights/michela-taufer-delaware.html#sthash.cxwIFqh6.dpuf

 

 

Marc Tchiboukdjian, CGG

 

Marc Tchiboukdjian is an HPC architect at CGG where he is investigating new technologies for CGG's processing centers. He received is PhD in 2010 from the University of Grenoble and did his postdoc in the Exascale Computing Research center in Paris.

 

 

Rajeev Thakur, Argonne National Lab

 

Rajeev Thakur is the Deputy Director of the Mathematics and Computer Science Division at Argonne National Laboratory, where he is also a Senior Computer Scientist. He is also a Senior Fellow in the Computation Institute at the University of Chicago and an Adjunct Professor in the Department of Electrical Engineering and Computer Science at Northwestern University. He received a Ph.D. in Computer Engineering from Syracuse University in 1995.  His research interests are in the area of high-performance computing in general and particularly in parallel programming models, runtime systems, communication libraries, and scalable parallel I/O. He is a member of the MPI Forum that defines the Message Passing Interface (MPI) standard. He is also co-author of the MPICH implementation of MPI and the ROMIO implementation of MPI-IO, which have thousands of users all over the world and form the basis of commercial MPI implementations from IBM, Cray, Intel, Microsoft, and other vendors. MPICH received an R&D 100 Award in 2005. Rajeev is a co-author of the book "Using MPI-2: Advanced Features of the Message Passing Interface" published by MIT Press, which has also been translated into Japanese. He was an associate editor of IEEE Transactions on Parallel and Distributed Systems (2003-2007) and was Technical Program Chair of the SC12 conference.

 

 

Bernard Tourancheau, University Grenoble

 

Bernard Tourancheau got a MSc. in Apply Maths from Grenoble University in 1986 and a MSc. in Renewable Energy Science and Technology from Loughborough University in 2007. He was awarded best Computer Science PhD by Institut National Polytechnique of Grenoble in 1989 for his work on Parallel Computing for Distributed Memory Architectures.

Working for the LIP laboratory, he was appointed assistant professor at Ecole Normale SupŽrieure de Lyon in 1989 before joining CNRS as a junior researcher. After initiating a CNRS-NSF collaboration, he worked two and half years on leave at the University of Tennessee on a senior researcher position with the US Center for Research in Parallel Computation at the ICL laboratory.

He then took a Professor position at University of Lyon in 1995 where he created a research laboratory and the INRIA RESO team, specialized in High Speed Networking and HPC Clusters.

In 2001, he joined SUN Microsystems Laboratories for a 6 years sabbatical as a Principal Investigator in the DARPA HPCS project where he lead the backplane networking group.

Back in academia he oriented his research on sensor and actuator networks for building energy efficiency at ENS LIP and INSA CITI labs.

He was appointed Professor at University Joseph Fourier of Grenoble in 2012. Since then, he is developing research at the LIG laboratory Drakkar team about protocols and architectures for the Internet of Things and their applications to energy efficiency in buildings. He as well continue HPC multicores GPGPU's communication algorithms optimization research and renewable energy transition vs peak oil scientific promotion.

He has authored more than a hundred peer-reviewed publications and filed 10 patents.

 

 

 

StŽphane Ubeda, INRIA

 

After a PhD in Computer Sciences from the Ecole Normale SupŽrieure de Lyon in 1993, StŽphane UbŽda was associated professor in the Swiss Federal Institut of Technology until 1994. He was associated professor in the Jean-Monnet Univeristy  (Saint-Etienne) up to 2000 and went to the Institut Nationale des Sciences AppliquŽes de Lyon (INSA Lyon), as full professor in the department Telecommunications.  From 2000 to 2010, he was head of the CITI Lab attached to INSA Lyon and associated with Inria.  His main interest concerns global mobility management architectures and protocols. In mobility management he is interested in self-organized networks but also very in sensitive issues like temporary address, mutli-homing of mobile host and resources optimization (especially radio resources). He is also interested in to fundamental studies for models of interaction of smart objects, like the notion of trust in such an environment and what kind of security we can build on the top of it.  Since 2010, he is Director of Technological Development at Inria, member of the national board of the institute. He is in charge of the coordination of software and technological developments at national level, as well as large scale research infrastructures.

 

 

Jeff Vetter, ORNL

 

Jeffrey Vetter, Ph.D., holds a joint appointment between Oak Ridge National Laboratory (ORNL) and the Georgia Institute of Technology (GT). At ORNL, Vetter is a Distinguished R&D Staff Member, and the founding group leader of the Future Technologies Group in the Computer Science and Mathematics Division. At GT, Vetter is a Joint Professor in the Computational Science and Engineering School, the Principal Investigator for the NSF-funded Keeneland Project that brings large scale GPU resources to NSF users through XSEDE, and the Director of the NVIDIA CUDA Center of Excellence. His papers have won awards at the International Parallel and Distributed Processing Symposium and EuroPar; he was awarded the ACM Gordon Bell Prize in 2010. His recent book ÒContemporary High Performance ComputingÓ surveys the international landscape of HPC. See his website for more information: http://ft.ornl.gov/~vetter/.

 

 

Xavier Vigouroux, Bull

 

Xavier Vigouroux, after a PhD from Ecole normale Superieure de Lyon in Distributed computing, worked for several major companies in different positions: From Investigator at Sun labs to Support Engineer for HP. He has now been working for bull for eight years. He led the HPC benchmarking team for the first five years, then in charge of the " Education and Research " market for HPC at Bull, he is now leading de "Center for Excellence in Parallel Programming" of Bull

 

 

Frederic Vivien, ENS

 

 

David  Walker, Cardiff

 

David Walker is Professor of High Performance Computing in the School of Computer Science and Informatics at Cardiff University, where he heads the Distributed Collaborative Computing group. From 2002-2010 he was also Director of the Welsh e-Science Centre. He received a B.A. (Hons) in Mathematics from Jesus College, Cambridge in 1976, an M.Sc. in Astrophysics from Queen Mary College, London, in 1979, and a Ph.D. in Physics from the same institution in 1983. Professor Walker has conducted research into parallel and distributed algorithms and applications for the past 25 years in the UK and USA, and has published over 140 papers on these subjects. Professor Walker was instrumental in initiating and guiding the development of the MPI specification for message-passing, and has co-authored a book on MPI. He also contributed to the ScaLAPACK library for parallel numerical linear algebra computations. Professor WalkerÕs research interests include software environments for distributed scientific computing, problem-solving environments and portals, and parallel applications and algorithms. Professor Walker is a Principal Editor of Computer Physics Communications, the co-editor of Concurrency and Computation: Practice and Experience, and serves on the editorial boards of the International Journal of High Performance Computing Applications, and the Journal of Computational Science.

 

 

 

Michael Wolfe

NVIDA/PGI

 

Michael Wolfe has over 35 years experience working on compilers in academia and industry.  He joined PGI in 1996 and has most recently bee working on compilers for heterogeneous host+accelerator systems.  He was formerly an associate professor at OGI, and a cofounder and lead compiler engineer at KAI prior to that.  He has published one textbook, "High Performance Compilers for Parallel Computing."

 

 

 

 

 

 

 

 

CCGSC 1998 Participants, Blackberry Tennessee

 

 

 

CCGSC 2000 Participants, Faverges, France

 

 

 

CCGSC 2002 Participants, Faverges, France

 

 

 

CCGCS 2004 Participants, Faverges, France

 

 

 

CCGCS 2006 Participants, Flat Rock North Carolina

Some additional pictures can be found here.

http://web.eecs.utk.edu/~dongarra/ccgsc2006/

 

 

CCGCS 2008 Participants, Flat Rock North Carolina

http://web.eecs.utk.edu/~dongarra/ccgsc2008/

 

 

 

CCGCS 2010 Participants, Flat Rock North Carolina

http://web.eecs.utk.edu/~dongarra/ccgsc2010/

 

 

 

CCDSC 2012 Participants, Dareize, France

http://web.eecs.utk.edu/~dongarra/CCDSC-2012/index.htm

 

 

CCGSC 2014 Participants, Dareize, France