PMAA 2012

7th International Workshop on
Parallel Matrix Algorithms and Applications (PMAA 2012)
28-30 June 2012, Birkbeck University of London, UK



PROGRAMME PMAA 2012


KEYNOTE TALKS


Keynote talk 1 Thursday 28.06.2012 09:10 - 10:00 Room: B33
Parallel multishift QR and QZ algorithms with advanced deflation strategies - recent progress
Speaker: B. Kagstrom Chair: Ahmed Sameh
Keynote talk 2 Friday 29.06.2012 14:00 - 14:50 Room: B33
Energy aware performance metrics
Speaker: C. Bekas Chair: Stratis Gallopoulos
Keynote talk 3 Saturday 30.06.2012 11:50 - 12:40 Room: B33
Hybrid methods for solving large sparse systems
Speaker: I. Duff Chair: Bernard Philippe


PARALLEL SESSIONS


Parallel session B: Thursday 28.06.2012 10:30 - 12:10

Session PS03 Room: B35
Scientific applications of heterogeneous CPU/GPU computing 1 Thursday 28.06.2012    10:30 - 12:10
Chair: Marc Baboulin Organizer: Marc Baboulin and Radek Stompor
  P062:   X. Andrade
  Real-space electronic structure calculations on graphical processing units
  P031:   M. Szydlarski, G. Fabbian, L. Grigori, R. Stompor
  Parallel spherical harmonic transforms on heterogeneous architectures (GPUs/multi-core CPUs)
  P028:   Y. Wang, M. Baboulin, Y. Fraigneau, O. Le Maitre
  A parallel solver for simulations of incompressible fluid flows
  P050:   M. Lefebvre, J. Le Gouez, T. Le
  Using GPU libraries for CFD solvers
Session PS15 Room: B33
Sparse matrix computations Thursday 28.06.2012    10:30 - 12:10
Chair: Jennifer Scott Organizer: Jennifer Scott and Jonathan Hogg
  P016:   J. Scott, J. Hogg
  Achieving bit-compatibility in a sparse direct solver
  P019:   J. Hogg
  Fast triangular solve on GPUs
  P069:   N. Dingle
  Inexact sparse matrix-vector products in the calculation of passage time distributions in large semi-Markov models
  P083:   G. Kollias, K. Kambatla, A. Grama
  Efficient large-scale graph analysis in MapReduce
Session PS16 Room: B36
Parallel methods for optimization Thursday 28.06.2012    10:30 - 12:10
Chair: Julian Hall Organizer: Julian Hall
  P015:   J. Hall, E. Smith
  Parallel revised simplex for primal block angular LP problems
  P027:   Q. Huangfu, J. Hall
  A high performance dual simplex solver
  P030:   M. Lubin, J. Hall, C. Petra, M. Anitescu
  Parallel linear-algebra decomposition methods in stochastic optimization
  P073:   J. Langguth, F. Manne, M. Halappanavar, B. Ucar
  Shared memory parallel maximum transversal algorithms
Parallel session C: Thursday 28.06.2012 13:35 - 14:50

Session PS02 Room: B35
Parallel Schur methods for solving large sparse systems of linear equations Thursday 28.06.2012    13:35 - 14:50
Chair: Achim Basermann Organizer: Achim Basermann
  P041:   M. Bollhoefer, J. Aliaga, A. Bodendiek, A. Martin, E. Quintana-Orti
  Parallel multilevel ILU for Maxwell equations
  P043:   F. Wubs, J. Thies
  A robust hybrid direct/iterative two-level solver for the 3D Navier-Stokes equations
  P061:   E. Boman, S. Rajamanickam, M. Heroux
  A hierarchical parallel Schur-complement preconditioner for sparse linear systems
Session PS21 Room: B36
Matrix methods in statistics Thursday 28.06.2012    13:35 - 14:50
Chair: George Loizou Organizer: George Loizou
  P066:   C. Gatu, C. Barna, E. Kontoghiorghes
  Directed graphs based strategies for solving the clustering problem
  P048:   S. Hadjiantoni, E. Kontoghiorghes
  Downdating the generalized least squares estimator
  P089:   S. Pollock
  on Kronecker products, tensor products and matrix differential calculus
  P065:   M. Cosbuc, C. Gatu, E. Kontoghiorghes
  A GSVD strategy for estimating the simultaneous equations model
Session PS05 Room: B33
Fast iterative solvers for large scale applications 1 Thursday 28.06.2012    13:35 - 14:50
Chair: Peter Arbenz Organizer: Peter Arbenz, Radim Blaheta and Maya Neytcheva
  P059:   P. Ghysels, T. Ashby, K. Meerbergen, W. Vanroose
  Hiding global synchronization costs in Krylov methods for systems of linear equations
  P070:   N. Budko, I. Flanegin, G. Zouros
  Accelerating the solution of a dense non-normal system arising in free-space electromagnetic scattering
  P051:   F. Dang, N. Emad, P. Fiorini
  Toward reusable numerical library for solving Hamilton-Jacobi-Bellman (HJB) equations
Parallel session D: Thursday 28.06.2012 15:20 - 17:00

Session PS07 Room: B35
Applications of multilevel methods Thursday 28.06.2012    15:20 - 17:00
Chair: Matthias Bollhoefer Organizer: Matthias Bollhoefer
  P011:   J. Thies, F. Wubs
  Robust parallel multilevel ILU for the 3D Navier-Stokes equations on structured grids
  P029:   A. Basermann, M. Zoellner
  Scalable two-level preconditioners for CFD computations on many-core systems
  P055:   M. Bolten
  A highly scalable multigrid solver for structured matrices and its application
  P018:   D. Gordon, E. Turkel, R. Gordon, S. Tsynkov
  Parallel implementation of compact sixth order schemes for the Helmholtz equation with variable wave number
Session PS08 Room: B33
Sparse matrix computations on multicore and manycore architectures Thursday 28.06.2012    15:20 - 17:00
Chair: Erik Boman Organizer: Erik Boman
  P002:   S. Zhu, T. Gu, X. Liu
  Inner product computation for sparse iterative solvers on heterogeneous supercomputers
  P017:   M. Martone
  An efficient sparse matrix storage scheme for shared memory parallel Sparse BLAS operations
  P063:   J. Fattebert, D. Osei-Kuffuor
  O(N) parallel algorithm for computing selected elements of the inverse of a Gram matrix in electronic structure calculations
  P075:   H. Thornquist, S. Rajamanickam, M. Heroux, E. Boman
  Sparse matrix techniques for next-generation parallel transistor-level circuit simulation
Session PS11 Room: B36
HPC and high-accuracy computing 1 Thursday 28.06.2012    15:20 - 17:00
Chair: Hidehiko Hasegawa Organizer: Hidehiko Hasegawa
  P045:   H. Murao, H. Hagiwara
  Exact linear-system solving via GPU-accelerated iterative method over finite-fields
  P020:   M. Nakata
  The MPACK - arbitrary accurate version of BLAS and LAPACK and acceleration on GPU
  P040:   K. Rojek, R. Wyrzykowski
  Autotuned adaptation of MPDATA to GPU accelerators
  P012:   E. Milovanovic, I. Milovanovic, M. Stojcev, T. Nikolic
  Hexagonal arrays for fault-tolerant matrix multiplication
Parallel session E: Friday 29.06.2012 08:45 - 10:25

Session PS01 Room: B36
Linear algebra on manycore architectures over runtime systems Friday 29.06.2012    08:45 - 10:25
Chair: Emmanuel Agullo Organizer: Emmanuel Agullo, Luc Giraud and Jean Roman
  P084:   H. Ltaief, A. Haidar, P. Luszczek, J. Dongarra
  Recent advances in dense matrix computations for two-sided reduction algorithms
  P078:   P. Ramet, G. Bosilca, M. Faverge, X. Lacoste, I. Yamazaki
  Toward a supernodal sparse direct solver over DAG runtimes
  P074:   G. Bosilca
  Linear algebra on distributed heterogeneous hardware with a symbolic DAG approach
  P088:   S. Nakov, E. Agullo, L. Giraud, A. Guermouche, J. Roman, S. Thibault
  Pipelining the conjugate gradient method over a runtime system
Session PS14 Room: B33
Energy aware matrix computations on multicore architectures Friday 29.06.2012    08:45 - 10:25
Chair: Daniel Kressner Organizer: Daniel Kressner and Costas Bekas
  P021:   E. Quintana-Orti, P. Alonso, M. Dolz, R. Mayo
  Energy-aware dense and sparse linear algebra
  P064:   F. Igual, M. Ali, R. van de Geijn
  Exploring the low-power and high-performance of multi-core DSPs for dense linear algebra
  P085:   H. Ltaief, P. Luszczek, J. Dongarra
  Energy footprint of advanced dense numerical linear algebra using tile algorithms on multicore architecture
  P090:   P. Raghavan
  Achieving energy-aware high performance for parallel sparse matrix and graph computations
Session PS20 Room: B35
Linear systems of equations and eigenproblem Friday 29.06.2012    08:45 - 10:25
Chair: Cristian Gatu Organizer: PMAA
  P006:   S. Fujino
  A proposal of MrsR method with one global synchronization per one iteration
  P035:   L. Karlsson, D. Kressner
  Optimally packed chains of bulges in multi-shift QR algorithms
  P025:   R. Fezzani, L. Grigori, F. Nataf
  Sparse/Low-rank block filtering decomposition preconditionner
  P077:   Y. Maeda, Y. Futamura, T. Sakurai
  Effective resource utilization for a contour integral based parallel eigensolver
Parallel session F: Friday 29.06.2012 10:55 - 12:35

Session PS04 Room: B36
Scientific applications of heterogeneous CPU/GPU computing 2 Friday 29.06.2012    10:55 - 12:35
Chair: Radek Stompor Organizer: Marc Baboulin and Radek Stompor
  P049:   J. Arnal, M. Sanchez, V. Vidal, E. Quintana
  An efficient image noise removal method on heterogeneous CPU-GPU configurations
  P057:   S. Glimberg, A. Engsig-Karup
  A generic library for large scale solution of PDEs on modern heterogeneous architectures
  P033:   L. Szustak, R. Wyrzykowski, K. Rojek
  Parallelization of MPDATA algorithm on heterogeneous CPU-GPU architectures
  P060:   F. Boillod Cerneux, S. Petiton, C. Calvin, J. Dubois
  Toward smart-tuned hybrid asynchronous Krylov eigenvalue computing with multi-restarted strategies
Session PS12 Room: B35
HPC and high-accuracy computing 2 Friday 29.06.2012    10:55 - 12:35
Chair: Hidehiko Hasegawa Organizer: Hidehiko Hasegawa
  P042:   S. Yamada, Y. Idomura, T. Imamura, M. Machida
  High performance Krylov subspace method for asymmetric linear system on fusion plasma simulation code GT5D
  P047:   T. Imamura, S. Yamada, M. Machida
  Eigen-K: high performance eigenvalue solver for symmetric matrices developed for K computer
  P004:   H. Ishigami, K. Kimura, Y. Nakamura
  Accelerating orthogonalization process of inverse iteration on multi-core CPU and GPGPU environment
  P038:   Y. Hirota, Y. Yamamoto
  An acceleration of backward transformation of singular vectors on a CPU and GPU heterogeneous environment
Session PS06 Room: B33
Fast iterative solvers for large scale applications 2 Friday 29.06.2012    10:55 - 12:35
Chair: Radim Blaheta Organizer: Peter Arbenz, Radim Blaheta and Maya Neytcheva
  P046:   P. Boyanova, O. Axelsson, M. Kronbichler, M. Neytcheva, X. Wu
  Parallelization aspects of multiphase flow simulations
  P024:   M. Mensik, T. Brzobohaty, M. Jarosova, A. Markopoulos
  Hybrid total FETI
  P081:   R. Hrtus, O. Axelsson, R. Blaheta, P. Byczanski
  Block preconditioners for poroelasticity with parallelization aspects
  P067:   P. Arbenz, E. Turan
  Preconditioning for large scale micro finite element analysis of 3D poroelasticity
Parallel session H: Friday 29.06.2012 15:20 - 17:00

Session PS09 Room: B33
Matrix functions Friday 29.06.2012    15:20 - 17:00
Chair: Costas Bekas Organizer: Costas Bekas and Stratis Gallopoulos
  P058:   A. Stathopoulos, J. Laeuchli, K. Orginos
  Hierarchical probing with applications to approximating the trace of the matrix inverse
  P013:   B. Philippe, E. Kamgnia, L. Nguenang
  A parallel method for counting eigenvalues of a large sparse matrix in the complex plane
  P079:   V. Kalantzis, C. Bekas, A. Curioni, E. Gallopoulos
  A parallel projection method for estimating the diagonal of the matrix inverse
  P086:   D. Kressner
  Reconstructing the diagonal of the matrix inverse
Session PS17 Room: B36
Automatic tuning and performance modeling Friday 29.06.2012    15:20 - 17:00
Chair: Cristian Gatu Organizer: PMAA
  P014:   J. Cuenca, J. Camara, L. Garcia, D. Gimenes
  Auto-tuned nested parallelism: a way to reduce the execution time of scientific software in NUMA systems
  P052:   E. Peise
  Performance modeling for ranking blocked algorithms
  P082:   J. Ramanujam, G. Baumgartner, P. Bhattacharya, A. Panyala
  A fusion-based optimization framework for a tensor contraction language
  P008:   M. Stojcev, T. Nikolic, E. Milovanovic, I. Milovanovic
  Communication architecture for interconnecting IP blocks in SoC design using LCDMA technique
Session PS19 Room: B35
Sparse matrix applications Friday 29.06.2012    15:20 - 17:00
Chair: Jonathan Hogg Organizer: PMAA
  P039:   A. Mansour, J. Goetze
  Sparse matrix-vector multiplication using network-on-chip
  P080:   E. Rudberg, E. Rubensson
  Chunks & Tasks -- simplifying parallelization of dynamic hierarchic algorithms aiming to enable large sparse matrix computations
  P076:   U. Borstnik, V. Weber, J. VandeVondele, I. Bethune, J. Hutter
  Sparse matrix multiplication for quantum chemical calculations
  P071:   A. Remon, P. Benner, M. Kohler, E. Quintana-Orti, J. Saak
  Acceleration of large scale Riccati equation solvers using GPUs
Parallel session I: Saturday 30.06.2012 09:15 - 11:20

Session PS10 Room: B33
Large dense eigenvalue solvers Saturday 30.06.2012    09:15 - 11:20
Chair: Inge Gutheil Organizer: Inge Gutheil, Thomas Huckle and Thomas Auckenthaler
  P005:   T. Auckenthaler, T. Huckle, R. Wittmann
  A blocked QR-decomposition for the parallel symmetric eigenvalue problem
  P032:   L. Kraemer
  The FEAST algorithm: Analysis and computational experience
  P044:   M. Petschow, P. Bientinesi
  Using mixed precisions in the solution of dense Hermitian eigenproblems
  P022:   E. Di Napoli
  Block iterative solvers for sequences of correlated dense eigenvalue problems
  P036:   I. Gutheil, F. Muenchhalfen, J. Grotendorst
  Performance of dense eigensolvers on BlueGene/Q - first results
Session PS13 Room: B35
Parallel Monte Carlo methods for matrix computations Saturday 30.06.2012    09:15 - 11:20
Chair: Aneta Karaivanova Organizer: Aneta Karaivanova
  P003:   J. Acebron, F. Bernal, A. Rodriguez-Rozas
  A parallel probabilistic-based preconditioner suited for the iterative solution of a large-scale linear system
  P007:   A. Karaivanova, E. Atanassov
  Randomized quasi-Monte Carlo for matrix computations
  P009:   T. Gurov, E. Atanassov
  Efficient Monte Carlo and quasi-Monte Carlo algorithms for inverse matrix problems
  P023:   V. Alexandrov, J. Strassburg
  Parallel algorithms for solving systems of linear algebraic equations through approximate Monte Carlo matrix Inverse
  P034:   S. Pauli, P. Arbenz, C. Schwab
  Intrinsic fault tolerance of multi level Monte Carlo methods
Session PS18 Room: B36
Parallel matrix computations Saturday 30.06.2012    09:15 - 11:20
Chair: Nahid Emad Organizer: Nahid Emad
  P026:   M. Becka, G. Oksa, M. Vajtersic
  On efficient implementation of the parallel one-sided block-Jacobi SVD algorithm
  P068:   Y. Futamura, T. Sakurai, S. Furuya, J. Iwata
  Efficient algorithm for solving linear systems arising from a sparse eigensolver
  P010:   Z. Liu, M. Lamure, S. Ben Amor, N. Emad
  Modeling of epidemic spread and eigenvalue computation
  P054:   M. Pippig, D. Potts
  Massively parallel computation of nonequispaced fast Fourier transforms
  P037:   E. Kayaaslan, C. Aykanat, B. Ucar
  Sparse matrix partitioning for low communication latency and volume