Table of contents for issues of Parallel Computing

Last update: Fri Mar 6 14:13:36 MST 2026

Parallel Computing
Volume 1, Number 1, August, 1984

                    D. J. Evans   Parallel SOR iterative methods . . . . . 3--18
                    W. Gentzsch   Numerical algorithms in computational
                                  fluid dynamics on vector computers . . . 19--33
              M. J. Kascic, Jr.   Vorton dynamics: a case study of
                                  developing a fluid dynamics model for a
                                  vector processor . . . . . . . . . . . . 35--44
             P. N. Swarztrauber   FFT algorithms for vector computers  . . 45--63
               D. Parkinson and   
                  M. Wunderlich   A compact algorithm for Gaussian
                                  elimination over GF(2) implemented on
                                  highly parallel computers  . . . . . . . 65--73
                      W. Ronsch   Stability aspects in using parallel
                                  algorithms . . . . . . . . . . . . . . . 75--98
                   F. J. Peters   Parallel pivoting algorithms for sparse
                                  symmetric matrices . . . . . . . . . . . 99--110

Parallel Computing
Volume 1, Number 2, December, 1984

               C. C. Hsiung and   
                    W. Butscher   A numerical seismic $3$-D migration
                                  model for vector multiprocessors . . . . 113--120
                       M. Kratz   Vectorized finite-element stiffness
                                  generation: tuning the Noor-Lambiotte
                                  algorithm  . . . . . . . . . . . . . . . 121--132
             J. J. Dongarra and   
                 R. E. Hiromoto   A collection of parallel linear
                                  equations routines for the Denelcor HEP  133--142
                 D. C. Sorensen   Buffering for vector performance on a
                                  pipelined MIMD machine . . . . . . . . . 143--164
                      M. Bishop   The Ultracomputer as a vehicle for
                                  polymer simulations  . . . . . . . . . . 165--174
            P. Frederickson and   
                R. Hiromoto and   
               T. L. Jordan and   
                   B. Smith and   
                     T. Warnock   Pseudo-random trees in Monte Carlo . . . 175--180
                       J. Tappe   The minimal average latency of
                                  multiconfigurable pipelines  . . . . . . 181--183
                       J. Tappe   Algorithms for pipeline control  . . . . 185--188

Parallel Computing
Volume 1, Number 3--4, December, 1984

         Robert E. Hiromoto and   
             Olaf M. Lubeck and   
                    James Moore   Experiences with the Denelcor HEP  . . . 197--206
          Nisheeth R. Patel and   
                Harry F. Jordan   A parallelized point rowwise successive
                                  over-relaxation method on a
                                  multiprocessor . . . . . . . . . . . . . 207--222
           Jack J. Dongarra and   
                 Ahmed H. Sameh   On some parallel banded system solvers   223--235
                 T. Axelrod and   
                  P. Dubois and   
                    P. Eltgroth   A simulator for MIMD performance
                                  prediction: application to the S-1 MkIIa
                                  multiprocessor . . . . . . . . . . . . . 237--274
               Shao-Wen Mai and   
                    D. J. Evans   A parallel algorithm for the enumeration
                                  of the spanning trees of a graph . . . . 275--286
                  Celso Ribeiro   Performance evaluation of vector
                                  implementations of combinatorial
                                  algorithms . . . . . . . . . . . . . . . 287--294
            F. W. Bobrowicz and   
                J. E. Lynch and   
               K. J. Fisher and   
                    J. E. Tabor   Vectorized Monte Carlo photon transport  295--305
               B. L. Buzbee and   
           H. J. Raveché   Conference on forefronts of large-scale
                                  computational problems . . . . . . . . . 307--315
                       Ad Emmen   International supercomputer applications
                                  symposium  . . . . . . . . . . . . . . . 317--319
                   Iain S. Duff   Supercomputers in Europe . . . . . . . . 321--324
               Marian Vajtersic   Parallel marching Poisson solvers  . . . 325--330
         Alberto Pettorossi and   
                Andrzej Skowron   Higher-order communications for
                                  concurrent programming . . . . . . . . . 331--336
           Ondrej Sýkora   VLSI systems for some problems of
                                  computational geometry . . . . . . . . . 337--342
                      Anonymous   Calendar . . . . . . . . . . . . . . . . 343--344
                      Anonymous   Author index to volume 1 (1984)  . . . . 345--346

Parallel Computing
Volume 2, Number 1, March, 1985

               Roger W. Hockney   $(r_\infty,\,n_{1/2},\,s_{1/2})$
                                  measurements on the 2-CPU CRAY X-MP  . . 1--14
                     W. Handler   Dynamic computer structures for manifold
                                  utilization  . . . . . . . . . . . . . . 15--32
                       U. Meier   A parallel partition method for solving
                                  banded systems of linear equations . . . 33--43
             Daniel A. Reed and   
             Merrell L. Patrick   Parallel, iterative solution of sparse
                                  linear systems: models and architectures 45--67
                 J. J. Modi and   
                  J. S. Rollett   An algorithm for inverse square-roots    69--71
              Nikola K. Kasabov   A method for SIMD/MIMD functionally
                                  reconfigurable multimicroprocessor
                                  systems design and parallel data
                                  exchange algorithms  . . . . . . . . . . 73--78

Parallel Computing
Volume 2, Number 2, June, 1985

             Hiroshi Tamura and   
              Sachio Kamiya and   
               Takahiro Ishigai   FACOM VP-100/200: supercomputers with
                                  ease of use  . . . . . . . . . . . . . . 87--107
                  D. A. Calahan   Task granularity studies on a
                                  many-processor CRAY X-MP . . . . . . . . 109--118
                  R. W. Hockney   MIMD computing in the U.S.A.---1984  . . 119--136
             J. A. Clausing and   
                R. Hagstrom and   
                 E. L. Lusk and   
                 R. A. Overbeek   A technique for achieving portability
                                  among multiprocessors: Implementation on
                                  the Lemur  . . . . . . . . . . . . . . . 137--162
                N. C. Kalra and   
                 P. C. P. Bhatt   Parallel algorithms for tree traversals  163--171
             Wilhelm Oberaigner   Parallel algorithms for rounding exact
                                  evaluation of sums of products . . . . . 173--182

Parallel Computing
Volume 2, Number 3, November, 1985

                   G. S. Almasi   Overview of parallel processing  . . . . 191--203
                 Garry Rodrigue   Inner/outer iterative methods and
                                  numerical Schwarz algorithms . . . . . . 205--218
                     R. Ohbuchi   Overview of parallel processing research
                                  in Japan . . . . . . . . . . . . . . . . 219--228
                      C. Ghezzi   Concurrency in programming languages: a
                                  survey . . . . . . . . . . . . . . . . . 229--241
                    P. M. Kogge   Function-based computing and
                                  parallelism: a review  . . . . . . . . . 243--253
       Paul O. Frederickson and   
           Rondall E. Jones and   
                 Brian T. Smith   Synchronization and control of parallel
                                  algorithms . . . . . . . . . . . . . . . 255--264
               D. D. Gajski and   
                     J. K. Peir   Comparison of five multiprocessor
                                  systems  . . . . . . . . . . . . . . . . 265--282
                  S. E. Fahlman   Parallel processing in artificial
                                  intelligence . . . . . . . . . . . . . . 283--286
                P. C. Treleaven   Control-driven, data-driven and
                                  demand-driven computer architecture  . . 287--288

Parallel Computing
Volume 2, Number 4, December, 1985

                   Arthur Rizzi   Vector coding the finite-volume
                                  procedure for the CYBER 205  . . . . . . 295--312
                D. J. Evans and   
                         S. Mai   Two parallel algorithms for the convex
                                  hull problem in a two dimensional space  313--326
                     F. Seutter   CEPROL: a cellular programming language  327--333
              J. Staunstrup and   
            J. O. Jespersen and   
                 O. V. Johansen   Physical datarepresentation in a
                                  multiprocessor database machine  . . . . 335--343
                 S. A. Williams   The transformation of collections of
                                  communicating sequential processes that
                                  represent pipeline configurations  . . . 345--351
                       S. Kutti   Taxonomy of parallel processing and
                                  definitions  . . . . . . . . . . . . . . 353--359

Parallel Computing
Volume 3, Number 1, March, 1986

                   J. C. Browne   Framework for formulation and analysis
                                  of parallel computation structures . . . 1--9
                  S. G. Akl and   
                     H. Schmeck   Systolic sorting in a sequential
                                  input/output environment . . . . . . . . 11--23
             J. J. Dongarra and   
                A. H. Sameh and   
                 D. C. Sorensen   Implementation of some concurrent
                                  algorithms for matrix factorization  . . 25--34
                   M. K. Seager   Parallelizing conjugate gradient for the
                                  Cray X-MP  . . . . . . . . . . . . . . . 35--47
            H. A. van der Vorst   The performance of FORTRAN
                                  implementations for preconditioned
                                  conjugate gradients on vector computers  49--58
                M. Sonnenschein   An extension of the language C for
                                  concurrent programming . . . . . . . . . 59--71
                O. Axelsson and   
                    V. Eijkhout   A note on the vectorization of scalar
                                  recursions . . . . . . . . . . . . . . . 73--83
                D. J. Evans and   
                   N. Y. Yousif   The parallel neighbor sort and $2$-way
                                  merge algorithm  . . . . . . . . . . . . 85--90

Parallel Computing
Volume 3, Number 2, May, 1986

                Harry F. Jordan   Structuring parallel algorithms in an
                                  MIMD, shared memory environment  . . . . 93--110
                Robert Hiromoto   Some issues in parallel processing as
                                  encountered on the Denelcor HEP  . . . . 111--127
                 Tim S. Axelrod   Effects of synchronization barriers on
                                  multiprocessor performance . . . . . . . 129--140
                     M. Goldapp   Fast scan-line conversion using
                                  vectorisation  . . . . . . . . . . . . . 141--152
                   Daniel Boley   Solving the generalized eigenvalue
                                  problem on a synchronous linear
                                  processor array  . . . . . . . . . . . . 153--166
                   A. Brass and   
                   G. S. Pawley   Two- and three-dimensional FFTs on
                                  highly parallel computers  . . . . . . . 167--184

Parallel Computing
Volume 3, Number 3, July, 1986

                   B. L. Buzbee   A strategy for vectorization . . . . . . 187--192
                   Iain S. Duff   Parallel implementation of multifrontal
                                  schemes  . . . . . . . . . . . . . . . . 193--204
                       U. Meier   Two parallel SOR variants of the Schwarz
                                  alternating procedure  . . . . . . . . . 205--215
                 C. B. Yang and   
                   R. C. T. Lee   The mapping of $2$-D array processors to
                                  $1$-D array processors . . . . . . . . . 217--229
               John P. Shen and   
              John P. Hayes and   
            Luigi Ciminiera and   
                   Angelo Serra   Fault-tolerance and performance analysis
                                  of beta-networks . . . . . . . . . . . . 231--249
                      E. Katona   A lattice model for cellular (systolic)
                                  algorithms . . . . . . . . . . . . . . . 251--258
                   V. Faber and   
             Olaf M. Lubeck and   
           Andrew B. White, Jr.   Superlinear speedup of an efficient
                                  sequential algorithm is not possible . . 259--260
                   D. Parkinson   Parallel efficiency can be greater than
                                  unity  . . . . . . . . . . . . . . . . . 261--262
                 J. J. Modi and   
                  J. S. Rollett   Some problems of exploiting a pipeline
                                  processor  . . . . . . . . . . . . . . . 263--265

Parallel Computing
Volume 3, Number 4, October, 1986

                H.-C. Hoppe and   
             H. Mühlenbein   Parallel adaptive full-multigrid methods
                                  on message-based multiprocessors . . . . 269--287
                D. J. Evans and   
                   G. M. Megson   Romberg integration using systolic
                                  arrays . . . . . . . . . . . . . . . . . 289--304
                  D. Gannon and   
                     J. Panetta   Restructuring SIMPLE for the CHiP
                                  architecture . . . . . . . . . . . . . . 305--326
                   J. W. H. Liu   Computational models and task scheduling
                                  for parallel sparse Cholesky
                                  factorization  . . . . . . . . . . . . . 327--342
                     W. Oed and   
                       O. Lange   Modelling, measurement, and simulation
                                  of memory interference in the Cray X-MP  343--358

Parallel Computing
Volume 4, Number 1, February, 1987

                    T. Yuba and   
                   H. Kashiwagi   The Japanese national project for new
                                  generation supercomputing systems  . . . 1--16
                  G. C. Fox and   
                 S. W. Otto and   
                   A. J. G. Hey   Matrix algorithms on a hypercube. I.
                                  Matrix multiplication  . . . . . . . . . 17--31
                D. J. Evans and   
                   G. M. Megson   Construction of extrapolation tables by
                                  systolic arrays for solving ordinary
                                  differential equations . . . . . . . . . 33--48
                  W. Ronsch and   
                     H. Strauss   Timing results of some internal sorting
                                  algorithms on vector computers . . . . . 49--61
          P. Moller-Nielsen and   
                  J. Staunstrup   Problem-heap: a paradigm for
                                  multiprocessor algorithms  . . . . . . . 63--74
                H. Carlisle and   
                A. Crawford and   
                    S. Sheppard   ADA multitasking and the single source
                                  shortest path problem  . . . . . . . . . 75--91
                    I. Parberry   Some practical simulations of
                                  impractical parallel computers . . . . . 93--101
              E. D. Brooks, III   A butterfly processor-memory
                                  interconnection for a vector processing
                                  environment  . . . . . . . . . . . . . . 103--110

Parallel Computing
Volume 4, Number 2, April, 1987

                    D. Kamowitz   SOR and MGR$(\nu)$ experiments on the
                                  Crystal multicomputer  . . . . . . . . . 117--142
                 M. Louter-Nool   Basic linear algebra subprograms (BLAS)
                                  on the CDC Cyber 205 . . . . . . . . . . 143--165
                   R. Suros and   
                    E. Montagne   Optimizing systolic networks by fitting
                                  diagonals  . . . . . . . . . . . . . . . 167--174
              R. B. Simpson and   
                      A. Yazici   An organization of the extrapolation
                                  method for vector processing . . . . . . 175--188
                     Z. Strakos   Effectivity and optimizing of algorithms
                                  and programs on the
                                  host-computer/array-processor system . . 189--207
                   V. Faber and   
               O. M. Lubeck and   
               A. B. White, Jr.   Comments on the paper `Parallel
                                  efficiency can be greater than unity'    209--210
                     R. Janssen   A note on superlinear speedup  . . . . . 211--213
                    H. Umeo and   
                   I. Nakatsuka   A design of pipeline-interval-optimum
                                  systolic stack . . . . . . . . . . . . . 215--219
              M. P. Bekakos and   
                    D. J. Evans   A `rotating' and `folding' algorithm
                                  using a two-dimensional `systolic'
                                  communication geometry . . . . . . . . . 221--228
               Michael Kaps and   
                Michael Schlegl   A short proof for the existence of the
                                  ${WZ}$-factorisation . . . . . . . . . . 229--232

Parallel Computing
Volume 4, Number 3, June, 1987

                K. H. Cheng and   
                       S. Sahni   VLSI systems for band matrix
                                  multiplication . . . . . . . . . . . . . 239--258
                C. A. Pogue and   
                     P. Willett   Use of text signatures for document
                                  retrieval in a highly parallel
                                  environment  . . . . . . . . . . . . . . 259--268
         H. Mühlenbein and   
        M. Gorges-Schleuter and   
                      O. Kramer   New solutions to the mapping problem of
                                  parallel systems: the evolution approach 269--279
             P. Federickson and   
                R. Hiromoto and   
                      J. Larson   A parallel Monte Carlo transport
                                  algorithm using a pseudo-random tree to
                                  guarantee reproducibility  . . . . . . . 281--290
              Y. N. Srikant and   
                     P. Shankar   A new parallel algorithm for parsing
                                  arithmetic infix expressions . . . . . . 291--304
                   Guang R. Gao   A stability classification method and
                                  its application to pipelined solution of
                                  linear recurrences . . . . . . . . . . . 305--321
               Hartmut Schwandt   An interval arithmetic method for the
                                  solution of nonlinear systems of
                                  equations on a vector computer . . . . . 323--337
                    Rami Melhem   Parallel Gauss--Jordan elimination for
                                  the solution of dense linear systems . . 339--343
                    J. Modi and   
                      R. Prager   Implementation of bubble sort and the
                                  odd-even transposition sort on a rack of
                                  transputers  . . . . . . . . . . . . . . 345--348
                    W. Gentzsch   A fully vectorizable SOR variant . . . . 349--353

Parallel Computing
Volume 5, Number 1--2, July, 1987

                      Anonymous   International Conference on Vector and
                                  Parallel Computing --- Issues in Applied
                                  Research and Development . . . . . . . . ??
      Petter E. Bjòrstad   A large scale, sparse, secondary
                                  storage, direct linear equation solver
                                  for structural analysis and its
                                  implementation on vector and parallel
                                  architectures  . . . . . . . . . . . . . 3--12
                E. Clementi and   
                 J. Detrich and   
                    S. Chin and   
                G. Corongiu and   
                  D. Folsom and   
                   D. Logan and   
              R. Caltabiano and   
               A. Carnevali and   
                   J. Helin and   
                   M. Russo and   
                   A. Gnudi and   
                  P. Palamidese   Large-scale computations on a scalar,
                                  vector and parallel `supercomputer'  . . 13--44
          Henk A. van der Vorst   Large tridiagonal and block tridiagonal
                                  linear systems on vector and parallel
                                  computers  . . . . . . . . . . . . . . . 45--54
              Ameet K. Dave and   
                   Iain S. Duff   Sparse matrix calculations on the CRAY-2 55--64
                Eleanor Chu and   
                    Alan George   Gaussian elimination with partial
                                  pivoting and load balancing on a
                                  multiprocessor . . . . . . . . . . . . . 65--74
                   D. Parkinson   Organisational aspects of using parallel
                                  computers  . . . . . . . . . . . . . . . 75--83
                Alan George and   
           Michael T. Heath and   
                  Esmond Ng and   
                     Joseph Liu   Symbolic Cholesky factorization on a
                                  local-memory multiprocessor  . . . . . . 85--95
                  R. W. Hockney   Parametrization of computer performance  97--103
                    M. Itoh and   
                      K. Uchida   Trends in Fujitsu large scale computer
                                  technology . . . . . . . . . . . . . . . 105--115
          Oliver A. McBryan and   
           Eric F. Van de Velde   Matrix and vector operations on
                                  hypercube parallel processors  . . . . . 117--125
              Dianne P. O'Leary   Parallel implementation of the block
                                  conjugate gradient algorithm . . . . . . 127--139
       Catherine E. Houstis and   
           Elias N. Houstis and   
                   John R. Rice   Partitioning PDE computations: methods
                                  and performance evaluation . . . . . . . 141--163
               William D. Gropp   Solving PDEs on loosely-coupled parallel
                                  processors . . . . . . . . . . . . . . . 165--173
             J. J. Dongarra and   
                 D. C. Sorensen   A portable environment for developing
                                  parallel FORTRAN programs  . . . . . . . 175--186
                  G. W. Stewart   A parallel implementation of the
                                  $QR$-algorithm . . . . . . . . . . . . . 187--196
           Paul N. Swarztrauber   Multiprocessor FFTs  . . . . . . . . . . 197--210
         Merrell L. Patrick and   
             Daniel A. Reed and   
                Robert G. Voigt   The impact of domain partitioning on the
                                  performance of a shared memory
                                  multiprocessor . . . . . . . . . . . . . 211--217
           Jack J. Dongarra and   
               Lennart Johnsson   Solving banded systems on a parallel
                                  processor  . . . . . . . . . . . . . . . 219--246
                    T. Watanabe   Architecture and performance of NEC
                                  supercomputer SX system  . . . . . . . . 247--255
              Ruth Gonzalez and   
            Mary Fanett Wheeler   Domain decomposition for elliptic
                                  partial differential equations with
                                  Neumann boundary conditions  . . . . . . 257--263

Parallel Computing
Volume 5, Number 3, November, 1987

          Gérard Meurant   Multitasking the conjugate gradient
                                  method on the CRAY X-MP/48 . . . . . . . 267--280
               Alan H. Karp and   
                John Greenstadt   An improved parallel Jacobi method for
                                  diagonalizing a symmetric matrix . . . . 281--294
          Nikolaos M. Missirlis   Scheduling parallel iterative methods on
                                  multiprocessor systems . . . . . . . . . 295--302
          Henk A. van der Vorst   Analysis of a parallel solution method
                                  for tridiagonal linear systems . . . . . 303--311
             Jian Ping Shao and   
                   Li Shan Kang   An asynchronous parallel mixed algorithm
                                  for linear and nonlinear equations . . . 313--321
                M. A. de Bruijn   EPS: an `elementary' programming system
                                  for the Delft Parallel Processor . . . . 323--337
            Piyush Mehrotra and   
             John Van Rosendale   The BLAZE language: a parallel language
                                  for scientific programming . . . . . . . 339--361
                 T. Hoshino and   
               T. Shirakawa and   
                      K. Tsuboi   Mesh-connected parallel computer PAX for
                                  scientific applications  . . . . . . . . 363--371
             I. Stojmenovic and   
                    D. J. Evans   Comments on two parallel algorithms for
                                  the planar convex hull problem . . . . . 373--375

Parallel Computing
Volume 6, Number 1, January, 1988

                 H. P. Zima and   
                 H.-J. Bast and   
                      M. Gerndt   SUPERB: a tool for semi-automatic
                                  MIMD/SIMD parallelization  . . . . . . . 1--18
              Joel H. Saltz and   
                  Vijay K. Naik   Towards developing robust algorithms for
                                  solving partial differential equations
                                  on MIMD machines . . . . . . . . . . . . 19--44
          Stavros A. Zenios and   
                 John M. Mulvey   A distributed algorithm for convex
                                  network optimization problems  . . . . . 45--56
                  M. Zubair and   
                    B. B. Maden   Efficient systolic algorithm for finding
                                  bridges in a connected graph . . . . . . 57--61
          David L. Cochrane and   
              Donald G. Truhlar   Strategies and performance norms for
                                  efficient utilization of vector pipeline
                                  computers as illustrated by the
                                  classical mechanical simulation of
                                  rotationally inelastic collisions  . . . 63--85
                  Zahari Zlatev   Treatment of some mathematical models
                                  describing long-range transport of air
                                  pollutants on vector processors  . . . . 87--98
                Clive Temperton   Implementation of a prime factor FFT
                                  algorithm on CRAY-1  . . . . . . . . . . 99--108
          Charles H. Romine and   
                James M. Ortega   Parallel solution of triangular systems
                                  of equations . . . . . . . . . . . . . . 109--114
                   P. Carnevali   Timing results of some internal sorting
                                  algorithms on the IBM 3090 . . . . . . . 115--117
                D. J. Evans and   
                  K. Margaritis   Optical processing of banded matrix
                                  algorithms using outer product concepts  119--125

Parallel Computing
Volume 6, Number 2, February, 1988

              F. A. Lootsma and   
                 K. M. Ragsdell   State-of-the-art in parallel nonlinear
                                  optimization . . . . . . . . . . . . . . 133--155
                D. J. Silvester   Optimising finite element matrix
                                  calculations using the general technique
                                  of element vectorisation . . . . . . . . 157--164
                    Rami Melhem   Parallel solution of linear systems with
                                  striped sparse matrices  . . . . . . . . 165--184
       Willi Schönauer and   
                   Eric Schnepf   FIDISOL: a ``black box'' solver for
                                  partial differential equations . . . . . 185--193
                   Yau Shu Wong   Solving large elliptic difference
                                  equations on CYBER 205 . . . . . . . . . 195--207
                   T. Asano and   
                        H. Umeo   Systolic algorithms for computing the
                                  visibility polygon and triangulation of
                                  a polygonal region . . . . . . . . . . . 209--216
              Mike Ashworth and   
                 Andrew G. Lyne   A segmented FFT algorithm for vector
                                  computers  . . . . . . . . . . . . . . . 217--224
              R. M. Chamberlain   Gray codes, fast Fourier transforms and
                                  hypercubes . . . . . . . . . . . . . . . 225--233
              E. D. Brooks, III   The shared memory hypercube  . . . . . . 235--245
                C. R. Askew and   
            D. B. Carpenter and   
              J. T. Chalker and   
               A. J. G. Hey and   
                   M. Moore and   
               D. A. Nicole and   
                D. J. Pritchard   Monte Carlo simulation on transputer
                                  arrays . . . . . . . . . . . . . . . . . 247--258
             M. Hatzopoulos and   
                    D. J. Evans   Comments on the paper: ``A short proof
                                  for the existence of the
                                  WZ-factorisation'' [Parallel Comput. \bf
                                  4 (1987), no. 2, 229--232, MR
                                  88j:65064a] by M. Kaps and M. Schlegl    259--259

Parallel Computing
Volume 6, Number 3, March, 1988

          William L. Briggs and   
                Thomas Turnbull   Fast Poisson solvers for MIMD computers  265--274
                 M. Cosnard and   
               M. Marrakchi and   
                  Y. Robert and   
                    D. Trystram   Parallel Gaussian elimination on an MIMD
                                  computer . . . . . . . . . . . . . . . . 275--296
                H. Y. Chang and   
                    S. Utku and   
                  M. Salama and   
                        D. Rapp   A parallel Householder
                                  tridiagonalization strategem using
                                  scattered square decomposition . . . . . 297--311
                D. J. Evans and   
             Jian Ping Shao and   
                   Li Shan Kang   The convergence factor of the parallel
                                  Schwarz overrelaxation method for linear
                                  systems  . . . . . . . . . . . . . . . . 313--324
            B. W. Glickfeld and   
                 R. A. Overbeek   Geometric specification of scheduling
                                  constraints: a simplified approach to
                                  multiprocessing  . . . . . . . . . . . . 325--337
              E. D. Brooks, III   The indirect $k$-ary $n$-cube for a
                                  vector processing environment  . . . . . 339--348
                    M. J. Quinn   Parallel sorting algorithms for tightly
                                  coupled multiprocessors  . . . . . . . . 349--357
           Robert A. Wagner and   
             Merrell L. Patrick   A sparse matrix algorithm on the Boolean
                                  vector machine . . . . . . . . . . . . . 359--371
                   U. Harms and   
                  H. Luttermann   Experiences in benchmarking the three
                                  supercomputers CRAY-1M, CRAY-X/MP,
                                  FUJITSU VP-200 compared with the CYBER
                                  76 . . . . . . . . . . . . . . . . . . . 373--382
                      S. C. Kak   A two-layered mesh array for matrix
                                  multiplication . . . . . . . . . . . . . 383--385

Parallel Computing
Volume 7, Number 1, April, 1988

                D. A. Poplawski   Mapping rings and grids onto the FPS
                                  T-Series hypercube . . . . . . . . . . . 1--10
                  F. Darema and   
               D. A. George and   
               V. A. Norton and   
                  G. F. Pfister   A single-program-multiple-data
                                  computational model for EPEX/FORTRAN . . 11--24
              Manfred Kunde and   
           Hans-Werner Lang and   
          Manfred Schimmler and   
            Hartmut Schmeck and   
            Heiko Schröder   The instruction systolic array and its
                                  relation to other models of parallel
                                  computers  . . . . . . . . . . . . . . . 25--39
                F. C. Kampe and   
                   T. M. Nguyen   Performance comparison of the Cray-2 and
                                  Cray X-MP on a class of seismic data
                                  processing algorithms  . . . . . . . . . 41--53
                     B. Steffen   Implementation of a resonant cavity
                                  package on MIMD computers  . . . . . . . 55--63
         H. Mühlenbein and   
        M. Gorges-Schleuter and   
                      O. Kramer   Evolution algorithms in combinatorial
                                  optimization . . . . . . . . . . . . . . 65--85
         Peter H. Michielse and   
          Henk A. van der Vorst   Data transport in Wang's partition
                                  method . . . . . . . . . . . . . . . . . 87--95
                  Mark Goldmann   Vectorisation of the multiple shooting
                                  method for the nonlinear boundary value
                                  problem in ordinary differential
                                  equations  . . . . . . . . . . . . . . . 97--110
                D. J. Evans and   
                  M. P. Bekakos   The solution of linear systems by the
                                  QIF algorithm on a wavefront array
                                  processor  . . . . . . . . . . . . . . . 111--130

Parallel Computing
Volume 7, Number 2, June, 1988

                   J. M. Ortega   The $ijk$ forms of factorization
                                  methods. I. Vector computers . . . . . . 135--147
               J. M. Ortega and   
                   C. H. Romine   The $ijk$ forms of factorization
                                  methods. II. Parallel systems  . . . . . 149--162
                      J. T. Feo   An analysis of the computational and
                                  parallel complexity of the Livermore
                                  loops  . . . . . . . . . . . . . . . . . 163--185
                 R. G. Babb and   
                   L. Storc and   
                    R. Hiromoto   Developing a parallel Monte Carlo
                                  transport algorithm using large-grain
                                  data flow  . . . . . . . . . . . . . . . 187--198
             Earl Zmijewski and   
                John R. Gilbert   A parallel algorithm for sparse symbolic
                                  Cholesky factorization on a
                                  multiprocessor . . . . . . . . . . . . . 199--210
              J. A. Kapenga and   
                  E. de Doncker   A parallelization of adaptive task
                                  partitioning algorithms  . . . . . . . . 211--225
              J. L. Gaudiot and   
                   J. I. Pi and   
                 M. L. Campbell   Program graph allocation in distributed
                                  multicomputers . . . . . . . . . . . . . 227--247
             I. Stojmenovic and   
                    M. Miyakawa   An optimal parallel algorithm for
                                  solving the maximal elements problem in
                                  the plane  . . . . . . . . . . . . . . . 249--251
                Yves Robert and   
                 Denis Trystram   Comments on scheduling parallel
                                  iterative methods on multiprocessor
                                  systems  . . . . . . . . . . . . . . . . 253--255

Parallel Computing
Volume 7, Number 3, September, 1988

                      Anonymous   2nd International SUPRENUM Colloquium    ??
             K. Solchenbach and   
                 U. Trottenberg   SUPRENUM: system essentials and grid
                                  applications . . . . . . . . . . . . . . 265--281
                    W. K. Giloi   SUPRENUM: a trendsetter in modern
                                  supercomputer development  . . . . . . . 283--296
                      K. Peinze   The SUPRENUM preprototype: status and
                                  experiences  . . . . . . . . . . . . . . 297--313
                      H. Kammer   The SUPRENUM vector floating-point unit  315--323
                    W. Schroder   PEACE: the distributed SUPRENUM
                                  operating system . . . . . . . . . . . . 325--333
                   G. Schaffler   Connecting PEACE to UNIX . . . . . . . . 335--339
                 K. Solchenbach   Grid applications on distributed memory
                                  architectures: implementation and
                                  evaluation . . . . . . . . . . . . . . . 341--356
                    O. Kolp and   
                 H. Mierendorff   Performance estimations for SUPRENUM
                                  systems  . . . . . . . . . . . . . . . . 357--366
                M. D. Ercegovac   Heterogeneity in supercomputer
                                  architectures  . . . . . . . . . . . . . 367--372
                    F. Hossfeld   Vector-supercomputers  . . . . . . . . . 373--385
                  U. Kremer and   
                 H.-J. Bast and   
                  M. Gerndt and   
                     H. P. Zima   Advanced tools and techniques for
                                  automatic parallelization  . . . . . . . 387--393
                 L. Lehmann and   
                       F. Hopfl   A model of distributed recovery for the
                                  SUPRENUM multiprocessor  . . . . . . . . 395--401
                  B. Franke and   
                 R. Harneit and   
                    A. Kern and   
                  H. C. Zeidler   The pipeline bus: an interconnection
                                  network for multiprocessor systems . . . 403--412
                  W. Ronsch and   
                     H. Strauss   A linear algebra package for a local
                                  memory multiprocessor: problems,
                                  proposals and solutions  . . . . . . . . 413--418
                     I. Gutheil   SUPRENUM software for the symmetric
                                  eigenvalue problem . . . . . . . . . . . 419--424
                      U. Herzog   Performance evaluation principles for
                                  vector- and multiprocessor systems . . . 425--438
                    R. Williams   Free-Lagrange hydrodynamics with a
                                  distributed-memory parallel processor    439--443
                 D. Seldner and   
                    M. Alef and   
              T. Westermann and   
                      E. Halter   Parallel particle simulation in high
                                  voltage diodes (algorithms and concepts
                                  for implementation on SUPRENUM)  . . . . 445--449
                   H. Capdevila   Solution of $2$-D Euler equations with a
                                  parallel code  . . . . . . . . . . . . . 451--460
                  J. Linden and   
                 B. Steckel and   
                      K. Stuben   Parallel multigrid solution of the
                                  Navier--Stokes equations on general 2D
                                  domains  . . . . . . . . . . . . . . . . 461--475
                  O. A. McBryan   New architectures: performance
                                  highlights and new algorithms  . . . . . 477--499

Parallel Computing
Volume 8, Number 1--3, October, 1988

                      Anonymous   International Conference on Vector and
                                  Parallel Processors in Computational
                                  Science III  . . . . . . . . . . . . . . ??
                  A. Kashko and   
                  H. Buxton and   
               B. F. Buxton and   
                 D. A. Castelow   Parallel matching and reconstruction
                                  algorithms in computer vision  . . . . . 3--17
                    C. Jesshope   Transputers and switches as objects in
                                  OCCAM  . . . . . . . . . . . . . . . . . 19--30
                   H. F. Jordan   Programming language concepts for
                                  multiprocessors  . . . . . . . . . . . . 31--40
             J. J. Dongarra and   
             D. C. Sorensen and   
                K. Connolly and   
                   J. Patterson   Programming methodology and performance
                                  issues for advanced computer
                                  architectures  . . . . . . . . . . . . . 41--58
                P. C. Treleaven   Parallel architecture overview . . . . . 59--70
              B. M. Forrest and   
                  D. Roweth and   
                  N. Stroud and   
              D. J. Wallace and   
                   G. V. Wilson   Neural network models  . . . . . . . . . 71--83
             R. G. Babb, II and   
                   L. Storc and   
                 P. G. Eltgroth   Parallelization schemes for $2$-D
                                  hydrodynamics codes using the
                                  independent time step method . . . . . . 85--89
                   K. Miura and   
                 R. G. Babb, II   Tradeoffs in granularity and
                                  parallelization for a Monte Carlo shower
                                  simulation code  . . . . . . . . . . . . 91--100
                  C. F. Baillie   Comparing shared and distributed memory
                                  computers  . . . . . . . . . . . . . . . 101--110
                 Thomas Brandes   Determination of dependencies in a
                                  knowledge-based parallelization tool . . 111--119
                      G. Carver   A spectral meteorological method on the
                                  ICL DAP  . . . . . . . . . . . . . . . . 121--126
                   M. Clint and   
                D. Roantree and   
                     A. Stewart   Towards the construction of an
                                  eigenvalue engine  . . . . . . . . . . . 127--132
                  A. Corona and   
                 C. Martini and   
                 M. Morando and   
                 S. Ridella and   
                     C. Rolando   Solving linear equation systems on
                                  vector computers with maximum efficiency 133--139
                 D. Crookes and   
               P. J. Morrow and   
                P. Milligan and   
           P. L. Kilpatrick and   
                    N. S. Scott   An array processing language for
                                  transputer networks  . . . . . . . . . . 141--148
                    D. Dent and   
                     M. O'Neill   Microtasking as a complement to
                                  macrotasking . . . . . . . . . . . . . . 149--154
          Peter G. Eltgroth and   
                 Mark K. Seager   The sub-implicit method: new
                                  multiprocessor algorithms for old
                                  implicit codes . . . . . . . . . . . . . 155--163
                 R. Francis and   
                   I. Mathieson   Synchronised execution on shared memory
                                  multiprocessors  . . . . . . . . . . . . 165--175
                       R. Gurke   The approximate solution of the
                                  Euclidean traveling salesman problem on
                                  a CRAY X-MP  . . . . . . . . . . . . . . 177--183
                   A. Inoue and   
                       A. Maeda   The architecture of a multi-vector
                                  processor system, VVP  . . . . . . . . . 185--193
                 T. Legendi and   
                  E. Katona and   
                    J. Toth and   
                      A. Zsoter   Megacell machine . . . . . . . . . . . . 195--199
         H. Mühlenbein and   
                  O. Kramer and   
               F. Limburger and   
               M. Mevenkamp and   
                     S. Streitz   MUPPET: a programming environment for
                                  message-based multiprocessors  . . . . . 201--221
                    W. E. Nagel   Using multiple CPUs for problem solving:
                                  experiences in multitasking on the CRAY
                                  X-MP/48  . . . . . . . . . . . . . . . . 223--230
                    S. Katz and   
                  W. A. Ray and   
                      G. Walder   Multiprocessor software for the
                                  CYBERPLUS high performance system  . . . 231--244
           J. B. G. Roberts and   
                 J. G. Harp and   
           B. C. Merrifield and   
               K. J. Palmer and   
                 P. Simpson and   
                 J. S. Ward and   
                   H. C. Webber   Evaluating parallel processors for
                                  real-time applications . . . . . . . . . 245--254
             D. F. Snelling and   
                 G.-R. Hoffmann   A comparative study of libraries for
                                  parallel processing  . . . . . . . . . . 255--266
            D. A. Tanqueray and   
                 D. F. Snelling   A distributed self-scheduler for
                                  partially ordered tasks  . . . . . . . . 267--273
                        R. Wait   Partitioning and preconditioning of
                                  finite element matrices on the DAP . . . 275--284
            H. J. Wasserman and   
              M. L. Simmons and   
                   O. M. Lubeck   The performance of minisupercomputers:
                                  Alliant FX/8, Convex C-1, and SCS-40 . . 285--293
                A. T. Brint and   
               V. J. Gillet and   
                M. F. Lynch and   
                 P. Willett and   
               G. A. Manson and   
                   G. A. Wilson   Chemical graph matching using transputer
                                  networks . . . . . . . . . . . . . . . . 295--300
                  Z. Zlatev and   
                  Phuong Vu and   
              J. Wasniewski and   
                  K. Schaumburg   Computations with symmetric, positive
                                  definite and band matrices on a parallel
                                  vector processor . . . . . . . . . . . . 301--312
                J. Berntsen and   
                  T. O. Espelid   A parallel global adaptive quadrature
                                  algorithm for hypercubes . . . . . . . . 313--323
                    R. Wait and   
                    N. G. Brown   Overlapping block methods for solving
                                  tridiagonal systems on transputer arrays 325--333
                   A. J. Davies   The boundary element method on the ICL
                                  DAP  . . . . . . . . . . . . . . . . . . 335--343
              J. J. Du Croz and   
             P. J. D. Mayes and   
              J. Wasniewski and   
                      S. Wilson   Applications of Level 2 BLAS in the NAG
                                  library  . . . . . . . . . . . . . . . . 345--350
                  C. H. Lai and   
                  H. M. Liddell   Finite elements using long vectors of
                                  the DAP  . . . . . . . . . . . . . . . . 351--361
               A. McKerrell and   
                   L. M. Delves   Monte Carlo simulation of neutron
                                  diffusion on SIMD architectures  . . . . 363--370
                      R. Reuter   Solving tridiagonal systems of linear
                                  equations on the IBM 3090 VF . . . . . . 371--376
                G. Radicati and   
                  Y. Robert and   
                   P. Sguazzero   Dense linear systems FORTRAN solvers on
                                  the IBM 3090 vector multiprocessor . . . 377--384
          C. Froese Fischer and   
                N. S. Scott and   
                         J. Yoo   Multitasking the calculation of angular
                                  integrals on the CRAY-2 and CRAY X-MP    385--390
               H. Finnemann and   
                   J. Brehm and   
                  E. Michel and   
                     J. Volkert   Solution of the neutron diffusion
                                  equation through multigrid methods
                                  implemented on a memory-coupled
                                  25-processor system  . . . . . . . . . . 391--398
                C. A. Pogue and   
            E. M. Rasmussen and   
                     P. Willett   Searching and clustering of databases
                                  using the ICL distributed array
                                  processor  . . . . . . . . . . . . . . . 399--407
                 D. F. Snelling   Standard FORTRAN 77 as a parallel
                                  language . . . . . . . . . . . . . . . . 409--414

Parallel Computing
Volume 9, Number 1, December, 1988

                  O. A. McBryan   The Connection Machine: PDE solution on
                                  65536 processors . . . . . . . . . . . . 1--24
                  O. Brewer and   
                J. Dongarra and   
                    D. Sorensen   Tools to aid in the analysis of memory
                                  access patterns for FORTRAN programs . . 25--35
               O. M. Lubeck and   
                       V. Faber   Modeling the performance of hypercubes:
                                  a case study using the particle-in-cell
                                  application  . . . . . . . . . . . . . . 37--52
                 T. Hoshino and   
                R. Hiromoto and   
               S. Sekiguchi and   
                      S. Majima   Mapping schemes of the particle-in-cell
                                  method implemented on the PAX computer   53--75
    Dieter Müller-Wichards   Performance estimates for applications:
                                  an algebraic framework . . . . . . . . . 77--106
                W. Gentzsch and   
                F. Szelenyi and   
                       V. Zecca   Use of parallel FORTRAN for engineering
                                  problems on the IBM 3090 vector
                                  multiprocessor . . . . . . . . . . . . . 107--115

Parallel Computing
Volume 9, Number 2, January, 1989

          Emile H. L. Aarts and   
                Jan H. M. Korst   Computations in massively parallel
                                  networks based on the Boltzmann machine:
                                  a review . . . . . . . . . . . . . . . . 129--145
                    J. K. Annot   A deadlock free and starvation free
                                  network of packet switching
                                  communication processors . . . . . . . . 147--162
           H. P. Barendregt and   
    M. C. J. D. Van Eekelen and   
           M. J. Plasmeijer and   
           J. R. W. Glauert and   
             J. R. Kennaway and   
                    M. R. Sleep   LEAN: an intermediate language based on
                                  graph rewriting  . . . . . . . . . . . . 163--177
                    D. I. Bevan   An efficient reference counting solution
                                  to the distributed garbage collection
                                  problem  . . . . . . . . . . . . . . . . 179--192
                    W. Damm and   
                      G. Dohmen   Specifying distributed computer
                                  architectures in AADL  . . . . . . . . . 193--211
             O. Krämer and   
             H. Mühlenbein   Mapping strategies in message-based
                                  multiprocessor systems . . . . . . . . . 213--225
               A. R. Martin and   
                   J. V. Tucker   The concurrent assignment representation
                                  of synchronous systems . . . . . . . . . 227--256
                    P. H. Welch   Emulating digital logic using transputer
                                  networks (very high
                                  parallelism=simplicity=performance)  . . 257--272

Parallel Computing
Volume 9, Number 3, February, 1989

                     R. Hockney   Synchronization and communication
                                  overheads on the LCAP multiple FPS-164
                                  computer system  . . . . . . . . . . . . 279--290
           Chandrika Kamath and   
                    Ahmed Sameh   A projection method for solving
                                  nonsymmetric linear systems on
                                  multiprocessors  . . . . . . . . . . . . 291--312
               Robert A. Wagner   Parallel solution of arbitrarily sparse
                                  linear systems . . . . . . . . . . . . . 313--331
             Loyce M. Adams and   
               Elizabeth G. Ong   Additive polynomial preconditioners for
                                  parallel computers . . . . . . . . . . . 333--345
                Israel Gottlieb   The partitioning of QSDF computation
                                  graphs . . . . . . . . . . . . . . . . . 347--358
             Ilio Galligani and   
               Valeria Ruggiero   Solving large systems of linear ordinary
                                  differential equations on a vector
                                  computer . . . . . . . . . . . . . . . . 359--365
    M. Bessenrodt-Weberpals and   
                   H. Weberpals   A fast vector algorithm for solving
                                  tridiagonal linear equations . . . . . . 367--372
                D. J. Evans and   
              K. Margaritis and   
                  M. P. Bekakos   Systolic and holographic pyramidical
                                  soft-systolic designs for successive
                                  matrix powers  . . . . . . . . . . . . . 373--384
                 M. Cosnard and   
             A. G. Ferreira and   
                    H. Herbelin   The two list algorithm for the knapsack
                                  problem on a FPS T20 . . . . . . . . . . 385--388

Parallel Computing
Volume 10, Number 1, March, 1989

                   A. Greenbaum   Synchronization costs on multiprocessors 3--14
                Th. Ruppelt and   
                       G. Wirtz   Automatic transformation of high-level
                                  object-oriented specifications into
                                  parallel programs  . . . . . . . . . . . 15--28
                    C. McCrosky   Realizing the parallelism of array-based
                                  computation  . . . . . . . . . . . . . . 29--43
                   Y. Wolfstahl   Mapping parallel programs to
                                  multiprocessors: a dynamic approach  . . 45--50
                    J. Gary and   
                     L. Fosdick   An optimizing precompiler for
                                  finite-difference computations on a
                                  vector computer  . . . . . . . . . . . . 51--64
                 J.-F. Hake and   
                     W. Homberg   Linear algebra software on a vector
                                  computer . . . . . . . . . . . . . . . . 65--81
          Aydin Üresin and   
                  Michel Dubois   Sufficient conditions for the
                                  convergence of asynchronous iterations   83--92
                V. Eijkhout and   
                 P. Vassilevski   Positive definiteness aspects of
                                  vectorizable preconditioners . . . . . . 93--100
           Susumu Horiguchi and   
            Willard L. Miranker   A parallel algorithm for finding the
                                  maximum value  . . . . . . . . . . . . . 101--108
              Kam-Hoi Cheng and   
                   Sartaj Sahni   A new VLSI system for adaptive recursive
                                  filtering  . . . . . . . . . . . . . . . 109--115
             Michel Cosnard and   
           Maurice Tchuente and   
            Bernard Tourancheau   Systolic Gauss--Jordan elimination for
                                  dense linear systems . . . . . . . . . . 117--122
                    H. M. Amman   Nonlinear control simulation on a vector
                                  machine  . . . . . . . . . . . . . . . . 123--127

Parallel Computing
Volume 10, Number 2, April, 1989

                     Nigel Dodd   Graph matching by stochastic
                                  optimisation applied to the
                                  implementation of multi layer
                                  perceptrons on transputer networks . . . 135--142
             E. Gallopoulos and   
                        Y. Saad   A parallel block cyclic reduction
                                  algorithm for the fast solution of
                                  elliptic equations . . . . . . . . . . . 143--159
              Wolfgang Pelz and   
                Layne T. Watson   Message length effects for solving
                                  polynomial systems on a hypercube  . . . 161--176
               Michael R. Leuze   Independent set orderings for parallel
                                  matrix factorization by Gaussian
                                  elimination  . . . . . . . . . . . . . . 177--191
                D. J. Evans and   
                 A. M. S. Rahma   The numerical solution of Fredholm
                                  integral equations on parallel computers 193--205
               G. M. Megson and   
                    D. J. Evans   Algorithmic fault tolerance for matrix
                                  operations on triangular arrays  . . . . 207--219
                      S. Storoy   Holistic algorithms: a paradigm for
                                  multiprocessor programming . . . . . . . 221--229
            Ferng-Ching Lin and   
                      R. Charng   Pin reduction through variable
                                  duplications and substitutions in a data
                                  dependence graph . . . . . . . . . . . . 231--238
                 John M. Conroy   A note on the parallel Cholesky
                                  factorization of wide banded matrices    239--246
                 M. Cosnard and   
                  Y. Robert and   
                 B. Tourancheau   Evaluating speedups on distributed
                                  memory architectures . . . . . . . . . . 247--253

Parallel Computing
Volume 10, Number 3, May, 1989

                     J. J. Hack   On the promise of general-purpose
                                  parallel computing . . . . . . . . . . . 261--275
              R. W. Hockney and   
                I. J. Curington   $f_{1/2}$: a parameter to characterize
                                  memory and communication bottlenecks . . 277--286
                Alan George and   
           Joseph W. H. Liu and   
                      Esmond Ng   Communication results for parallel
                                  sparse Cholesky factorization on a
                                  hypercube  . . . . . . . . . . . . . . . 287--298
                R. M. Hyatt and   
                B. W. Suter and   
                   H. L. Nelson   A parallel alpha/beta tree searching
                                  algorithm  . . . . . . . . . . . . . . . 299--308
                         Tao Li   Parallel implementation of rule-based
                                  expert systems for interactive
                                  applications . . . . . . . . . . . . . . 309--318
                   M. Malek and   
                       E. Opper   The cylindrical banyan multicomputer: a
                                  reconfigurable systolic architecture . . 319--327
                    C. Holt and   
                     A. Stewart   A parallel thinning algorithm with fine
                                  grain subtasking . . . . . . . . . . . . 329--334
      James R. A. Allwright and   
                D. B. Carpenter   A distributed implementation of
                                  simulated annealing for the travelling
                                  salesman problem . . . . . . . . . . . . 335--338
                G. de Biase and   
                  P. Ciucci and   
                     M. Cottone   Vectorized algorithms for astronomical
                                  image processing . . . . . . . . . . . . 339--346
           Jong-Chuang Tsay and   
               Yodung-Chang Hou   Generating function and equivalent
                                  transformation for systolic arrays . . . 347--356
              M. P. Bekakos and   
                    D. J. Evans   Relative performance comparisons for the
                                  group explicit class of methods on MIMD,
                                  SIMD and pipelined vector computers  . . 357--364

Parallel Computing
Volume 11, Number 1, July, 1989

                  K. Ohmaki and   
                  S. Tomura and   
                   K. Inoue and   
                     T. Ito and   
                     K. Ito and   
                       K. Torii   TERM: a parallel executable graph
                                  reduction machine for equational
                                  language . . . . . . . . . . . . . . . . 1--16
            M. E. Henderson and   
                 W. L. Miranker   Synergy in parallel algorithms . . . . . 17--35
         A. T. Chronopoulos and   
                     C. W. Gear   On the efficient implementation of
                                  preconditioned $s$-step conjugate
                                  gradient methods on multiprocessors with
                                  memory hierarchy . . . . . . . . . . . . 37--53
                Eleanor Chu and   
                    Alan George   $QR$ factorization of a dense matrix on
                                  a shared-memory multiprocessor . . . . . 55--71
               Joseph W. H. Liu   Reordering sparse matrices for parallel
                                  elimination  . . . . . . . . . . . . . . 73--91
           David A. Carlson and   
                    Binay Sugla   Adapting shuffle-exchange like parallel
                                  processing organizations to work as
                                  systolic arrays  . . . . . . . . . . . . 93--106
                   C. Temperton   Further measurements of
                                  $(r_\infty,n_{1/2})$ on the CRAY-1 and
                                  CRAY X-MP  . . . . . . . . . . . . . . . 107--111
                       C. Lecot   An algorithm for generating low
                                  discrepancy sequences on vector
                                  computers  . . . . . . . . . . . . . . . 113--116
               J. Moscinski and   
               Z. A. Rycerz and   
                P. W. M. Jacobs   Timing results of some internal sorting
                                  algorithms on the ETA 10-P . . . . . . . 117--119
                J. M. Troya and   
                      M. Ortega   A study of parallel branch-and-bound
                                  algorithms with best-bound-first search  121--126

Parallel Computing
Volume 11, Number 2, 1989

                Youcef Saad and   
              Martin H. Schultz   Data communication in parallel
                                  architectures  . . . . . . . . . . . . . 131--150
               A. M. Frieze and   
                 J. Yadegar and   
              S. El-Horbaty and   
                   D. Parkinson   Algorithms for assignment problems on an
                                  array processor  . . . . . . . . . . . . 151--162
                E. Adamides and   
               Ph. Tsalides and   
                 A. Thanailakis   Synchronization of asynchronous
                                  concurrent processes using cellular
                                  automata . . . . . . . . . . . . . . . . 163--169
           Christian H. Bischof   Computing the singular value
                                  decomposition on a distributed system of
                                  vector processors  . . . . . . . . . . . 171--186
               J. P. Bonomo and   
                   W. R. Dyksen   Pipelined iterative methods for shared
                                  memory machines  . . . . . . . . . . . . 187--199
                 Gita Alaghband   Parallel pivoting combined with parallel
                                  reduction and fill-in control  . . . . . 201--221
     G. Radicati di Brozolo and   
                      Y. Robert   Parallel conjugate gradient-like
                                  algorithms for solving sparse
                                  nonsymmetric linear systems on a vector
                                  multiprocessor . . . . . . . . . . . . . 223--239
           Stanley C. Eisenstat   Comments on scheduling parallel
                                  iterative methods on multiprocessor
                                  systems. II  . . . . . . . . . . . . . . 241--244
                D. J. Evans and   
                   B. B. Sanugi   A parallel Runge--Kutta integration
                                  method . . . . . . . . . . . . . . . . . 245--251

Parallel Computing
Volume 11, Number 3, 1989

            David E. Womble and   
      Richard C. Allen, Jr. and   
               Lorraine S. Baca   Invariant imbedding and the method of
                                  lines for parallel computers . . . . . . 263--273
                  Xiaobo Li and   
                    Zhi Xi Fang   Parallel clustering algorithms . . . . . 275--290
              E. L. Zapatam and   
               F. F. Rivera and   
                O. G. Plata and   
                   M. A. Ismail   Parallel fuzzy clustering on fixed size
                                  hypercube SIMD computers . . . . . . . . 291--303
               D. W. Lozier and   
                     R. G. Rehm   Some performance comparisons for a fluid
                                  dynamics code  . . . . . . . . . . . . . 305--320
               G. S. Pawley and   
              C. F. Baillie and   
               E. Tenenbaum and   
                   W. Celmaster   The BBN Butterfly used to simulate a
                                  molecular liquid . . . . . . . . . . . . 321--329
                 J. Glasgow and   
                 M. Jenkins and   
                  H. Meijer and   
                    C. McCrosky   Expressing parallel algorithms in Nial   331--347
               V. K. Murthy and   
               H. Schröder   Systolic arrays for parallel matrix
                                  $g$-inversion and finding Petri net
                                  invariants . . . . . . . . . . . . . . . 349--359
                 W. Ewinger and   
                    O. Haan and   
              E. Haupenthal and   
                     C. Siemers   Modelling and measurement of memory
                                  access in SIEMENS VP supercomputers  . . 361--365
                I.-C. Chang Jou   Linear rotation based algorithm and
                                  systolic architecture for solving linear
                                  system equations . . . . . . . . . . . . 367--379
             Jang-Ping Sheu and   
               Chun-lien Wu and   
                  Gen-Huey Chen   Selection of the first k largest
                                  processes in hypercubes  . . . . . . . . 381--384
                    D. J. Evans   A systolic design for the Aitken
                                  extrapolation formula  . . . . . . . . . 385--388

Parallel Computing
Volume 12, Number 1, October, 1989

            Horace P. Flatt and   
                    Ken Kennedy   Performance of parallel processors . . . 1--20
                    L. Brochard   Efficiency of some parallel numerical
                                  algorithms on distributed systems  . . . 21--44
                 R. S. Barr and   
              R. V. Helgaon and   
               J. L. Kennington   Minimal spanning trees: an empirical
                                  investigation of parallel algorithms . . 45--52
              Kam Hoi Cheng and   
                       S. Sahni   VLSI architectures for back substitution 53--69
       Hussein M. Alnuweiri and   
           V. K. Prasanna Kumar   An efficient VLSI architecture with
                                  applications to geometric problems . . . 71--93
                E. Babolian and   
                   L. M. Delves   Parallel solution of Fredholm integral
                                  equations  . . . . . . . . . . . . . . . 95--106
          Manfred Schimmler and   
            Heiko Schröder   A simple systolic method to find all
                                  bridges of an undirected graph . . . . . 107--111
                    H. Bohr and   
               K. S. Jensen and   
                T. Petersen and   
                 B. Rathjen and   
               E. Mosekilde and   
         N.-H. Holstein-Rathlou   Parallel computer simulation of
                                  nearest-neighbour interaction in a
                                  system of nephrons . . . . . . . . . . . 113--120
             David J. Evans and   
             Ivan Stojmenovi\'c   On parallel computation of Vorono\u\i
                                  diagrams . . . . . . . . . . . . . . . . 121--125

Parallel Computing
Volume 12, Number 2, November, 1989

                    L. Hart and   
                   S. McCormick   Asynchronous multilevel adaptive methods
                                  for solving partial differential
                                  equations on multiprocessors: basic
                                  ideas  . . . . . . . . . . . . . . . . . 131--144
               S. McCormick and   
                     D. Quinlan   Asynchronous multilevel adaptive methods
                                  for solving partial differential
                                  equations on multiprocessors:
                                  performance results  . . . . . . . . . . 145--156
            N. S. Arenstorf and   
                   H. F. Jordan   Comparing barrier algorithms . . . . . . 157--170
  Theodore S. Papatheodorou and   
           Yiannis G. Saridakis   Parallel algorithms and architectures
                                  for multisplitting iterative methods . . 171--182
           Mounir Marrakchi and   
                    Yves Robert   Optimal algorithms for Gaussian
                                  elimination on an MIMD computer  . . . . 183--194
          Concettina Guerra and   
                    Rami Melhem   Synthesis of systolic algorithm design   195--207
              C. F. Baillie and   
                   G. S. Pawley   A comparison of the CM with the DAP for
                                  lattice gauge theory . . . . . . . . . . 209--220
                Frank Dehne and   
     Anne-Lise Hassenklover and   
    Jörg-Rüdiger Sack   Computing the configuration space for a
                                  robot on a mesh-of-processors  . . . . . 221--231
         Hans-Jürgen Hotop   New Kalman filter algorithms based on
                                  orthogonal transformations for serial
                                  and vector computers . . . . . . . . . . 233--247
                 A. Benaini and   
                      Y. Robert   An even faster systolic array for matrix
                                  multiplication . . . . . . . . . . . . . 249--254

Parallel Computing
Volume 12, Number 3, December, 1989

                F. Hossfeld and   
                  R. Knecht and   
                    W. E. Nagel   Multitasking: experience with
                                  applications on a CRAY X-MP  . . . . . . 259--283
                   Hiroshi Umeo   A design of time-optimum and
                                  register-number-minimum systolic
                                  convolvers . . . . . . . . . . . . . . . 285--299
                  N. Petkov and   
                     F. Sloboda   A bit-level systolic array for digital
                                  contour smoothing  . . . . . . . . . . . 301--313
                   E. Eskow and   
                 R. B. Schnabel   Mathematical modeling of a parallel
                                  global optimization algorithm  . . . . . 315--325
               P. Fernandes and   
                    P. Girdinio   A new storage scheme for an efficient
                                  implementation of the sparse
                                  matrix-vector product  . . . . . . . . . 327--333
                    J. Berntsen   Communication efficient matrix
                                  multiplication on hypercubes . . . . . . 335--342
                I. Gladwell and   
                      R. I. Hay   Vector- and parallelisation of ODE BVP
                                  codes  . . . . . . . . . . . . . . . . . 343--350
                  T. L. Freeman   Calculating polynomial zeros on a local
                                  memory parallel computer . . . . . . . . 351--358
  George T. Papaspyropoulos and   
                 D. G. Maritsas   Parallel discrete event simulation with
                                  SIMULA . . . . . . . . . . . . . . . . . 359--373
          Tsung Chuan Huang and   
              Jhing-Fa Wang and   
              Chu Sing Yang and   
                   Jau-Yien Lee   Graph theoretic characterization and
                                  reliability of the generalized Boolean
                                  $n$-cube network . . . . . . . . . . . . 375--385

Parallel Computing
Volume 13, Number 1, January, 1990

              P. Sadayappan and   
                   F. Ercal and   
                   J. Ramanujam   Cluster partitioning approaches to
                                  mapping parallel programs onto a
                                  hypercube  . . . . . . . . . . . . . . . 1--16
                 M. R. Exum and   
                  J. L. Gaudiot   Network design and allocation
                                  considerations in the Hughes data-flow
                                  machine  . . . . . . . . . . . . . . . . 17--34
               P. Carnevali and   
                    M. Kindelan   A simplified model to predict the
                                  performance of FORTRAN vector loops on
                                  the IBM 3090/VF  . . . . . . . . . . . . 35--46
                   H. Weberpals   Architectural approach to the IBM 3090E
                                  vector performance . . . . . . . . . . . 47--59
                      M. Zubair   An optimal speedup algorithm for the
                                  measure problem  . . . . . . . . . . . . 61--71
            Ronald J. Leach and   
           O. Michael Atogi and   
             Razeyah R. Stephen   The actual complexity of parallel
                                  evaluation of low degree polynomials . . 73--83
                   G. M. Megson   Rank annihilation on a ring of
                                  processors . . . . . . . . . . . . . . . 85--94
                    J. Zerovnik   A parallel variant of a heuristical
                                  algorithm for graph colouring  . . . . . 95--100
                Herbert Fischer   Automatic differentiation: parallel
                                  computation of function, gradient, and
                                  Hessian matrix . . . . . . . . . . . . . 101--110
              Gen-Huey Chen and   
            Maw-Sheng Chern and   
                 Jin-Hwang Jang   Pipeline architectures for dynamic
                                  programming algorithms . . . . . . . . . 111--117
                 J. C. Tsay and   
                      C. J. Lin   A systolic design for generating
                                  combinations in lexicographic order  . . 119--125

Parallel Computing
Volume 13, Number 2, February, 1990

                 Z. C. Shih and   
               R. C. T. Lee and   
                     S. N. Yang   A parallel algorithm for finding
                                  congruent regions  . . . . . . . . . . . 135--142
               Sajal K. Das and   
               Narsingh Deo and   
                  Sushil Prasad   Parallel graph algorithms for hypercube
                                  computers  . . . . . . . . . . . . . . . 143--158
                     H. Eckardt   System performance and execution of
                                  scientific algorithms on the parallel
                                  computer Parawell  . . . . . . . . . . . 159--173
            R. R. Oldehoeft and   
                   J. R. McGraw   Mixed applicative and imperative
                                  programs . . . . . . . . . . . . . . . . 175--191
              A. De Matteis and   
                    S. Pagnutti   A class of parallel random number
                                  generators . . . . . . . . . . . . . . . 193--198
                G. A. Geist and   
                    G. J. Davis   Finding eigenvalues and eigenvectors of
                                  unsymmetric matrices using a
                                  distributed-memory multiprocessor  . . . 199--209
            M. K. Stoj\vcev and   
        E. I. Milovanovi\'c and   
          I. \vZ. Milovanovi\'c   An algorithm for multiplication of
                                  concatenated matrices  . . . . . . . . . 211--223
                    W. E. Nagel   Exploiting autotasking on a CRAY Y-MP:
                                  an improved software interface to
                                  multitasking . . . . . . . . . . . . . . 225--233
              Gen Huey Chen and   
                 Hong Fa Ho and   
             Shieu Hong Lin and   
                 Jang-Ping Sheu   Data mapping of linear programming on
                                  fixed-size hypercubes  . . . . . . . . . 235--243
             Jang-Ping Sheu and   
               Nan-Ling Kuo and   
                  Gen-Huey Chen   Graph search algorithms and maximum
                                  bipartite matching algorithm on the
                                  hypercube network model  . . . . . . . . 245--251
                 Chii Huah Shyu   A parallel algorithm for finding a
                                  maximum weight clique of an interval
                                  graph  . . . . . . . . . . . . . . . . . 253--256

Parallel Computing
Volume 13, Number 3, March, 1990

          J. Dantas De Melo and   
               J. L. Calvet and   
                   J. M. Garcia   Vectorization and multitasking of
                                  dynamic programming in control:
                                  experiments on a CRAY-2  . . . . . . . . 261--269
                 R. Morandi and   
                    F. Sgallari   Parallel algorithms for the iterative
                                  solution of sparse least-squares
                                  problems . . . . . . . . . . . . . . . . 271--280
               J. S. Weston and   
                       M. Clint   Two algorithms for the parallel
                                  computation of eigenvalues and
                                  eigenvectors of large symmetric matrices
                                  using the ICL DAP  . . . . . . . . . . . 281--288
           Hyoung Joong Kim and   
                   Jang Gyu Lee   A parallel algorithm solving a
                                  tridiagonal Toeplitz linear system . . . 289--294
                 S. J. Shyu and   
                   R. C. T. Lee   Solving the set cover problem on a
                                  supercomputer  . . . . . . . . . . . . . 295--300
        E. V. Krishnamurthy and   
                   M. Kunde and   
               M. Schimmler and   
               H. Schröder   Systolic algorithm for tensor products
                                  of matrices: implementation and
                                  applications . . . . . . . . . . . . . . 301--308
                      G. R. Gao   Exploiting fine-grain parallelism on
                                  dataflow architectures . . . . . . . . . 309--320
                  R. Doallo and   
                   E. L. Zapata   A VLSI Systolic Architecture for Solving
                                  DBT-Transformed Fuzzy Clustering
                                  Problems of Arbitrary Size . . . . . . . 321--335
                 P. Lenders and   
                    H. Schroder   A programmable systolic device for image
                                  processing based on mathematical
                                  morphology . . . . . . . . . . . . . . . 337--344
             D. W. Heermann and   
                  A. N. Burkitt   Parallelization of the Ising model and
                                  its performance evaluation . . . . . . . 345--357
                   P. Michielse   Parallel adaptive reservoir simulation   359--368
              R. M. R. Page and   
                 S. F. Reddaway   The DAP as a filestore search engine . . 369--376
          Pierre Fraigniaud and   
               Serge Miguet and   
                    Yves Robert   Scattering on a ring of processors . . . 377--383

Parallel Computing
Volume 14, Number 1, May, 1990

               Pelle Olsson and   
            S. Lennart Johnsson   A dataparallel implementation of an
                                  explicit method for the
                                  three-dimensional compressible
                                  Navier--Stokes equations . . . . . . . . 1--30
               Arno Krechel and   
          Hans-Joachim Plum and   
              Klaus Stüben   Parallelization and vectorization
                                  aspects of the solution of tridiagonal
                                  linear systems . . . . . . . . . . . . . 31--49
               F. F. Rivera and   
                  R. Doallo and   
             J. D. Bruguera and   
               E. L. Zapata and   
                      R. Peskin   Gaussian elimination with pivoting on
                                  hypercubes . . . . . . . . . . . . . . . 51--60
                   U. Block and   
                 A. Frommer and   
                       G. Mayer   Block colouring schemes for the SOR
                                  method on local memory parallel
                                  computers  . . . . . . . . . . . . . . . 61--75
                D. J. Evans and   
                  K. Margaritis   Systolic designs for
                                  eigenvalue-eigenvector computations
                                  using matrix powers  . . . . . . . . . . 77--87
           Jau-Hsiung Huang and   
              Leonard Kleinrock   Optimal parallel merging and sorting
                                  algorithms using $\sqrt {N}$ processors
                                  without memory contention  . . . . . . . 89--97
                 W. Hasselbring   CELIP: a Cellular Language for Image
                                  Processing . . . . . . . . . . . . . . . 99--109
                    D. J. Evans   A parallel sorting-merging algorithm for
                                  tightly coupled multiprocessors  . . . . 111--121

Parallel Computing
Volume 14, Number 2, June, 1990

               Ramesh Natarajan   A parallel algorithm for the generalized
                                  symmetric eigenvalue problem on a hybrid
                                  multiprocessor . . . . . . . . . . . . . 129--150
            John R. Gilbert and   
Hjálmtýr Hafsteinsson   Parallel symbolic factorization of
                                  sparse linear systems  . . . . . . . . . 151--162
       Sanjay V. Rajopadhye and   
            Richard M. Fujimoto   Synthesizing systolic arrays from
                                  recurrence equations . . . . . . . . . . 163--189
                L. Brugnano and   
                     M. Marrone   Vectorization of some block
                                  preconditioned conjugate gradient
                                  methods  . . . . . . . . . . . . . . . . 191--198
                   G. M. Megson   A systolic helix for matrix
                                  triangularisation with partial pivoting  199--206
              A. de Matteis and   
                    S. Pagnutti   Long-range correlations in linear and
                                  nonlinear random number generators . . . 207--210
                      J. Li and   
                   A. Brass and   
                 D. J. Ward and   
                      B. Robson   A study of parallel molecular dynamics
                                  algorithms for $N$-body simulations on a
                                  transputer system  . . . . . . . . . . . 211--222
               Basile Louka and   
               Maurice Tchuente   Triangular matrix inversion on systolic
                                  arrays . . . . . . . . . . . . . . . . . 223--228
               T. Theoharis and   
                     J. J. Modi   Implementation of matrix multiplication
                                  on the T-RACK  . . . . . . . . . . . . . 229--233
                        Liwu Li   Systolic computation with fault
                                  diagnosis  . . . . . . . . . . . . . . . 235--243

Parallel Computing
Volume 14, Number 3, August, 1990

             H. Mühlenbein   Limitations of multi-layer perceptron
                                  networks-steps towards genetic neural
                                  networks . . . . . . . . . . . . . . . . 249--260
               F. J. Smieja and   
             H. Mühlenbein   The geometry of multi-layer perceptron
                                  solutions  . . . . . . . . . . . . . . . 261--275
              J. Kindermann and   
                      A. Linden   Inversion of neural networks by gradient
                                  descent  . . . . . . . . . . . . . . . . 277--286
                    T. E. Lange   Simulation of heterogeneous neural
                                  networks on serial and parallel machines 287--303
                      A. Singer   Implementations of artificial neural
                                  networks on the Connection Machine . . . 305--315
                 Xiru Zhang and   
                 M. McKenna and   
              J. P. Mesirov and   
                    D. L. Waltz   The backpropagation algorithm on grid
                                  and hypercube architectures  . . . . . . 317--327
                M. Witbrock and   
                       M. Zagha   An implementation of backpropagation
                                  learning on GF11, a large SIMD parallel
                                  computer . . . . . . . . . . . . . . . . 329--346
                 D. Whitley and   
            T. Starkweather and   
                      C. Bogart   Genetic algorithms and neural networks:
                                  optimizing connections and connectivity  347--361
          M. F. da Mota Tenorio   Topology synthesis networks:
                                  self-organization of structure and
                                  weight adjustment as a learning paradigm 363--380
               K. Obermayer and   
                  H. Ritter and   
                    K. Schulten   Large-scale simulations of
                                  self-organizing neural networks on
                                  parallel computers: application to
                                  biological modelling . . . . . . . . . . 381--404
                R. W. Kentridge   Neural networks for learning in the real
                                  world: representation, reinforcement and
                                  dynamics . . . . . . . . . . . . . . . . 405--414

Parallel Computing
Volume 15, Number 1--3, September, 1990

                  S. Knecht and   
                E. Laermann and   
                    W. E. Nagel   Parallelizing QCD with dynamical
                                  fermions on a Cray multiprocessor system 3--20
            Ibrahim N. Hajj and   
                   Stig Skelboe   A multilevel parallel solver for block
                                  tridiagonal and banded linear systems    21--45
        F. F. Van der Vlugt and   
            D. A. van Delft and   
               A. F. Bakker and   
             T. H. van der Meer   The implementation of a $3$D
                                  Navier--Stokes algorithm on an algorithm
                                  oriented processor . . . . . . . . . . . 47--60
              Amir Averbuch and   
                Eran Gabber and   
             Boaz Gordissky and   
                     Yoav Medan   A parallel FFT on an MIMD machine  . . . 61--74
             Michel Cosnard and   
              Pierre Fraigniaud   Finding the roots of a polynomial on an
                                  MIMD multicomputer . . . . . . . . . . . 75--85
                  I. Garcia and   
               J. J. Merelo and   
             J. D. Bruguera and   
                   E. L. Zapata   Parallel quadrant interlocking
                                  factorization on hypercube computers . . 87--100
              T. Z. Kalamboukis   The symmetric tridiagonal eigenvalue
                                  problem on a transputer network  . . . . 101--106
                 J. Boreddy and   
                     A. Paulraj   On the performance of transputer arrays
                                  for dense linear systems . . . . . . . . 107--117
                  L. Bomans and   
                   D. Roose and   
                      R. Hempel   The Argonne/GMD macros in FORTRAN for
                                  portable parallel programming and their
                                  implementation on the Intel iPSC/2 . . . 119--132
    Igor \vZ. Milovanovi\'c and   
     Emina I. Milovanovi\'c and   
              Mile K. Stoj\vcev   An optimal algorithm for Gaussian
                                  elimination of band matrices on an MIMD
                                  computer . . . . . . . . . . . . . . . . 133--145
           Michael Thuné   A partitioning strategy for explicit
                                  difference methods . . . . . . . . . . . 147--154
      István Deák   Uniform random number generators for
                                  parallel computers . . . . . . . . . . . 155--164
            Peter J. Varman and   
        Balakrishna R. Iyer and   
          Donald J. Haderle and   
                Stephen M. Dunn   Parallel merging: algorithm and
                                  implementation results . . . . . . . . . 165--177
               Sajal K. Das and   
               Narsingh Deo and   
                  Sushil Prasad   Two minimum spanning forest algorithms
                                  on fixed-size hypercube computers  . . . 179--187
            Ferng-Ching Lin and   
                Kuo Liang Chung   A cost-optimal parallel tridiagonal
                                  system solver  . . . . . . . . . . . . . 189--199
                   F. Dehne and   
             A. G. Ferreira and   
                 A. Rau-Chaplin   Parallel branch and bound on
                                  fine-grained hypercube multiprocessors   201--209
         Abdelhamid Benaini and   
                    Yves Robert   Spacetime-minimal systolic arrays for
                                  Gaussian elimination and the algebraic
                                  path problem . . . . . . . . . . . . . . 211--225
              K. Margaritis and   
                    D. J. Evans   Systolic designs for Bernoulli's method  227--240
                  Sung Kwon Kim   Parallel algorithms for planar dominance
                                  counting . . . . . . . . . . . . . . . . 241--246
                  D. Morris and   
              C. J. Theaker and   
                R. Phillips and   
                    D. G. Evans   An experimental parallel system (EPS)    247--259
        Evgenij E. Tyrtyshnikov   New approaches to deriving parallel
                                  algorithms . . . . . . . . . . . . . . . 261--265
                    Chau-Jy Lin   Parallel generation of permutations on
                                  systolic arrays  . . . . . . . . . . . . 267--276
                  S. R. Das and   
               N. H. Vaidya and   
                  L. M. Patnaik   A systolic algorithm for hidden surface
                                  removal  . . . . . . . . . . . . . . . . 277--289

Parallel Computing
Volume 16, Number 1, November, 1990

           Craig C. Douglas and   
            Willard L. Miranker   Beyond massive parallelism: numerical
                                  computation using associative tables . . 1--25
                  G. W. Stewart   Communication and matrix computations on
                                  large message passing systems  . . . . . 27--40
             Chien Min Wang and   
                  Sheng-De Wang   Structured partitioning of concurrent
                                  programs for execution on
                                  multiprocessors  . . . . . . . . . . . . 41--57
                   Feng Gao and   
           Beresford N. Parlett   A note on communication analysis of
                                  parallel sparse Cholesky factorization
                                  on a hypercube . . . . . . . . . . . . . 59--60
               Qian Ping Gu and   
                  Tadao Takaoka   A sharper analysis of a parallel
                                  algorithm for the all pairs shortest
                                  path problem . . . . . . . . . . . . . . 61--67
    Sathiamoorthy Manoharan and   
                Nigel P. Topham   A general bound on schedule length for
                                  independent tasks  . . . . . . . . . . . 69--73
                   F. Dehne and   
                    M. Gastaldo   A note on the load balancing problem for
                                  coarse grained hypercube dictionary
                                  machines . . . . . . . . . . . . . . . . 75--79
                D. J. Evans and   
                   W. S. Yousif   The implementation of the explicit block
                                  iterative methods on the Balance 8000
                                  parallel computer  . . . . . . . . . . . 81--97
              D. P. O'Leary and   
                     P. Whitman   Parallel $QR$ factorization by
                                  Householder and modified Gram--Schmidt
                                  algorithms . . . . . . . . . . . . . . . 99--112
     M. F. X. B. van Swaaij and   
          F. V. M. Catthoor and   
                   H. J. de Man   Deriving ASIC architectures for the
                                  Hough transform  . . . . . . . . . . . . 113--121

Parallel Computing
Volume 16, Number 2--3, December, 1990

           Eric F. Van de Velde   Data redistribution and concurrency  . . 125--138
                 John M. Conroy   Parallel nested dissection . . . . . . . 139--156
             Michael L. Dowling   Optimal code parallelization using
                                  unimodular transformations . . . . . . . 157--171
                 B. Veltman and   
              B. J. Lageweg and   
                  J. K. Lenstra   Multiprocessor scheduling with
                                  communication delays . . . . . . . . . . 173--182
           Jau Hsiung Huang and   
              Leonard Kleinrock   Distributed selectsort sorting
                                  algorithms on broadcast communication
                                  networks . . . . . . . . . . . . . . . . 183--190
               G. M. Megson and   
                    D. J. Evans   Systolic arrays for group explicit
                                  methods for solving first order
                                  hyperbolic equations . . . . . . . . . . 191--205
                D. J. Evans and   
                          C. Li   Successive underrelaxation (SUR) and
                                  generalised conjugate gradient (GCG)
                                  methods for hyperbolic difference
                                  equations on a parallel computer . . . . 207--220
              Stephen J. Wright   Solution of discrete-time optimal
                                  control problems on parallel computers   221--237
              M. C. Counilh and   
                       J. Roman   Expression for massively parallel
                                  algorithms-description and illustrative
                                  example  . . . . . . . . . . . . . . . . 239--251
               G. M. Megson and   
                    D. J. Evans   An orthogonal systolic design for the
                                  assignment problem . . . . . . . . . . . 253--267
                        N. Dodd   Slow annealing versus multiple fast
                                  annealing runs --- an empirical
                                  investigation  . . . . . . . . . . . . . 269--272
               Yen Chun Lin and   
                Ferng-Ching Lin   Parallel sorting with cooperating heaps
                                  in a linear array of processors  . . . . 273--278
                D. J. Evans and   
             M. Adamopoulos and   
                S. Kortesis and   
                     K. Tsouros   Searching sets of properties with neural
                                  networks . . . . . . . . . . . . . . . . 279--285
                   T. Samad and   
                      P. Harper   High-order Hopfield and Tank
                                  optimization networks  . . . . . . . . . 287--292
                Marc Garbey and   
                   David Levine   Massively parallel computation of
                                  conservation laws  . . . . . . . . . . . 293--304
                     K. Burrage   An adaptive numerical integration code
                                  for a chain of transputers . . . . . . . 305--312
                M. A. Baker and   
               K. C. Bowler and   
                   R. D. Kenway   MIMD implementations of linear solvers
                                  for oil reservoir simulation . . . . . . 313--334
                 A. Stewart and   
                     G. J. Shaw   A parallel multigrid FAS scheme for
                                  transputer networks  . . . . . . . . . . 335--342
                 S. J. Shyu and   
                   R. C. T. Lee   The vectorization of the partition
                                  problem  . . . . . . . . . . . . . . . . 343--350
                  Tanguy Risset   Implementing Gaussian elimination on a
                                  matrix-matrix multiplication systolic
                                  array  . . . . . . . . . . . . . . . . . 351--359
                       F. Reale   A tridiagonal solver for massively
                                  parallel computer systems  . . . . . . . 361--368
                    S. A. Levin   A fully vectorized quicksort . . . . . . 369--373
                  C. Kamath and   
                  S. Weeratunga   Implementation of two projection methods
                                  on a shared memory multiprocessor: DEC
                                  VAX 6240 . . . . . . . . . . . . . . . . 375--382

Parallel Computing
Volume 17, Number 1, April, 1991

                        M. Alef   Concepts for efficient multigrid
                                  implementation on SUPRENUM-like
                                  architectures  . . . . . . . . . . . . . 1--16
                 S. Heydorn and   
                     P. Weidner   Optimization and performance analysis of
                                  thinning algorithms on parallel
                                  computers  . . . . . . . . . . . . . . . 17--27
                   P. Senechaud   A MIMD Implementation of the Buchberger
                                  Algorithm for Boolean Polynomials  . . . 29--37
                 N. Kockler and   
                       M. Simon   Parallel singular value decomposition
                                  with cyclic storing  . . . . . . . . . . 39--47
                D. J. Evans and   
                    M. D. Levin   A matrix-squaring variant of the power
                                  method on the DAP  . . . . . . . . . . . 49--54
                  E. Bampis and   
                J. C. Konig and   
                    D. Trystram   Impact of communications on the
                                  complexity of the parallel Gaussian
                                  Elimination  . . . . . . . . . . . . . . 55--61
               S. Manoharan and   
                    P. Thanisch   Assigning dependency graphs onto
                                  processor networks . . . . . . . . . . . 63--73
                 C.-J. Wang and   
                   V. P. Nelson   Petri net performance modeling of a
                                  modified mesh-connected parallel
                                  computer . . . . . . . . . . . . . . . . 75--84
                    A. Torralba   A systolic array with applications to
                                  image processing and wire-routing in
                                  VLSI circuits  . . . . . . . . . . . . . 85--93
                     W. Dzwinel   The search for an optimal multiprocessor
                                  interconnection network  . . . . . . . . 95--100
                   M. Wheat and   
                    D. J. Evans   Maintenance of shared data structures on
                                  tightly coupled multiprocessors  . . . . 101--107
                      M. Simmen   Comments on broadcast algorithms for
                                  two-dimensional grids  . . . . . . . . . 109--112

Parallel Computing
Volume 17, Number 2--3, June, 1991

            Roland A. Sweet and   
          William L. Briggs and   
             Suely Oliveira and   
           Jules L. Porsche and   
                   Tom Turnbull   FFTs and three-dimensional Poisson
                                  solvers for hypercubes . . . . . . . . . 121--131
            Marcin Paprzyck and   
                   Ian Gladwell   Solving almost block diagonal systems on
                                  parallel computers . . . . . . . . . . . 133--153
                 P. Tervola and   
                       W. Yeung   Parallel Jacobi algorithm for matrix
                                  diagonalisation on transputer networks   155--163
                D. J. Evans and   
                     Wang Deren   An asynchronous parallel algorithm for
                                  solving a class of nonlinear
                                  simultaneous equations . . . . . . . . . 165--180
               S. M. Muller and   
                    D. Scheerer   A method to parallelize tridiagonal
                                  solvers  . . . . . . . . . . . . . . . . 181--188
                F. A. Rabhi and   
                   G. A. Manson   Divide-and-conquer and parallel graph
                                  reduction  . . . . . . . . . . . . . . . 189--205
                H. Schroder and   
                   P. Strazdins   Program compression on the instruction
                                  systolic array . . . . . . . . . . . . . 207--219
           Chang-Sung Jeong and   
                   Myung-Ho Kim   Fast parallel simulated annealing for
                                  traveling salesman problem on SIMD
                                  machines with linear interconnections    221--228
               Pao-Hsu Shih and   
                  Wu-Shung Feng   An application of neural networks on
                                  channel routing problem  . . . . . . . . 229--240
               Chang-Sung Jeong   Parallel Vorono\u\i diagram in
                                  ${L}_1({L}_\infty)$ metric on a
                                  mesh-connected computer  . . . . . . . . 241--252
    L. Bacchelli Montefusco and   
                    C. Guerrini   A domain decomposition method for
                                  scattered data approximation on a
                                  distributed memory multiprocessor  . . . 253--263
                     Hong Zhang   On the accuracy of the parallel diagonal
                                  dominant algorithm . . . . . . . . . . . 265--272
           H. Schröder and   
            E. V. Krishnamurthy   Systolic computation of characteristic
                                  polynomials of Hessenberg matrices . . . 273--277
              Gen Huey Chen and   
                Maw Sheng Chern   Synthesis of algorithms on processor
                                  arrays . . . . . . . . . . . . . . . . . 279--284
          R. J. van der Pas and   
                 J. M. van Kats   Parallelism in a multi-user environment  285--296
                  N. Honjou and   
                 K. Ohtsuki and   
                  M. Sekiya and   
                      F. Sasaki   A parallelization technique for the
                                  speedup of configuration interaction
                                  computing  . . . . . . . . . . . . . . . 297--310
                J.-Fr. Hake and   
                     W. Homberg   The impact of memory organization on the
                                  performance of matrix calculations . . . 311--327
                    H. Schwandt   Memory access problems in block cyclic
                                  reduction on vector computers  . . . . . 329--346
                       M. Kiehl   A vector implementation of an ODE code
                                  for multi-point-boundary-value problems  347--352

Parallel Computing
Volume 17, Number 4--5, July, 1991

              T. Tollenaere and   
                    G. A. Orban   Simulating modular neural networks on
                                  message-passing multiprocessors  . . . . 361--379
                      Xiaobo Li   Nearest neighbor classification on two
                                  types of SIMD machines . . . . . . . . . 381--407
                    Ilan Bar-On   Efficient logarithmic time parallel
                                  algorithms for the Cholesky
                                  decomposition and Gram--Schmidt process  409--417
                     S. Bondeli   Divide and conquer: a parallel algorithm
                                  for the solution of a tridiagonal linear
                                  system of equations  . . . . . . . . . . 419--434
               Fridrich Sloboda   A projection method of the Cimmino type
                                  for linear algebraic systems . . . . . . 435--442
                    E. Taillard   Robust taboo search for the quadratic
                                  assignment problem . . . . . . . . . . . 443--455
                   Yen-Chun Lin   An FP-based tool for the synthesis of
                                  regular array algorithms . . . . . . . . 457--470
                 Z. Mahjoub and   
              F. Karoui-Sahtout   Parallel algorithms for redundant
                                  precedence relations elimination in task
                                  systems  . . . . . . . . . . . . . . . . 471--481
        E. V. Krishnamurthy and   
               H. Schröder   Systolic algorithm for multivariable
                                  approximation using tensor products of
                                  basis functions  . . . . . . . . . . . . 483--492
                H. Schroder and   
               V. K. Murthy and   
            E. V. Krishnamurthy   Systolic algorithm for polynomial
                                  interpolation and related problems . . . 493--503
               Chang-Sung Jeong   An improved parallel algorithm for
                                  constructing Vorono\u\i diagram on a
                                  mesh-connected computer  . . . . . . . . 505--514
                   Yen-Chun Lin   Array size anomaly of problem-size
                                  independent systolic arrays for
                                  matrix-vector multiplication . . . . . . 515--522
                  S. Storoy and   
                     T. Sorevik   A note on an orthogonal systolic design
                                  for the assignment problem . . . . . . . 523--525
               Sajal K. Das and   
                  Cui-Qing Yang   Performance of parallel spanning tree
                                  algorithms on linear arrays of
                                  transputers and Unix systems . . . . . . 527--551
                        G. Pini   A parallel algorithm for the partial
                                  eigensolution of sparse symmetric
                                  matrices on the CRAY Y-MP  . . . . . . . 553--561
                 I. Gohberg and   
               I. Koltracht and   
                A. Averbuch and   
                      B. Shoham   Timing analysis of a parallel algorithm
                                  for Toeplitz matrices on a MIMD parallel
                                  machine  . . . . . . . . . . . . . . . . 563--577
                  U. Detert and   
                    G. Hofemann   CRAY X-MP and Y-MP memory performance    579--590
                M. D. Levin and   
                    D. J. Evans   The inversion of matrices by the
                                  double-bordering algorithm on MIMD
                                  computers  . . . . . . . . . . . . . . . 591--602

Parallel Computing
Volume 17, Number 6--7, September, 1991

       Paul N. Swarztrauber and   
            Roland A. Sweet and   
          William L. Briggs and   
           Van Emden Henson and   
                     James Otto   Bluestein's FFT for arbitrary ${N}$ on
                                  the hypercube  . . . . . . . . . . . . . 607--617
         H. Mühlenbein and   
               M. Schomisch and   
                        J. Born   The parallel genetic algorithm as
                                  function optimizer . . . . . . . . . . . 619--632
            V. V. R. Prasad and   
             C. Siva Ram Murthy   Downloading node programs/data into
                                  hypercubes . . . . . . . . . . . . . . . 633--642
 Constantine N. K. Osiakwan and   
                   Selim G. Akl   Parallel computation of matchings in
                                  trees  . . . . . . . . . . . . . . . . . 643--656
              Manfred Schimmler   Parallel strong orientation on a mesh
                                  connected computer . . . . . . . . . . . 657--664
           Michael Thuné   Straightforward partitioning of
                                  composite grids for explicit difference
                                  methods  . . . . . . . . . . . . . . . . 665--672
              T. L. Freeman and   
                     M. K. Bane   Asynchronous polynomial zero-finding
                                  algorithms . . . . . . . . . . . . . . . 673--681
             Stephan Olariu and   
               Zhaofang Wen and   
                Wei Xiong Zhang   A faster optimal algorithm for the
                                  measure problem  . . . . . . . . . . . . 683--687
                  S. Olariu and   
                         Z. Wen   An efficient parallel algorithm for
                                  multiselection . . . . . . . . . . . . . 689--693
                     D. Fischer   On superlinear speedups  . . . . . . . . 695--697
                    J. Hagemann   Combinatorial structures for
                                  multiprocessor-systems . . . . . . . . . 699--706
            D. P. Bertsekas and   
                 D. A. Castanon   Parallel synchronous and asynchronous
                                  implementations of the auction algorithm 707--732
               D. Moncrieff and   
             V. R. Saunders and   
                      S. Wilson   Parallel processing using macro-tasking
                                  in a multi-job environment on a CRAY
                                  Y-MP computer  . . . . . . . . . . . . . 733--750
                    C. Phillips   The performance of the BLAS and LAPACK
                                  on a shared memory scalar multiprocessor 751--761
                  S. K. Kim and   
             A. T. Chronopoulos   A class of Lanczos-like algorithms
                                  implemented on parallel computers  . . . 763--778
                      K. Wright   Parallel algorithms for $QR$
                                  decomposition on a shared memory
                                  multiprocessor . . . . . . . . . . . . . 779--790
                 F. Wiegand and   
                    B. S. Hoyle   Development and implementation of
                                  real-time ultrasound process tomography
                                  using a transputer network . . . . . . . 791--807
                  A. Corana and   
              A. Casaleggio and   
                 C. Rolando and   
                     S. Ridella   Efficient computation of the correlation
                                  dimension from a time series on a LIW
                                  computer . . . . . . . . . . . . . . . . 809--820
                   C.-H. Wu and   
               R. E. Hodges and   
                     C. J. Wang   Parallelizing the self-organizing
                                  feature map on multiprocessor systems    821--832
                D. J. Evans and   
                   S. Chikohora   The alternating group explicit (AGE)
                                  method on a transputer network . . . . . 833--843

Parallel Computing
Volume 17, Number 8, October, 1991

                  V. Topkar and   
                 O. Frieder and   
                     A. K. Sood   Duplicate removal on hypercube engines:
                                  an experimental analysis . . . . . . . . 845--871
             E. D. Chajakis and   
                   S. A. Zenios   Synchronous and asynchronous
                                  implementations of relaxation algorithms
                                  for nonlinear network optimization . . . 873--894
                   Y. Huang and   
                       Y. Paker   A parallel FFT algorithm for transputer
                                  networks . . . . . . . . . . . . . . . . 895--906
              E. Francomano and   
               A. Pecorella and   
          A. Tortorici Macaluso   Parallel experience on the inverse
                                  matrix computation . . . . . . . . . . . 907--912
                        H. Park   A parallel algorithm for the unbalanced
                                  orthogonal Procrustes problem  . . . . . 913--923
                    D. J. Evans   The parallel AGE method for the elliptic
                                  problem in two dimensions  . . . . . . . 925--940
                     Y.-H. Choi   Reconfigurable VLSI/WSI multipipelines   941--952

Parallel Computing
Volume 17, Number 9, November, 1991

              D. Hutchinson and   
                B. M. S. Khalaf   Parallel algorithms for solving initial
                                  value problems: front broadening and
                                  embedded parallelism . . . . . . . . . . 957--968
               A. De Gloria and   
                  P. Faraboschi   A Boltzmann Machine approach to code
                                  optimization . . . . . . . . . . . . . . 969--982
             Wen Tsuen Chen and   
                   Ming Yi Fang   An efficient procedure for theorem
                                  proving in propositional logic on vector
                                  computers  . . . . . . . . . . . . . . . 983--995
                   S. Horiguchi   Hybrid systolic sorters  . . . . . . . . 997--1007
              S. Selvakumar and   
             C. Siva Ram Murthy   An efficient algorithm for mapping VLSI
                                  circuit simulation programs onto
                                  multiprocessors  . . . . . . . . . . . . 1009--1016
                    L. Brugnano   A parallel solver for tridiagonal linear
                                  systems for distributed memory parallel
                                  computers  . . . . . . . . . . . . . . . 1017--1023
             V. R. Saunders and   
                      S. Wilson   ``Scavenger'' programming for the CRAY
                                  X-MP computer (Short communication)  . . 1025--1034
                   M. Wheat and   
                    D. J. Evans   Asynchronous parallel merging  . . . . . 1035--1041
               L. C. Waring and   
                       M. Clint   Parallel Gram--Schmidt orthogonalisation
                                  on a network of transputers  . . . . . . 1043--1050
                   J. Erhel and   
                A. Traynard and   
                    M. Vidrascu   An element-by-element preconditioned
                                  conjugate gradient method implemented on
                                  a vector computer  . . . . . . . . . . . 1051--1065

Parallel Computing
Volume 17, Number 10--11, December, 1991

                     J. Worlton   Toward a taxonomy of performance metrics 1073--1092
                Xian-He Sun and   
                J. L. Gustafson   Toward a better parallel performance
                                  metric . . . . . . . . . . . . . . . . . 1093--1109
                     R. Hockney   Performance parameters and benchmarking
                                  of supercomputers  . . . . . . . . . . . 1111--1130
               W. Schonauer and   
                      H. Hafner   Performance estimates for
                                  supercomputers: the responsibilities of
                                  the manufacturer and of the user . . . . 1131--1149
                  R. P. Weicker   A detailed look at some popular
                                  benchmarks . . . . . . . . . . . . . . . 1153--1172
                   M. Berry and   
                 G. Cybenko and   
                      J. Larson   Scientific benchmark characterizations   1173--1194
                    K. M. Dixit   The SPEC benchmarks  . . . . . . . . . . 1195--1209
            A. J. van der Steen   The benchmark of the EuroBen group . . . 1211--1221
                  D. Levine and   
                D. Callahan and   
                    J. Dongarra   A comparative study of automatic
                                  vectorizing compilers  . . . . . . . . . 1223--1244
                J. Dongarra and   
                 M. Furtney and   
               S. Reinhardt and   
                     J. Russell   Parallel loops --- a test suite for
                                  parallelizing compilers: description and
                                  example results  . . . . . . . . . . . . 1247--1255
                   C. M. Grassl   Parallel performance of applications on
                                  supercomputers . . . . . . . . . . . . . 1257--1273
                   A. J. G. Hey   The Genesis distributed memory
                                  benchmarks . . . . . . . . . . . . . . . 1275--1283
                  T. H. Dunigan   Performance of the Intel iPSC/860 and
                                  Ncube 6400 hypercubes  . . . . . . . . . 1285--1302
                W. E. Nagel and   
                     M. A. Linn   Benchmarking parallel programs in a
                                  multiprogramming environment: the
                                  PAR-Bench system . . . . . . . . . . . . 1303--1321

Parallel Computing
Volume 17, Number 12, December, 1991

                S. Arvindam and   
                   V. Kumar and   
          V. Nageshwara Rao and   
                       V. Singh   Automatic test pattern generation on
                                  parallel processors  . . . . . . . . . . 1323--1342
             Jenn Yang Tien and   
                  Wei Pang Yang   Hierarchical spanning trees and
                                  distributing on incomplete hypercubes    1343--1360
    Dieter Müller-Wichards   Problem size scaling in the presence of
                                  parallel overhead  . . . . . . . . . . . 1361--1376
                D. G. Feitelson   Deadlock detection without wait-for
                                  graphs . . . . . . . . . . . . . . . . . 1377--1383
             A. Chakraborty and   
           D. C. S. Allison and   
              C. J. Ribbens and   
                   L. T. Watson   Note on unit tangent vector computation
                                  for homotopy curve tracking on a
                                  hypercube  . . . . . . . . . . . . . . . 1385--1395
                   G. Bader and   
                      E. Gehrke   On the performance of transputer
                                  networks for solving linear systems of
                                  equations  . . . . . . . . . . . . . . . 1397--1407
                      A. Peters   Sparse matrix vector multiplication
                                  techniques on the IBM 3090 VF  . . . . . 1409--1424
                  Y. Escaig and   
                         W. Oed   Analysis tools for Micro- and
                                  Autotasking programs on CRAY
                                  multiprocessor systems . . . . . . . . . 1425--1433

Parallel Computing
Volume 18, Number 1, January, 1992

                     E. Chu and   
                      A. George   A balanced submatrix merging algorithm
                                  for multiprocessor architectures . . . . 1--10
                   G. Lotti and   
                   M. Vajtersic   The application of VLSI Poisson solvers
                                  to the biharmonic problem  . . . . . . . 11--19
                  G. Horton and   
                     R. Knirsch   A time-parallel multigrid-extrapolation
                                  method for parabolic partial
                                  differential equations . . . . . . . . . 21--29
                D. Conforti and   
             L. Grandinetti and   
                R. Musmanno and   
               M. Cannataro and   
                G. Spezzano and   
                       D. Talia   A model of efficient asynchronous
                                  parallel algorithms on multicomputer
                                  systems  . . . . . . . . . . . . . . . . 31--45
                 C. Neusius and   
               J. Olszewski and   
                    D. Scheerer   An efficient distributed thinning
                                  algorithm  . . . . . . . . . . . . . . . 47--55
               A. De Gloria and   
              P. Faraboschi and   
                     S. Ridella   A dedicated massively parallel
                                  architecture for the Boltzmann machine   57--73
               V. K. Murthy and   
        E. V. Krishnamurthy and   
                       Pin Chen   Systolic algorithm for rational
                                  interpolation and Padé approximation  . . 75--83
            Anatol G. Filin and   
             Michael A. Frumkin   A systolic array for inversion of a
                                  finite Radon transform . . . . . . . . . 85--90
                   M. Wheat and   
                    D. J. Evans   An efficient parallel sorting algorithm
                                  for shared memory multiprocessors  . . . 91--102
     El-Sayed M. El-Horbaty and   
           A. El-Din H. Mohamed   A synchronous algorithm for shortest
                                  paths on a tree machine  . . . . . . . . 103--107
                  W. Erhard and   
                       A. Grefe   Improved parallel algorithms for the
                                  classification of electroencephalograms
                                  (EEGs) on the DAP510 . . . . . . . . . . 109--115

Parallel Computing
Volume 18, Number 2, February, 1992

           R. von Hanxleden and   
                    L. R. Scott   Correctness and determinism of Parallel
                                  Monte Carlo Processes  . . . . . . . . . 121--132
             Tzung-Pei Hong and   
             Shian-Shyong Tseng   Parallel perceptron learning on a
                                  single-channel broadcast communication
                                  model  . . . . . . . . . . . . . . . . . 133--148
                   D. Audet and   
                 Y. Savaria and   
                    J.-L. Houle   Performance improvements to VLSI
                                  parallel systems, using dynamic
                                  concatenation of processing resources    149--167
                   M. Marrakchi   Optimal parallel scheduling for the
                                  $2$-steps graph with constant task cost  169--176
                      Hong Shen   Improved universal $k$-selection in
                                  hypercubes . . . . . . . . . . . . . . . 177--184
                 Ph. Clauss and   
                C. Mongenet and   
                   G. R. Perrin   Synthesis of size-optimal toroidal
                                  arrays for the Algebraic Path Problem: a
                                  new contribution . . . . . . . . . . . . 185--194
                    D. J. Evans   A systolic array design for matrix
                                  system solution by the symmetric
                                  bordering method . . . . . . . . . . . . 195--205
              T. Z. Kalamboukis   A parallel algorithm for the dense
                                  symmetric eigenvalue problem on a
                                  transputer array . . . . . . . . . . . . 207--212
      Przemys\law Stpiczy\'nski   Parallel Cholesky factorization on
                                  orthogonal multiprocessors . . . . . . . 213--219
           Chang-Sung Jeong and   
               Jung-Ju Choi and   
                   Der Tsai Lee   Parallel enclosing rectangle on SIMD
                                  machines . . . . . . . . . . . . . . . . 221--229
                S. Kohlhoff and   
                       J. Krone   Performance evaluation of SUPRENUM for
                                  the LINPACK benchmark (Short
                                  communication) . . . . . . . . . . . . . 231--238

Parallel Computing
Volume 18, Number 3, March, 1992

                R. Hiromoto and   
               B. R. Wienke and   
                 R. G. Brickner   The performance of asynchronous
                                  iteration schemes applied to the
                                  linearized Boltzmann transport equation  241--268
                    A. Schuller   Parallelizing particle simulations based
                                  on the Boltzmann equation  . . . . . . . 269--279
            J. Andrew Holey and   
                Oscar H. Ibarra   Iterative algorithms for the planar
                                  convex hull problem on mesh-connected
                                  arrays . . . . . . . . . . . . . . . . . 281--296
                   B. Robic and   
                P. Kolbezen and   
                        J. Silc   Area optimization of dataflow-graph
                                  mappings . . . . . . . . . . . . . . . . 297--311
                P. Casiccia and   
               P. Castangia and   
                S. Cincotti and   
                      G. Parodi   Simulation of a molecular cellular array
                                  on a transputer-based parallel computer  313--324
           K. G. Margaritis and   
                    D. J. Evans   Systolic implementation of neural
                                  networks for searching sets of
                                  properties . . . . . . . . . . . . . . . 325--334
                   W. Loots and   
                 T. H. C. Smith   A parallel three phase sorting procedure
                                  for a $k$-dimensional hypercube and a
                                  transputer implementation  . . . . . . . 335--344
                 Eric Goles and   
                    Marcos Kiwi   A lower bound on the computational
                                  complexity of the $QR$ decomposition on
                                  a shared memory SIMD computer  . . . . . 345--354
               G. M. Megson and   
                    D. J. Evans   More on systolic line drawing  . . . . . 355--358

Parallel Computing
Volume 18, Number 4, April, 1992

                   P. H. Worley   The effect of multiprocessor radius on
                                  scaling  . . . . . . . . . . . . . . . . 361--376
                 Su Chu Hsu and   
            Hsien Fen Hsieh and   
              Shing Tsaan Huang   A fully-pipelined systolic algorithm for
                                  finding bridges on an undirected
                                  connected graph  . . . . . . . . . . . . 377--391
            Hong Chich Chou and   
               Chung Ping Chung   A bound analysis of scheduling
                                  instructions on pipelined processors
                                  with a maximal delay of one cycle  . . . 393--399
               I. Mahadevan and   
                  L. M. Patnaik   Performance evaluation of bidirectional
                                  associative memory on a transputer-based
                                  parallel system  . . . . . . . . . . . . 401--413
               G. M. Megson and   
                 O. Brudaru and   
                      D. Comish   Systolic designs for Aitken's root
                                  finding method . . . . . . . . . . . . . 415--429
           Pl. Iv. Piskoulijski   Error analysis of parallel algorithm for
                                  the solution of a tridiagonal Toeplitz
                                  linear system of equations . . . . . . . 431--438
              Gen-Huey Chen and   
                  Wei-Wen Liang   Conflict-free broadcasting algorithms
                                  for graph traversals and their
                                  applications . . . . . . . . . . . . . . 439--448
             C. P. Thompson and   
               W. R. Cowell and   
                     G. K. Leaf   On the parallelization of an adaptive
                                  multigrid algorithm for a class of flow
                                  problems . . . . . . . . . . . . . . . . 449--466
                 H. C. Burg and   
                       J. Helin   1991 International Conference on
                                  Supercomputing . . . . . . . . . . . . . 467
                 H.-C. Hege and   
                      R. Knecht   Parallel Computing 91  . . . . . . . . . 473

Parallel Computing
Volume 18, Number 5, May, 1992

                  Y. Robert and   
                     S. W. Song   Revisiting cycle shrinking . . . . . . . 481--496
            Yuh-Horng Shiau and   
               Chung-Ping Chung   Adoptability and effectiveness of
                                  microcode compaction algorithms in
                                  superscalar processing . . . . . . . . . 497--510
                     R. Lin and   
                      S. Olariu   A fast cost-optimal parallel algorithm
                                  for the lowest common ancestor problem   511--516
             E. D. Adamides and   
               Ph. Tsalides and   
                 A. Thanailakis   Hierarchical Cellular Automata
                                  structures . . . . . . . . . . . . . . . 517--524
                D. J. Evans and   
                       M. Gusev   Implementation of folding
                                  transformations on linear VLSI processor
                                  arrays . . . . . . . . . . . . . . . . . 525--542
              R. S. Francis and   
                L. J. H. Pannan   A parallel partition for enhanced
                                  parallel QuickSort . . . . . . . . . . . 543--550
               F. Suraweera and   
                P. Bhattacharya   A parallel cost-optimal algorithm to
                                  compute the supremum of max-min powers   551--556
               H. Schreiber and   
             O. Steinhauser and   
                    P. Schuster   Parallel molecular dynamics of
                                  biomolecules . . . . . . . . . . . . . . 557--573
                  T. Dontje and   
                Th. Lippert and   
                  N. Petkov and   
                   K. Schilling   Statistical analysis of
                                  simulation-generated time series:
                                  Systolic vs. semi-systolic correlation
                                  on the Connection Machine  . . . . . . . 575--588

Parallel Computing
Volume 18, Number 6, June, 1992

              Ajay K. Gupta and   
           Susanne E. Hambrusch   Load balanced tree embeddings  . . . . . 595--614
                  Y. P. Boglaev   Exact dynamic load balancing of MIMD
                                  architectures with linear programming
                                  algorithms . . . . . . . . . . . . . . . 615--623
             Chien-Min Wang and   
                  Sheng-De Wang   A hybrid scheme for efficiently
                                  executing nested loops on
                                  multiprocessors  . . . . . . . . . . . . 625--637
              J.-C. Bermond and   
               P. Michallon and   
                    D. Trystram   Broadcasting in wraparound meshes with
                                  parallel monodirectional links . . . . . 639--648
    Ömer E\ugecio\uglu and   
     Çetin K. Koç   A parallel algorithm for generating
                                  discrete orthogonal polynomials  . . . . 649--659
            B. M. S. Khalaf and   
                  D. Hutchinson   Parallel algorithms for initial value
                                  problems: parallel shooting  . . . . . . 661--673
                J. Andersen and   
                   G. Mitra and   
                   D. Parkinson   The scheduling of sparse matrix-vector
                                  multiplication on a massively parallel
                                  DAP computer . . . . . . . . . . . . . . 675--697
                  J. M. D. Hill   Parallel lexical analysis and parsing on
                                  the AMT distributed array processor  . . 699--714

Parallel Computing
Volume 18, Number 7, July, 1992

                E. Rothberg and   
                       A. Gupta   Parallel ICCG on a hierarchical memory
                                  multiprocessor --- Addressing the
                                  triangular solve bottleneck  . . . . . . 719--741
                  T. Takeda and   
                    K. Tani and   
              T. Tsunematsu and   
               Y. Kishimoto and   
               G. I. Kurita and   
              S. Matsushita and   
                      T. Nakata   Plasma simulator METIS for tokamak
                                  confinement and heating studies  . . . . 743--765
                   L. Lopez and   
                      T. Politi   Parallel methods in the numerical
                                  treatment of population dynamic models   767--777
                  Jianjian Song   A distributed-termination experiment on
                                  a mesh-connected array of processors . . 779--791
                  D. Morris and   
                    D. G. Evans   Modelling distributed and parallel
                                  computer systems . . . . . . . . . . . . 793--806
                 Laurence Boxer   Finding congruent regions in parallel    807--810
              Gen Huey Chen and   
                 Jin Hwang Jang   An improved parallel algorithm for $0/1$
                                  knapsack problem . . . . . . . . . . . . 811--821
             Yung Chen Hung and   
                  Gen Huey Chen   Distributed algorithms for the quickest
                                  path problem . . . . . . . . . . . . . . 823--834

Parallel Computing
Volume 18, Number 8, August, 1992

             Srinivas Aluru and   
               G. M. Prabhu and   
                 John Gustafson   A random number generator for parallel
                                  computers  . . . . . . . . . . . . . . . 839--847
        P. Sreenivasa Kumar and   
           M. Kishore Kumar and   
                        A. Basu   A parallel algorithm for elimination
                                  tree computation and symbolic
                                  factorization  . . . . . . . . . . . . . 849--856
                   M. Gusev and   
                       J. Tasic   Comparative analysis of methods for
                                  broadcast elimination  . . . . . . . . . 857--866
                       M. Thune   The partitioning problem for a class of
                                  data parallel algorithms . . . . . . . . 867--878
              M. P. Bekakos and   
                    D. J. Evans   The double alternating group explicit
                                  method for nonlinear parabolic equations
                                  on MIMD parallel computers . . . . . . . 879--895
                J. Zerovnik and   
                     M. Kaufman   A parallel variant of a heuristical
                                  algorithm for graph coloring ---
                                  Corrigendum (Short communication)  . . . 897--900
                 K. Okamoto and   
                  Y. Kodama and   
                   S. Sakai and   
                   Y. Yamaguchi   Methodologies in development and testing
                                  of the dataflow machine EM-4 . . . . . . 901--912
                 K. R. Tout and   
                    D. J. Evans   Parallel forward chaining technique with
                                  dynamic scheduling, for rule-based
                                  expert systems . . . . . . . . . . . . . 913--930
                       R. Butel   A Cray-2 versus CM-2 comparison using
                                  several polynomial benchmarks  . . . . . 931--945
                         W. Oed   Cray Y-MP C90: System features and early
                                  benchmark results (Short communication)  947--954

Parallel Computing
Volume 18, Number 9, September, 1992

                   S. Stark and   
                    A. N. Beris   LU decomposition optimized for a
                                  parallel computer with a hierarchical
                                  distributed memory . . . . . . . . . . . 959--971
           Jack J. Dongarra and   
         Robert A. van de Geijn   Reduction to condensed form for the
                                  eigenvalue problem on distributed memory
                                  architectures  . . . . . . . . . . . . . 973--982
                  Y. P. Chu and   
                    C. M. Hsieh   An artificial neural network model with
                                  modified perceptron algorithm  . . . . . 983--996
                   M. Gusev and   
                    D. J. Evans   VLSI processor array IPS cells (Short
                                  communication) . . . . . . . . . . . . . 997--1007
                   G. Zhang and   
                    H. C. Elman   Parallel sparse Cholesky factorization
                                  on a shared memory multiprocessor  . . . 1009--1022
                 M. Bentley and   
              C. Froese Fischer   Hypercube conversion of serial codes for
                                  atomic structure calculations  . . . . . 1023--1031
              S. S. Nielsen and   
                   S. A. Zenios   Data structures for network algorithms
                                  on massively parallel architectures  . . 1033--1052
                   J. Tasic and   
                   M. Gusev and   
                    D. J. Evans   Systolic implementation of
                                  preconditioned conjugate gradient method
                                  in adaptive transversal filters  . . . . 1053--1065
              R. W. Hockney and   
                  E. A. Carmona   Comparison of communications on the
                                  Intel iPSC/860 and Touchstone Delta
                                  (Short communication)  . . . . . . . . . 1067--1072
                     H. Strauss   Parallel CFD'92  . . . . . . . . . . . . 1073

Parallel Computing
Volume 18, Number 10, October, 1992

             M. V. A. Hancu and   
                 K. Iwasaki and   
                    Y. Sato and   
                       M. Sugie   Experimental results on the error
                                  detection capability of a concurrent
                                  test architecture for massively-parallel
                                  computers  . . . . . . . . . . . . . . . 1079--1103
                   Peter Arbenz   Divide and conquer algorithms for the
                                  bandsymmetric eigenvalue problem . . . . 1105--1128
               A. Basermann and   
                     P. Weidner   A parallel algorithm for determining all
                                  eigenvalues of large real symmetric
                                  tridiagonal matrices . . . . . . . . . . 1129--1141
              Lih-Hsing Hsu and   
              Peng Fei Wang and   
                     Chu Tao Wu   Parallel algorithms for finding the most
                                  vital edge with respect to minimum
                                  spanning tree  . . . . . . . . . . . . . 1143--1155
            T. Chockalingam and   
                   S. Arunkumar   A randomized heuristics for the mapping
                                  problem: The genetic approach  . . . . . 1157--1165
                 E. Violard and   
                   G.-R. Perrin   PEI: a language and its refinement
                                  calculus for parallel programming  . . . 1167--1184
                     Y.-H. Choi   An easily-diagnosable fault-tolerant
                                  binary tree architecture (Short
                                  communication) . . . . . . . . . . . . . 1185--1195

Parallel Computing
Volume 18, Number 11, November, 1992

             S. L. Johnsson and   
                  R. L. Krawitz   Cooley--Tukey FFT on the Connection
                                  Machine  . . . . . . . . . . . . . . . . 1201--1221
                  M. Zubair and   
                S. N. Gupta and   
                   C. E. Grosch   A variable precision approach to speedup
                                  iterative schemes on fine grained
                                  parallel machines (short communication)  1223--1231
    Emmanouel A. Varvarigos and   
           Dimitri P. Bertsekas   Communication algorithms for isotropic
                                  tasks in hypercubes and wraparound
                                  meshes . . . . . . . . . . . . . . . . . 1233--1257
          Roman G. Strongin and   
           Yaroslav D. Sergeyev   Global multidimensional optimization on
                                  parallel computer  . . . . . . . . . . . 1259--1273
                I. Vlahavas and   
                     P. Kefalas   A parallel Prolog resolution based on
                                  multiple unifications  . . . . . . . . . 1275--1283

Parallel Computing
Volume 18, Number 12, December, 1992

                   F. J. Peters   Preface  . . . . . . . . . . . . . . . . 1289
                 T. Lippert and   
               K. Schilling and   
                      N. Petkov   Quark propagator on the Connection
                                  Machine  . . . . . . . . . . . . . . . . 1291--1299
                      Mi Lu and   
                 Xiangzhen Qiao   Applying parallel computer systems to
                                  solve symmetric tridiagonal eigenvalue
                                  problems . . . . . . . . . . . . . . . . 1301--1315
               E. M. Daoudi and   
                       J. Lobry   Implementation of a boundary element
                                  method on distributed memory computers   1317--1324
                   M. Clint and   
               J. S. Weston and   
                 C. W. Bleakney   A comparison of two Fortran dialects for
                                  expressing parallel solutions for a
                                  problem in linear algebra  . . . . . . . 1325--1333
                    B. Khan and   
                   L. Hayes and   
                A. P. Cracknell   The optimisation of higher order
                                  resampling methods in a multiprocessor
                                  environment  . . . . . . . . . . . . . . 1335--1347
                    P. Spee and   
                 W. F. Wong and   
                    M. Sato and   
                        E. Goto   Evaluation of the continuation bit in
                                  the Cyclic Pipeline Computer . . . . . . 1349--1361
                   D. Sharp and   
                  M. Cripps and   
                  J. Darlington   Parallel-architecture-directed program
                                  transformation . . . . . . . . . . . . . 1363--1380
                   D. K. Arvind   On the detection of
                                  communication-related errors in
                                  concurrent programs  . . . . . . . . . . 1381--1392
                 C. Ribeiro and   
                      D. El Baz   A parallel optimal routing algorithm . . 1393--1402
            A. W. G. Duller and   
                      R. Storer   Simulation and verification of
                                  associative processor arrays . . . . . . 1403--1414
                   B. Quatember   Concept of a crossbar switch for
                                  large-scale multiple processor systems
                                  in the field of process control  . . . . 1415--1431

Parallel Computing
Volume 19, Number 1, January, 1993

                 S. Foresti and   
             S. Hassanzadeh and   
                H. Murakami and   
                      V. Sonnad   Parallel rapid operator for iterative
                                  finite element solvers on a shared
                                  memory machine . . . . . . . . . . . . . 1--7
                 P. Edmonds and   
                     E. Chu and   
                      A. George   Dynamic programming on a shared-memory
                                  multiprocessor . . . . . . . . . . . . . 9--22
                G. Lonsdale and   
                    A. Schuller   Multigrid efficiency for complex flow
                                  simulations on distributed memory
                                  machines . . . . . . . . . . . . . . . . 23--32
                  H. Barada and   
                   A. El- Amawy   A methodology for algorithm
                                  regularization and mapping into
                                  time-optimal VLSI arrays . . . . . . . . 33--61
                N. Funabiki and   
                    Y. Takefuji   A parallel multi-layer channel router on
                                  the HVH model  . . . . . . . . . . . . . 63--77
                D. J. Evans and   
                      C. R. Wan   Parallel direct solution for $P$-cyclic
                                  matrix systems . . . . . . . . . . . . . 79--93
                  S. G. Akl and   
                         Ke Qiu   A novel routing scheme on the star and
                                  pancake networks and its applications    95--101
             J. Struckmeier and   
                 F. J. Pfreundt   On the efficiency of simulation methods
                                  for the Boltzmann equation on parallel
                                  computers  . . . . . . . . . . . . . . . 103--119

Parallel Computing
Volume 19, Number 2, February, 1993

                   S. Sakai and   
                  Y. Kodama and   
                   Y. Yamaguchi   Design and implementation of a circular
                                  omega network in the EM-4  . . . . . . . 125--142
                  P. S. Laursen   Simple approaches to parallel Branch and
                                  Bound  . . . . . . . . . . . . . . . . . 143--152
                          E. Ng   Supernodal symbolic Cholesky
                                  factorization on a local-memory
                                  multiprocessor . . . . . . . . . . . . . 153--162
               A. De Gloria and   
              P. Faraboschi and   
                    M. Olivieri   Clustered Boltzmann Machines: Massively
                                  parallel architectures for constrained
                                  optimization problems  . . . . . . . . . 163--175
              G. P. Balboni and   
               G. P. Cabodi and   
                     S. Gai and   
                M. Sonza Reorda   A parallel system for test pattern
                                  generation . . . . . . . . . . . . . . . 177--185
        P. Sreenivasa Kumar and   
                M. K. Kumar and   
                        A. Basu   Parallel algorithms for sparse
                                  triangular system solution . . . . . . . 187--196
           M. Y. Mohd-Saman and   
                    D. J. Evans   Investigation of a set of Bernstein
                                  Tests for the detection of loop
                                  parallelization  . . . . . . . . . . . . 197--207
                      G. Horton   A multi-level diffusion method for
                                  dynamic load balancing . . . . . . . . . 209--218
                    Yi-Bing Lin   Parallel trace-driven simulation for
                                  packet loss in finite-buffered voice
                                  multiplexers . . . . . . . . . . . . . . 219--228
             Stephan Olariu and   
           James L. Schwing and   
                 Jingyuan Zhang   Applications of reconfigurable meshes to
                                  constant-time computations . . . . . . . 229--237

Parallel Computing
Volume 19, Number 3, March, 1993

                     E. Chu and   
                  A. George and   
                     D. Quesnel   Parallel matrix inversion on a
                                  subcube-grid . . . . . . . . . . . . . . 243--256
                Volker Mehrmann   Divide and conquer methods for block
                                  tridiagonal systems  . . . . . . . . . . 257--279
           Bassem F. Beidas and   
    George P. Papavassilopoulos   Convergence analysis of asynchronous
                                  linear iterations with stochastic delays 281--302
                  C. R. Wan and   
                    D. J. Evans   A systolic array architecture for linear
                                  and inverse matrix systems . . . . . . . 303--321
                Zhiyong Liu and   
                   Jia-Huai You   Conflict-free routing for
                                  BPC-permutations on synchronous
                                  hypercubes . . . . . . . . . . . . . . . 323--342
             A. G. Chalmers and   
                     S. Gregory   Constructing minimum path configurations
                                  for multiprocessor systems . . . . . . . 343--355

Parallel Computing
Volume 19, Number 4, April, 1993

          S. Lakshmivarahan and   
              Jung Sing Jwo and   
                    S. K. Dhall   Symmetry in interconnection networks
                                  based on Cayley graphs of permutation
                                  groups: a survey . . . . . . . . . . . . 361--407
              A. R. Krommer and   
               C. W. Ueberhuber   Architecture adaptive algorithms . . . . 409--435
               M. Mantharam and   
                 P. J. Eberlein   New Jacobi-sets for parallel
                                  computations . . . . . . . . . . . . . . 437--454
             M. Atiquzzaman and   
                    M. M. Banat   Effect of hot-spots on the performance
                                  of crossbar multiprocessor systems . . . 455--461
             M. Graca Ruano and   
       D. F. Garcia Nocetti and   
                 P. J. Fish and   
                  P. J. Fleming   Alternative parallel implementations of
                                  an AR-modified covariance spectral
                                  estimator for diagnostic ultrasonic
                                  blood flow studies . . . . . . . . . . . 463--476

Parallel Computing
Volume 19, Number 5, May, 1993

          Mythili Mantharam and   
                 P. J. Eberlein   Block recursive algorithm to generate
                                  Jacobi-sets  . . . . . . . . . . . . . . 481--496
           Mokhtar Aboelaze and   
                  De-Lei L. Lee   A method for data allocation and
                                  manipulation in hypercube computers  . . 497--510
                    M. Bahi and   
                  J. C. Miellou   Contractive mappings with maximum norms:
                                  comparison of constants of contraction
                                  and application to asynchronous
                                  iterations . . . . . . . . . . . . . . . 511--523
                   M. Misra and   
                 D. Nassimi and   
                 V. K. Prasanna   Efficient VLSI implementation of
                                  iterative solutions to sparse linear
                                  systems  . . . . . . . . . . . . . . . . 525--544
              M. P. Bekakos and   
                    D. J. Evans   Parallel cyclic odd-even reduction
                                  algorithms for solving Toeplitz
                                  tridiagonal equations on MIMD computers  545--561
                G. Spaletta and   
                    D. J. Evans   The Parallel Recursive Decoupling
                                  algorithm for solving tridiagonal linear
                                  systems  . . . . . . . . . . . . . . . . 563--576
        E. V. Krishnamurthy and   
                       Chen Pin   Data parallel evaluation-interpolation
                                  algorithm for polynomial matrix
                                  inversion  . . . . . . . . . . . . . . . 577--589

Parallel Computing
Volume 19, Number 6, June, 1993

                 S. Chandra and   
                    M. Jain and   
                    A. Basu and   
                    P. S. Kumar   Sorting algorithms on transputer arrays  595--607
               T. B. Boffey and   
                    W. A. Essah   Implementing a parallel constrained
                                  $\ell_1$ approximation algorithm . . . . 609--620
            A. N. Choudhary and   
                B. Narahari and   
                R. Krishnamurti   An efficient heuristic scheme for
                                  dynamic remapping of parallel
                                  computations (Short communication) . . . 621--632
                  H. Azaria and   
                     Y. Elovici   Modeling and evaluation of a new
                                  message-passing system for parallel
                                  multiprocessor systems . . . . . . . . . 633--649
               M. Paprzycki and   
                    I. Gladwell   A parallel chopping algorithm for ODE
                                  boundary value problems  . . . . . . . . 651--666
                  F. Pagano and   
                  G. Parodi and   
                      R. Zunino   Parallel implementation of associative
                                  memories for image classification  . . . 667--684
               R. Campanini and   
                I. D'Antone and   
                 G. Di Caro and   
                      G. Giusti   A transputer-based parallel expert
                                  diagnostic system  . . . . . . . . . . . 685--692
                    Y.-W. Leung   On-line fault identification in
                                  multistage interconnection networks  . . 693--702
       E. J. Kontoghiorghes and   
                M. R. B. Clarke   Parallel reorthogonalization of the $QR$
                                  decomposition after deleting columns
                                  (Short communication)  . . . . . . . . . 703--707

Parallel Computing
Volume 19, Number 7, July, 1993

                    S. J. Horng   Computing dominators on a cube-connected
                                  machine  . . . . . . . . . . . . . . . . 713--728
             J. D. Bruguera and   
                  E. Antelo and   
                   E. L. Zapata   Design of a pipelined radix 4 CORDIC
                                  processor  . . . . . . . . . . . . . . . 729--744
                C. N. Zhang and   
                   H. F. Li and   
                   R. Jayakumar   A systematic approach for designing
                                  concurrent error-detecting systolic
                                  arrays using redundancy  . . . . . . . . 745--764
            Ren-Lianq Cheng and   
               Chung-Ping Chung   Reaching approximate agreement on
                                  hypercube  . . . . . . . . . . . . . . . 765--775
                   P. A. Nelson   Hypercube matrix multiplication  . . . . 777--788
                A. El-Amawy and   
                        R. Raja   Split sequence generation algorithms for
                                  efficient identification of operational
                                  subcubes in faulty hypercubes  . . . . . 789--805
            Yung-Chang Wong and   
                 Shu-Yuen Hwang   On parallelizing the Dempster-Shafer
                                  method using transputer network  . . . . 807--822
              S. D. Altekar and   
                  A. K. Ray and   
                   B. R. Wienke   On the parallelization of a $S_n$
                                  transport algorithm on a CRAY Y MP . . . 823--834

Parallel Computing
Volume 19, Number 8, August, 1993

              B. L. Menezes and   
           I. L. M. Ricarte and   
                  R. Thurimella   Analysis of pipelined external sorting
                                  on a reconfigurable message-passing
                                  multicomputer  . . . . . . . . . . . . . 839--858
            Nicolas Boissin and   
                Jean-Luc Lutton   A parallel simulated annealing algorithm 859--872
               Louis Ibarra and   
                  Dana Richards   Efficient parallel graph algorithms
                                  based on open ear decomposition  . . . . 873--886
                    Jiawang Wei   Parallel asynchronous iterations of
                                  least fixed points . . . . . . . . . . . 887--895
                D. J. Evans and   
                  W. U. N. Butt   Dynamic load balancing using
                                  task-transfer probabilities  . . . . . . 897--916
      Przemys\law Stpiczy\'nski   Error analysis of two parallel
                                  algorithms for solving linear recurrence
                                  systems  . . . . . . . . . . . . . . . . 917--923
              John J. Buoni and   
            Paul A. Farrell and   
                   Arden Ruttan   Algorithms for ${LU}$ decomposition on a
                                  shared memory multiprocessor . . . . . . 925--937
                   Jianping Zhu   $QR$ factorization for the regularized
                                  least squares problem on hypercubes  . . 939--948
                 A. Matrone and   
                 P. Schiano and   
                       V. Puoti   LINDA and PVM: a comparison between two
                                  environments for parallel programming    949--957

Parallel Computing
Volume 19, Number 9, September, 1993

          Zbigniew J. Czech and   
              Marek Konopka and   
             Bohdan S. Majewski   Parallel algorithms for finding a
                                  suboptimal fundamental-cycle set in a
                                  graph  . . . . . . . . . . . . . . . . . 961--971
                 I. W. Chan and   
                  D. K. Friesen   Parallel algorithm for segment
                                  visibility reporting . . . . . . . . . . 973--978
                       L. Lopez   Methods based on boundary value
                                  techniques for solving parabolic
                                  equations on parallel computers  . . . . 979--991
                      Hong Shen   A high performance interconnection
                                  network for multiprocessor systems . . . 993--1001
                  H. Caffey and   
                 L. Z. Liao and   
                C. A. Shoemaker   Parallel processing of large scale
                                  discrete-time unconstrained differential
                                  dynamic programming  . . . . . . . . . . 1003--1017
                      D. El Baz   Asynchronous implementation of
                                  relaxation and gradient algorithms for
                                  convex network flow problems . . . . . . 1019--1028
                  R. Trobec and   
                 I. Jerebic and   
                     D. Janezic   Parallel algorithm for molecular
                                  dynamics integration . . . . . . . . . . 1029--1039
                P. Altevogt and   
                       A. Linke   Parallelization of the two-dimensional
                                  Ising model on a cluster of IBM RISC
                                  System/6000 workstations . . . . . . . . 1041--1052
             A. Nanayakkara and   
               D. Moncrieff and   
                      S. Wilson   Performance of IBM RISC System/6000
                                  workstation clusters in a quantum
                                  chemical application . . . . . . . . . . 1053--1062
                  A. Jakobs and   
                  R. W. Gerling   Scaling aspects for the performance of
                                  parallel algorithms  . . . . . . . . . . 1063--1073

Parallel Computing
Volume 19, Number 10, October, 1993

                  Xiaobo Li and   
                    Paul Lu and   
         Jonathan Schaeffer and   
           John Shillington and   
               Pok Sze Wong and   
                     Hanmao Shi   On the versatility of parallel sorting
                                  by regular sampling  . . . . . . . . . . 1079--1103
            Rajesh Aggarwal and   
            David R. Dellwo and   
             Morton B. Friedman   Parallel solution of Fredholm integral
                                  equations of the second kind by
                                  accelerated projection methods . . . . . 1105--1115
       Maria Antonietta Pirozzi   The fast numerical solution of mildly
                                  nonlinear elliptic boundary value
                                  problems on multiprocessors  . . . . . . 1117--1128
           Terry Bossomaier and   
                   Adrian Loeff   Parallel computation of the Hausdorff
                                  distance between images  . . . . . . . . 1129--1140
                     D. Busvine   Implementing recursive functions as
                                  processor farms  . . . . . . . . . . . . 1141--1153
                      Y. Kanada   A method of vector processing for shared
                                  symbolic data  . . . . . . . . . . . . . 1155--1175
                   M. Gusev and   
                    D. J. Evans   New linear systolic arrays for the
                                  string comparison algorithm  . . . . . . 1177--1193

Parallel Computing
Volume 19, Number 11, November, 1993

               J. De Keyser and   
                       D. Roose   Load balancing data parallel programs on
                                  distributed memory computers . . . . . . 1199--1219
                  C. H. Cap and   
                    V. Strumpen   Efficient parallel computing in
                                  distributed workstation environments . . 1221--1234
                 S. L. Johnsson   Minimizing the communication time for
                                  matrix multiplication on multiprocessors 1235--1257
                 B. Hendrickson   Parallel $QR$ factorization using the
                                  torus-wrap mapping . . . . . . . . . . . 1259--1271
                  P. Amodio and   
                 N. Mastronardi   A parallel version of the cyclic
                                  reduction algorithm on a hypercube . . . 1273--1281
                   H. Dhrif and   
                      D. Sarkar   Fuzzy arithmetic on systolic arrays  . . 1283--1301
Çetin Kaya Koç and   
                 Peter Cappello   Systolic arrays for integer Chinese
                                  remaindering . . . . . . . . . . . . . . 1303--1311
                      S. Hurley   Taskgraph mapping using a genetic
                                  algorithm: a comparison of fitness
                                  functions  . . . . . . . . . . . . . . . 1313--1317

Parallel Computing
Volume 19, Number 12, December, 1993

                    T. Yang and   
                  A. Gerasoulis   List scheduling with and without
                                  communication delays . . . . . . . . . . 1321--1344
               F. B. Hanson and   
                  J.-D. Mei and   
                    C. Tier and   
                          H. Xu   PDAC: a data parallel algorithm for the
                                  performance analysis of closed queueing
                                  networks . . . . . . . . . . . . . . . . 1345--1358
                     H. B. Zhou   Two-stage $m$-way graph partitioning . . 1359--1373
             K.-H. Hoffmann and   
                         J. Zou   Parallel efficiency of domain
                                  decomposition methods  . . . . . . . . . 1375--1391
                   M. Kumar and   
                Y. Baransky and   
                     M. Denneau   The GF11 parallel computer . . . . . . . 1393--1412
                  U. Gartel and   
                 W. Joppich and   
                    A. Schuller   Parallelizing the ECMWF's weather
                                  forecast program: the 2D case  . . . . . 1413--1425
                  U. Gartel and   
                 W. Joppich and   
                    A. Schuller   First results with a parallelized $3$D
                                  weather prediction code  . . . . . . . . 1427--1429
                   F.-H Hebeker   Parallel CFD'93  . . . . . . . . . . . . 1431

Parallel Computing
Volume 20, Number 1, January 16, 1994

               Shen Shen Wu and   
                 David Sweeting   Heuristic algorithms for task assignment
                                  and scheduling in a processor network    1--14
            J. B\la\.zewicz and   
              M. Drozdowski and   
                 G. Schmidt and   
                    D. de Werra   Scheduling independent multiprocessor
                                  tasks on a uniform $k$-processor system  15--28
                D. J. Evans and   
                       M. Gusev   New linear systolic arrays for digital
                                  filters and convolution  . . . . . . . . 29--61
           Thomas Schreiber and   
                 Peter Otto and   
               Fridolin Hofmann   A new efficient parallelization strategy
                                  for the $QR$ algorithm . . . . . . . . . 63--75 (or 63--76??)
               R. Calinescu and   
                    D. J. Evans   A parallel simulation model for load
                                  balancing in clustered distributed
                                  systems  . . . . . . . . . . . . . . . . 77--91
               Jaime Seguel and   
                Dorothy Bollman   Fast digit-reversal algorithms on a
                                  shared-memory machine  . . . . . . . . . 93--99
                Shyan-Ming Yuan   An efficient fault-tolerant
                                  decentralized commit protocol  . . . . . 101--114
                  Thomas Umland   Parallel sorting revisited . . . . . . . 115--124
                   K. Nagel and   
                  A. Schleicher   Microscopic traffic modeling on parallel
                                  high performance computers . . . . . . . 125--146

Parallel Computing
Volume 20, Number 2, February 24, 1994

                  G. W. Stewart   Updating URV decompositions in parallel  151--172
              D. M. Beazley and   
                  P. S. Lomdahl   Message-passing multi-cell molecular
                                  dynamics on the Connection Machine 5 . . 173--195
              M. Angelaccio and   
                   M. Colajanni   The row/column pivoting strategy on
                                  multicomputers . . . . . . . . . . . . . 197--213
             Michael Conner and   
              Richard Tolimieri   Special purpose hardware for Discrete
                                  Fourier Transform implementation . . . . 215--232
      Henry Ker-Chang Chang and   
     Jonathan Jen-Rong Chen and   
               Shyong-Jian Shyu   A parallel algorithm for the knapsack
                                  problem using a generation and searching
                                  technique  . . . . . . . . . . . . . . . 233--243
          Antonio d'Acierno and   
                Roberto Vaccaro   On parallelizing recursive neural
                                  networks on coarse-grained parallel
                                  computers: a general algorithm . . . . . 245--256
              M. Angelaccio and   
                   M. Colajanni   Subcube matrix decomposition: a unifying
                                  view for LU factorization on
                                  multicomputers . . . . . . . . . . . . . 257--270

Parallel Computing
Volume 20, Number 3, March 10, 1994

                       M. Kiehl   Parallel multiple shooting for the
                                  solution of initial value problems . . . 275--295
                Lujuan Chen and   
        E. V. Krishnamurthy and   
                   Iain Macleod   Generalised matrix inversion and rank
                                  computation by successive matrix
                                  powering . . . . . . . . . . . . . . . . 297--311
                    Jian-Jin Li   Multiscattering on the Cube-Connected
                                  Cycles . . . . . . . . . . . . . . . . . 313--324
                D. J. Evans and   
                  W. U. N. Butt   Load balancing with network partitioning
                                  using host groups  . . . . . . . . . . . 325--345
             Tzung-Pei Hong and   
             Shian-Shyong Tseng   An optimal parallel perceptron learning
                                  algorithm for a large training set . . . 347--352
           Jong-Chuang Tsay and   
                   Wei-Ping Lee   An optimal parallel algorithm for
                                  generating permutations in minimal
                                  change order . . . . . . . . . . . . . . 353--361
               M. L. Sawley and   
                  C. M. Bergman   A comparative study of the use of the
                                  data-parallel approach for compressible
                                  flow calculations  . . . . . . . . . . . 363--373
                  A. Asenov and   
                    D. Reid and   
                   J. R. Barker   Speed-up of scalable iterative linear
                                  solvers implemented on an array of
                                  transputers  . . . . . . . . . . . . . . 375--387
               Roger W. Hockney   The communication challenge for MPP:
                                  Intel Paragon and Meiko CS-2 . . . . . . 389--398
                   U. Kleis and   
               J. M. Singer and   
             I. Morgenstern and   
          Th. Hußlein and   
                 H.-G. Matuttis   Experiences with re-engineering and
                                  parallelizing a high-T$_c$
                                  superconductivity code . . . . . . . . . 399--407
                      Anonymous   Parallel Computing 93  . . . . . . . . . 409

Parallel Computing
Volume 20, Number 4, March 31, 1994

Oliver A. McBryan An overview of message passing
environments . . . . . . . . . . . . . . 417--443 (or 417--444??)
Vasanth Bala and
Jehoshua Bruck and
Raymond Bryant and
Robert Cypher and
Peter de Jong and
Pablo Elustondo and
D. Frye and
Alex Ho and
Ching-Tien Ho and
Gail Irwin and
Shlomo Kipnis and
Richard Lawrence and
Marc Snir The IBM External User Interface for
scalable parallel systems . . . . . . . 445--462
Paul Pierce The NX message passing interface . . . . 463--480
Lewis W. Tucker and
Alan Mainwaring CMMD: Active messages on the CM-5 . . . 481--496
Eric Barton and
James Cownie and
Moray McLaren Message passing on the Meiko CS-2 . . . 497--507
M. Schmidt-Voigt Efficient parallel communication with
the nCUBE 2S processor . . . . . . . . . 509--530
V. S. Sunderam and
G. A. Geist and
J. Dongarra and
R. Manchek The PVM concurrent computing system:
Evolution, experiences, and trends . . . 531--545
Ralph M. Butler and
Ewing L. Lusk Monitors, messages, and clusters: The p4
parallel programming system . . . . . . 547--564
Anthony Skjellum and
Steven G. Smith and
Nathan E. Doss and
Alvin P. Leung and
Manfred Morari The design and evolution of Zipcode . . 565--596
Jon Flower and
Adam Kolawa Express is not just a message passing
system: Current and future directions in
Express . . . . . . . . . . . . . . . . 597--614
R. Calkin and
R. Hempel and
H.-C. Hoppe and
P. Wypior Portable programming with the PARMACS
message-passing library . . . . . . . . 615--632
Nicholas J. Carriero and
David Gelernter and
Timothy G. Mattson and
Andrew H. Sherman The Linda alternative to message-passing
systems . . . . . . . . . . . . . . . . 633--655
David W. Walker The design of a standard message passing
interface for distributed memory
concurrent computers . . . . . . . . . . 657--673

Parallel Computing
Volume 20, Number 5, May 11, 1994

                Alain Darte and   
                    Yves Robert   Mapping uniform loop nests onto
                                  distributed memory architectures . . . . 679--710
                   Jingling Xue   Automating non-unimodular loop
                                  transformations for massive parallelism  711--728
                 David J. Lilja   A multiprocessor architecture combining
                                  fine-grained and coarse-grained
                                  parallelism strategies . . . . . . . . . 729--751
              Mark T. Jones and   
              Paul E. Plassmann   Scalable iterative solution of sparse
                                  linear systems . . . . . . . . . . . . . 753--773
               Wei Ping Lee and   
               Jong Chuang Tsay   A systolic design for generating
                                  permutations in lexicographic order  . . 775--785
                D. J. Evans and   
                   W. S. Yousif   The solution of unsymmetric tridiagonal
                                  Toeplitz systems by the strides
                                  reduction algorithm  . . . . . . . . . . 787--798
        E. V. Krishnamurthy and   
           Vikram Krishnamurthy   An ANN model perceptron algorithm using
                                  generalized matrix inversion . . . . . . 799--806
                E. Montagne and   
                   M. Rukoz and   
            R. Surós and   
                      F. Breant   Modeling optimal granularity when
                                  adapting systolic algorithms to
                                  transputer based supercomputers  . . . . 807--814
                   Y. F. Hu and   
                    R. J. Blake   Numerical experiences with partitioning
                                  of unstructured meshes . . . . . . . . . 815--829

Parallel Computing
Volume 20, Number 6, June 10, 1994

              S. Selvakumar and   
             C. Siva Ram Murthy   Static task allocation of concurrent
                                  programs for distributed computing
                                  systems with processor and resource
                                  heterogeneity  . . . . . . . . . . . . . 835--851
                  Jianjian Song   A partially asynchronous and iterative
                                  algorithm for distributed load balancing 853--868
              Dongseung Kim and   
                 Byung-Guoen Yi   A two-pass scheduling algorithm for
                                  parallel programs  . . . . . . . . . . . 869--885
              Tien-Yu Huang and   
                Jean-Lien C. Wu   Alternate resolution strategy in
                                  multistage interconnection networks  . . 887--896
              Bao Lin Zhang and   
                     Wen Zhi Li   On Alternating Segment Crank--Nicolson
                                  scheme (Short communication) . . . . . . 897--902
                  C. R. Wan and   
                    D. J. Evans   A systolic array architecture for $QR$
                                  decomposition of block structured sparse
                                  systems  . . . . . . . . . . . . . . . . 903--914

Parallel Computing
Volume 20, Number 7, July 12, 1994

            Kapil K. Mathur and   
            S. Lennart Johnsson   Multiplication of matrices of arbitrary
                                  shape on a data parallel computer  . . . 919--951
               Inge Gutheil and   
             Werner Krotz-Vogel   Performance of a parallel matrix
                                  multiplication routine on Intel iPSC/860 953--974
                   H. Suman and   
                   K. Schilling   A comparative study of gauge fixing
                                  procedures on the connection machines
                                  CM2 and CM5  . . . . . . . . . . . . . . 975--990
                  Chang-ming Ma   Implementation of a Monte Carlo code on
                                  a parallel computer system . . . . . . . 991--1005
             Hsiao-Hsi Wang and   
               Ruei-Chuan Chang   A distributed shared memory system with
                                  self-adjusting coherence scheme  . . . . 1007--1025
                Takenori Makino   Shift-net and power shift-net for
                                  parallel processor systems . . . . . . . 1027--1039
            Jean-Lien C. Wu and   
                    T.-Y. Huang   A new bus contention scheme in S/NET
                                  with dynamic priority  . . . . . . . . . 1041--1054
                D. J. Evans and   
                   E. Galligani   A parallel additive preconditioner for
                                  conjugate gradient method for $AX+XB=C$  1055--1064

Parallel Computing
Volume 20, Number 8, August 10, 1994

            Johan De Keyser and   
                  Kurt Lust and   
                     Dirk Roose   Run-time load balancing support for a
                                  parallel multiblock Euler/Navier--Stokes
                                  code with adaptive refinement on
                                  distributed memory computers . . . . . . 1069--1088
                 Hong Zhang and   
                William F. Moss   Using parallel banded linear system
                                  solvers in generalized eigenvalue
                                  problems . . . . . . . . . . . . . . . . 1089--1105
          Sabine Van Huffel and   
                    Haesun Park   Parallel tri- and bi-diagonalization of
                                  bordered bidiagonal matrices . . . . . . 1107--1128
                 T. F. Pena and   
               E. L. Zapata and   
                    D. J. Evans   Finite element simulation of
                                  semiconductor devices on multiprocessor
                                  computers  . . . . . . . . . . . . . . . 1129--1159
         Nicholas J. Higham and   
       Pythagoras Papadimitriou   A parallel algorithm for computing the
                                  polar decomposition  . . . . . . . . . . 1161--1173
                 P. Yalamov and   
                    D. J. Evans   On the forward stability of a modified
                                  `stride of $3$' reduction method . . . . 1175--1190
                   Amit J. Basu   A parallel algorithm for spectral
                                  solution of the three-dimensional
                                  Navier--Stokes equations . . . . . . . . 1191--1204
         Richard E. Overill and   
                 Stephen Wilson   Performance of parallel algorithms for
                                  the evaluation of power series . . . . . 1205--1213
                David W. Walker   Erratum to: ``The design of a standard
                                  message passing interface for
                                  distributed memory concurrent
                                  computers''  . . . . . . . . . . . . . . 1215--1215

Parallel Computing
Volume 20, Number 9, September 12, 1994

          L. C. Polymenakos and   
                D. P. Bertsekas   Parallel shortest path auction
                                  algorithms . . . . . . . . . . . . . . . 1221--1247
                     Qi Gan and   
                  Qing Yang and   
                     Chen-Yi Hu   Parallel all-row preconditioned interval
                                  linear solver for nonlinear equations on
                                  multiprocessors  . . . . . . . . . . . . 1249--1268
          Jeffrey T. Draper and   
                  Joydeep Ghosh   The M-cache: a message-handling
                                  mechanism for multicomputer systems  . . 1269--1288
              Abdel Aziz Farrag   Tolerating faulty edges in a
                                  multi-dimensional mesh . . . . . . . . . 1289--1301
                 Abhay Jain and   
                N. S. Chaudhari   Efficient parallel recognition of
                                  context-free languages . . . . . . . . . 1303--1321 (or 1303--1322??)
                   Yen Chun Lin   New systolic arrays for the longest
                                  common subsequence problem$^+$ . . . . . 1323--1334
         Saulo R. M. Barros and   
                 Tuomo Kauranne   On the parallelization of global
                                  spectral weather models  . . . . . . . . 1335--1356
                     Jun Makino   Lagged-Fibonacci random number
                                  generators on parallel computers . . . . 1357--1367
                Frank Dehne and   
            Afonso Ferreira and   
             Andrew Rau-Chaplin   A massively parallel knowledge-base
                                  server using a hypercube multiprocessor  1369--1382

Parallel Computing
Volume 20, Number 10--11, November 3, 1994

              Oliver A. McBryan   The SUPRENUM and GENESIS projects  . . . 1389--1396
             Ulrich Trottenberg   Some remarks on the SUPRENUM project . . 1397--1406
                    W. K. Giloi   The SUPRENUM supercomputer: Goals,
                                  achievements, and lessons learned  . . . 1407--1425
              Oliver A. McBryan   SUPRENUM: Perspectives and performance   1427--1442
              Wolfgang K. Giloi   Parallel supercomputer architectures and
                                  their programming models . . . . . . . . 1443--1470
Wolfgang Schröder-Preikschat   PEACE --- a software backplane for
                                  parallel computing . . . . . . . . . . . 1471--1485
               Hans P. Zima and   
              Peter Brezany and   
             Barbara M. Chapman   SUPERB and Vienna Fortran  . . . . . . . 1487--1517
                      R. Hempel   Application programming interfaces for
                                  SUPRENUM . . . . . . . . . . . . . . . . 1519--1526
        Hermann Mierendorff and   
          Helmut Schwamborn and   
                 Maurizio Tazza   Performance modelling of grid problems
                                  --- a case study on the SUPRENUM system  1527--1546
                   Manfred Alef   Implementation of a multigrid algorithm
                                  on SUPRENUM and other systems  . . . . . 1547--1557
            Hubert Ritzdorf and   
        Anton Schüller and   
         Barbara A. Steckel and   
              Klaus Stüben   $L_i$SS --- an environment for the
                                  parallel multigrid solution of partial
                                  differential equations on general 2D
                                  domains  . . . . . . . . . . . . . . . . 1559--1570
        Ortwin Pätzold and   
        Anton Schüller and   
           Horst Schwichtenberg   Parallel applications and performance
                                  measurements on SUPRENUM . . . . . . . . 1571--1582
          Georg Fleischmann and   
             Matthias Gente and   
           Fridolin Hofmann and   
                   Gunter Bolch   Performance analysis of parallel
                                  programs based on model calculations . . 1583--1603
                       Tony Hey   The Genesis Esprit project --- an
                                  overview . . . . . . . . . . . . . . . . 1605--1612
                      Otto Kolp   Performance estimation for a parallel
                                  system with a hierarchical switch
                                  network  . . . . . . . . . . . . . . . . 1613--1626
               Jon Beecroft and   
              Mark Homewood and   
                  Moray McLaren   Meiko CS-2 interconnect Elan-Elite
                                  design . . . . . . . . . . . . . . . . . 1627--1638
               L. M. Delves and   
              C. A. Addison and   
                     O. A. Aziz   The design and implementation of a
                                  portable parallel numerical library  . . 1639--1651
              C. A. Addison and   
                V. S. Getov and   
               A. J. G. Hey and   
              R. W. Hockney and   
                   I. C. Wolton   Benchmarking for distributed memory
                                  parallel systems: Gaining insight from
                                  numbers  . . . . . . . . . . . . . . . . 1653--1668
           Karl Solchenbach and   
       Clemens-August Thole and   
             Ulrich Trottenberg   GENESIS application software . . . . . . 1669--1673
             Edgar A. Gerteisen   Preliminary performance results of the
                                  massive parallel Aircraft Euler Method   1675--1683
                 Tuomo Kauranne   Summary of GENESIS work at the European
                                  Centre for Medium-range Weather
                                  Forecasts (ECMWF)  . . . . . . . . . . . 1685--1688
            J. J. H. Miller and   
                        S. Wang   On the implementation of a $3$-D
                                  semiconductor device simulator on
                                  distributed-memory MIMD/SIMD machines    1689--1691

Parallel Computing
Volume 20, Number 12, November 28, 1994

                   A. Dubey and   
                  M. Zubair and   
                   C. E. Grosch   A general purpose subroutine for fast
                                  Fourier transform on a distributed
                                  memory parallel machine  . . . . . . . . 1697--1710
        Ralf Östermark and   
                Martin Saarinen   Parallel implementation of a VARMAX
                                  algorithm  . . . . . . . . . . . . . . . 1711--1720
                 Shu Hua Hu and   
                Hsing Lung Chen   An effective routing algorithm in
                                  incomplete hypercubes  . . . . . . . . . 1721--1738
                M. S. Horng and   
                 D. J. Chen and   
                    Kuo Lung Ku   Parallel routing algorithms for
                                  incomplete hypercube interconnection
                                  networks . . . . . . . . . . . . . . . . 1739--1761
                  Kemal Efe and   
            P. K. Blackwell and   
                  W. Slough and   
                       T. Shiau   Topological properties of the crossed
                                  cube architecture  . . . . . . . . . . . 1763--1775

Parallel Computing
Volume 21, Number 1, January 10, 1995

           Samir W. Mahfoud and   
              David E. Goldberg   Parallel recombinative simulated
                                  annealing: a genetic algorithm . . . . . 1--28
           R. Van Driessche and   
                       D. Roose   An improved spectral bisection algorithm
                                  and its application to dynamic load
                                  balancing  . . . . . . . . . . . . . . . 29--48
             Claus Bendtsen and   
       Per Christian Hansen and   
                 Kaj Madsen and   
         Hans Bruun Nielsen and   
                  Mustafa Pinar   Implementation of $QR$ up- and
                                  downdating on a massively parallel
                                  computer . . . . . . . . . . . . . . . . 49--61
             T. H. C. Smith and   
                 G. L. Thompson   A parallel implementation of the column
                                  subtraction algorithm  . . . . . . . . . 63--71
              A. De Matteis and   
                    S. Pagnutti   Controlling correlations in parallel
                                  Monte Carlo  . . . . . . . . . . . . . . 73--84
    Sathiamoorthy Manoharan and   
                Nigel P. Topham   An assessment of assignment schemes for
                                  dependency graphs  . . . . . . . . . . . 85--107
                D. J. Evans and   
                     S. A. Amin   Systolic algorithms for digital image
                                  filtering  . . . . . . . . . . . . . . . 109--119
             Kuninobu Tanno and   
           Toshihiro Taketa and   
               Susumu Horiguchi   Parallel FFT algorithms using radix 4
                                  butterfly computation on an
                                  eight-neighbor processor array . . . . . 121--136
                Chi-kin Lee and   
                   Mounir Hamdi   Practical aspects and experiences:
                                  Parallel image processing applications
                                  on a network of workstations . . . . . . 137--160
            Howard C. Elman and   
               Dennis K.-Y. Lee   Use of linear algebra kernels to build
                                  an efficient finite element solver . . . 161--173

Parallel Computing
Volume 21, Number 2, February 17, 1995

               J. De Keyser and   
                       D. Roose   Run-time load balancing techniques for a
                                  parallel unstructured multi-grid Euler
                                  solver with adaptive grid refinement . . 179--198
      Tilmann Bönniger and   
         Rüdiger Esser and   
                Dietrich Krekel   CM-5E, KSR2, Paragon XP/S: a comparative
                                  description of massively parallel
                                  computers  . . . . . . . . . . . . . . . 199--232
   Juan C. Agüí and   
          Javier Jiménez   A binary tree implementation of a
                                  parallel distributed tridiagonal solver  233--241
    Emmanouel A. Varvarigos and   
           Dimitri P. Bertsekas   Transposition of banded matrices in
                                  hypercubes: a nearly isotropic task  . . 243--264
                    E. Lega and   
                  H. Scholl and   
                J.-M. Alimi and   
                 A. Bijaoui and   
                        P. Bury   A parallel algorithm for structure
                                  detection based on wavelet and
                                  segmentation analysis  . . . . . . . . . 265--285
                F. J. Muniz and   
                  E. J. Zaluska   Parallel load-balancing: an extension to
                                  the gradient model . . . . . . . . . . . 287--301
                      Hong Shen   An efficient permutation-based parallel
                                  algorithm for range-join in hypercubes   303--313
           M. Y. Mohd-Saman and   
                    D. J. Evans   Inter-procedural analysis for parallel
                                  computing  . . . . . . . . . . . . . . . 315--338
              Zaher Mahjoub and   
                  Mohamed Jemni   Restructuring and parallelizing a static
                                  conditional loop . . . . . . . . . . . . 339--347

Parallel Computing
Volume 21, Number 3, March 10, 1995

                 F. Desprez and   
                 B. Tourancheau   Basic routines for the rank-$2k$ update:
                                  2D torus vs.\ reconfigurable network . . 353--372
  Jörg-Thomas Pfenning and   
                 Christoph Moll   Optimized communication patterns on
                                  workstation clusters . . . . . . . . . . 373--388
                   Liu Yong and   
                Kang Lishan and   
                    D. J. Evans   The annealing evolution algorithm as
                                  function optimizer . . . . . . . . . . . 389--400
                S. Crivelli and   
                   E. R. Jessup   The cost of eigenvalue computation on
                                  distributed-memory MIMD multiprocessors  401--422
                L. Nicastro and   
                     N. D'Amico   An optimized mass storage FFT for vector
                                  computers  . . . . . . . . . . . . . . . 423--432
                 R. Sridhar and   
             N. Chandrasekharan   Highly parallelizable problems on sorted
                                  intervals  . . . . . . . . . . . . . . . 433--446
                K. G. Kumar and   
               D. B. Skillicorn   Data parallel geometric operations on
                                  lists  . . . . . . . . . . . . . . . . . 447--459
                   Zhaofang Wen   Fast parallel algorithms for the maximum
                                  sum problem  . . . . . . . . . . . . . . 461--466
               D. Moncrieff and   
              R. E. Overill and   
                      S. Wilson   $\alpha_{\mbox{critical}}$ for parallel
                                  processors . . . . . . . . . . . . . . . 467--471
                Pontus Matstoms   Parallel sparse $QR$ factorization on
                                  shared memory architectures  . . . . . . 473--486
             Pasqua D'Ambra and   
                  Giulio Giunta   Concurrent banded Cholesky factorization
                                  on workstation networks using PVM  . . . 487--494
           Frederic Desprez and   
                    Marc Garbey   Numerical simulation of a combustion
                                  problem on a Paragon machine . . . . . . 495--508
               Gerhard Globisch   PARMESH --- a parallel mesh generator    509--524

Parallel Computing
Volume 21, Number 4, April 1, 1995

                 David M. Nicol   Noncommittal barrier synchronization . . 529--549
              Rolf Borgeest and   
             Bernward Dimke and   
                    Olav Hansen   A trace based performance evaluation
                                  tool for parallel real time systems  . . 551--564
               Bai Zhongzhi and   
                 Wang Deren and   
                    D. J. Evans   Models of asynchronous parallel matrix
                                  multisplitting relaxed iterations  . . . 565--582
               L. F. Romero and   
                   E. L. Zapata   Data distributions for sparse matrix
                                  vector multiplication  . . . . . . . . . 583--605
              N. M. Bahoshy and   
                    D. J. Evans   A general harness for explicit parallel
                                  programming  . . . . . . . . . . . . . . 607--617
                  M. P. Bekakos   A notational approach to formulation of
                                  systolic array programs (Short
                                  communication) . . . . . . . . . . . . . 619--626
                 Xiaodong Zhang   Parallelizing an oil refining
                                  simulation: Numerical methods,
                                  implementations and experience . . . . . 627--647
               Albert Y. Zomaya   Parallel processing for robot dynamics
                                  computations . . . . . . . . . . . . . . 649--668
                  A. Asenov and   
                    D. Reid and   
                   J. R. Barker   Speed-up of scalable iterative linear
                                  solvers implemented on an array of
                                  transputers  . . . . . . . . . . . . . . 669--682
                  G. A. Kohring   Dynamic load balancing for parallelized
                                  particle simulations on MIMD computers   683--693

Parallel Computing
Volume 21, Number 5, May 10, 1995

            Takuya Terasawa and   
                Ou Yamamoto and   
             Tomohiro Kudoh and   
                 Hideharu Amano   A performance evaluation of the
                                  multiprocessor testbed ATTEMPT-0 . . . . 701--730
       Susanne E. Hambrusch and   
              Farooq Hameed and   
              Ashfaq A. Khokhar   Communication operations on
                                  coarse-grained mesh architectures  . . . 731--751 (or 731--752??)
              Shuichi Sakai and   
              Yuetsu Kodama and   
             Mitsuhisa Sato and   
                Andrew Shaw and   
           Hiroshi Matsuoka and   
               Hideo Hirono and   
            Kazuaki Okamoto and   
                 Takashi Yokota   Reduced interprocessor-communication
                                  architecture and its implementation on
                                  EM-4 . . . . . . . . . . . . . . . . . . 753--769 (or 753--770??)
            Dilip K. Saikia and   
                  Ranjan K. Sen   Order preserving communication on a star
                                  network  . . . . . . . . . . . . . . . . 771--782
              M. A. de Rosa and   
                  G. Giunta and   
                    M. Rizzardi   Parallel Talbot's algorithm for
                                  distributed memory machines  . . . . . . 783--801 (or 783--802??)
               M. Cannataro and   
             S. Di Gregorio and   
                   R. Rongo and   
                 W. Spataro and   
                G. Spezzano and   
                       D. Talia   A parallel cellular automata environment
                                  on multicomputers for computational
                                  science  . . . . . . . . . . . . . . . . 803--823 (or 803--824??)
               K. G. Margaritis   On the systolic implementation of
                                  associative memory artificial neural
                                  networks . . . . . . . . . . . . . . . . 825--840
                  Ling Chen and   
             Henry Y. H. Chuang   An efficient algorithm for complete
                                  Euclidean distance transform on
                                  mesh-connected SIMD (Short
                                  communication) . . . . . . . . . . . . . 841--852
       Marek T. Michalewicz and   
                Mark Priebatsch   Perfect scaling of the electronic
                                  structure problem on a SIMD architecture 853--870

Parallel Computing
Volume 21, Number 6, June 12, 1995

             Robert B. Schnabel   A view of the limitations,
                                  opportunities, and challenges in
                                  parallel nonlinear optimization  . . . . 875--905
                  Kai Rothe and   
                  Heinrich Voss   A fully parallel condensation method for
                                  generalized eigenvalue problems on
                                  distributed memory computers . . . . . . 907--921
            Arkady Kanevsky and   
                      Chao Feng   On the embedding of cycles in pancake
                                  graphs . . . . . . . . . . . . . . . . . 923--936
Dieter Müller-Wichards and   
           Wolfgang Rönsch   Scalability of algorithms: an analytic
                                  approach . . . . . . . . . . . . . . . . 937--952
             Tzong Wann Kao and   
                 Shi Jinn Horng   Optimal algorithms for computing
                                  articulation points and some related
                                  problems on a circular-arc graph (Short
                                  communication) . . . . . . . . . . . . . 953--969
                 John Brown and   
         Jerzy Was\'niewski and   
                  Zahari Zlatev   Practical aspects and experiences.
                                  Running air pollution models on
                                  massively parallel machines  . . . . . . 971--991
           Vamsee Lakamsani and   
            Laxmi N. Bhuyan and   
             D. Scott Linthicum   Practical aspects and experiences.
                                  Mapping molecular dynamics computations
                                  on to hypercubes . . . . . . . . . . . . 993--1013
                 Jun Makino and   
                 Osamu Miyamura   Parallelized feedback shift register
                                  generators of pseudorandom numbers . . . 1015--1028

Parallel Computing
Volume 21, Number 7, July 11, 1995

               Tony F. Chan and   
                 Jian Ping Shao   Parallel complexity of domain
                                  decomposition methods and optimal coarse
                                  grid size  . . . . . . . . . . . . . . . 1033--1049
             Hugo Embrechts and   
                     Dirk Roose   MIMD divide-and-conquer algorithms for
                                  the distance transformation. Part I:
                                  City Block distance  . . . . . . . . . . 1051--1076
             Hugo Embrechts and   
                     Dirk Roose   MIMD divide-and-conquer algorithms for
                                  the distance transformation. Part II.
                                  Chamfer $3$-$4$ distance . . . . . . . . 1077--1096
           Pierluigi Amodio and   
                 Luigi Brugnano   The parallel $QR$ factorization
                                  algorithm for tridiagonal linear systems 1097--1110
                 P. Yalamov and   
                    D. J. Evans   The $WZ$ matrix factorisation method . . 1111--1120
                Edward Rothberg   Alternatives for solving sparse
                                  triangular systems on distributed-memory
                                  multiprocessors  . . . . . . . . . . . . 1121--1136
                  N. Floros and   
                    J. S. Reeve   Evaluation of a spectral element CFD
                                  code on parallel architectures . . . . . 1137--1150
                A. Averbuch and   
                 M. Israeli and   
                     L. Vozovoi   Parallel implementation of non-linear
                                  evolution problems using parabolic
                                  domain decomposition . . . . . . . . . . 1151--1183

Parallel Computing
Volume 21, Number 8, August 10, 1995

           Michael W. Berry and   
           Jack J. Dongarra and   
                   Youngbae Kim   A parallel algorithm for the reduction
                                  of a nonsymmetric matrix to block
                                  upper-Hessenberg form  . . . . . . . . . 1189--1211
                 C. Trefftz and   
                C. C. Huang and   
             P. K. McKinley and   
                   T.-Y. Li and   
                        Z. Zeng   A scalable eigenvalue solver for
                                  symmetric tridiagonal matrices . . . . . 1213--1240
                    Xian-He Sun   Application and accuracy of the parallel
                                  diagonal dominant algorithm  . . . . . . 1241--1267
                   H. R. Barada   Modular matrix computations on
                                  multi-linear VLSI arrays . . . . . . . . 1269--1284
       Paraskevas Evripidou and   
               Jean-Luc Gaudiot   Incorporating input/output operations
                                  into dynamic data-flow graphs  . . . . . 1285--1311
                 Clark F. Olson   Parallel algorithms for hierarchical
                                  clustering . . . . . . . . . . . . . . . 1313--1325
                 Tom Altman and   
         Yoshihide Igarashi and   
                   Koji Obokata   Hyper-ring connection machines . . . . . 1327--1338
            J. P. Geschiere and   
              H. A. G. Wijshoff   Exploiting large grain parallelism in a
                                  sparse direct linear system solver . . . 1339--1364
                G. Casciola and   
                      S. Morigi   Graphics in parallel computation for
                                  rendering $3$D modelled scenes . . . . . 1365--1382

Parallel Computing
Volume 21, Number 9, September 12, 1995

              Jaeyoung Choi and   
           Jack J. Dongarra and   
                David W. Walker   Parallel matrix transpose algorithms on
                                  distributed memory concurrent computers  1387--1405
                 Gita Alaghband   Parallel sparse matrix solution and
                                  performance  . . . . . . . . . . . . . . 1407--1430
           Bassem F. Beidas and   
    George P. Papavassilopoulos   Distributed asynchronous algorithms with
                                  stochastic delays for constrained
                                  optimization problems with conditions of
                                  time drift . . . . . . . . . . . . . . . 1431--1450
               Fotis Barlos and   
                  Ophir Frieder   A load balanced multicomputer relational
                                  database system for highly skewed data   1451--1483
          Akiyoshi Wakatani and   
                  Michael Wolfe   Optimization of array redistribution for
                                  distributed memory multicomputers  . . . 1485--1490
            Umpei Nagashima and   
            Sachiko Hyugaji and   
          Satoshi Sekiguchi and   
             Mitsuhisa Sato and   
                   Haruo Hosoya   An experience with super-linear speedup
                                  achieved by parallel computing on a
                                  workstation cluster: Parallel
                                  calculation of density of states of
                                  large scale cyclic polyacenes  . . . . . 1491--1504
      Jesper Larsson Träff   An experimental comparison of two
                                  distributed single-source shortest path
                                  algorithms . . . . . . . . . . . . . . . 1505--1532

Parallel Computing
Volume 21, Number 10, November 29, 1995

                   J. Drake and   
                      I. Foster   Guest Editorial: Parallel computing in
                                  climate and weather modeling . . . . . . 1537
                   J. Drake and   
                      I. Foster   Introduction to the special issue on
                                  parallel computing in climate and
                                  weather modeling . . . . . . . . . . . . 1539--1544
              James J. Hack and   
          James M. Rosinski and   
        David L. Williamson and   
           Byron A. Boville and   
              John E. Truesdale   Computational design of the NCAR
                                  community climate model  . . . . . . . . 1545--1569
                 John Drake and   
                 Ian Foster and   
            John Michalakes and   
               Brian Toonen and   
                 Patrick Worley   Design and performance of a scalable
                                  parallel community climate model . . . . 1571--1591
          Steven W. Hammond and   
            Richard D. Loft and   
             John M. Dennis and   
                Richard K. Sato   Implementation and performance issues of
                                  a massively parallel atmospheric model   1593--1619
            S. R. M. Barros and   
                    D. Dent and   
                 L. Isaksen and   
                G. Robinson and   
              G. Mozdzynski and   
                 F. Wollenweber   The IFS model: a parallel production
                                  weather code . . . . . . . . . . . . . . 1621--1638
                     J. G. Sela   Weather forecasting on parallel
                                  architectures  . . . . . . . . . . . . . 1639--1654
               M. F. Wehner and   
                A. A. Mirin and   
             P. G. Eltgroth and   
             W. P. Dannevik and   
              C. R. Mechoso and   
              J. D. Farrara and   
                    J. A. Spahr   Performance of a distributed memory
                                  finite difference atmospheric general
                                  circulation model  . . . . . . . . . . . 1655--1675
            Philip W. Jones and   
        Christopher L. Kerr and   
              Richard S. Hemler   Practical considerations in development
                                  of a parallel SKYHI general circulation
                                  model  . . . . . . . . . . . . . . . . . 1677--1694
               Rainer Bleck and   
                Sumner Dean and   
            Matthew O'Keefe and   
                   Aaron Sawdey   A comparison of data-parallel and
                                  message-passing versions of the Miami
                                  Isopycnic Coordinate Ocean Model (MICOM) 1695--1720

Parallel Computing
Volume 21, Number 11, November 29, 1995

                        Y. Nota   An efficient parallel discrete PDE
                                  solver . . . . . . . . . . . . . . . . . 1725--1748
                  Chang Shu and   
                  Hilary Buxton   Parallel path planning on the
                                  distributed array processor  . . . . . . 1749--1767
              Nathan Mattor and   
        Timothy J. Williams and   
               Dennis W. Hewett   Algorithm for solving tridiagonal matrix
                                  problems in parallel . . . . . . . . . . 1769--1782
    Suchendra M. Bhandarkar and   
               Hamid R. Arabnia   The REFINE multiprocessor ---
                                  Theoretical properties and algorithms    1783--1805
  Ramachandran Vaidyanathan and   
              Anand Padmanabhan   Short communication: Bus-based networks
                                  for fan-in and uniform hypercube
                                  algorithms . . . . . . . . . . . . . . . 1807--1821
                  N. Floros and   
                J. S. Reeve and   
          J. Clinckemaillie and   
             S. Vlachoutsis and   
                    G. Lonsdale   Comparative efficiencies of domain
                                  decompositions . . . . . . . . . . . . . 1823--1835
            Mats Holmström   Practical aspects and experiences:
                                  Parallelizing the fast wavelet transform 1837--1848
                  M. Briscolini   A parallel implementation of a $3$-D
                                  pseudospectral based code on the IBM
                                  9076 scalable POWER parallel system  . . 1849--1862

Parallel Computing
Volume 21, Number 12, December 12, 1995

                    T. Dehn and   
                M. Eiermann and   
              K. Giebermann and   
                    V. Sperling   Structured sparse matrix-vector
                                  multiplication on massively parallel
                                  SIMD architectures . . . . . . . . . . . 1867--1894
                 PeiZong Z. Lee   Techniques for compiling programs on
                                  distributed memory multicomputers  . . . 1895--1923
                 C. S. Yang and   
                 Y. M. Tsai and   
                  S. L. Chi and   
             Shepherd S. B. Shi   Adaptive wormhole routing in $k$-ary
                                  $n$-cubes  . . . . . . . . . . . . . . . 1925--1943
            J. B\la\.zewicz and   
                  M. Drozdowski   Short Communication: Scheduling
                                  divisible jobs on hypercubes . . . . . . 1945--1956
             Sergio De Agostino   Short communication: a parallel decoding
                                  algorithm for LZ2 data compression . . . 1957--1961
        Chandra N. Sekharan and   
                Vineet Goel and   
                     R. Sridhar   Load balancing methods for ray tracing
                                  and binary tree computing using PVM  . . 1963--1978
               Gerhard Globisch   On an automatically parallel generation
                                  technique for tetrahedral meshes . . . . 1979--1995
                     Murray Dow   Transposing a matrix on a vector
                                  computer . . . . . . . . . . . . . . . . 1997--2005

Parallel Computing
Volume 22, Number 1, February 20, 1996

                     Bruno Lang   Parallel reduction of banded matrices to
                                  bidiagonal form  . . . . . . . . . . . . 1--18
    Francisco Argüello and   
             Margarita Amor and   
               Emilio L. Zapata   FFTs on mesh connected computers . . . . 19--38
               S. A. Savari and   
                D. P. Bertsekas   Finite termination of asynchronous
                                  iterative algorithms . . . . . . . . . . 39--56
                  E. de Sturler   A performance model for Krylov subspace
                                  methods on mesh-based parallel computers 57--74
             Himanshu Gupta and   
                  P. Sadayappan   Communication-efficient matrix
                                  multiplication on hypercubes . . . . . . 75--99
                 A. Baronio and   
                        F. Zama   A domain decomposition technique for
                                  spline image restoration on distributed
                                  memory systems . . . . . . . . . . . . . 101--110
              Donald Dabdub and   
               John H. Seinfeld   Parallel computation in atmospheric
                                  chemical modeling  . . . . . . . . . . . 111--130
                  R. Hempel and   
                  R. Calkin and   
                    R. Hess and   
                 W. Joppich and   
            C. W. Oosterlee and   
                H. Ritzdorf and   
                  P. Wypior and   
                 W. Ziegler and   
                   N. Koike and   
                  T. Washio and   
                      U. Keller   Real applications on the new parallel
                                  system NEC Cenju-3 . . . . . . . . . . . 131--148
                    Andreas Uhl   Wavelet packet best basis selection on
                                  moderate parallel MIMD architectures . . 149--158

Parallel Computing
Volume 22, Number 2, April 5, 1996

            C. S. Ierotheou and   
              S. P. Johnson and   
                   M. Cross and   
                  P. F. Leggett   Computer aided parallelisation tools
                                  (CAPTools) --- conceptual overview and
                                  performance on the parallelisation of
                                  structured mesh codes  . . . . . . . . . 163--195
              S. P. Johnson and   
                   M. Cross and   
                  M. G. Everett   Exploitation of symbolic information in
                                  interprocedural dependence analysis  . . 197--226
              S. P. Johnson and   
            C. S. Ierotheou and   
                       M. Cross   Automatic parallel code generation for
                                  message passing on distributed memory
                                  systems  . . . . . . . . . . . . . . . . 227--258
              P. F. Leggett and   
             A. T. J. Marsh and   
              S. P. Johnson and   
                       M. Cross   Integrating user knowledge with
                                  information from parallelisation tools
                                  to facilitate the automatic generation
                                  of efficient parallel FORTRAN code . . . 259--288
                L. Colombet and   
              Ph. Michallon and   
                    D. Trystram   Parallel matrix-vector product on rings
                                  with a minimum of communications . . . . 289--310
                 Yu-Hua Lee and   
             Shi-Jinn Horng and   
             Tzong-Wann Kao and   
            Ferng-Shi Jaung and   
             Yuung-Jih Chen and   
                 Horng-Ren Tsai   Parallel computation of exact Euclidean
                                  distance transform . . . . . . . . . . . 311--325
           Theodore Johnson and   
           Timothy A. Davis and   
             Steven M. Hadfield   A concurrent dynamic task graph  . . . . 327--333

Parallel Computing
Volume 22, Number 3, April 29, 1996

                   Jingling Xue   Transformations of nested loops with
                                  non-convex iteration spaces  . . . . . . 339--368
               Bruce Boldon and   
               Narsingh Deo and   
                   Nishit Kumar   Minimum-weight degree-constrained
                                  spanning tree problem: Heuristics and
                                  implementation on an SIMD parallel
                                  machine  . . . . . . . . . . . . . . . . 369--382
                  Peter Fiebach   Cyclic block-algorithms for solving
                                  triangular systems on distributed-memory
                                  multiprocessors with mesh topology . . . 383--393
               Imtiaz Ahmad and   
             Muhammad K. Dhodhi   Multiprocessor scheduling in a genetic
                                  paradigm . . . . . . . . . . . . . . . . 395--406
               D. Moncrieff and   
              R. E. Overill and   
                      S. Wilson   Heterogeneous computing machines and
                                  Amdahl's law . . . . . . . . . . . . . . 407--413
      Roland Wismüller and   
          Michael Oberhuber and   
             Johann Krammer and   
                    Olav Hansen   Interactive debugging and performance
                                  analysis of massively parallel
                                  applications . . . . . . . . . . . . . . 415--442
                 F. Gutbrod and   
                   N. Attig and   
                       M. Weber   The SU(2)-Lattice Gauge Theory
                                  simulation code on the Intel Paragon
                                  supercomputer  . . . . . . . . . . . . . 443--463
                  M. M. Shearer   Computational optimization of finite
                                  difference methods on the CM5  . . . . . 465--481

Parallel Computing
Volume 22, Number 4, June 11, 1996

              Samuel Kortas and   
                 Philippe Angot   A practical and portable model of
                                  programming for iterative solvers on
                                  distributed memory machines  . . . . . . 487--512
                    S. Oliveira   Parallel multigrid methods for transport
                                  equations: the anisotropic case  . . . . 513--537
                 Markus Hegland   Real and complex fast Fourier transforms
                                  on the Fujitsu VPP 500 . . . . . . . . . 539--553
               Roni Khardon and   
              Shlomit S. Pinter   Partitioning and scheduling to
                                  counteract overhead  . . . . . . . . . . 555--593
        Sotirios G. Ziavras and   
                 Arup Mukherjee   Data broadcasting and reduction, prefix
                                  computation, and sorting on reduced
                                  hypercube parallel computers . . . . . . 595--606
                       Lin Chen   Partitioning graphs into Hamiltonian
                                  ones . . . . . . . . . . . . . . . . . . 607--618

Parallel Computing
Volume 22, Number 5, August 8, 1996

         A. T. Chronopoulos and   
                  C. D. Swanson   Parallel iterative ${S}$-step methods
                                  for unsymmetric linear systems . . . . . 623--641
                D. Conforti and   
                 L. De Luca and   
             L. Grandinetti and   
                    R. Musmanno   A parallel implementation of automatic
                                  differentiation for partially separable
                                  functions using PVM  . . . . . . . . . . 643--656
         Y. Trémolet and   
                 F.-X. Le Dimet   Parallel algorithms for variational data
                                  assimilation and coupling models . . . . 657--674
                  Dugki Min and   
                  Matt W. Mutka   A model for analyzing interactions in
                                  $2$-D mesh wormhole-routed
                                  multicomputers . . . . . . . . . . . . . 675--699
              Borut Robi\vc and   
               Bo\vstjan Vilfan   Improved schemes for mapping arbitrary
                                  algorithms onto processor meshes . . . . 701--724
          Klaus Stüben and   
        Hermann Mierendorff and   
       Clemens-August Thole and   
                    Owen Thomas   Industrial parallel computing with real
                                  codes  . . . . . . . . . . . . . . . . . 725--737
    Umakishore Ramachandran and   
                Gautam Shah and   
               S. Ravikumar and   
      Jeyakumar Muthukumarasamy   Scalability study of the KSR-1 . . . . . 739--759
               G. Fabbretti and   
                  A. Farina and   
               D. Laforenza and   
                     F. Vinelli   Mapping the synthetic aperture radar
                                  signal processor on a distributed-memory
                                  MIMD architecture  . . . . . . . . . . . 761--784

Parallel Computing
Volume 22, Number 6, September 20, 1996

              William Gropp and   
                 Ewing Lusk and   
                Nathan Doss and   
               Anthony Skjellum   High-performance, portable
                                  implementation of the MPI Message
                                  Passing Interface Standard . . . . . . . 789--828
                   Y. F. Hu and   
              D. R. Emerson and   
                    R. J. Blake   The communication performance of the
                                  Cray T3D and its effect on iterative
                                  solvers  . . . . . . . . . . . . . . . . 829--844
               M. Chandwani and   
                N. S. Chaudhari   Formulation and analysis of parallel
                                  context-free recognition and parsing on
                                  a PRAM model . . . . . . . . . . . . . . 845--868
              Mats Brorsson and   
             Per Stenström   Characterising and modelling shared
                                  memory accesses in multiprocessor
                                  programs . . . . . . . . . . . . . . . . 869--893
         Sanjeev R. Rastogi and   
               Norman J. Wagner   A parallel algorithm for Lees-Edwards
                                  boundary conditions  . . . . . . . . . . 895--901
         Leszek Ga\csieniec and   
                   Andrzej Pelc   Adaptive broadcasting with faulty nodes  903--912

Parallel Computing
Volume 22, Number 7, October 1, 1996

                  Zhiwei Xu and   
                      Kai Hwang   Early prediction of MPP performance: The
                                  SP2, T3D, and Paragon experiences  . . . 917--942
                     S. Lanteri   Parallel solutions of compressible flows
                                  using overlapping and non-overlapping
                                  mesh partitioning strategies . . . . . . 943--968
           Mark A. Franklin and   
               Vasudha Govindan   A general matrix iterative model for
                                  dynamic load balancing . . . . . . . . . 969--989
      Paraskevi Fragopoulou and   
                   Selim G. Akl   Spanning subgraphs with applications to
                                  communication on the multidimensional
                                  torus network  . . . . . . . . . . . . . 991--1015
             N. Bassiliades and   
                    I. Vlahavas   Hierarchical query execution in a
                                  parallel object-oriented database system 1017--1048

Parallel Computing
Volume 22, Number 8, October 28, 1996

                M. Surridge and   
            D. J. Tildesley and   
                 Y. C. Kong and   
                    D. B. Adolf   Practical aspects and experiences. A
                                  parallel molecular dynamics simulation
                                  code for dialkyl cationic surfactants    1053--1071
          Frank C. Wimberly and   
         Michael H. Lambert and   
        Nicholas A. Nystrom and   
            Alex Ropelewski and   
                  William Young   Porting third-party applications
                                  packages to the Cray T3D: Programming
                                  issues and scalability results . . . . . 1073--1089
    Josep-Lluis Larriba-Pey and   
            Juan J. Navarro and   
                Angel Jorba and   
                     Oriol Roig   Review of general and Toeplitz vector
                                  bidiagonal solvers . . . . . . . . . . . 1091--1125 (or 1091--1126??)
                Peter K. K. Loh   Artificial intelligence search
                                  techniques as fault-tolerant routing
                                  strategies . . . . . . . . . . . . . . . 1127--1147
             H. H. ten Cate and   
            E. A. H. Vollebregt   On the portability and efficiency of
                                  parallel algorithms and software . . . . 1149--1163

Parallel Computing
Volume 22, Number 9, November 22, 1996

Ignacio Martín Llorente and
Francisco Tirado and
Luis Vázquez Some aspects about the scalability of
scientific applications on parallel
architectures . . . . . . . . . . . . . 1169--1195
Goran Lj. Djordjevi\'c and
Milorad B. To\vsi\'c A heuristic for scheduling task graphs
with communication delays onto
multiprocessors . . . . . . . . . . . . 1197--1214
Jerry C. Yan and
Sekhar R. Sarukkai Analyzing parallel program performance
using normalized performance indices and
trace transformation techniques . . . . 1215--1237
Abdel Aziz Farrag New algorithm for constructing
fault-tolerant solutions of the
circulant graph configuration . . . . . 1239--1253 (or 1239--1254??)
C. Calvin Implementation of parallel FFT
algorithms on distributed memory
machines with a minimum overhead of
communication . . . . . . . . . . . . . 1255--1279
Maria Antonietta Pirozzi A fast numerical method for mildly
nonlinear parabolic initial boundary
value problems. II: The parallel
implementation on the Intel Touchstone
Delta system . . . . . . . . . . . . . . 1281--1285

Parallel Computing
Volume 22, Number 10, December 15, 1996

             K. A. Gallivan and   
              B. A. Marsolf and   
              H. A. G. Wijshoff   Solving large nonsymmetric sparse linear
                                  systems using MCSPARSE . . . . . . . . . 1291--1333
                       S. Hioki   Construction of staples in lattice gauge
                                  theory on a parallel computer  . . . . . 1335--1344
          Rabi N. Mahapatra and   
              Sudipta Mahapatra   Mapping of neural network models onto
                                  two-dimensional processor arrays . . . . 1345--1357
              Piyush Maheshwari   Improving granularity and locality of
                                  data in multiprocessor execution of
                                  functional programs  . . . . . . . . . . 1359--1372
             Mich\`ele Dion and   
                    Yves Robert   Mapping affine loop nests  . . . . . . . 1373--1397
             Ingmar Neumann and   
              Wolfgang Wilhelmi   A parallel algorithm for achieving the
                                  Smith normal form of an integer matrix   1399--1412
                  C. Calvin and   
                    L. Colombet   Performance evaluation and modeling of
                                  collective communications on Cray T3D    1413--1427

Parallel Computing
Volume 22, Number 11, January 26, 1997

             Yasushi Shinjo and   
                 Yasushi Kiyoki   A lightweight process facility
                                  supporting meta-level programming  . . . 1429--1454
                A. Cichocki and   
                    A. Bargiela   Neural networks for solving linear
                                  inequality systems . . . . . . . . . . . 1455--1475
                   M. Hamdi and   
                      C. K. Lee   Dynamic load-balancing of image
                                  processing applications on clusters of
                                  workstations . . . . . . . . . . . . . . 1477--1492
                    N. P. Kruyt   A conjugate gradient method for the
                                  spectral partitioning of graphs  . . . . 1493--1502
                    R. Hess and   
                     W. Joppich   A comparison of parallel multigrid and a
                                  fast Fourier transform algorithm for the
                                  solution of the Helmholtz equation in
                                  numerical weather prediction . . . . . . 1503--1512
              William Gropp and   
                     Ewing Lusk   A high-performance MPI implementation on
                                  a shared-memory vector supercomputer . . 1513--1526
                 Bodo Heise and   
                   Michael Jung   Parallel solvers for nonlinear elliptic
                                  problems based on domain decomposition
                                  ideas  . . . . . . . . . . . . . . . . . 1527--1544
              Edward Walker and   
                Gary Morgan and   
                 Bruce Cass and   
              Zygmunt Ulanowski   A note on compiling FORTRAN loop kernels
                                  onto a dataflow architecture . . . . . . 1545--1557

Parallel Computing
Volume 22, Number 12, February 21, 1997

                Dominique Barth   Parallel matrix product algorithm in the
                                  de Bruijn network using emulation of
                                  meshes of trees  . . . . . . . . . . . . 1563--1578
                Jong-Uk Kim and   
              Kyu-Hyun Shim and   
                    Kyu Ho Park   A link-disjoint subcube for processor
                                  allocation in hypercube computers  . . . 1579--1595
              Dale M. Slone and   
              Garry H. Rodrigue   Efficient biased random bit generation
                                  for parallel lattice gas simulations . . 1597--1620
                   Jingling Xue   Unimodular transformations of
                                  non-perfectly nested loops . . . . . . . 1621--1645
           David J. Jackson and   
              Chris W. Humphres   A simple yet effective load balancing
                                  extension to the PVM software system . . 1647--1660
               S. Mahapatra and   
            R. N. Mahapatra and   
                B. N. Chatterji   A parallel formulation of
                                  back-propagation learning on distributed
                                  memory multiprocessors . . . . . . . . . 1661--1675
              Satoko Sakata and   
            Umpei Nagashima and   
             Mitsuhisa Sato and   
          Satoshi Sekiguchi and   
                   Haruo Hosoya   Performance evaluation of a workstation
                                  cluster, TMC CM-5, and Intel Paragon/XP
                                  using a parallel homology analysis
                                  program  . . . . . . . . . . . . . . . . 1677--1693

Parallel Computing
Volume 22, Number 13, February 28, 1997

                  G. Haring and   
                  P. Kacsuk and   
                      G. Kotsis   Distributed and parallel systems:
                                  Environments and tools . . . . . . . . . 1699--1701
                  G. Chiola and   
                     G. Ciaccio   Implementing a low cost, low latency
                                  parallel platform  . . . . . . . . . . . 1703--1717
               F. Bergadano and   
            A. Giallombardo and   
               A. Puliafito and   
                   G. Ruffo and   
                        L. Vita   Security agents for information
                                  retrieval in distributed systems . . . . 1719--1731
                Rushed Kanawati   LICRA: a replicated-data management
                                  algorithm for distributed synchronous
                                  group-ware applications  . . . . . . . . 1733--1746
        Péter Kacsuk and   
       José C. Cunha and   
  Gábor Dózsa and   
João Lourenço and   
              Tibor Fadgyas and   
             Tiago Antão   A graphical development and debugging
                                  environment for parallel programs  . . . 1747--1770
                Gabriele Kotsis   A systematic approach for workload
                                  modeling for parallel processing systems 1771--1787
              J. Lüthi and   
                S. Majumdar and   
                  G. Kotsis and   
                      G. Haring   Performance bounds for distributed
                                  systems with workload variabilities and
                                  uncertainties  . . . . . . . . . . . . . 1789--1806
        Tamás Bartha and   
           Endre Selényi   Probabilistic system-level fault
                                  diagnostic algorithms for
                                  multiprocessors  . . . . . . . . . . . . 1807--1821
                T. Delaitre and   
        G. R. Ribeiro-Justo and   
                   F. Spies and   
                   S. C. Winter   A graphical toolset for simulation
                                  modelling of parallel systems  . . . . . 1823--1836
                  H. Wabnig and   
                      G. Haring   PAPS --- a testbed for performance
                                  prediction of parallel applications  . . 1837--1851
        Péter Kacsuk and   
        Zsolt Németh and   
            Zsolt Puskás   Tools for mapping, load balancing and
                                  monitoring in the LOGFLOW parallel
                                  Prolog project . . . . . . . . . . . . . 1853--1881
                   E. Morel and   
                   J. Briat and   
      J. Chassin de Kergommeaux   Cuts and side-effects in distributed
                                  memory OR-parallel Prolog  . . . . . . . 1883--1896
              Szabolcs Ferenczi   Parallel execution of object-oriented
                                  programs: Message handling strategies    1897--1912
László Böszörményi and   
                Karl-Heinz Eder   M3Set --- a language for handling of
                                  distributed and persistent sets of
                                  objects  . . . . . . . . . . . . . . . . 1913--1925

Parallel Computing
Volume 22, Number 14, March 24, 1997

              Xiaodong Wang and   
      Vwani P. Roychowdhury and   
            Pratheep Balasingam   Practical aspects and experiences.
                                  Scalable massively parallel algorithms
                                  for computational nanoelectronics  . . . 1931--1963
Anthony Theodore Chronopoulos and   
                      Gang Wang   Practical aspects and experiences.
                                  Parallel solution of a traffic flow
                                  simulation problem . . . . . . . . . . . 1965--1983
             Der-Chyuan Lou and   
                Chin-Chen Chang   A parallel two-list algorithm for the
                                  knapsack problem . . . . . . . . . . . . 1985--1996
                 M. A. Amer and   
         B. A. Abdel-Hamida and   
                     D. Fausett   Parallel implementation of the Kronecker
                                  product technique for numerical solution
                                  of parabolic partial differential
                                  equations  . . . . . . . . . . . . . . . 1997--2005
          Edward A. Billard and   
             Joseph C. Pasquale   Load balancing to adjust for proximity
                                  in some network topologies . . . . . . . 2007--2023
            D. M. Dhamdhere and   
            Sridhar R. Iyer and   
         E. Kishore Kumar Reddy   Distributed termination detection for
                                  dynamic systems  . . . . . . . . . . . . 2025--2045
          Srabani Sen Gupta and   
               Rajib K. Das and   
   Krishnendu Mukhopadhyaya and   
               Bhabani P. Sinha   A family of network topologies with
                                  multiple loops and logarithmic diameter  2047--2064

Parallel Computing
Volume 23, Number 1--2, April 16, 1997

                J. Dongarra and   
                 B. Tourancheau   Workshop on environments and tools for
                                  parallel scientific computing  . . . . . 1
                   Tony Hey and   
            Alistair Dunlop and   
        Emilio Hernández   Realistic parallel performance
                                  estimation . . . . . . . . . . . . . . . 5--21
              Jesus Labarta and   
               Sergi Girona and   
                    Toni Cortes   Analyzing scheduling policies using
                                  Dimemas  . . . . . . . . . . . . . . . . 23--34
         Gilles Berger Sabbatel   Hardware solutions for efficient
                                  distributed computing on ATM networks    35--48
           Jack J. Dongarra and   
            Sven Hammarling and   
                David W. Walker   Key concepts for parallel out-of-core $L
                                  U$ factorization . . . . . . . . . . . . 49--70
                 T. Brandes and   
               S. Chaumette and   
              M. C. Counilh and   
                   J. Roman and   
                   A. Darte and   
                 F. Desprez and   
                   J. C. Mignot   HPFIT: a set of integrated tools for the
                                  parallelization of applications using
                                  High Performance Fortran. Part I: HPFIT
                                  and the TransTOOL environment  . . . . . 71--87
                 T. Brandes and   
               S. Chaumette and   
              M. C. Counilh and   
                   J. Roman and   
                 F. Desprez and   
                   J. C. Mignot   HPFIT: a set of integrated tools for the
                                  parallelization of applications using
                                  High Performance Fortran. Part II:
                                  Data-structure visualization and HPF
                                  extensions for irregular problems  . . . 89--105
                 Lo\"\ic Prylli   The CAPDYN environment and its
                                  message-passing library implementation   107--120
                 Vaidy Sunderam   Heterogeneous network computing: The
                                  next generation  . . . . . . . . . . . . 121--135
          El Mostafa Daoudi and   
             Abdelhak Lakhouaja   Exploiting the symmetry in the
                                  parallelization of the Jacobi method . . 137--151
     François Pellegrini   Graph partitioning based methods and
                                  tools for scientific computing . . . . . 153--164
          Jean-Yves Berthou and   
               Laurent Colombet   Which approach to parallelizing
                                  scientific codes --- That is the
                                  question . . . . . . . . . . . . . . . . 165--179
         Karen L. Karavanic and   
            Jussi Myllymaki and   
                Miron Livny and   
               Barton P. Miller   Integrated visualization of parallel
                                  program performance data . . . . . . . . 181--198
       D. Kranzlmüller and   
                 S. Grabner and   
                     J. Volkert   Debugging with the MAD environment . . . 199--217
               Bruno Gaujal and   
           Alain Jean-Marie and   
             Philippe Mussi and   
                 Gunther Siegel   High speed simulation of discrete event
                                  systems by mixing process oriented and
                                  equational approaches  . . . . . . . . . 219--233
              Laurent Lef\`evre   Parallel programming on top of DSM
                                  system. An experimental study  . . . . . 235--249
        Pierre-Yves Calland and   
                Alain Darte and   
                Yves Robert and   
                Frederic Vivien   Plugging anti and output dependence
                                  removal techniques into loop
                                  parallelization algorithm  . . . . . . . 251--266

Parallel Computing
Volume 23, Number 3, May 15, 1997

            Timo Hamalainen and   
              Harri Klapuri and   
             Jukka Saarinen and   
                    Kimmo Kaski   Mapping of SOM and LVQ algorithms on a
                                  tree shape parallel computer system  . . 271--289
             Chao-Tung Yang and   
         Shian-Shyong Tseng and   
           Cheng-Der Chuang and   
                 Wen-Chung Shih   Using knowledge-based techniques on loop
                                  parallelization for parallelizing
                                  compilers  . . . . . . . . . . . . . . . 291--309
             Yuh-Shyan Chen and   
                 Jang-Ping Sheu   Tolerating faults in injured hypercubes
                                  using maximal fault- free subcube-ring   311--331
              Plamen Y. Yalamov   Stability of a partitioning algorithm
                                  for bidiagonal systems . . . . . . . . . 333--348
                  Sung Kwon Kim   Rectangulating rectilinear polygons in
                                  parallel . . . . . . . . . . . . . . . . 349--367
                     C. K. Yuen   Parallel programming --- a critique  . . 369--380
               A. Basermann and   
                 B. Reichel and   
                  C. Schelthoff   Preconditioned CG methods for sparse
                                  matrices on massively parallel machines  381--398

Parallel Computing
Volume 23, Number 4--5, May 23, 1997

            David E. Womble and   
             David S. Greenberg   Parallel I/O: an introduction  . . . . . 403--417
            Ethan L. Miller and   
                  Randy H. Katz   RAMA: An easy-to-use, high-performance
                                  parallel file system . . . . . . . . . . 419--446
            Nils Nieuwejaar and   
                     David Kotz   The Galley parallel file system  . . . . 447--476
             Jason A. Moore and   
               Michael J. Quinn   Enhancing disk-directed I/O for
                                  fine-grained redistribution of file data 477--499
            Eric J. Schwabe and   
          Ian M. Sutherland and   
                Bruce K. Holmer   Evaluating approximately balanced
                                  parity-declustered data layouts for disk
                                  arrays . . . . . . . . . . . . . . . . . 501--523
               J. Carretero and   
            F. Pérez and   
               P. de Miguel and   
           F. García and   
                      L. Alonso   Performance increase mechanisms for
                                  parallel and distributed file systems    525--542
                Ian Parsons and   
                  Ron Unrau and   
         Jonathan Schaeffer and   
                  Duane Szafron   PI/OT: Parallel I/O templates  . . . . . 543--570
           Thomas H. Cormen and   
                Melissa Hirschl   Early experiences in evaluating the
                                  parallel disk model with the ViC*
                                  implementation . . . . . . . . . . . . . 571--600
            Rakesh D. Barve and   
            Edward F. Grove and   
           Jeffrey Scott Vitter   Simple randomized mergesort on parallel
                                  disks  . . . . . . . . . . . . . . . . . 601--631

Parallel Computing
Volume 23, Number 6, June 20, 1997

                  M. Pakzad and   
                J. L. Lloyd and   
                    C. Phillips   Independent columns: a new parallel ILU
                                  preconditioner for the PCG method  . . . 637--647 (or 637--648??)
        Mohan K. Kadalbajoo and   
                  A. Appaji Rao   Parallel group explicit method for
                                  two-dimensional parabolic equations  . . 649--666
                   J. Lopez and   
                   O. Plata and   
                F. Arguello and   
                   E. L. Zapata   Unified framework for the
                                  parallelization of divide and conquer
                                  based tridiagonal systems  . . . . . . . 667--686
                Sergei Gorlatch   $N$-graphs: Scalable topology and design
                                  of balanced divide-and-conquer
                                  algorithms . . . . . . . . . . . . . . . 687--698
                 M. Cermele and   
                   M. Colajanni   Non-uniform and dynamic domain
                                  decompositions for hypercomputing  . . . 699--720
               Roman Trobec and   
                 Izidor Jerebic   Local diagnosis in massively parallel
                                  systems  . . . . . . . . . . . . . . . . 721--731
                   G. Mitra and   
                     I. Hai and   
                   M. T. Hajian   A distributed processing algorithm for
                                  solving integer programs using a cluster
                                  of workstations  . . . . . . . . . . . . 733--753
               Jiahong Wang and   
                     Jie Li and   
                   Hisao Kameda   Simulation studies on concurrency
                                  control in parallel transaction
                                  processing systems . . . . . . . . . . . 755--775
           Neeraj K. Sharma and   
          Madhusudhana R. Pinnu   An efficient implementation of bypass
                                  queue under bursty traffic . . . . . . . 777--781
                   Ishfaq Ahmad   Express versus PVM: a performance
                                  comparison . . . . . . . . . . . . . . . 783--812
                      Anonymous   Miscellaneous: Calendar of forthcoming
                                  conferences and events . . . . . . . . . 813

Parallel Computing
Volume 23, Number 7, July 14, 1997

                A. Chalmers and   
                   F. W. Jansen   Parallel graphics and visualisation  . . 817
             Thomas W. Crockett   An introduction to parallel rendering    819--843
               Alan Heirich and   
                     James Arvo   Scalable Monte Carlo image synthesis . . 845--859
              Hyeon-Ju Yoon and   
               Seongbae Eun and   
                   Jung Wan Cho   Image parallel ray tracing using static
                                  load balancing and data prefetching  . . 861--872
              Erik Reinhard and   
             Frederik W. Jansen   Rendering large scenes using parallel
                                  ray tracing  . . . . . . . . . . . . . . 873--885
              Bruno Arnaldi and   
              Thierry Priol and   
               Luc Renambot and   
                   Xavier Pueyo   Visibility masks for solving complex
                                  radiosity computations on
                                  multiprocessors  . . . . . . . . . . . . 887--897
          Christophe Renaud and   
      François Rousselle   Fast massively parallel progressive
                                  radiosity on the MP-1  . . . . . . . . . 899--913
         Anton H. J. Koning and   
        Karel J. Zuiderveld and   
               Max A. Viergever   Volume visualization on shared memory
                                  architectures  . . . . . . . . . . . . . 915--925
    Rüdiger Westermann and   
                    Thomas Ertl   Distributed volume visualization: a step
                                  towards integrated data analysis and
                                  image synthesis  . . . . . . . . . . . . 927--941
            Cemal Köse and   
                  Alan Chalmers   Profiling for efficient parallel volume
                                  visualisation  . . . . . . . . . . . . . 943--952
                 David C. Banks   Screen-parallel determination of
                                  intersection curves  . . . . . . . . . . 953--960
              Michael Krogh and   
              James Painter and   
                 Charles Hansen   Parallel sphere rendering  . . . . . . . 961--974
         Malte Zöckler and   
            Detlev Stalling and   
            Hans-Christian Hege   Parallel line integral convolution . . . 975--989
               Shaun Bangay and   
                 James Gain and   
               Greg Watkins and   
                  Kevan Watkins   Building the second generation of
                                  parallel/distributed virtual reality
                                  systems  . . . . . . . . . . . . . . . . 991--1000

Parallel Computing
Volume 23, Number 8, July 25, 1997

                     Guangye Li   A block variant of the GMRES method on
                                  massively parallel processors  . . . . . 1005--1019
                 P. Beraldi and   
                   F. Guerriero   Parallel asynchronous implementation of
                                  the $\epsilon$-relaxation method for the
                                  linear minimum cost flow problem . . . . 1021--1044
                 Padma Raghavan   Parallel ordering using edge contraction 1045--1067
           Soren S. Nielsen and   
              Stavros A. Zenios   Scalable parallel Benders decomposition
                                  for stochastic linear programming  . . . 1069--1088
                 Ajit Singh and   
             Vincent Van Dongen   An integrated performance analysis tool
                                  for SPMD data-parallel programs  . . . . 1089--1112
              Svetozara Petrova   Parallel implementation of fast elliptic
                                  solver . . . . . . . . . . . . . . . . . 1113--1128
         S. Chandra Sekhara Rao   Existence and uniqueness of WZ
                                  factorization  . . . . . . . . . . . . . 1129--1139
                   Xin Wang and   
             Edward K. Blum and   
            D. Stott Parker and   
                  Daniel Massey   The dance party problem and its
                                  application to collective communication
                                  in computer networks . . . . . . . . . . 1141--1156
              D. C. Hodgson and   
                   P. K. Jimack   A domain decomposition preconditioner
                                  for a parallel finite element solver on
                                  distributed unstructured grids . . . . . 1157--1181
       Mouloud Oussaid\`ene and   
            Bastien Chopard and   
          Olivier V. Pictet and   
                Marco Tomassini   Parallel genetic programming and its
                                  application to trading model induction   1183--1198
             Marco D'Apuzzo and   
              Marco Lapegna and   
                 Almerico Murli   Scalability and load balancing in
                                  adaptive algorithms for multidimensional
                                  integration  . . . . . . . . . . . . . . 1199--1210

Parallel Computing
Volume 23, Number 9, November 3, 1997

           Michael Eldredge and   
        Thomas J. R. Hughes and   
          Robert M. Ferencz and   
            Steven M. Rifai and   
             Arthur Raefsky and   
                  Bruce Herndon   High-performance parallel computing in
                                  industry . . . . . . . . . . . . . . . . 1217--1233
                   V. Kalro and   
                    T. Tezduyar   Parallel $3$D computation of unsteady
                                  flows around circular cylinders  . . . . 1235--1248
               Y. Matsumoto and   
                    T. Tokumasu   Parallel computing of diatomic molecular
                                  rarefied gas flows . . . . . . . . . . . 1249--1260
                L. Paglieri and   
                 D. Ambrosi and   
               L. Formaggia and   
              A. Quarteroni and   
                A. L. Scheinine   Parallel computation for shallow water
                                  flow: a domain decomposition approach    1261--1277
                  S. E. Ray and   
                 G. P. Wren and   
                 T. E. Tezduyar   Parallel implementations of a finite
                                  element formulation for fluid-structure
                                  interactions in interior flows . . . . . 1279--1292
                N. Satofuka and   
                   M. Obata and   
                      T. Suzuki   Parallel computation of
                                  super-/hypersonic flows on workstation
                                  network and Transputer arrays  . . . . . 1293--1305
                John Shadid and   
           Scott Hutchinson and   
              Gary Hennigan and   
               Harry Moffat and   
               Karen Devine and   
                 A. G. Salinger   Efficient parallel computation of
                                  unstructured finite element reacting
                                  flow solutions . . . . . . . . . . . . . 1307--1325
             M. S. Shephard and   
             J. E. Flaherty and   
             C. L. Bottasso and   
            H. L. de Cougny and   
                 C. Ozturan and   
                   M. L. Simone   Parallel automatic adaptive analysis . . 1327--1347
                T. Tezduyar and   
                   V. Kalro and   
                     W. Garrard   Parallel computational methods for $3$D
                                  simulation of a parafoil with prescribed
                                  shape changes  . . . . . . . . . . . . . 1349--1363
               Genki Yagawa and   
        Yasushi Nakabayashi and   
                  Hiroshi Okuda   Large-scale finite element fluid
                                  analysis by massively parallel
                                  processors . . . . . . . . . . . . . . . 1365--1377
              Andrew Yeckel and   
               Jeffrey J. Derby   Parallel computation of incompressible
                                  flows in materials processing: Numerical
                                  experiments in diagonal preconditioning  1379--1400

Parallel Computing
Volume 23, Number 10, November 7, 1997

            Mark J. Clement and   
               Michael J. Quinn   Automated performance prediction for
                                  scalable parallel computing  . . . . . . 1405--1420
                  P. Arbenz and   
                  W. Gander and   
                      M. Oettli   The remote computation system  . . . . . 1421--1428
              W. J. Gutjahr and   
                    M. Hitz and   
                    T. A. Mueck   Task assignment in Cayley
                                  interconnection topologies . . . . . . . 1429--1460
            Aiichiro Nakano and   
               Timothy Campbell   Adaptive curvilinear-coordinate approach
                                  to dynamic load balancing of parallel
                                  multiresolution molecular dynamics . . . 1461--1478
               Fabio Ancona and   
            Stefano Rovetta and   
                 Rodolfo Zunino   Transputer-based implementation of
                                  distributed associative memories . . . . 1479--1491
                E. W. Evans and   
              S. P. Johnson and   
              P. F. Leggett and   
                       M. Cross   Automatic code generation of overlapped
                                  communications in a parallelisation tool 1493--1523
                    X. Yuan and   
               C. Salisbury and   
                 D. Balsara and   
                      R. Melhem   Load balancing package on distributed
                                  memory systems and its application to
                                  particle-particle particle-mesh (P3M)
                                  methods  . . . . . . . . . . . . . . . . 1525--1544
               M. S. Bebbington   Parallel implementation of an
                                  aggregation/disaggregation method for
                                  evaluating quasi-stationary behavior in
                                  continuous-time Markov chains  . . . . . 1545--1559

Parallel Computing
Volume 23, Number 11, December 1, 1997

                  M. Kutrib and   
                 R. Vollmar and   
                     Th. Worsch   Introduction to the special issue on
                                  cellular automata  . . . . . . . . . . . 1567--1576
             J.-P. Allouche and   
             F. v. Haeseler and   
                   E. Lange and   
                A. Petersen and   
                     G. Skordev   Linear cellular automata and automatic
                                  sequences  . . . . . . . . . . . . . . . 1577--1592
                G. Cattaneo and   
                E. Formenti and   
                 L. Margara and   
                       G. Mauri   Transformations of the one-dimensional
                                  cellular automata rule space . . . . . . 1593--1611
                   Klaus Sutner   Linear cellular automata and Fischer
                                  automata . . . . . . . . . . . . . . . . 1613--1634
               Mario Markus and   
                 Tomas Hahn and   
                     Ingo Kusch   A novel quantification of cellular
                                  automata . . . . . . . . . . . . . . . . 1635--1642
            Thomas Buchholz and   
                  Martin Kutrib   Some relations between massively
                                  parallel arrays  . . . . . . . . . . . . 1643--1662
                   Olivier Heen   Efficient constant speed-up for one
                                  dimensional cellular automata
                                  calculators  . . . . . . . . . . . . . . 1663--1671
            Paola Flocchini and   
Frédéric Geurts and   
                 Nicola Santoro   CA-like error propagation in fuzzy CA    1673--1682
                  Thomas Worsch   On parallel Turing machines with
                                  multi-head control units . . . . . . . . 1683--1697
            Jörg R. Weimar   Cellular automata for
                                  reaction--diffusion systems  . . . . . . 1699--1715

Parallel Computing
Volume 23, Number 12, December 15, 1997

Divyesh Jadav and
Chutimet Srinilta and
Alok Choudhary Batching and dynamic allocation
techniques for increasing the stream
capacity of an on-demand media server 1727--1742
Jinsung Cho and
Heonshik Shin Scheduling video streams in a
large-scale video-on-demand server . . . 1743--1755
Valentin Rottmann and
Petra Berenbrink and
Reinhard Luling Simple distributed scheduling policy for
parallel interactive continuous media
servers . . . . . . . . . . . . . . . . 1757--1776
Constantin Arapis and
Simon Gibbs and
Christian Breiteneder Real-time segmentation of video on a
multiprocessor platform . . . . . . . . 1777--1792
John A. Watlington and
V. Michael Bove, Jr. A system for parallel media processing 1793--1809
Eddy De Greef and
Francky Catthoor and
Hugo De Man Memory size reduction through storage
order optimization for embedded parallel
multimedia applications . . . . . . . . 1811--1837 (or 1811--1838??)
Wei Li and
Xiaohu Huang and
Nanning Zheng Parallel implementing OpenGL on PVM . . 1839--1850

Parallel Computing
Volume 23, Number 13, December 15, 1997

         Abdelsalam Heddaya and   
                    Kihong Park   Congestion control for asynchronous
                                  parallel computing on workstation
                                  networks . . . . . . . . . . . . . . . . 1855--1875
                  P. S. Rao and   
                      G. Mouney   Data communication in parallel block
                                  predictor-corrector methods for solving
                                  ODE's  . . . . . . . . . . . . . . . . . 1877--1888
                Weifa Liang and   
                   Xiaojun Shen   Finding the $k$ most vital edges in the
                                  minimum spanning tree problem  . . . . . 1889--1907
                  Yih Huang and   
             Philip K. McKinley   Adaptive global reduction algorithm for
                                  wormhole-routed 2D meshes  . . . . . . . 1909--1936
              Seong-Pyo Kim and   
                    Taisook Han   Fault-tolerant wormhole routing in mesh
                                  with overlapped solid fault regions  . . 1937--1962
           M.-Tahar Kechadi and   
                J.-Luc Dekeyser   Analysis and simulation of an
                                  out-of-order execution model in vector
                                  multiprocessor systems . . . . . . . . . 1963--1986
                      Hong Shen   Optimal parallel multiselection on EREW
                                  PRAM . . . . . . . . . . . . . . . . . . 1987--1992
                   Tong-Yee Lee   Exploitation of image parallelism for
                                  ray tracing $3$D scenes on $2$D mesh
                                  multicomputers . . . . . . . . . . . . . 1993--2015
               Jarmo Rantakokko   Strategies for parallel variational data
                                  assimilation . . . . . . . . . . . . . . 2017--2039
         Michael A. Lambert and   
          Garry H. Rodrigue and   
               Dennis W. Hewett   Parallel DSDADI method for solution of
                                  the steady state diffusion equation  . . 2041--2065
        Ç. K. Koç   Parallel $p$-adic method for solving
                                  linear systems of equations  . . . . . . 2067--2074
                  Chunguang Sun   Parallel solution of sparse linear least
                                  squares problems on distributed-memory
                                  multiprocessors  . . . . . . . . . . . . 2075--2093
            Daniela di Serafino   Parallel implementation of a multigrid
                                  multiblock Euler solver on distributed
                                  memory machines  . . . . . . . . . . . . 2095--2113
              R. E. Overill and   
                      S. Wilson   Data parallel evaluation of univariate
                                  polynomials by the Knuth-Eve algorithm   2115--2127

Parallel Computing
Volume 23, Number 14, December 17, 1997

                 C. Baillie and   
              J. Michalakes and   
                R. Skålin   Regional weather modeling on parallel
                                  computers  . . . . . . . . . . . . . . . 2135--2142
               S. J. Thomas and   
             A. V. Malevsky and   
                M. Desgagne and   
                  R. Benoit and   
                P. Pellerin and   
                       M. Valin   Massively parallel implementation of the
                                  mesoscale compressible community model   2143--2160
            R. Skålin and   
               D. Bjòrge   Implementation and performance of a
                                  parallel version of the HIRLAM limited
                                  area atmospheric model . . . . . . . . . 2161--2172
                  J. Michalakes   MM90: a scalable parallel implementation
                                  of the Penn State/NCAR Mesoscale Model
                                  (MM5)  . . . . . . . . . . . . . . . . . 2173--2186
              Donald Dabdub and   
                  Rajit Manohar   Performance and portability of an air
                                  quality model  . . . . . . . . . . . . . 2187--2200
                M. Ashworth and   
                 F. Foelkel and   
             V. Gülzow and   
                  K. Kleese and   
                D. P. Eppel and   
                 H. Kapitza and   
                       S. Unger   Parallelization of the GESIMA mesoscale
                                  atmospheric model  . . . . . . . . . . . 2201--2213
      Ulrich Schättler and   
             Elisabeth Krenzien   Parallel `Deutschland-Modell' --- a
                                  message-passing version for distributed
                                  memory computers . . . . . . . . . . . . 2215--2226
          Alan J. Wallcraft and   
                Daniel R. Moore   The NRL layered ocean model  . . . . . . 2227--2242
                  A. Sathye and   
                     M. Xue and   
                 G. Bassett and   
                 K. Droegemeier   Parallel weather modeling with the
                                  advanced regional prediction system  . . 2243--2256

Parallel Computing
Volume 24, Number 1, March 10, 1998

           Thomas H. Cormen and   
                 David M. Nicol   Performing out-of-core FFTs on parallel
                                  disk systems . . . . . . . . . . . . . . 5--20
        Peter Triantafillou and   
             Christos Faloutsos   Overlay striping and optimal parallel
                                  I/O for modern applications  . . . . . . 21--43
             Daniel A. Ford and   
        Robert J. T. Morris and   
                   Alan E. Bell   Redundant arrays of independent
                                  libraries (RAIL): the StarFish tertiary
                                  storage system . . . . . . . . . . . . . 45--64
            Carter T. Shock and   
              Chialin Chang and   
                Bongki Moon and   
             Anurag Acharya and   
                Larry Davis and   
                 Joel Saltz and   
                   Alan Sussman   Design and evaluation of a
                                  high-performance earth science database  65--89
    Shahram Ghandeharizadeh and   
                  Richard Muntz   Design and implementation of scalable
                                  continuous media servers . . . . . . . . 91--122
            Leana Golubchik and   
             John C. S. Lui and   
              Maria Papadopouli   Survey of approaches to fault tolerant
                                  design of VOD servers: techniques,
                                  analysis and comparison  . . . . . . . . 123--155
               Ann L. Chervenak   Challenges for tertiary storage in
                                  multimedia servers . . . . . . . . . . . 157--176

Parallel Computing
Volume 24, Number 2, February 1, 1998

              Manu Konchady and   
                  Arun Sood and   
                 Paul S. Schopf   Implementation and performance
                                  evaluation of a parallel ocean model . . 181--203
                Kangwoo Lee and   
                  Michel Dubois   Empirical models of miss rates . . . . . 205--219
  Luis Díaz de Cerio and   
Miguel Valero-García and   
        Antonio González   Method for exploiting
                                  communication/computation overlap in
                                  hypercubes . . . . . . . . . . . . . . . 221--245
           Michael E. Houle and   
                   Gavin Turner   Dimension-exchange token distribution on
                                  the mesh and the torus . . . . . . . . . 247--265
               Jelena Mi\vsi\'c   Unicast-based multicast algorithm in
                                  wormhole-routed star graph
                                  interconnection networks . . . . . . . . 267--286
               K. Sumiyoshi and   
                   T. Ebisuzaki   Performance of parallel solution of a
                                  block-tridiagonal linear system on
                                  Fujitsu VPP500 . . . . . . . . . . . . . 287--304
                S. V. Kuznetsov   Orthogonal reduction of dense matrices
                                  to bidiagonal form on computers with
                                  distributed memory architectures . . . . 305--313

Parallel Computing
Volume 24, Number 3--4, May 1, 1998

            Piyush Mehrotra and   
         John Van Rosendale and   
                      Hans Zima   High Performance Fortran: History,
                                  status and future  . . . . . . . . . . . 325--354
               Henk J. Sips and   
              Will Denissen and   
              Kees van Reeuwijk   Analysis of local enumeration and
                                  storage schemes in HPF . . . . . . . . . 355--382
                 Michael Gerndt   High-level programming of massively
                                  parallel computers based on shared
                                  virtual memory . . . . . . . . . . . . . 383--400
            Brian Armstrong and   
              Seon Wook Kim and   
                Insung Park and   
               Michael Voss and   
               Rudolf Eigenmann   Compiler-based tools for analyzing
                                  parallel programs  . . . . . . . . . . . 401--420
              Pierre Boulet and   
                Alain Darte and   
Georges-André Silber and   
  Frédéric Vivien   Loop parallelization algorithms: From
                                  parallelism extraction to code
                                  generation . . . . . . . . . . . . . . . 421--444
                 Amy W. Lim and   
                  Monica S. Lam   Maximizing parallelism and minimizing
                                  synchronization with affine partitions   445--475
            Trung N. Nguyen and   
                    Zhiyuan Lib   Interprocedural analysis for loop
                                  scheduling and data allocation . . . . . 477--504
               Wolfram Amme and   
             Eberhard Zehendner   Data dependence analysis in programs
                                  with pointers  . . . . . . . . . . . . . 505--525
           Lawrence Rauchwerger   Run-time parallelization: Its time has
                                  come . . . . . . . . . . . . . . . . . . 527--556
      Eduard Ayguadé and   
               Jordi Garcia and   
                  Ulrich Kremer   Tools and techniques for automatic data
                                  layout: a case study . . . . . . . . . . 557--578
          Hironori Kasahara and   
                Akimasa Yoshida   A data-localization compilation scheme
                                  using partial-static task assignment for
                                  Fortran coarse-grain parallel processing 579--596
                M. Kandemir and   
               A. Choudhary and   
               J. Ramanujam and   
                  R. Bordawekar   Compilation techniques for out-of-core
                                  parallel computations  . . . . . . . . . 597--628
              B. Creusillet and   
                     F. Irigoin   Interprocedural analyses of Fortran
                                  programs . . . . . . . . . . . . . . . . 629--648
           Vincent Lefebvre and   
                 Paul Feautrier   Automatic storage management for
                                  parallel programs  . . . . . . . . . . . 649--671

Parallel Computing
Volume 24, Number 5--6, June 1, 1998

                A. Averbuch and   
                   L. Ioffe and   
                 M. Israeli and   
                     L. Vozovoi   Two-dimensional parallel solver for the
                                  solution of Navier--Stokes equations
                                  with constant and variable coefficients
                                  using ADI on cells . . . . . . . . . . . 673--699
                   C. Ceron and   
                  J. Dopazo and   
               E. L. Zapata and   
               J. M. Carazo and   
                     O. Trelles   Parallel implementation of DNAml program
                                  on message-passing architectures . . . . 701--716
                 P. Fisette and   
        J. M. Péterkenne   Contribution to parallel and vector
                                  computation in multibody dynamics  . . . 717--728
                A. A. Mirin and   
             D. E. Shumaker and   
                   M. F. Wehner   Efficient filtering techniques for
                                  finite-difference atmospheric general
                                  circulation models on parallel
                                  processors . . . . . . . . . . . . . . . 729--740
                  R. Aversa and   
                  A. Mazzeo and   
                N. Mazzocca and   
                     U. Villano   Developing applications for
                                  heterogeneous computing environments
                                  using simulation: a case study . . . . . 741--761
            Mostafa M. Aref and   
             Mohammed A. Tayyib   Lana-Match algorithm: a parallel version
                                  of the Rete-Match algorithm  . . . . . . 763--775
                D. J. Evans and   
                     M. Barulli   BSP linear solvers for dense matrices    777--795
               Ananth Grama and   
                Vipin Kumar and   
                    Ahmed Sameh   Scalable parallel formulations of the
                                  Barnes--Hut method for $n$-body
                                  simulations  . . . . . . . . . . . . . . 797--822
       Zden\vek Hanzálek   A parallel algorithm for gradient
                                  training of feedforward neural networks  823--839
           Alain Jean-Marie and   
  Sophie Lefebvre-Barbaroux and   
                       Zhen Liu   An analytical approach to the
                                  performance evaluation of master--slave
                                  computational models . . . . . . . . . . 841--862
                       Zhen Liu   Worst-case analysis of scheduling
                                  heuristics of parallel systems . . . . . 863--891
          Piyush Maheshwari and   
                      Hong Shen   An efficient clustering algorithm for
                                  partitioning parallel programs . . . . . 893--909
                 M. Marrocu and   
             R. Scardovelli and   
                    P. Malguzzi   Parallelization and performance of a
                                  meteorological limited area model  . . . 911--922
               Michael Mascagni   Parallel linear congruential generators
                                  with prime moduli  . . . . . . . . . . . 923--936
              Tz. Ostromsky and   
               P. C. Hansen and   
                      Z. Zlatev   A coarse-grained parallel
                                  $QR$-factorization algorithm for sparse
                                  least squares problems . . . . . . . . . 937--964
                  Sung Kwon Kim   Constant-time RMESH algorithms for the
                                  range minima and co-minima problems  . . 965--977

Parallel Computing
Volume 24, Number 7, July 1, 1998

                   F. Arbab and   
              P. Ciancarini and   
                      C. Hankin   Coordination languages for parallel
                                  programming  . . . . . . . . . . . . . . 989--1004
              Nicholas Carriero   An implementation of Linda for a NUMA
                                  machine  . . . . . . . . . . . . . . . . 1005--1021
      Michel R. V. Chaudron and   
            Arno C. N. van Duin   The formal derivation of parallel
                                  triangular system solvers using a
                                  coordination-based design method . . . . 1023--1046
         Lorenzo Donatiello and   
              Alessandro Fabbri   Generative coordination environments
                                  supporting parallel discrete event
                                  simulation . . . . . . . . . . . . . . . 1047--1080
              Kees Everaars and   
                    Barry Koren   Using coordination to parallelize
                                  sparse-grid methods for $3$-D CFD
                                  problems . . . . . . . . . . . . . . . . 1081--1106
                Tom Holvoet and   
                 Thilo Kielmann   Behaviour specification of parallel
                                  active objects . . . . . . . . . . . . . 1107--1135
         George A. Papadopoulos   Distributed and parallel systems
                                  engineering in MANIFOLD  . . . . . . . . 1137--1160

Parallel Computing
Volume 24, Number 8, August 1, 1998

              Bouchaib Radi and   
   Jean-François Estrade   Adaptive parallelization techniques in
                                  global weather models  . . . . . . . . . 1167--1175
    Suchendra M. Bhandarkar and   
              Salem Machaka and   
         Sridhar Chirravuri and   
                Jonathan Arnold   Parallel computing for chromosome
                                  reconstruction via ordering of DNA
                                  sequences  . . . . . . . . . . . . . . . 1177--1204
                O. Benkahla and   
                  C. Aktouf and   
                      C. Robach   Performance evaluation of distributed
                                  diagnosis algorithms in parallel systems 1205--1222
           Elise de Doncker and   
                     Ajay Gupta   Multivariate integration on hypercubic
                                  and mesh networks  . . . . . . . . . . . 1223--1244
               Qian-Ping Gu and   
                  Shietung Peng   Node-to-set and set-to-set cluster fault
                                  tolerant routing in hypercubes . . . . . 1245--1261
             Shahram Latifi and   
              Pradip K. Srimani   Wormhole broadcast in star graph
                                  networks . . . . . . . . . . . . . . . . 1263--1276

Parallel Computing
Volume 24, Number 9--10, September 1, 1998

               Mahlon Stacy and   
              Dennis Hanson and   
                   Jon Camp and   
                Richard A. Robb   High performance computing in biomedical
                                  imaging research . . . . . . . . . . . . 1287--1321
         Robert L. Galloway and   
             W. Andrew Bass and   
          Christopher E. Hockey   Task-oriented asymmetric multiprocessing
                                  for interactive image-guided surgery . . 1323--1343
          Simon K. Warfield and   
           Ferenc A. Jolesz and   
                    Ron Kikinis   A high performance computing approach to
                                  the registration of medical imaging data 1345--1368
            Gary E. Christensen   MIMD vs. SIMD parallel processing: a
                                  case study in $3$D medical image
                                  registration . . . . . . . . . . . . . . 1369--1383
           Craig M. Wittenbrink   Extensions to permutation warping for
                                  parallel volume rendering  . . . . . . . 1385--1406
              Chris Basoglu and   
              Ravi Managuli and   
                George York and   
                    Yongmin Kim   Computing requirements of modern medical
                                  diagnostic ultrasound machines . . . . . 1407--1431
               Paul Schimpf and   
              Jens Haueisen and   
                 Ceon Ramon and   
                   Hannes Nowak   Realistic computer modelling of electric
                                  and magnetic fields of human head and
                                  torso  . . . . . . . . . . . . . . . . . 1433--1460
                 C. Laurent and   
                  F. Peyrin and   
               J-M Chassery and   
                       M. Amiel   Parallel image reconstruction on MIMD
                                  computers for three- dimensional
                                  cone-beam tomography . . . . . . . . . . 1461--1479
                Jens Gregor and   
                   Dean A. Huff   A computational study of the
                                  focus-of-attention EM-ML algorithm for
                                  PET reconstruction . . . . . . . . . . . 1481--1497
                Chung-Ming Chen   An efficient four-connected parallel
                                  system for PET image reconstruction  . . 1499--1522
                Habib Zaidi and   
        Claire Labbé and   
                Christian Morel   Implementation of an environment for
                                  Monte Carlo simulation of fully $3$-D
                                  positron tomography on a
                                  high-performance parallel platform . . . 1523--1536
            Bjorn De Sutter and   
           Mark Christiaens and   
          Koen De Bosschere and   
             Jan Van Campenhout   On the use of subword parallelism in
                                  medical image processing . . . . . . . . 1537--1556
             Yuan-Ping Pang and   
              Stephen Brimijoin   Supercomputing-based dimeric analog
                                  approach for drug optimization . . . . . 1557--1566
            Todd E. Scheetz and   
             Terry A. Braun and   
               Kyle J. Munn and   
             Edwin M. Stone and   
           Val C. Sheffield and   
             Thomas L. Casavant   GenoMap: a distributed system for
                                  unifying genotyping and genetic linkage
                                  analysis . . . . . . . . . . . . . . . . 1567--1592

Parallel Computing
Volume 24, Number 11, October 1, 1998

                Craig Chase and   
        Prakash Arunachalam and   
                  Jacob Abraham   Memory distribution: Techniques and
                                  practice for CAD applications  . . . . . 1597--1615
                Jih-H. Chen and   
                 Shu-Yun Le and   
           Bruce A. Shapiro and   
                Jacob V. Maizel   Optimization of an RNA folding algorithm
                                  for parallel architectures . . . . . . . 1617--1634
              Paul Caprioli and   
                 Mark H. Holmes   A parallel quasi-Newton method for
                                  Gaussian data fitting  . . . . . . . . . 1635--1651
                  E. Bampis and   
                 C. Delorme and   
               J.-C. König   Optimal schedules for $d-D$ grid graphs
                                  with communication delays  . . . . . . . 1653--1664
              Cyril Fonlupt and   
           Philippe Marquet and   
              Jean-Luc Dekeyser   Data-parallel load balancing strategies  1665--1684
                       G. Haase   Parallel incomplete Cholesky
                                  preconditioners based on the
                                  non-overlapping data distribution  . . . 1685--1703

Parallel Computing
Volume 24, Number 12--13, November 1, 1998

            Greg Eisenhauer and   
                 Beth Plale and   
                 Karsten Schwan   DataExchange: high performance
                                  communications in distributed
                                  laboratories . . . . . . . . . . . . . . 1713--1733
                 Ian Foster and   
           Jonathan Geisler and   
              William Gropp and   
           Nicholas Karonis and   
                 Ewing Lusk and   
       George Thiruvathukal and   
                  Steven Tuecke   Wide-area implementation of the Message
                                  Passing Interface  . . . . . . . . . . . 1735--1749
             Matthias Brune and   
               Jorn Gehring and   
                Axel Keller and   
            Burkhard Monien and   
            Friedhelm Ramme and   
            Alexander Reinefeld   Specifying resources and services in
                                  metacomputing environments . . . . . . . 1751--1776
             Henri Casanova and   
                  Jack Dongarra   Using agent-based software for
                                  scientific computing in the NetSolve
                                  system . . . . . . . . . . . . . . . . . 1777--1790
               Roy Williams and   
                    Bruce Sears   A high-performance active digital
                                  library  . . . . . . . . . . . . . . . . 1791--1806
         A. W. van Halderen and   
           B. J. Overeinder and   
             P. M. A. Sloot and   
             R. van Dantzig and   
             D. H. J. Epema and   
                       M. Livny   Hierarchical resource management in the
                                  Polder Metacomputing Initiative  . . . . 1807--1825
         Timothy J. Sheehan and   
         William A. Shelton and   
            Thomas J. Pratt and   
     Philip M. Papadopoulos and   
            Philip LoCascio and   
              Thomas H. Dunigan   The locally self-consistent multiple
                                  scattering code in a geographically
                                  distributed linked MPP environment . . . 1827--1846
              Th Eickermann and   
                J. Henrichs and   
                   M. Resch and   
                    R. Stoy and   
                      R. Volpel   Metacomputing in gigabit environments:
                                  networks, tools, and applications  . . . 1847--1872
             Sharon Brunett and   
              Thomas Gottschalk   A large-scale metacomputing framework
                                  for the ModSAF real-time simulation  . . 1873--1900
             K. Mani Chandy and   
              Joseph Kiniry and   
                Adam Rifkin and   
               Daniel Zimmerman   A framework for structured distributed
                                  object computing . . . . . . . . . . . . 1901--1922

Parallel Computing
Volume 24, Number 14, December 1, 1998

                    C. Vuik and   
        R. R. P. van Nooyen and   
                   P. Wesseling   Parallelism in ILU-preconditioned GMRES  1927--1946
        Jonathan M. D. Hill and   
                Bill McColl and   
          Dan C. Stefanescu and   
           Mark W. Goudreau and   
                 Kevin Lang and   
              Satish B. Rao and   
               Torsten Suel and   
         Thanasis Tsantilas and   
               Rob H. Bisseling   BSPlib: The BSP programming library  . . 1947--1980
              Alina N. Moga and   
           Bogdan Cramariuc and   
                 Moncef Gabbouj   Parallel watershed transformation
                                  algorithms for image segmentation  . . . 1981--2001
                E. G. Talbi and   
                  Z. Hafidi and   
                      J-M. Geib   A parallel adaptive tabu search approach 2003--2019
                   L. K. Lundin   Computing the velocity of a rotating
                                  flow . . . . . . . . . . . . . . . . . . 2021--2034
               Ravi Prakash and   
           Dhabaleswar K. Panda   Designing communication strategies for
                                  heterogeneous parallel systems . . . . . 2035--2052
                 B. Ciciani and   
               M. Colajanni and   
                    C. Paolucci   Performance evaluation of deterministic
                                  wormhole routing in $k$-ary $n$-cubes    2053--2075
                Kuo-Pao Fan and   
                  Chung-Ta King   Efficient barrier synchronization in
                                  wormhole-routed mesh networks supporting
                                  turn model . . . . . . . . . . . . . . . 2077--2099
            Weng-Long Chang and   
                  Chih-Ping Chu   The extension of the $I$ test  . . . . . 2101--2127
      Jos B. T. M. Roerdink and   
           Michel A. Westenberg   Data-parallel tomographic
                                  reconstruction: a comparison of filtered
                                  backprojection and direct Fourier
                                  reconstruction . . . . . . . . . . . . . 2129--2142
         Joe Shang-Chieh Wu and   
                   Ying-Dar Lin   An efficient and orderly implementation
                                  of bypass queue under bursty traffic . . 2143--2148

Parallel Computing
Volume 25, Number 1, January 1, 1999

                Jacques Verriet   Scheduling interval-ordered tasks with
                                  non-uniform deadlines subject to
                                  non-zero communication delays  . . . . . 3--21
       Rolf H. Möhring and   
       Markus W. Schäffter   Scheduling series--parallel orders
                                  subject to $0/1$-communication delays    23--40
                    Alix Munier   Approximation algorithms for scheduling
                                  trees with general communication delays  41--48
               A. K. Amoura and   
                  E. Bampis and   
             Y. Manoussakis and   
                       Zs. Tuza   A comparison of heuristics for
                                  scheduling multiprocessor tasks on three
                                  dedicated processors . . . . . . . . . . 49--61
            Cristina Boeres and   
            Vinod E. F. Rebello   A versatile cost modelling approach for
                                  multicomputer task scheduling  . . . . . 63--86
         Jacek B\la\.zewicz and   
          Maciej Drozdowski and   
             Mariusz Markiewicz   Divisible task scheduling --- Concept
                                  and verification . . . . . . . . . . . . 87--98

Parallel Computing
Volume 25, Number 2, February 1, 1999

  Christoph W. Keßler and   
      Jesper Larsson Träff   Language and library support for
                                  practical PRAM programming . . . . . . . 105--135
             Horng-Ren Tsai and   
             Shi-Jinn Horng and   
             Tzong-Wann Kao and   
            Shung-Shing Lee and   
                 Shun-Shan Tsai   Fundamental data movement operations and
                                  its applications on a hyper-bus
                                  broadcast network  . . . . . . . . . . . 137--157
              Danny Krizanc and   
                Anton Saarimaki   Bulk synchronous parallel: practical
                                  experience with a model for parallel
                                  computing  . . . . . . . . . . . . . . . 159--181
                 S. W. Chen and   
                 C. Y. Fang and   
                    K. E. Chang   Neural simulation of Petri nets  . . . . 183--207

Parallel Computing
Volume 25, Number 3, March 22, 1999

                 Ravi Murty and   
                 Daniel Okunbor   Efficient parallel algorithms for
                                  molecular dynamics simulations . . . . . 217--230
           Vikramaditya Sen and   
              Mrinal K. Sen and   
                 Paul L. Stoffa   PVM based $3$-D Kirchhoff depth
                                  migration using dynamically computed
                                  travel-times: an application in seismic
                                  data processing  . . . . . . . . . . . . 231--248
           Mohamed Benmaiza and   
              Abderezak Touzene   One-to-all broadcast algorithm for
                                  constant degree 4 Cayley graphs  . . . . 249--264
            Cristina Corral and   
      Isabel Giménez and   
   José Marín and   
                José Mas   Parallel $m$-step preconditioners for
                                  the conjugate gradient method  . . . . . 265--281
                  Sunil Kim and   
        Alexander V. Veidenbaum   Interconnection network organization and
                                  its impact on performance and cost in
                                  shared memory multiprocessors  . . . . . 283--309
                J. S. Reeve and   
                       M. Heath   An efficient parallel version of the
                                  householder-QL matrix diagonalisation
                                  algorithm  . . . . . . . . . . . . . . . 311--319
                I. Vlahavas and   
                 P. Kefalas and   
                    C. Halatsis   OASys: an AND/OR parallel logic
                                  programming system . . . . . . . . . . . 321--336

Parallel Computing
Volume 25, Number 4, April 1, 1999

                Edmund Chadwick   A hybrid parallel algorithm for the
                                  spectral transform method which uses
                                  functional parallelism . . . . . . . . . 345--360
                T. C. Clune and   
              J. R. Elliott and   
               M. S. Miesch and   
                  J. Toomre and   
               G. A. Glatzmaier   Computational aspects of a code to study
                                  rotating turbulent convection in
                                  spherical shells . . . . . . . . . . . . 361--380
          Maciej Drozdowski and   
            W\lodzimierz Glazek   Scheduling divisible loads in a
                                  three-dimensional mesh of processors . . 381--404
           Akihiro Fujiwara and   
              Michiko Inoue and   
        Toshimitsu Masuzawa and   
                 Hideo Fujiwara   A cost optimal parallel algorithm for
                                  weighted distance transforms . . . . . . 405--416
                   Y. F. Hu and   
                    R. J. Blake   An improved diffusion algorithm for
                                  dynamic load balancing . . . . . . . . . 417--444
                Zhiyong Liu and   
                David W. Cheung   Oblivious routing for LC permutations on
                                  hypercubes . . . . . . . . . . . . . . . 445--460
         Roseli S. Wedemann and   
          Valmir C. Barbosa and   
                 Raul Donangelo   Defeasible time-stepping . . . . . . . . 461--489

Parallel Computing
Volume 25, Number 5, May 1, 1999

           Nicholas Giolmas and   
           Daniel W. Watson and   
          David M. Chelberg and   
          Peter V. Henstock and   
                 June Ho Yi and   
              Howard Jay Siegel   Aspects of computational mode and data
                                  distribution for parallel range image
                                  segmentation . . . . . . . . . . . . . . 499--523
                U. W. Rathe and   
                 P. Sanders and   
                   P. L. Knight   A case study in scalability: An ADI
                                  method for the two-dimensional
                                  time-dependent Dirac equation  . . . . . 525--533
          H. Schwichtenberg and   
                  G. Winter and   
                   H. Wallmeier   Acceleration of molecular mechanic
                                  simulation by parallelization and fast
                                  multipole techniques . . . . . . . . . . 535--546
              Pierre Boulet and   
              Jack Dongarra and   
                Yves Robert and   
  Frédéric Vivien   Static tiling for heterogeneous
                                  computing platforms  . . . . . . . . . . 547--568
                     W. Cai and   
                   K. Zhang and   
               S. J. Turner and   
                         C. Sun   Interlock avoidance in transparent and
                                  dynamic parallel program instrumentation
                                  using logical clocks . . . . . . . . . . 569--591
           Giuseppe Passoni and   
          Giancarlo Alfonsi and   
              Giovanni Tula and   
                  Umberto Cardu   A wavenumber parallel computational code
                                  for the numerical integration of the
                                  Navier--Stokes equations . . . . . . . . 593--611
                 M. Szularz and   
                  J. Weston and   
                       M. Clint   Explicitly restarted Lanczos algorithms
                                  in an MPP environment  . . . . . . . . . 613--631

Parallel Computing
Volume 25, Number 6, June 1, 1999

                  Angelo Corana   Parallel computation of the correlation
                                  dimension from a time series . . . . . . 639--666
        Hermann Mierendorff and   
              Helmut Schwamborn   Automatic model generation for
                                  performance estimation of parallel
                                  programs . . . . . . . . . . . . . . . . 667--680
                  Zhong-Zhi Bai   A class of asynchronous parallel
                                  multisplitting blockwise relaxation
                                  methods  . . . . . . . . . . . . . . . . 681--701
                      S. Ramesh   Implementation of communicating reactive
                                  processes  . . . . . . . . . . . . . . . 703--727
                 Reiji Suda and   
              Akira Nishida and   
                 Yoshio Oyanagi   A high performance parallelization
                                  scheme for the Hessenberg double shift
                                  $QR$ algorithm . . . . . . . . . . . . . 729--744
              Franco Zambonelli   Exploiting biased load information in
                                  direct-neighbour load balancing policies 745--766
             R. S. Wedemann and   
              V. C. Barbosa and   
                   R. Donangelo   Erratum to ``Defeasible time-stepping''
                                  [Parallel Computing 25 (4) (April 1999)
                                  pp. 461--489]  . . . . . . . . . . . . . 767--767

Parallel Computing
Volume 25, Number 7, August 13, 1999

                      Anonymous   Parallelization techniques for numerical
                                  modelling  . . . . . . . . . . . . . . . 775--776
                 Gerhard Adrian   Parallel processing in regional
                                  climatology: The parallel version of the
                                  ``Karlsruhe Atmospheric Mesoscale
                                  Model'' (KAMM) . . . . . . . . . . . . . 777--787
              Ralf Diekmann and   
            Andreas Frommer and   
                Burkhard Monien   Efficient schemes for nearest neighbor
                                  load balancing . . . . . . . . . . . . . 789--812
                 Ralf Ebner and   
               Christoph Zenger   A distributed functional framework for
                                  recursive finite element simulations . . 813--826
            Michael Griebel and   
               Gerhard Zumbusch   Parallel multigrid in an adaptive PDE
                                  solver based on hashing and
                                  space-filling curves . . . . . . . . . . 827--843
                     Bruno Lang   Efficient eigenvalue and singular value
                                  computations on shared memory machines   845--860
            Ingrid Lenhardt and   
                 Thomas Rottner   Krylov subspace methods for structural
                                  finite element analysis  . . . . . . . . 861--875
                 Thomas Lippert   Hyper-systolic algorithms for $N$-body
                                  computations and parallel level-$3$ BLAS
                                  libraries  . . . . . . . . . . . . . . . 877--891
           Wolfgang Mackens and   
                  Heinrich Voss   General masters in parallel condensation
                                  of eigenvalue problems . . . . . . . . . 893--903
           Reinhard Möller   A systolic implementation of the MLEM
                                  reconstruction algorithm for positron
                                  emission tomography images . . . . . . . 905--920

Parallel Computing
Volume 25, Number 8, September 1, 1999

               S. J. Dodson and   
               S. P. Walker and   
                    M. J. Bluck   Parallelisation issues for high speed
                                  time domain integral equation analysis   925--942
                  W.-Y. Lin and   
                     C.-L. Chen   Minimum communication cost reordering
                                  for parallel sparse Cholesky
                                  factorization  . . . . . . . . . . . . . 943--967
            B. Großer and   
                        B. Lang   Efficient parallel reduction to
                                  bidiagonal form  . . . . . . . . . . . . 969--986
                   G. S. Brodal   Priority queues on parallel machines . . 987--1011
                     P. Sanders   Analysis of nearest neighbor load
                                  balancing algorithms for random loads    1013--1033
                   D. Barth and   
                    C. Laforest   Scattering and multi-scattering in trees
                                  and meshes, with local routing and
                                  without buffering  . . . . . . . . . . . 1035--1057

Parallel Computing
Volume 25, Number 9, September 1, 1999

              M. E. Barrows and   
              D. E. Gregory and   
                     L. Gao and   
            A. L. Rosenberg and   
                    P. R. Cohen   An empirical study of dynamic scheduling
                                  on rings of processors . . . . . . . . . 1063--1079
                J. Yamamoto and   
                         others   Performance evaluation of SNAIL: a
                                  multiprocessor based on the simple
                                  serial synchronized multistage
                                  interconnection network architecture . . 1081--1103
                G.-H. Hwang and   
                      J. K. Lee   Communication set generations with CSD
                                  calculus and expression-rewriting
                                  framework  . . . . . . . . . . . . . . . 1105--1130
                A. Clematis and   
                      A. Corana   Modeling performance of heterogeneous
                                  parallel computing systems . . . . . . . 1131--1145
       E. J. Kontoghiorghes and   
                   M. Clint and   
                  H.-H. Naegeli   Recursive least-squares using a hybrid
                                  Householder algorithm on massively
                                  parallel SIMD systems  . . . . . . . . . 1147--1159
                 G. Edjlali and   
                  M. Garbey and   
             D. Tromeur-Dervout   Interoperability parallel programs
                                  approach to simulate $3$D frontal
                                  polymerization processes . . . . . . . . 1161--1191

Parallel Computing
Volume 25, Number 10--11, September 1, 1999

                 N. Cabibbo and   
                 Y. Iwasaki and   
                   K. Schilling   High performance computing in lattice
                                  QCD  . . . . . . . . . . . . . . . . . . 1197--1198
                       R. Gupta   General physics motivations for
                                  numerical simulations of quantum field
                                  theory . . . . . . . . . . . . . . . . . 1199--1215
                     F. Rapuano   Quenched physics on APE computers  . . . 1217--1226
        Stephan Güsken and   
             Thomas Lippert and   
                Klaus Schilling   Lattice QCD with two dynamical Wilson
                                  fermions on APE100 parallel systems  . . 1227--1242
                    S. Aoki and   
                         others   Performance of lattice QCD programs on
                                  CP-PACS  . . . . . . . . . . . . . . . . 1243--1255
                    Akira Ukawa   Lattice QCD results from the CP-PACS
                                  computer . . . . . . . . . . . . . . . . 1257--1280
            Robert D. Mawhinney   The 1 Teraflops QCDSP computer . . . . . 1281--1296
                 R. Tripiccione   APEmille . . . . . . . . . . . . . . . . 1297--1309
                  A. D. Kennedy   The Hybrid Monte Carlo algorithm on
                                  parallel computers . . . . . . . . . . . 1311--1339
           Philippe de Forcrand   The MultiBoson method  . . . . . . . . . 1341--1355
                    Th. Lippert   Parallel SSOR preconditioning for
                                  lattice QCD  . . . . . . . . . . . . . . 1357--1370
            Stephan Güsken   Stochastic estimator techniques and
                                  their implementation on distributed
                                  parallel computers . . . . . . . . . . . 1371--1381
                G. Peter Lepage   Improved discretizations for lattice QCD 1383--1393
          Robert G. Edwards and   
              Urs M. Heller and   
             Rajamani Narayanan   Chiral fermions on the lattice . . . . . 1395--1407

Parallel Computing
Volume 25, Number 12, November, 1999

               V. Annamalai and   
       C. S. Krishnamoorthy and   
                    V. Kamakoti   Adaptive finite element analysis on a
                                  parallel and distributed environment . . 1413--1434
            G. Carré and   
                 S. Lanteri and   
                    Mark Loriot   High performance simulations of
                                  compressible flows inside car engine
                                  geometries using the N3S-NATUR parallel
                                  solver . . . . . . . . . . . . . . . . . 1435--1458
                 Myron Ginsberg   Influences, challenges, and strategies
                                  for automotive HPC benchmarking and
                                  performance improvement  . . . . . . . . 1459--1476
                  S. Loucif and   
             M. Ould-Khaoua and   
                L. M. Mackenzie   Analysis of fully adaptive wormhole
                                  routing in tori  . . . . . . . . . . . . 1477--1487
                  Max Geigl and   
              Martin Griebl and   
             Christian Lengauer   Termination detection in parallel loop
                                  nests with while loops . . . . . . . . . 1489--1510

Parallel Computing
Volume 25, Number 13--14, December, 1999

           Erich Strohmaier and   
           Jack J. Dongarra and   
              Hans W. Meuer and   
                 Horst D. Simon   The marketplace of high-performance
                                  computing  . . . . . . . . . . . . . . . 1517--1544
                 Yoshio Oyanagi   Development of supercomputers in Japan:
                                  Hardware and software  . . . . . . . . . 1545--1567
            Enrico Clementi and   
              Giorgina Corongiu   Early parallelism with a loosely coupled
                                  array of processors: The ICAP experiment 1583--1600
            Shunichi Uchida and   
                 Akira Aiba and   
           Kazuaki Rokusawa and   
          Takashi Chikayama and   
                 Ryuzo Hasegawa   The parallel logic programming system in
                                  the FGCS project and its future
                                  directions . . . . . . . . . . . . . . . 1601--1633
          Kisaburo Nakazawa and   
           Hiroshi Nakamura and   
               Taisuke Boku and   
                Ikuo Nakata and   
            Yoshiyuki Yamashita   CP-PACS: a massively parallel processor
                                  at the University of Tsukuba . . . . . . 1635--1661
                    D. Sugimoto   GRAPE: a parallel computer dedicated to
                                  astrophysical many-body problems . . . . 1663--1676
            Paolo Cremonesi and   
               Emilia Rosti and   
           Giuseppe Serazzi and   
                 Evgenia Smirni   Performance evaluation of parallel
                                  systems  . . . . . . . . . . . . . . . . 1677--1698
             V. S. Sunderam and   
                    G. A. Geist   Heterogeneous parallel and distributed
                                  computing  . . . . . . . . . . . . . . . 1699--1721
     A. P. Willem Böhm and   
          Jeffrey P. Hammes and   
                   Sumit S. Sur   On the performance of pure and impure
                                  parallel functional programs . . . . . . 1723--1740
                Rajiv Gupta and   
              Santosh Pande and   
          Kleanthis Psarris and   
                   Vivek Sarkar   Compilation techniques for parallel
                                  systems  . . . . . . . . . . . . . . . . 1741--1783
          Siegfried Benkner and   
                      Hans Zima   Compiling High Performance Fortran for
                                  distributed-memory architectures . . . . 1785--1825
                   B. Bacci and   
               M. Danelutto and   
               S. Pelagatti and   
                   M. Vanneschi   SkIE: a heterogeneous environment for
                                  HPC applications . . . . . . . . . . . . 1827--1852
            David E. Womble and   
                         others   Massively parallel computing: A Sandia
                                  perspective  . . . . . . . . . . . . . . 1853--1876
          S. Lakshmivarahan and   
             Sudarshan K. Dhall   Ring, torus and hypercube
                                  architectures/algorithms for parallel
                                  computing  . . . . . . . . . . . . . . . 1877--1906
            Walid A. Najjar and   
              Edward A. Lee and   
                   Guang R. Gao   Advances in the dataflow computational
                                  model  . . . . . . . . . . . . . . . . . 1907--1929
               Iain S. Duff and   
          Henk A. van der Vorst   Developments and trends in the parallel
                                  solution of linear systems . . . . . . . 1931--1970
               E. L. Zapata and   
                   O. Plata and   
                  R. Asenjo and   
                  G. P. Trabado   Data-parallel support for numerical
                                  irregular problems . . . . . . . . . . . 1971--1994
                   Shun Doi and   
                  Takumi Washio   Ordering strategies and related
                                  techniques to overcome the trade-off
                                  between parallelism and convergence in
                                  incomplete factorizations  . . . . . . . 1995--2014
       Clemens-August Thole and   
              Klaus Stüben   Industrial simulation on parallel
                                  computers  . . . . . . . . . . . . . . . 2015--2037
            Tayfun Tezduyar and   
                    Yasuo Osawa   Methods for parallel computation of
                                  complex flow problems  . . . . . . . . . 2039--2066
                Richard A. Robb   Visualization in biomedical computing    2067--2110
          Kenneth C. Bowler and   
              Anthony J. G. Hey   Parallel computing and quantum
                                  chromodynamics . . . . . . . . . . . . . 2111--2134
        Hermann Mierendorff and   
               Wolfgang Joppich   Empirical performance modeling for
                                  parallel weather prediction codes  . . . 2135--2148
              Stavros A. Zenios   High-performance computing in finance:
                                  The last 10 years and the next . . . . . 2149--2175
                 Andreas Reuter   Methods for parallel execution of
                                  complex database queries . . . . . . . . 2177--2188
                      Anonymous   Index  . . . . . . . . . . . . . . . . . 2189--2196

Parallel Computing
Volume 26, Number 1, January, 2000

               G. Ch. Pflug and   
            A. \'Swi\cetanowski   Selected parallel optimization methods
                                  for financial management under
                                  uncertainty  . . . . . . . . . . . . . . 3--25
         Beno\^\it Bourbeau and   
     Teodor Gabriel Crainic and   
                Bernard Gendron   Branch-and-bound parallelization
                                  strategies applied to a depot location
                                  and container fleet management problem   27--46
        Ricardo C. Corrêa   A parallel approximation scheme for the
                                  multiprocessor scheduling problem  . . . 47--72
         Stella C. S. Porto and   
João Paulo F. W. Kitajima and   
               Celso C. Ribeiro   Performance evaluation of a parallel
                                  tabu search task scheduling algorithm    73--90
            Michel Toulouse and   
     Teodor Gabriel Crainic and   
                K. Thulasiraman   Global optimization properties of
                                  parallel cooperative search algorithms:
                                  a simulation study . . . . . . . . . . . 91--112
              D. G. Morales and   
                         others   Parallel dynamic programming and
                                  automata theory  . . . . . . . . . . . . 113--134
               M. D. Durand and   
                 Steve R. White   Trading accuracy for speed in parallel
                                  simulated annealing with simultaneous
                                  moves  . . . . . . . . . . . . . . . . . 135--150
                   I. Maros and   
                       G. Mitra   Investigating the sparse simplex
                                  algorithm on a distributed memory
                                  multiprocessor . . . . . . . . . . . . . 151--170

Parallel Computing
Volume 26, Number 2--3, February, 2000

       Mohammed Atiquzzaman and   
              Pradip K. Srimani   Parallel computing on clusters of
                                  workstations . . . . . . . . . . . . . . 175--177
                  W.-M. Lin and   
                         W. Xie   Load-skewing task assignment to minimize
                                  communication conflicts on network of
                                  workstations . . . . . . . . . . . . . . 179--197
       Stephen R. Donaldson and   
        Jonathan M. D. Hill and   
            David B. Skillicorn   BSP clusters: High performance, reliable
                                  and very low cost  . . . . . . . . . . . 199--242
             Ron Brightwell and   
                         others   Massively parallel computing using
                                  commodity components . . . . . . . . . . 243--266
                   N. Melab and   
                    E.-G. Talbi   Parallel adaptive computing on
                                  meta-systems including NOWs  . . . . . . 267--284
                John C. Chu and   
                Patrick W. Dowd   Adaptive cache coherence over a high
                                  bandwidth broadband mesh network . . . . 285--311
             Edward K. Blum and   
                   Xin Wang and   
                  Patrick Leung   Architectures and message-passing
                                  algorithms for cluster computing: Design
                                  and performance  . . . . . . . . . . . . 313--332
                  G. Chiola and   
                     G. Ciaccio   Efficient parallel processing on
                                  low-cost clusters with GAMMA active
                                  ports  . . . . . . . . . . . . . . . . . 333--354
               Yung-Lin Liu and   
                  Chung-Ta King   EXPLORER: Supporting run-time
                                  parallelization of DOACROSS loops on
                                  general networks of workstations . . . . 355--375

Parallel Computing
Volume 26, Number 4, March, 2000

                   N. Marco and   
                     S. Lanteri   A two-level parallelization strategy for
                                  Genetic Algorithms applied to optimum
                                  shape design . . . . . . . . . . . . . . 377--397
                  Moez Ayed and   
               Jean-Luc Gaudiot   An efficient heuristic for code
                                  partitioning . . . . . . . . . . . . . . 399--426
            Peter K. K. Loh and   
                   Wen Jing Hsu   The Josephus cube: a novel
                                  interconnection network  . . . . . . . . 427--453
                Pao-Hwa Sui and   
                  Sheng-De Wang   A fault-tolerant routing algorithm for
                                  wormhole routed meshes . . . . . . . . . 455--465
               Taesoon Park and   
                   Heon Y. Yeom   Application controlled checkpointing
                                  coordination for fault-tolerant
                                  distributed computing systems  . . . . . 467--482
       Costas S. Iliopoulos and   
                  James F. Reid   Optimal parallel analysis and
                                  decomposition of partially occluded
                                  strings  . . . . . . . . . . . . . . . . 483--494
              A. Bevilacqua and   
            E. Loli Piccolomini   Parallel image restoration on parallel
                                  and distributed computers  . . . . . . . 495--506

Parallel Computing
Volume 26, Number 5, March, 2000

Erricos John Kontoghiorghes and   
              Anna Nagurney and   
             Berç Rustem   Parallel computing in economics, finance
                                  and decision-making  . . . . . . . . . . 507--509
           S. A. MirHassani and   
                   C. Lucas and   
                   G. Mitra and   
                 E. Messina and   
                  C. A. Poojari   Computational solution of capacity
                                  planning models under uncertainty  . . . 511--538
              G. Zanghirati and   
                   F. Cocco and   
                 G. Paruolo and   
                      F. Taddei   A Cray T3E implementation of a parallel
                                  stochastic dynamic assets and
                                  liabilities management model . . . . . . 539--567
                   Cyril Godart   Parallel implementation of a two-factor
                                  Cheyette-beta model calibration  . . . . 569--586
          Rodolphe Chatagny and   
                Bastien Chopard   A parallel model for the foreign
                                  exchange market  . . . . . . . . . . . . 587--600
               F. O. Bunnin and   
                     Y. Guo and   
                     Y. Ren and   
                  J. Darlington   Design of high performance financial
                                  modelling environment  . . . . . . . . . 601--622
                S. C. Perry and   
             R. H. Grimwood and   
             D. J. Kerbyson and   
          E. Papaefstathiou and   
                     G. R. Nudd   Performance optimization of financial
                                  option calculations  . . . . . . . . . . 623--639
                Jenny X. Li and   
                 Gary L. Mullen   Parallel computing of a quasi-Monte
                                  Carlo algorithm for valuing derivatives  641--653
         Elias S. Manolakos and   
             Haris M. Stellakis   Systematic synthesis of parallel
                                  architectures for the computation of
                                  higher order cumulants . . . . . . . . . 655--676

Parallel Computing
Volume 26, Number 6, May, 2000

                E. W. Evans and   
              S. P. Johnson and   
              P. F. Leggett and   
                       M. Cross   Automatic and effective
                                  multi-dimensional parallelisation of
                                  structured mesh based codes  . . . . . . 677--703
                 R. Keppens and   
                 G. Tóth   Using high performance Fortran for
                                  magnetohydrodynamic simulations  . . . . 705--722
                   Keqin Li and   
                     Yi Pan and   
                   Mounir Hamdi   Solving graph theory problems using
                                  reconfigurable pipelined optical buses   723--735
           Arjen Schoneveld and   
          Peter M. A. Sloot and   
                Martin Lees and   
                  Erwan Karyadi   A framework for dynamic load balancing:
                                  a case study on explosive containment
                                  simulation . . . . . . . . . . . . . . . 737--751
        C. Rodríguez and   
                 J. L. Roda and   
                   F. Sande and   
              D. G. Morales and   
                     F. Almeida   A new parallel model for the analysis of
                                  asynchronous algorithms  . . . . . . . . 753--767
              Huan-Chao Keh and   
                   Jen-Chih Lin   On fault-tolerant embedding of
                                  Hamiltonian cycles, linear arrays and
                                  rings in a Flexible Hypercube  . . . . . 769--781
             Jan Trdli\vcka and   
            Pavel Tvrdík   Embedding complete $k$-ary trees into
                                  $k$-square $2$D meshes with optimal edge
                                  congestion . . . . . . . . . . . . . . . 783--790
                Shijun Diao and   
                    T. Fujiwara   Evaluation and strategy of different
                                  data parallel implementation methods of
                                  a stiff chemical non-equilibrium flow
                                  solver . . . . . . . . . . . . . . . . . 791--804
                  J. G. Liu and   
              F. H. Y. Chan and   
                  F. K. Lam and   
                       H. F. Li   A new approach to fast calculation of
                                  moments of $3$-D gray level images . . . 805--815

Parallel Computing
Volume 26, Number 7--8, July, 2000

              Jerzy Leszczynski   Computational chemistry  . . . . . . . . 817--818
             Wanda Andreoni and   
             Alessandro Curioni   New advances in chemistry and materials
                                  science with CPMD and parallel computing 819--842
                 C. P. Sosa and   
                G. Scalmani and   
                R. Gomperts and   
                   M. J. Frisch   Ab initio quantum chemistry on a ccNUMA
                                  architecture using openMP. III . . . . . 843--856
                  John D. Watts   Parallel algorithms for coupled-cluster
                                  methods  . . . . . . . . . . . . . . . . 857--867
              Ross H. Nobes and   
        Alistair P. Rendell and   
                Jarek Nieplocha   Computational chemistry on Fujitsu
                                  vector-parallel processors: Hardware and
                                  programming environment  . . . . . . . . 869--886
        Alistair P. Rendell and   
                         others   Computational chemistry on Fujitsu
                                  vector-parallel processors: Development
                                  and performance of applications software 887--911
              Piotr Piecuch and   
              Joseph I. Landman   Parallelization of multi-reference
                                  coupled-cluster method . . . . . . . . . 913--943
             David E. Bernholdt   Scalability of correlated electronic
                                  structure calculations on parallel
                                  computers: a case study of the RI-MP2
                                  method . . . . . . . . . . . . . . . . . 945--963
            Dennis M. Newns and   
                         others   Molecular dynamics study of structure
                                  and gating of low molecular weight ion
                                  channels . . . . . . . . . . . . . . . . 965--976
                   Barry Robson   Simplified models of protein folding
                                  exploiting the Lagrange radius of
                                  gyration of the hydrophobic component    977--998
               Jacek Komasa and   
               Jacek Rychlewski   Solving quantum-mechanical problems on
                                  parallel systems . . . . . . . . . . . . 999--1009
                  Jon Baker and   
                    Matt Shirel   Ab initio quantum chemistry on PC-based
                                  parallel supercomputers  . . . . . . . . 1011--1024
                Marc Pavese and   
               Soonmin Jang and   
                Gregory A. Voth   Centroid molecular dynamics: a quantum
                                  dynamics method suitable for the
                                  parallel computer  . . . . . . . . . . . 1025--1041
                Leonid Gorb and   
                 Ilya Yanov and   
              Jerzy Leszczynski   High performance computing on the Cray
                                  T3E and IBM SP2 systems with the
                                  parallel version of GAUSSIAN 94  . . . . 1043--1060

Parallel Computing
Volume 26, Number 9, July, 2000

            Jacek Ba\lewicz and   
             Klaus H. Ecker and   
                       Tao Yang   New trends on scheduling in parallel and
                                  distributed systems  . . . . . . . . . . 1061--1063
                Jacques Verriet   Scheduling outtrees of height one in the
                                  LogP model . . . . . . . . . . . . . . . 1065--1082
             Welf Löwe and   
                Wolf Zimmermann   Scheduling balanced task-graphs to
                                  LogP-machines  . . . . . . . . . . . . . 1083--1108
          Tomasz Kalinowski and   
              Iskander Kort and   
                 Denis Trystram   List scheduling of general task graphs
                                  under LogP . . . . . . . . . . . . . . . 1109--1128
                   Chams Lahlou   Approximation algorithms for scheduling
                                  with a limited number of communications  1129--1162
     Philippe Chrétienne   On Graham's bound for cyclic scheduling  1163--1174
                    Alain Darte   On the complexity of loop fusion . . . . 1175--1193
            Jacek Ba\lewicz and   
          Maciej Drozdowski and   
          Piotr Formanowicz and   
            Wies\law Kubiak and   
            Günter Schmidt   Scheduling preemptable tasks on parallel
                                  processors with limited availability . . 1195--1211
         Luis Miguel Campos and   
              Isaac D. Scherson   Rate of change load balancing in
                                  distributed and parallel systems . . . . 1213--1230

Parallel Computing
Volume 26, Number 10, August 15, 2000

             Alan D. George and   
              Jeff Markwell and   
                   Ryan Fogarty   Real-time sonar beamforming on
                                  high-performance distributed computers   1231--1252
  J. Chassin de Kergommeaux and   
                   B. Stein and   
                  P. E. Bernard   Pajé, an interactive visualization tool
                                  for tuning multi-threaded parallel
                                  applications . . . . . . . . . . . . . . 1253--1274
            Weng-Long Chang and   
                  Chih-Ping Chu   The infinity Lambda test: a
                                  multi-dimensional version of Banerjee
                                  infinity test  . . . . . . . . . . . . . 1275--1295
         David K. Lowenthal and   
               Vincent W. Freeh   Architecture-independent parallelism for
                                  both shared- and distributed-memory
                                  machines using the Filaments package . . 1297--1323
                  Minyi Guo and   
                Ikuo Nakata and   
            Yoshiyuki Yamashita   Contention-free communication scheduling
                                  for array redistribution . . . . . . . . 1325--1343
               Peter Benner and   
                Ralph Byers and   
Enrique S. Quintana-Ortí and   
  Gregorio Quintana-Ortí   Solving algebraic Riccati equations on
                                  parallel computers using Newton's method
                                  with exact line search . . . . . . . . . 1345--1368

Parallel Computing
Volume 26, Number 11, October, 2000

                 Peiyi Tang and   
                   Jingling Xue   Generating efficient tiled code for
                                  distributed memory machines  . . . . . . 1369--1410
               Sajal K. Das and   
            M. Cristina Pinotti   Parallel priority queues based on
                                  binomial heaps . . . . . . . . . . . . . 1411--1428
Clémentin Tayou Djamégni and   
            Patrice Quinton and   
          Sanjay Rajopadhye and   
                  Tanguy Risset   Derivation of systolic algorithms for
                                  the algebraic path problem by recurrence
                                  transformations  . . . . . . . . . . . . 1429--1445
          M. Manzur Murshed and   
               Richard P. Brent   Adaptive AT 2 optimal algorithms on
                                  reconfigurable meshes  . . . . . . . . . 1447--1458
             Tzung-Shi Chen and   
             Nen-Chung Wang and   
                  Chih-Ping Chu   Multicast communication in
                                  wormhole-routed star graph
                                  interconnection networks . . . . . . . . 1459--1490
                   J. A. Bakker   Semantic partitioning as a basis for
                                  parallel I/O in database management
                                  systems  . . . . . . . . . . . . . . . . 1491--1513

Parallel Computing
Volume 26, Number 12, November, 2000

               Rupak Biswas and   
          Bruce Hendrickson and   
                 George Karypis   Graph partitioning and parallel
                                  computing  . . . . . . . . . . . . . . . 1515--1517
          Bruce Hendrickson and   
                Tamara G. Kolda   Graph partitioning models for parallel
                                  computing  . . . . . . . . . . . . . . . 1519--1534
                 N. Touheed and   
                 P. Selwood and   
               P. K. Jimack and   
                     M. Berzins   A comparison of some dynamic
                                  load-balancing algorithms for a parallel
                                  adaptive flow solver . . . . . . . . . . 1535--1554
              Ralf Diekmann and   
               Robert Preis and   
           Frank Schlimbach and   
                  Chris Walshaw   Shape-optimized mesh partitioning and
                                  load balancing for parallel adaptive FEM 1555--1581
              Leonid Oliker and   
               Rupak Biswas and   
                Harold N. Gabow   Parallel tetrahedral mesh adaptation
                                  with dynamic load balancing  . . . . . . 1583--1608
            Burkhard Monien and   
               Robert Preis and   
                  Ralf Diekmann   Quality matching and local improvement
                                  for multilevel graph-partitioning  . . . 1609--1634
                 C. Walshaw and   
                       M. Cross   Parallel optimisation algorithms for
                                  multilevel mesh partitioning . . . . . . 1635--1660
                  J. Rantakokko   Partitioning strategies for structured
                                  multiblock grids . . . . . . . . . . . . 1661--1680

Parallel Computing
Volume 26, Number 13--14, December, 2000

  J. Chassin de Kergommeaux and   
              P. J. Hatcher and   
                 L. Rauchwerger   Parallel computing for irregular
                                  applications . . . . . . . . . . . . . . 1681--1684
            Manuel Hermenegildo   Parallelizing irregular and
                                  pointer-based computations
                                  automatically: Perspectives from logic
                                  and constraint programming . . . . . . . 1685--1708
        E. Gutiérrez and   
                  R. Asenjo and   
                   O. Plata and   
                   E. L. Zapata   Automatic parallelization of irregular
                                  applications . . . . . . . . . . . . . . 1709--1738
           F. Warren Burton and   
               David J. Simpson   Memory requirements for parallel
                                  programs . . . . . . . . . . . . . . . . 1739--1763
           Andras Laszloffy and   
              Jingping Long and   
                 Abani K. Patra   Simple data management, scheduling and
                                  solution strategies for managing the
                                  irregularities in parallel adaptive hp
                                  finite element simulations . . . . . . . 1765--1788
Frédéric Brégier and   
    Marie-Christine Counilh and   
                     Jean Roman   Scheduling loops with partial
                                  loop-carried dependencies  . . . . . . . 1789--1806
             Thomas Brandes and   
   Cécile Germain-Renaud   A schedule cache for data parallel
                                  unstructured computations  . . . . . . . 1807--1823
                  Thomas Decker   Virtual data space --- load balancing
                                  for irregular applications . . . . . . . 1825--1860
                Hwansoo Han and   
                 Chau-Wen Tseng   Efficient compiler and run-time support
                                  for parallel irregular reductions  . . . 1861--1887
                 P. Beraldi and   
             L. Grandinetti and   
                R. Musmanno and   
                       C. Triki   Parallel algorithms to solve two-stage
                                  stochastic linear programs with
                                  robustness constraints . . . . . . . . . 1889--1908
                  C. S. Pua and   
             M. H. Williams and   
                  D. H. Marwick   Modelling parallel databases with
                                  process algebra  . . . . . . . . . . . . 1909--1924
               Ming-Yang Su and   
             Hui-Ling Huang and   
              Gen-Huey Chen and   
                   Dyi-Rong Duh   Node-disjoint paths in incomplete
                                  WK-recursive networks  . . . . . . . . . 1925--1944
                   Roman Trobec   Two-dimensional regular $d$-meshes . . . 1945--1953
                      Anonymous   Index  . . . . . . . . . . . . . . . . . 1955--1962

Parallel Computing
Volume 27, Number 1--2, January, 2001

                 O. Ya\csar and   
                    Y. Deng and   
                R. E. Tuzun and   
                       D. Saltz   New trends in high performance computing 1--2
            R. Clint Whaley and   
            Antoine Petitet and   
               Jack J. Dongarra   Automated empirical optimizations of
                                  software and the ATLAS project . . . . . 3--35
         Dinshaw S. Balsara and   
              Charles D. Norton   Highly parallel structured adaptive mesh
                                  refinement using parallel language-based
                                  approaches . . . . . . . . . . . . . . . 37--70
             Reginald L. Walker   Search engine case study: searching the
                                  Web using genetic programming and MPI    71--89
                Yuefan Deng and   
                   Alex Korobka   The performance of a supercomputer built
                                  with commodity components  . . . . . . . 91--108
      Michael D. Letherwood and   
                David D. Gunter   Ground vehicle modeling and simulation
                                  of military vehicles using high
                                  performance computing  . . . . . . . . . 109--140
                  Ting Chen and   
            Vladimir Filkov and   
               Steven S. Skiena   Identifying gene regulatory networks
                                  from experimental data . . . . . . . . . 141--162
              Alfredo U. Luccio   Numerical simulation of particle
                                  accelerators . . . . . . . . . . . . . . 163--177
                     O. Ya\csar   A new ignition model for spark-ignited
                                  engine simulations . . . . . . . . . . . 179--200

Parallel Computing
Volume 27, Number 3, February, 2001

              César Rego   Node-ejection chains for the vehicle
                                  routing problem: Sequential and parallel
                                  algorithms . . . . . . . . . . . . . . . 201--222
            Antonio Corradi and   
           Letizia Leonardi and   
              Franco Zambonelli   Parallel object allocation via
                                  user-specified directives: a case study
                                  in traffic simulation  . . . . . . . . . 223--241
             Patrick Dymond and   
              Jieliang Zhou and   
                   Xiaotie Deng   A $2$-D parallel convex hull algorithm
                                  with optimal communication phases  . . . 243--255
        Sathiamoorthy Manoharan   Effect of task duplication on the
                                  assignment of dependency graphs  . . . . 257--268
         Masayoshi Aritsugi and   
             Hiroki Fukatsu and   
             Yoshinari Kanamori   Several partitioning strategies for
                                  parallel image convolution in a network
                                  of heterogeneous workstations  . . . . . 269--293
              B. Di Martino and   
               S. Briguglio and   
                    G. Vlad and   
                   P. Sguazzero   Parallel PIC plasma simulation through
                                  particle decomposition techniques  . . . 295--314
                  Avi Kavas and   
                David Er-El and   
              Dror G. Feitelson   Using multicast to pre-load jobs on the
                                  ParPar cluster . . . . . . . . . . . . . 315--327

Parallel Computing
Volume 27, Number 4, March, 2001

                    J. W. Manke   Parallel computing in aerospace  . . . . 329--336
           William D. Gropp and   
          Dinesh K. Kaushik and   
             David E. Keyes and   
                 Barry F. Smith   High-performance parallel implicit CFD   337--362
                  M. Garbey and   
             Yu. V. Vassilevski   A parallel solver for unsteady
                                  incompressible $3$D Navier--Stokes
                                  equations  . . . . . . . . . . . . . . . 363--389
             Jay Hoeflinger and   
            Prasad Alavilli and   
             Thomas Jackson and   
                       Bob Kuhn   Producing scalable performance with
                                  OpenMP: Experiments with two CFD
                                  applications . . . . . . . . . . . . . . 391--413
                  P. Aumann and   
                         others   MEGAFLOW: Parallel complete aircraft CFD 415--440
               M. S. Fisher and   
                    M. Mani and   
                D. Stookesberry   Parallel processing with the Wind CFD
                                  code at Boeing . . . . . . . . . . . . . 441--456
            Joseph W. Manke and   
           G. David Kerlick and   
               David Levine and   
         Subhankar Banerjee and   
                    Eric Dillon   Parallel performance of two applications
                                  in the Boeing high performance computing
                                  benchmark suite  . . . . . . . . . . . . 457--475
            Piyush Mehrotra and   
                      Hans Zima   High Performance Fortran for aerospace
                                  applications . . . . . . . . . . . . . . 477--501
            Paul D. Hovland and   
                Lois C. McInnes   Parallel simulation of compressible flow
                                  using automatic differentiation and
                                  PETSc  . . . . . . . . . . . . . . . . . 503--519
                  James R. Taft   Achieving 60 GFLOP/s on the production
                                  CFD code OVERFLOW-MLP  . . . . . . . . . 521--536

Parallel Computing
Volume 27, Number 5, April, 2001

           Stefania Bandini and   
            Giancarlo Mauri and   
                  Roberto Serra   Cellular automata: From modeling to
                                  applications . . . . . . . . . . . . . . 537--538
                 S. Bandini and   
                   G. Mauri and   
                       R. Serra   Cellular automata: From a theoretical
                                  parallel computational model to its
                                  application to complex systems . . . . . 539--553
            Andreas Beckers and   
                  Thomas Worsch   A perimeter-time CA for the queen bee
                                  problem  . . . . . . . . . . . . . . . . 555--569
  F. Jiménez Morales and   
          J. P. Crutchfield and   
                    M. Mitchell   Evolving two-dimensional cellular
                                  automata to perform density
                                  classification: a report on work in
                                  progress . . . . . . . . . . . . . . . . 571--585
                   Hiroshi Umeo   Linear-time recognition of connectivity
                                  of binary images on $1$-bit inter-cell
                                  communication cellular automaton . . . . 587--599
            Jörg R. Weimar   Coupling microscopic and macroscopic
                                  cellular automata  . . . . . . . . . . . 601--611
               B. Ostrovsky and   
                  G. Crooks and   
                M. A. Smith and   
                     Y. Bar-Yam   Cellular automata for polymer simulation
                                  with application to polymer melts and
                                  polymer collapse including implications
                                  for protein folding  . . . . . . . . . . 613--641
           Stefania Bandini and   
         Massimiliano Magagnini   Parallel processing simulation of
                                  dynamic properties of filled rubber
                                  compounds based on cellular automata . . 643--661
              Roberto Serra and   
              Marco Villani and   
                 Anna Salvemini   Continuous genetic networks  . . . . . . 663--683
               R. Cappuccio and   
                G. Cattaneo and   
                 G. Erbacci and   
                      U. Jocher   A parallel implementation of a cellular
                                  automata based model for coffee
                                  percolation  . . . . . . . . . . . . . . 685--717
                   J. Wahle and   
                 L. Neubert and   
                   J. Esser and   
               M. Schreckenberg   A cellular automaton traffic flow model
                                  for online simulation of traffic . . . . 719--735

Parallel Computing
Volume 27, Number 6, May, 2001

                Th. Lippert and   
                  N. Petkov and   
               P. Palazzari and   
                   K. Schilling   Hyper-systolic matrix multiplication . . 737--759
              Gundolf Haase and   
               Michael Kuhn and   
                  Ulrich Langer   Parallel multigrid $3$D Maxwell solvers  761--775
                Yair Censor and   
                 Dan Gordon and   
                  Rachel Gordon   Component averaging: an efficient
                                  iterative parallel algorithm for large
                                  and sparse unstructured problems . . . . 777--808
 Alexandros V. Gerbessiotis and   
     Constantinos J. Siniolakis   Merging on the BSP model . . . . . . . . 809--822
               Ishfaq Ahmad and   
     Shahriar M. Akramullah and   
               Ming L. Liou and   
                 Muhammad Kafil   A scalable off-line MPEG-2 video
                                  encoding scheme using a multiprocessor
                                  system . . . . . . . . . . . . . . . . . 823--846
       Paul N. Swarztrauber and   
              Steven W. Hammond   A comparison of optimal FFTs on torus
                                  and hypercube multicomputers . . . . . . 847--859
         Muhammad H. Alsuwaiyel   An optimal parallel algorithm for the
                                  multiselection problem . . . . . . . . . 861--865

Parallel Computing
Volume 27, Number 7, June, 2001

               Henk J. Sips and   
          Ruud Sommerhalder and   
               Erik D'Hollander   Linear systems and associated problems   867--868
               A. Basermann and   
                J. Fingberg and   
                G. Lonsdale and   
                 B. Maerten and   
                     C. Walshaw   Dynamic multi-partitioning for parallel
                                  finite element applications  . . . . . . 869--881
                 Roman Geus and   
             Stefan Röllin   Towards a fast parallel sparse symmetric
                                  matrix-vector multiplication . . . . . . 883--896
                D. B. Heras and   
            J. C. Cabaleiro and   
                   F. F. Rivera   Modeling data locality for the sparse
                                  matrix-vector product using distance
                                  measures . . . . . . . . . . . . . . . . 897--912
                  A. Cooper and   
                 M. Szularz and   
                      J. Weston   External selective orthogonalization for
                                  the Lanczos algorithm in distributed
                                  memory environments  . . . . . . . . . . 913--923
                      H. X. Lin   A unifying graph model for designing
                                  parallel algorithms for tridiagonal
                                  systems  . . . . . . . . . . . . . . . . 925--939
             Peter Christen and   
                         others   Scalable parallel algorithms for surface
                                  fitting and data mining  . . . . . . . . 941--961
           Luca Bergamaschi and   
               Giorgio Pini and   
              Flavio Sartoretto   Parallel preconditioning of a sparse
                                  eigensolver  . . . . . . . . . . . . . . 963--976

Parallel Computing
Volume 27, Number 8, July, 2001

               Yuto Komeiji and   
           Makoto Haraguchi and   
                Umpei Nagashima   Parallel molecular dynamics simulation
                                  of a protein . . . . . . . . . . . . . . 977--987
Mardochée Magolu monga Made and   
          Henk A. van der Vorst   Parallel incomplete factorizations with
                                  pseudo-overlapped subdomains . . . . . . 989--1008
             Arnold Krechel and   
              Klaus Stüben   Parallel algebraic multigrid based on
                                  subdomain blocking . . . . . . . . . . . 1009--1031
         Azzedine Boukerche and   
                   Carl Tropper   Local versus global lookahead in
                                  conservative parallel simulations  . . . 1033--1055
               Byung S. Yoo and   
                   Chita R. Das   Efficient processor management schemes
                                  for mesh-connected multicomputers  . . . 1057--1078
           Constantine Katsinis   Performance analysis of the simultaneous
                                  optical multi-processor exchange bus . . 1079--1115
            Weng-Long Chang and   
                  Chih-Ping Chu   The generalized Direction Vector I test  1117--1144

Parallel Computing
Volume 27, Number 9, August, 2001

           M. Alabdulkareem and   
          S. Lakshmivarahan and   
                    S. K. Dhall   Scalability analysis of large codes
                                  using factorial designs  . . . . . . . . 1145--1171
               Daeyeon Park and   
           Byeong Hag Seong and   
             Rafael H. Saavedra   Adaptive software prefetching in
                                  scalable multiprocessors using cache
                                  information  . . . . . . . . . . . . . . 1173--1195
           Paraskevas Evripidou   $D^3$-Machine: a decoupled data-driven
                                  multithreaded architecture with variable
                                  resolution support . . . . . . . . . . . 1197--1225
       Vittorio Cortellessa and   
              Francesco Quaglia   A checkpointing-recovery scheme for Time
                                  Warp parallel simulation . . . . . . . . 1227--1252
                Dolors Royo and   
Miguel Valero-García and   
        Antonio González   Implementing the one-sided Jacobi method
                                  on a $2$D/$3$D mesh multicomputer  . . . 1253--1271
              Gen-Huey Chen and   
          Shien-Ching Hwang and   
             Hui-Ling Huang and   
               Ming-Yang Su and   
                   Dyi-Rong Duh   A general broadcasting scheme for
                                  recursive networks with complete
                                  connection . . . . . . . . . . . . . . . 1273--1278

Parallel Computing
Volume 27, Number 10, September, 2001

            Gabriel Antoniu and   
                         others   The Hyperion system: Compiling
                                  multithreaded Java bytecode for
                                  distributed execution  . . . . . . . . . 1279--1297
               Eric Noulard and   
                     Nahid Emad   A key for reusable parallel linear
                                  algebra software . . . . . . . . . . . . 1299--1319
                Jeff Boleng and   
               Manavendra Misra   Load balanced parallel QR decomposition
                                  on shared memory multiprocessors . . . . 1321--1345
               L. F. Romero and   
             E. M. Ortigosa and   
                   E. L. Zapata   Data-task parallelism for the VMEC
                                  program  . . . . . . . . . . . . . . . . 1347--1364
               O. Yu. Milyukova   Parallel approximate factorization
                                  method for solving discrete elliptic
                                  equations  . . . . . . . . . . . . . . . 1365--1379
                 J. Al-Sadi and   
                     K. Day and   
                 M. Ould-Khaoua   Fault-tolerant routing in hypercubes
                                  using probability vectors  . . . . . . . 1381--1399

Parallel Computing
Volume 27, Number 11, October, 2001

              Jack Dongarra and   
          Masaaki Shimasaki and   
            Bernard Tourancheau   Clusters and computational grids for
                                  scientific computing . . . . . . . . . . 1401--1402
              Cherri M. Pancake   Performance tools for today's HPC: Are
                                  we addressing the right issues?  . . . . 1403--1415
               Ralph Butler and   
              William Gropp and   
                     Ewing Lusk   Components and interfaces of a process
                                  management system for parallel programs  1417--1429
             Thilo Kielmann and   
               Henri E. Bal and   
            Sergei Gorlatch and   
              Kees Verstoep and   
            Rutger F. H. Hofman   Network performance-aware collective
                                  communication for clustered wide-area
                                  systems  . . . . . . . . . . . . . . . . 1431--1456
          Michael D. Beynon and   
                         others   Distributed processing of very large
                                  datasets with DataCutter . . . . . . . . 1457--1478
             Graham E. Fagg and   
           Antonin Bukovsky and   
               Jack J. Dongarra   HARNESS and fault tolerant MPI . . . . . 1479--1495
                   E. Caron and   
                         others   \sc Scilab to \sc Scilab$_{//}$: The \sc
                                  Ouragan project  . . . . . . . . . . . . 1497--1519

Parallel Computing
Volume 27, Number 12, November, 2001

            Michael Florian and   
                Michel Gendreau   Applications of parallel computing in
                                  transportation . . . . . . . . . . . . . 1521--1522
                 S. C. Wong and   
                 C. K. Wong and   
                     C. O. Tong   A parallelized genetic algorithm for the
                                  calibration of Lowry model . . . . . . . 1523--1536
         Michelle R. Hribar and   
          Valerie E. Taylor and   
                 David E. Boyce   Implementing parallel shortest path for
                                  parallel transportation applications . . 1537--1568
                N. Tremblay and   
                     M. Florian   Temporal shortest paths: Parallel
                                  computing implementations  . . . . . . . 1569--1609
                  Kai Nagel and   
                 Marcus Rickert   Parallel implementation of the TRANSIMS
                                  micro-simulation . . . . . . . . . . . . 1611--1639
            Michel Gendreau and   
            Gilbert Laporte and   
   Frédéric Semet   A dynamic model and parallel tabu search
                                  heuristic for real-time ambulance
                                  relocation . . . . . . . . . . . . . . . 1641--1653

Parallel Computing
Volume 27, Number 13, December 1, 2001

           Laurent Hascoët   A method for automatic placement of
                                  communications in SPMD parallelisation   1655--1664
           Giuseppe Passoni and   
            Paolo Cremonesi and   
              Giancarlo Alfonsi   Analysis and implementation of a
                                  parallelization strategy on a
                                  Navier--Stokes solver for shear flow
                                  simulations  . . . . . . . . . . . . . . 1665--1685
        B. V. Rathish Kumar and   
               T. Yamaguchi and   
                     H. Liu and   
                      R. Himeno   A parallel $3$D unsteady incompressible
                                  flow solver on VPP700  . . . . . . . . . 1687--1713
        Ignacio M. Llorente and   
Manuel Prieto-Matías and   
                   Boris Diskin   A parallel multigrid solver for $3$D
                                  convection and convection-diffusion
                                  problems . . . . . . . . . . . . . . . . 1715--1741
                  M. Arenaz and   
                  R. Doallo and   
          J. Touriño and   
              C. Vázquez   Efficient parallel numerical solver for
                                  the elastohydrodynamic Reynolds-Hertz
                                  problem  . . . . . . . . . . . . . . . . 1743--1765
                Wahid Nasri and   
                  Zaher Mahjoub   Optimal parallelization of a recursive
                                  algorithm for triangular matrix
                                  inversion on MIMD computers  . . . . . . 1767--1782
            Weng-Long Chang and   
              Chih-Ping Chu and   
                     Jia-Hwa Wu   A multi-dimensional version of the I
                                  test . . . . . . . . . . . . . . . . . . 1783--1799
            H. Sarbazi-Azad and   
             M. Ould-Khaoua and   
                L. M. Mackenzie   Communication delay in hypercubes in the
                                  presence of bit-reversal traffic . . . . 1801--1816
                   Jau-Der Shih   Wormhole routing for torus networks with
                                  faults . . . . . . . . . . . . . . . . . 1817--1829

Parallel Computing
Volume 27, Number 14, December 31, 2001

          Takahiro Katagiri and   
                Yasumasa Kanada   An efficient implementation of parallel
                                  eigenvalue computation for massively
                                  parallel processing  . . . . . . . . . . 1831--1845
      Márcia A. Inda and   
               Rob H. Bisseling   A simple and efficient parallel FFT
                                  algorithm using the BSP model  . . . . . 1847--1878
                   C. Bekas and   
                 E. Gallopoulos   Cobra: Parallel path following for
                                  computing the matrix pseudospectrum  . . 1879--1896
                    Wei Shi and   
              Pradip K. Srimani   A regular scalable fault tolerant
                                  interconnection network for distributed
                                  processing . . . . . . . . . . . . . . . 1897--1919
                 P. Dmitruk and   
                 L.-P. Wang and   
            W. H. Matthaeus and   
                   R. Zhang and   
                      D. Seckel   Scalable parallel FFT for spectral
                                  simulations on a Beowulf cluster . . . . 1921--1936
                      Anonymous   Author index to volume 27  . . . . . . . 1937--1944

Parallel Computing
Volume 28, Number 1, January, 2002

             Gerhard R. Joubert   Editorial  . . . . . . . . . . . . . . . 1--2
                Angela C. Sodan   Applications on a multithreaded
                                  architecture: a case study with
                                  EARTH-MANNA  . . . . . . . . . . . . . . 3--33
              Hendrik L. Tolman   Distributed-memory concepts in the wave
                                  model WAVEWATCH III  . . . . . . . . . . 35--52
                    P. Wang and   
               Karen Y. Liu and   
                   Tom Cwik and   
                   Robert Green   MODTRAN on supercomputers and parallel
                                  computers  . . . . . . . . . . . . . . . 53--64
                   Fusen He and   
                         Jie Wu   An efficient parallel implementation of
                                  the Everglades Landscape Fire Model
                                  using checkpointing  . . . . . . . . . . 65--82
              Rajeev Thakur and   
              William Gropp and   
                     Ewing Lusk   Optimizing noncontiguous accesses in
                                  MPI-IO . . . . . . . . . . . . . . . . . 83--105
           Hung-Chang Hsiao and   
                  Chung-Ta King   Implementation and evaluation of
                                  directory hints in CC-NUMA
                                  multiprocessors  . . . . . . . . . . . . 107--132
           Huei-Huang Chang and   
                   Ge-Ming Chiu   An improved fault-tolerant routing
                                  algorithm in meshes with convex faults   133--149

Parallel Computing
Volume 28, Number 2, February, 2002

Erricos John Kontoghiorghes and   
                Ahmed Sameh and   
                 Denis Trystram   Special issue on parallel matrix
                                  algorithms and applications  . . . . . . 151--153
           Olivier Beaumont and   
             Arnaud Legrand and   
           Fabrice Rastello and   
                    Yves Robert   Dense linear algebra kernels on
                                  heterogeneous platforms: Redistribution
                                  issues . . . . . . . . . . . . . . . . . 155--185
                Olaf Schenk and   
             Klaus Gärtner   Two-level dynamic scheduling in PARDISO:
                                  Improved scalability on shared memory
                                  multiprocessing systems  . . . . . . . . 187--197
                Dany Mezher and   
               Bernard Philippe   Parallel computation of pseudospectra of
                                  large sparse matrices  . . . . . . . . . 199--221
                   C. Bekas and   
                 E. Gallopoulos   Parallel computation of pseudospectra by
                                  fast descent . . . . . . . . . . . . . . 223--242
                 M. Be\vcka and   
                  G. Ok\vsa and   
                 M. Vajter\vsic   Dynamic ordering for a parallel
                                  block-Jacobi SVD algorithm . . . . . . . 243--262
        Martin H. Gutknecht and   
             Stefan Röllin   The Chebyshev iteration revisited  . . . 263--283
             Ahmed H. Sameh and   
                    Vivek Sarin   Parallel algorithms for indefinite
                                  linear systems . . . . . . . . . . . . . 285--299
            P. Hénon and   
                   P. Ramet and   
                       J. Roman   \sc PaStiX: a high-performance parallel
                                  direct solver for sparse symmetric
                                  positive definite systems  . . . . . . . 301--321
                   Y. Liang and   
                  J. Weston and   
                     M. Szularz   Generalized least-squares polynomial
                                  preconditioners for symmetric indefinite
                                  linear equations . . . . . . . . . . . . 323--341
            Joël M. Malard   Parallel restricted maximum likelihood
                                  estimation for linear models with a
                                  dense exogenous matrix . . . . . . . . . 343--353
           Wojciech Owczarz and   
                  Zahari Zlatev   Parallel matrix computations in air
                                  pollution modelling  . . . . . . . . . . 355--368

Parallel Computing
Volume 28, Number 3, March, 2002

                  B. Nkonga and   
                    P. Charrier   Generalized parcel method for dispersed
                                  spray and message passing strategy on
                                  unstructured meshes  . . . . . . . . . . 369--398
           Stephen H. Brill and   
               George F. Pinder   Parallel implementation of the Bi-CGSTAB
                                  method with block red-black
                                  Gauss--Seidel preconditioner applied to
                                  the Hermite collocation discretization
                                  of partial differential equations  . . . 399--414
            Harald J. Ehold and   
      Wilfried N. Gansterer and   
        Dieter F. Kvasnicka and   
        Christoph W. Ueberhuber   Optimizing Local Performance in HPF  . . 415--432
                  Alain Girault   Elimination of redundant messages with a
                                  two-pass static analysis algorithm . . . 433--453
              Kleanthis Psarris   Program analysis techniques for
                                  transforming programs for parallel
                                  execution  . . . . . . . . . . . . . . . 455--469
               Jen-Chih Lin and   
                 Nan-Chen Hsien   Reconfiguring binary tree structures in
                                  a faulty supercube with unbounded
                                  expansion  . . . . . . . . . . . . . . . 471--483
                 F. Quaglia and   
                 B. Ciciani and   
                   M. Colajanni   Performance analysis of adaptive
                                  wormhole routing in a two-dimensional
                                  torus  . . . . . . . . . . . . . . . . . 485--501
                 Yosi Ben-Asher   The parallel client-server paradigm  . . 503--523

Parallel Computing
Volume 28, Number 4, April, 2002

      Abdelkader Hameurlain and   
                  Franck Morvan   CPU and incremental memory allocation in
                                  dynamic parallelization of SQL queries   525--556
               A. Goscinski and   
                   M. Hobbs and   
                     J. Silcock   GENESIS: an efficient, transparent and
                                  easy to use cluster operating system . . 557--606
             Olivier Aumage and   
           Luc Bougé and   
Jean-François Méhaut and   
                 Raymond Namyst   Madeleine II: a portable and efficient
                                  communication library for
                                  high-performance cluster computing . . . 607--626
              Petr Salinger and   
            Pavel Tvrdík   Optimal broadcasting and gossiping in
                                  one-port meshes of trees with
                                  distance-insensitive routing . . . . . . 627--647
              Abderezak Touzene   Edges-disjoint spanning trees on the
                                  binary wrapped butterfly network with
                                  applications to fault tolerance  . . . . 649--666
              Roberto Serra and   
              Marco Villani and   
                 Anna Salvemini   Erratum to ``Continuous genetic
                                  networks'' [Parallel Comput. 27(5)
                                  (2001) 663--683] . . . . . . . . . . . . 667--667

Parallel Computing
Volume 28, Number 5, May, 2002

             Domenico Talia and   
              Pradip K. Srimani   Guest editorial: Parallel data-intensive
                                  algorithms and applications  . . . . . . 669--671
            Mario Cannataro and   
             Domenico Talia and   
              Pradip K. Srimani   Parallel data intensive computing in
                                  scientific and commercial applications   673--704
                     P. Sanders   Reconciling simplicity and realism in
                                  parallel disk models . . . . . . . . . . 705--723
            Renato Ferreira and   
              Gagan Agrawal and   
                     Joel Saltz   Data parallel language and compiler
                                  support for data intensive applications  725--748
               Bill Allcock and   
                         others   Data management and transfer in
                                  high-performance computational grid
                                  environments . . . . . . . . . . . . . . 749--771
                Yanyan Yang and   
                         others   Agent based data management in digital
                                  libraries  . . . . . . . . . . . . . . . 773--792
            Massimo Coppola and   
                Marco Vanneschi   High-performance data mining with
                                  skeleton-based structured parallel
                                  programming  . . . . . . . . . . . . . . 793--813
               D. B. Skillicorn   Parallel frequent set counting . . . . . 815--825
             Michael Beynon and   
                         others   Processing large-scale multi-dimensional
                                  data in parallel and distributed
                                  environments . . . . . . . . . . . . . . 827--859

Parallel Computing
Volume 28, Number 6, June, 2002

    Pablo A. Estévez and   
Hél\`ene Paugam-Moisy and   
             Didier Puzenat and   
                  Manuel Ugarte   A scalable parallel algorithm for
                                  training a hierarchical mixture of
                                  neural experts . . . . . . . . . . . . . 861--891
             Tyng-Yeu Liang and   
              Ce-Kuen Shieh and   
                      Jun-Qi Li   Selecting threads for workload migration
                                  in software distributed shared memory
                                  systems  . . . . . . . . . . . . . . . . 893--913
               Jingling Xue and   
                    Wentong Cai   Time-minimal tiling when rise is larger
                                  than zero  . . . . . . . . . . . . . . . 915--939

Parallel Computing
Volume 28, Number 7--8, August, 2002

                Andreas Uhl and   
                Peter Zinterhof   Guest editorial: Parallel computing in
                                  image and video processing . . . . . . . 941--943
         Cristina Nicolescu and   
                  Pieter Jonker   A data and task parallel image
                                  processing environment . . . . . . . . . 945--965
             F. J. Seinstra and   
                  D. Koelma and   
               J. M. Geusebroek   A software architecture for user
                                  transparent parallel image processing    967--993
               A. Biancardi and   
              A. Mérigot   Extending the data parallel paradigm
                                  with data-dependent operators  . . . . . 995--1021
    Francisco Argüello and   
          Juan López and   
     María A. Trenas and   
               Emilio L. Zapata   Architecture for wavelet packet
                                  transform based on lifting steps . . . . 1023--1037
               Ishfaq Ahmad and   
                    Yong He and   
                   Ming L. Liou   Video compression with parallel
                                  processing . . . . . . . . . . . . . . . 1039--1078
             Hazem M. Abbas and   
             Mohamed M. Bayoumi   Parallel codebook design for vector
                                  quantization on a message passing MIMD
                                  architecture . . . . . . . . . . . . . . 1079--1093
                     Rade Kutil   Approaches to zerotree image and video
                                  coding on MIMD architectures . . . . . . 1095--1109
               Aravind Dasu and   
        Sethuraman Panchanathan   Reconfigurable media processing  . . . . 1111--1139
                 K. Benkrid and   
                 D. Crookes and   
                     A. Benkrid   Towards a general framework for FPGA
                                  based image processing using hardware
                                  skeletons  . . . . . . . . . . . . . . . 1141--1154
               A. C. Zawada and   
                 N. L. Seed and   
                     P. A. Ivey   Continuous and high coverage
                                  self-testing of dynamically
                                  re-configurable systems  . . . . . . . . 1155--1178
            Virginie Fresse and   
               Olivier Deforges   ARIAL: R\em apid P\em rototyping for
                                  M\em ixed and P\em arallel P\em latforms 1179--1202
           Edwige Pissaloux and   
               Franck Amiot and   
                  Tharam Dillon   A vision-application adaptable computer
                                  concept and its implementation in
                                  FreeTIV computer . . . . . . . . . . . . 1203--1219
                      Anonymous   IFC --- Inside Front Cover (Editorial
                                  Board) . . . . . . . . . . . . . . . . . CO2--CO2

Parallel Computing
Volume 28, Number 9, September, 2002

           Mark Christiaens and   
             Michiel Ronsse and   
              Koen De Bosschere   Bounding the number of segment histories
                                  during data race detection . . . . . . . 1221--1238
          Mikhail S. Tarkov and   
              Youngsong Mun and   
              Jaeyoung Choi and   
                  Hyung-Il Choi   Mapping adaptive fuzzy Kohonen
                                  clustering network onto distributed
                                  image processing system  . . . . . . . . 1239--1256
    Erricos John Kontoghiorghes   Greedy Givens algorithms for computing
                                  the rank-$k$ updating of the QR
                                  decomposition  . . . . . . . . . . . . . 1257--1273
                    Ke Chen and   
                    Choi H. Lai   Parallel algorithms of the Purcell
                                  method for direct solution of linear
                                  systems  . . . . . . . . . . . . . . . . 1275--1291
             Shao Dong Chen and   
                  Hong Shen and   
                   Rodney Topor   An efficient algorithm for constructing
                                  Hamiltonian paths in meshes  . . . . . . 1293--1305
                Yuan-Shin Hwang   Parallelizing graph construction
                                  operations in programs with cyclic
                                  graphs . . . . . . . . . . . . . . . . . 1307--1328
                PeiZong Lee and   
                   Wen-Yao Chen   Generating communication sets of array
                                  assignment statements for block-cyclic
                                  distribution on distributed memory
                                  parallel computers . . . . . . . . . . . 1329--1368
                      Anonymous   Editorial Board  . . . . . . . . . . . . CO2--CO2

Parallel Computing
Volume 28, Number 10, October ??, 2002

             Alexey Lastovetsky   Adaptive parallel computing on
                                  heterogeneous networks with mpC  . . . . 1369--1407
          Jeffrey Nesheiwat and   
          Boleslaw K. Szymanski   Instrumentation database system for
                                  performance analysis of parallel
                                  scientific applications  . . . . . . . . 1409--1449
                   Chi Shen and   
                      Jun Zhang   Parallel two level block ILU
                                  preconditioning techniques for solving
                                  large sparse linear systems  . . . . . . 1451--1475
                    Lili Ju and   
                   Qiang Du and   
                 Max Gunzburger   Probabilistic methods for centroidal
                                  Voronoi tessellations and their parallel
                                  implementations  . . . . . . . . . . . . 1477--1500
Carlos Alberto Alonso Sanches and   
         Nei Yoshihiro Soma and   
         Horacio Hideki Yanasse   Short communication: Comments on
                                  parallel algorithms for the knapsack
                                  problem  . . . . . . . . . . . . . . . . 1501--1505
                      Anonymous   Editorial Board  . . . . . . . . . . . . CO2--CO2

Parallel Computing
Volume 28, Number 11, November ??, 2002

                Ian N. Dunn and   
             Gerard G. L. Meyer   QR factorization for shared memory and
                                  message passing  . . . . . . . . . . . . 1507--1530
       Jean-Guillaume Dumas and   
                Jean-Louis Roch   On parallel block algorithms for exact
                                  triangularizations . . . . . . . . . . . 1531--1548
               Taesoon Park and   
                 Inseon Lee and   
                   Heon Y. Yeom   An efficient causal logging scheme for
                                  recoverable distributed shared memory
                                  systems  . . . . . . . . . . . . . . . . 1549--1572
               Claire Hanen and   
             Alix Munier Kordon   Minimizing the volume in scheduling an
                                  out-tree with communication delays and
                                  duplication  . . . . . . . . . . . . . . 1573--1585
               S. A. Jarvis and   
              J. M. D. Hill and   
           C. J. Siniolakis and   
                  V. P. Vasilev   Portable and architecture independent
                                  parallel performance tuning using BSP    1587--1609
              Li-Chiu Chang and   
                  Fi-John Chang   An efficient parallel algorithm for
                                  LISSOM neural network  . . . . . . . . . 1611--1633
                      Anonymous   Editorial Board  . . . . . . . . . . . . CO2--CO2

Parallel Computing
Volume 28, Number 12, December, 2002

             Pasqua D'Ambra and   
            Marco Danelutto and   
            Daniela di Serafino   Advanced environments for parallel and
                                  distributed computing  . . . . . . . . . 1635--1636
             Pasqua D'Ambra and   
            Marco Danelutto and   
        Daniela di Serafino and   
                  Marco Lapegna   Advanced environments for parallel and
                                  distributed applications: a view of
                                  current status . . . . . . . . . . . . . 1637--1662
               S. MacDonald and   
                   J. Anvik and   
                S. Bromling and   
               J. Schaeffer and   
                 D. Szafron and   
                         K. Tan   From patterns to frameworks to parallel
                                  programs . . . . . . . . . . . . . . . . 1663--1683
       Jocelyn Sérot and   
               Dominique Ginhac   Skeletons for parallel image processing:
                                  an overview of the SKIPPER project . . . 1685--1708
                Marco Vanneschi   The programming model of ASSIST, an
                                  environment for parallel and distributed
                                  portable applications  . . . . . . . . . 1709--1732
                   D. Laforenza   Grid programming: some indications where
                                  we are headed  . . . . . . . . . . . . . 1733--1752
          Nathalie Furmento and   
              Anthony Mayer and   
            Stephen McGough and   
            Steven Newhouse and   
                 Tony Field and   
                John Darlington   ICENI: Optimisation of component
                                  applications within a Grid environment   1753--1772
                 Micah Beck and   
              Dorian Arnold and   
           Alessandro Bassi and   
                Fran Berman and   
             Henri Casanova and   
              Jack Dongarra and   
                Terry Moore and   
         Graziano Obertelli and   
                James Plank and   
                   Martin Swany   Middleware for the use of storage in
                                  communication  . . . . . . . . . . . . . 1773--1787
                M. Di Santo and   
             F. Frattolillo and   
                   W. Russo and   
                       E. Zimeo   A component-based approach to build a
                                  portable and flexible middleware for
                                  metacomputing  . . . . . . . . . . . . . 1789--1810
              Boyana Norris and   
               Satish Balay and   
              Steven Benson and   
               Lori Freitag and   
               Paul Hovland and   
               Lois McInnes and   
                    Barry Smith   Parallel components for PDEs and
                                  optimization: some issues and
                                  experiences  . . . . . . . . . . . . . . 1811--1831
                      Anonymous   Author Index . . . . . . . . . . . . . . 1833--1839
                      Anonymous   Editorial Board  . . . . . . . . . . . . CO2--CO2

Parallel Computing
Volume 29, Number 1, January, 2003

        E. A. H. Vollebregt and   
             M. R. T. Roest and   
                J. W. M. Lander   Large scale computing at Rijkswaterstaat 1--20
               Leo Chin Sim and   
             Heiko Schroder and   
                 Graham Leedham   MIMD--SIMD hybrid system----towards a
                                  new low cost parallel system . . . . . . 21--36
                  Hon F. Li and   
                 Gabriel Girard   View consistencies and exact
                                  implementations  . . . . . . . . . . . . 37--67
           Ashok Srinivasan and   
           Michael Mascagni and   
                 David Ceperley   Testing parallel random number
                                  generators . . . . . . . . . . . . . . . 69--94
  Ramachandran Vaidyanathan and   
            Jerry L. Trahan and   
                   Chun-ming Lu   Degree of scalability: scalable
                                  reconfigurable mesh algorithms for
                                  multiple addition and matrix--vector
                                  multiplication . . . . . . . . . . . . . 95--109
           Salma A. Ghoneim and   
             Hossam M. A. Fahmy   Job preemption, fast subcube compaction,
                                  or waiting in hypercube systems? A
                                  selection methodology  . . . . . . . . . 111--134
                  Heejo Lee and   
                   Jong Kim and   
               Sung Je Hong and   
                     Sunggu Lee   Task scheduling using a block dependency
                                  DAG for block-oriented sparse Cholesky
                                  factorization  . . . . . . . . . . . . . 135--159
                Oh-Han Kang and   
                    Si-Gwan Kim   A task duplication based scheduling
                                  algorithm for shared memory
                                  multiprocessors  . . . . . . . . . . . . 161--166
                      Anonymous   Editorial Board  . . . . . . . . . . . . CO2--CO2

Parallel Computing
Volume 29, Number 2, February, 2003

             Hongzhang Shan and   
         Jaswinder P. Singh and   
              Leonid Oliker and   
                   Rupak Biswas   Message passing and shared address space
                                  parallelism on an SMP cluster  . . . . . 167--186
              Olaf Bonorden and   
               Ben Juurlink and   
              Ingo von Otte and   
                   Ingo Rieping   The Paderborn University BSP (PUB)
                                  library  . . . . . . . . . . . . . . . . 187--207
           Fabrice Rastello and   
                   Amit Rao and   
                  Santosh Pande   Optimal task scheduling at run time to
                                  exploit intra-tile parallelism . . . . . 209--239
         D. González and   
                 F. Almeida and   
                  L. Moreno and   
            C. Rodríguez   Towards the automatic optimal mapping of
                                  pipeline algorithms  . . . . . . . . . . 241--254
             Cosimo Anglano and   
            Claudio Casetti and   
            Emilio Leonardi and   
                     Fabio Neri   Network interface multicast protocols
                                  for wormhole-based networks of
                                  workstations . . . . . . . . . . . . . . 255--283
                      Anonymous   Editorial Board  . . . . . . . . . . . . CO2--CO2

Parallel Computing
Volume 29, Number 3, March, 2003

              Erik Reinhard and   
                     Dirk Bartz   Parallel graphics and visualisation  . . 285--288
                     Toshi Kato   ``Kilauea''----parallel global
                                  illumination renderer  . . . . . . . . . 289--310
                   M. Isard and   
                   M. Shand and   
                     A. Heirich   Distributed rendering of interactive
                                  soft shadows . . . . . . . . . . . . . . 311--323
     Wagner T. Corrêa and   
         James T. Klosowski and   
        Cláudio T. Silva   Out-of-core sort-first parallel
                                  rendering for cluster-based tiled
                                  displays . . . . . . . . . . . . . . . . 325--338
     Jürgen P. Schulze and   
                    Ulrich Lang   The parallelized perspective shear-warp
                                  algorithm for volume rendering . . . . . 339--354
                    Li Chen and   
            Issei Fujishiro and   
                 Kengo Nakajima   Optimizing parallel performance of
                                  unstructured volume rendering for the
                                  Earth Simulator  . . . . . . . . . . . . 355--371
                      Anonymous   Editorial Board  . . . . . . . . . . . . CO2--CO2

Parallel Computing
Volume 29, Number 4, April, 2003

                A. Migdalas and   
                 G. Toraldo and   
                       V. Kumar   Parallel computing in numerical
                                  optimization . . . . . . . . . . . . . . 373--373
                A. Migdalas and   
                 G. Toraldo and   
                       V. Kumar   Nonlinear optimization and parallel
                                  computing  . . . . . . . . . . . . . . . 375--391
                 R. M. Aiex and   
                  S. Binato and   
               M. G. C. Resende   Parallel GRASP with path-relinking for
                                  job shop scheduling  . . . . . . . . . . 393--430
           Jörgen Blomvall   A multistage stochastic programming
                                  algorithm suitable for parallel
                                  computing  . . . . . . . . . . . . . . . 431--445
    Ricardo C. Corrêa and   
          Fernando C. Gomes and   
      Carlos A. S. Oliveira and   
              Panos M. Pardalos   A parallel implementation of an
                                  asynchronous team to the point-to-point
                                  connection problem . . . . . . . . . . . 447--466
                M. D'Apuzzo and   
                      M. Marino   Parallel computational issues of an
                                  interior point method for solving large
                                  bound-constrained quadratic programming
                                  problems . . . . . . . . . . . . . . . . 467--483
                 C. Durazzi and   
                    V. Ruggiero   Numerical solution of special linear and
                                  quadratic programs via a parallel
                                  interior-point method  . . . . . . . . . 485--503
              Cristian Gatu and   
      Erricos J. Kontoghiorghes   Parallel algorithms for computing all
                                  possible subset regression models using
                                  the QR decomposition . . . . . . . . . . 505--521
        Susana Gómez and   
        Nelson del Castillo and   
        Longina Castellanos and   
                   Julio Solano   The parallel tunneling method  . . . . . 523--533
              G. Zanghirati and   
                       L. Zanni   A parallel solver for large quadratic
                                  programs in training support vector
                                  machines . . . . . . . . . . . . . . . . 535--551
                      Anonymous   Editorial Board  . . . . . . . . . . . . CO2--CO2
                      Anonymous   Obituary: Harry F. Jordan  . . . . . . . iii--iii

Parallel Computing
Volume 29, Number 5, May, 2003

            Gilbert Laporte and   
               Roberto Musmanno   Parallel computing in logistics  . . . . 553--554
          James F. Campbell and   
                Gary Stiehr and   
           Andreas T. Ernst and   
           Mohan Krishnamoorthy   Solving hub arc location problems on a
                                  cluster of workstations  . . . . . . . . 555--574
Félix García-López and   
Belén Melián-Batista and   
José A. Moreno-Pérez and   
          J. Marcos Moreno-Vega   Parallelization of the scatter search
                                  for the $p$-median problem . . . . . . . 575--589
            Bernard Gendron and   
           Jean-Yves Potvin and   
                Patrick Soriano   A parallel hybrid heuristic for the
                                  multicommodity capacitated location
                                  problem with balancing requirements  . . 591--606
                   T. K. Ralphs   Parallel branch and cut for capacitated
                                  vehicle routing  . . . . . . . . . . . . 607--629
         Pierpaolo Caricato and   
           Gianpaolo Ghiani and   
             Antonio Grieco and   
             Emanuela Guerriero   Parallel tabu search for a pickup and
                                  delivery problem under track contention  631--639
               A. Bortfeldt and   
                 H. Gehring and   
                        D. Mack   A parallel tabu search algorithm for
                                  solving the container loading problem    641--662
               F. Guerriero and   
                     M. Mancini   A cooperative parallel rollout algorithm
                                  for the sequential ordering problem  . . 663--677
                      Anonymous   Editorial Board  . . . . . . . . . . . . CO2--CO2

Parallel Computing
Volume 29, Number 6, June, 2003

              Daisuke Takahashi   A parallel $1$-D FFT algorithm for the
                                  Hitachi SR8000 . . . . . . . . . . . . . 679--690
              Coskun Mermer and   
                Donglok Kim and   
                    Yongmin Kim   Efficient 2D FFT implementation on
                                  mediaprocessors  . . . . . . . . . . . . 691--709
                 P. H. Muir and   
               R. N. Pancer and   
                  K. R. Jackson   PMIRKDC: a parallel mono-implicit
                                  Runge--Kutta code with defect control
                                  for boundary value ODEs  . . . . . . . . 711--741
                A. Plastino and   
              C. C. Ribeiro and   
                   N. Rodriguez   Developing SPMD applications with load
                                  balancing  . . . . . . . . . . . . . . . 743--766
                  Naya Nagy and   
                   Selim G. Akl   The maximum flow problem: a real-time
                                  approach . . . . . . . . . . . . . . . . 767--794
               Bassel R. Arafeh   A task duplication scheme for resolving
                                  deadlocks in clustered DAGs  . . . . . . 795--820
                  Jung-Sheng Fu   Fault-tolerant cycle embedding in the
                                  hypercube  . . . . . . . . . . . . . . . 821--832
                      Anonymous   Editorial Board  . . . . . . . . . . . . CO2--CO2

Parallel Computing
Volume 29, Number 7, July, 2003

         Patrick R. Amestoy and   
               Iain S. Duff and   
      Jean-Yves L'Excellent and   
                   Xiaoye S. Li   Impact of the implementation of MPI
                                  point-to-point communications on the
                                  performance of two general sparse
                                  solvers  . . . . . . . . . . . . . . . . 833--849
               James Kohout and   
                 Alan D. George   A high-performance communication service
                                  for parallel computing on distributed
                                  DSP systems  . . . . . . . . . . . . . . 851--878
     Christopher J. Freitas and   
          Derrick B. Coffin and   
              Richard L. Murphy   The characterization of a wide area
                                  network computation  . . . . . . . . . . 879--894
Lúcia M. A. Drummond and   
              Valmir C. Barbosa   On reducing the complexity of matrix
                                  clocks . . . . . . . . . . . . . . . . . 895--905
              Manuel Prieto and   
           Ruben S. Montero and   
        Ignacio M. Llorente and   
               Francisco Tirado   A parallel multigrid solver for viscous
                                  flows on anisotropic structured grids    907--923
         Manuel Díaz and   
     Bartolomé Rubio and   
              Enrique Soler and   
           José M. Troya   Domain interaction patterns to
                                  coordinate HPF tasks . . . . . . . . . . 925--951
                   Y. Tseng and   
               R. F. DeMara and   
                   P. J. Wilder   Distributed-sum termination detection
                                  supporting multithreaded execution . . . 953--968
        Wolfgang Blochinger and   
               Carsten Sinz and   
          Wolfgang Küchlin   Parallel propositional satisfiability
                                  checking with distributed dynamic
                                  learning . . . . . . . . . . . . . . . . 969--994
                      Anonymous   Editorial Board  . . . . . . . . . . . . CO2--CO2

Parallel Computing
Volume 29, Number 8, August, 2003

                  M. Govett and   
                    L. Hart and   
               T. Henderson and   
              J. Middlecoff and   
                    D. Schaffer   The Scalable Modeling System:
                                  directive-based code parallelization for
                                  distributed and shared memory computers  995--1020
Jorge Buenabad-Chávez and   
             Henk L. Muller and   
        Paul W. A. Stallard and   
             David H. D. Warren   Virtual memory on data diffusion
                                  architectures  . . . . . . . . . . . . . 1021--1052
               M. Yamashita and   
                K. Fujisawa and   
                      M. Kojima   SDPARA: SemiDefinite Programming
                                  Algorithm paRAllel version . . . . . . . 1053--1067
              V. Teuli\`ere and   
                   Olivier Brun   Parallelisation of the particle
                                  filtering technique and application to
                                  Doppler-bearing tracking of maneuvering
                                  sources  . . . . . . . . . . . . . . . . 1069--1090
                 Liang Peng and   
              Weng-Fai Wong and   
               Chung-Kwong Yuen   SilkRoad II: mixed paradigm cluster
                                  computing with RC\_dag consistency . . . 1091--1115
                      Anonymous   Editorial Board  . . . . . . . . . . . . CO2--CO2

Parallel Computing
Volume 29, Number 9, September, 2003

               Peter Arbenz and   
     Efstratios Gallopoulos and   
           Bernard Philippe and   
                    Yousef Saad   Parallel Matrix Algorithms and
                                  Applications (PMAA '02)  . . . . . . . . 1117--1119
           Olivier Beaumont and   
             Arnaud Legrand and   
                    Yves Robert   Scheduling divisible workloads on
                                  heterogeneous platforms  . . . . . . . . 1121--1152
             Martin Be\vcka and   
                 Gabriel Ok\vsa   On variable blocking factor in a
                                  parallel dynamic block-Jacobi SVD
                                  algorithm  . . . . . . . . . . . . . . . 1153--1174
            Olivier Coulaud and   
       Michaël Dussere and   
        Pascal Hénon and   
              Erik Lefebvre and   
                     Jean Roman   Optimization of a kinetic laser--plasma
                                  interaction code for large parallel
                                  systems  . . . . . . . . . . . . . . . . 1175--1189
           Abdou Guermouche and   
      Jean-Yves L'Excellent and   
                      Gil Utard   Impact of reordering on the memory of a
                                  multifrontal solver  . . . . . . . . . . 1191--1218
             Hemant Mahawar and   
                    Vivek Sarin   Parallel iterative methods for dense
                                  linear systems in inductance extraction  1219--1235
           James R. McCombs and   
           Andreas Stathopoulos   Parallel, multigrain iterative solvers
                                  for hiding network latencies on MPPs and
                                  networks of clusters . . . . . . . . . . 1237--1259
    Sreekanth R. Sambavaram and   
                Vivek Sarin and   
                Ahmed Sameh and   
                   Ananth Grama   Multipole-based preconditioners for
                                  large sparse linear systems  . . . . . . 1261--1273
                      Anonymous   Editorial Board  . . . . . . . . . . . . CO2--CO2

Parallel Computing
Volume 29, Number 10, October, 2003

            Andrea Clematis and   
               Mike Mineter and   
               Richard Marciano   High performance computing with
                                  geographical data  . . . . . . . . . . . 1275--1279
                   K. C. Clarke   Geocomputation's future at the extremes:
                                  high performance computing and
                                  nanoclients  . . . . . . . . . . . . . . 1281--1295
          Kenneth A. Hawick and   
           P. D. Coddington and   
                    H. A. James   Distributed frameworks and parallel
                                  algorithms for processing large-scale
                                  geographic data  . . . . . . . . . . . . 1297--1333
              Ann Chervenak and   
                Ewa Deelman and   
             Carl Kesselman and   
               Bill Allcock and   
                 Ian Foster and   
          Veronika Nefedova and   
                  Jason Lee and   
                   Alex Sim and   
              Arie Shoshani and   
                  Bob Drach and   
                         others   High-performance remote access to
                                  climate simulation data: a challenge
                                  problem for data grid technologies . . . 1335--1356
           Giovanni Aloisio and   
                 Massimo Cafaro   A dynamic earth observation system . . . 1357--1362
       Asvin Ananthanarayan and   
         Rajiv Balachandran and   
            Robert Grossman and   
                 Yunhong Gu and   
                Xinwei Hong and   
               Jorge Levera and   
                 Marco Mazzucco   Data webs for earth science data . . . . 1363--1379
               Erik G. Hoel and   
                    Hanan Samet   Data-parallel polygonization . . . . . . 1381--1401
           Giuseppe Dattilo and   
          Giandomenico Spezzano   Simulation of a cellular landslide model
                                  with CAMELOT on high performance
                                  computers  . . . . . . . . . . . . . . . 1403--1418
     Apostolos Papadopoulos and   
            Yannis Manolopoulos   Parallel bulk-loading of spatial data    1419--1444
              Mark Lanthier and   
             Doron Nussbaum and   
    Jörg-Rüdiger Sack   Parallel implementation of geometric
                                  shortest path algorithms . . . . . . . . 1445--1479
               Shaowen Wang and   
              Marc P. Armstrong   A quadtree approach to domain
                                  decomposition for spatial interpolation
                                  in Grid computing environments . . . . . 1481--1504
                      Anonymous   Editorial Board  . . . . . . . . . . . . CO2--CO2

Parallel Computing
Volume 29, Number 11--12, November / December, 2003

           Laurence T. Yang and   
                     Yi Pan and   
                      Minyi Guo   Parallel and distributed scientific and
                                  engineering computing  . . . . . . . . . 1505--1508
          Yoshiyuki Iwamoto and   
                Koichi Suga and   
            Kanemitsu Ootsu and   
             Takashi Yokota and   
                  Takanobu Baba   Receiving message prediction method  . . 1509--1538
                 Yudong Sun and   
                    Cho-Li Wang   Solving irregularly structured problems
                                  based on distributed object model  . . . 1539--1562
               Weijian Fang and   
                Cho-Li Wang and   
              Francis C. M. Lau   On the design of global object space for
                                  efficient multi-threading Java computing
                                  on clusters  . . . . . . . . . . . . . . 1563--1587
                   Fan Chan and   
               Jiannong Cao and   
                     Yudong Sun   High-level abstractions for
                                  message-passing parallel programming . . 1589--1621
               Xiaohui Shen and   
                 Alok Choudhary   A distributed multi-storage I/O system
                                  for data intensive scientific computing  1623--1643
         Patrick R. Amestoy and   
               Iain S. Duff and   
     Stéphane Pralet and   
            Christof Vömel   Adapting a parallel sparse direct solver
                                  to architectures with clusters of SMPs   1645--1668

Parallel Computing
Volume 30, Number 1, January, 2004

               Suchuan Dong and   
          George Em Karniadakis   Dual-level parallelism for high-order
                                  CFD methods  . . . . . . . . . . . . . . 1--20
                 V. A. Pais and   
                N. Fournier and   
               M. A. Sutton and   
               K. J. Weston and   
                   U. Dragosits   Using High Performance Fortran to
                                  parallelise a multi-layer atmospheric
                                  transport model  . . . . . . . . . . . . 21--33
      Milan D. Mihajlovi\'c and   
             David J. Silvester   Efficient parallel solvers for the
                                  biharmonic equation  . . . . . . . . . . 35--55
            Michel Toulouse and   
     Teodor Gabriel Crainic and   
          Brunilde Sansó   Systemic behavior of cooperative search
                                  algorithms . . . . . . . . . . . . . . . 57--79
              Oliver Sinnen and   
                   Leonel Sousa   List scheduling: extension for
                                  contention awareness and evaluation of
                                  node priorities for heterogeneous
                                  cluster architectures  . . . . . . . . . 81--101
Frédéric Guinand and   
               Aziz Moukrim and   
                Eric Sanlaville   Sensitivity analysis of tree scheduling
                                  on two machines with communication
                                  delays . . . . . . . . . . . . . . . . . 103--120
               Yang-Suk Kee and   
                Jin-Soo Kim and   
                     Soonhoi Ha   Memory management for multi-threaded
                                  software DSM systems . . . . . . . . . . 121--138
                   Eric Violard   A semantic framework to address data
                                  locality in data parallel languages  . . 139--161
                      Anonymous   Editorial Board  . . . . . . . . . . . . CO2--CO2

Parallel Computing
Volume 30, Number 2, February, 2004

           Jörg Wensch and   
                  Ben Sommeijer   Parallel simulation of axon growth in
                                  the nervous system . . . . . . . . . . . 163--186
              Javier Cuenca and   
     Domingo Giménez and   
    José González   Architecture of an automatically tuned
                                  linear algebra library . . . . . . . . . 187--210
           Maria Calzarossa and   
              Luisa Massari and   
                Daniele Tessera   A methodology towards automatic
                                  performance analysis of parallel
                                  applications . . . . . . . . . . . . . . 211--223
             B. B. Fraguela and   
                  R. Doallo and   
          J. Touriño and   
                   E. L. Zapata   A compiler tool to predict memory
                                  hierarchy performance of scientific
                                  codes  . . . . . . . . . . . . . . . . . 225--248
                   N. Tomov and   
                E. Dempster and   
             M. H. Williams and   
                  A. Burger and   
                  H. Taylor and   
              P. J. B. King and   
                   P. Broughton   Analytical response time estimation in
                                  parallel relational database systems . . 249--283
               Kentaro Sano and   
           Yusuke Kobayashi and   
                 Tadao Nakamura   Differential coding scheme for efficient
                                  parallel image composition on a PC
                                  cluster system . . . . . . . . . . . . . 285--299
     Alexandros V. Gerbessiotis   Architecture independent parallel
                                  binomial tree option price valuations    301--316
                      Anonymous   Editorial Board  . . . . . . . . . . . . CO2--CO2

Parallel Computing
Volume 30, Number 3, March, 2004

            Lieven Eeckhout and   
              Koen De Bosschere   Efficient simulation of trace samples on
                                  parallel machines  . . . . . . . . . . . 317--335
                  V. Blanco and   
      J. A. González and   
             C. León and   
        C. Rodríguez and   
        G. Rodríguez and   
                   M. Printista   Predicting the performance of parallel
                                  programs . . . . . . . . . . . . . . . . 337--356
                 Eddy Caron and   
                      Gil Utard   On the performance of parallel
                                  factorization of out-of-core matrices    357--375
           Andrea Attanasio and   
Jean-François Cordeau and   
           Gianpaolo Ghiani and   
                Gilbert Laporte   Parallel Tabu search heuristics for the
                                  dynamic multi-vehicle dial-a-ride
                                  problem  . . . . . . . . . . . . . . . . 377--387
                    Murray Cole   Bringing skeletons out of the closet: a
                                  pragmatic manifesto for skeletal
                                  parallel programming . . . . . . . . . . 389--406
             Sun-Yuan Hsieh and   
                  Chun-Hua Chen   Pancyclicity on Möbius cubes with maximal
                                  edge faults  . . . . . . . . . . . . . . 407--421
                Jipeng Zhou and   
              Francis C. M. Lau   Multi-phase minimal fault-tolerant
                                  wormhole routing in meshes . . . . . . . 423--442
                      Anonymous   Editorial Board  . . . . . . . . . . . . CO2--CO2

Parallel Computing
Volume 30, Number 4, April, 2004

           Valerie Guralnik and   
                 George Karypis   Parallel tree-projection-based sequence
                                  mining algorithms  . . . . . . . . . . . 443--472
                Gwan-Hwan Hwang   An efficient algorithm for communication
                                  set generation of data parallel programs
                                  with block-cyclic distribution . . . . . 473--501
                  V. Dolean and   
                     S. Lanteri   Parallel multigrid methods for the
                                  calculation of unsteady flows on
                                  unstructured grids: algorithmic aspects
                                  and parallel performances on clusters of
                                  PCs  . . . . . . . . . . . . . . . . . . 503--525
            Rong-Guey Chang and   
           Tyng-Ruey Chuang and   
                  Jenq Kuen Lee   Support and optimization for parallel
                                  sparse programs with array intrinsics of
                                  Fortran 90 . . . . . . . . . . . . . . . 527--550
                      Anonymous   Editorial Board  . . . . . . . . . . . . CO2--CO2

Parallel Computing
Volume 30, Number 5--6, May / June, 2004

           Albert Y. Zomaya and   
               Fikret Ercal and   
               El-ghazali Talbi   Parallel and nature-inspired
                                  computational paradigms and applications 551--552
              V. Di Martino and   
                   M. Mililotti   Sub optimal scheduling in a grid using
                                  genetic algorithms . . . . . . . . . . . 553--565
                 Michelle Moore   An accurate parallel genetic algorithm
                                  to schedule tasks on a cluster . . . . . 567--583
                 P. Morillo and   
        J. M. Orduña and   
            M. Fernández   A comparison study of evolutive
                                  algorithms for solving the partitioning
                                  problem in distributed virtual
                                  environment systems  . . . . . . . . . . 585--610
                    E. Alba and   
                   G. Luque and   
                    J. M. Troya   Parallel LAN/WAN heuristics for
                                  optimization . . . . . . . . . . . . . . 611--628
         Azzedine Boukerche and   
Kathia Regina Lemos Jucá and   
   João Bosco Sobral and   
Mirela Sechi Moretti Annoni Notare   An artificial immune based intrusion
                                  detection model for computer and
                                  telecommunication systems  . . . . . . . 629--646
                 Sven E. Eklund   A massively parallel architecture for
                                  distributed genetic algorithms . . . . . 647--676
                   S. Cahon and   
                   N. Melab and   
                    E.-G. Talbi   Building with ParadisEO reusable
                                  parallel and distributed evolutionary
                                  algorithms . . . . . . . . . . . . . . . 677--697
                    E. Alba and   
                    F. Luna and   
                A. J. Nebro and   
                    J. M. Troya   Parallel heterogeneous genetic
                                  algorithms for continuous optimization   699--719
           F. de Toro Negro and   
                  J. Ortega and   
                     E. Ros and   
                    S. Mota and   
                B. Paechter and   
            J. M. Martín   PSFGA: Parallel processing and
                                  evolutionary computation for
                                  multiobjective optimisation  . . . . . . 721--739
                   Xin-She Yang   Pattern formation in enzyme inhibition
                                  and cooperativity with parallel cellular
                                  automata . . . . . . . . . . . . . . . . 741--751
      Franciszek Seredynski and   
              Pascal Bouvry and   
               Albert Y. Zomaya   Cellular automata computations and
                                  secret key cryptography  . . . . . . . . 753--766
                Tiago Sousa and   
              Arlindo Silva and   
                      Ana Neves   Particle swarm-based data mining
                                  algorithms for classification tasks  . . 767--783
            Peter Koro\vsec and   
               Jurij \vSilc and   
                  Borut Robi\vc   Solving the mesh-partitioning problem
                                  with an ant-colony algorithm . . . . . . 785--801
            Forbes J. Burkowski   Proximity and priority: applying a gene
                                  expression algorithm to the Traveling
                                  Salesperson Problem  . . . . . . . . . . 803--816
                      Anonymous   Editorial Board  . . . . . . . . . . . . CO2--CO2

Parallel Computing
Volume 30, Number 7, July, 2004

          Matthew L. Massie and   
              Brent N. Chun and   
                David E. Culler   The ganglia distributed monitoring
                                  system: design, implementation, and
                                  experience . . . . . . . . . . . . . . . 817--840
          Gerassimos Barlas and   
           Bharadwaj Veeravalli   Quantized load distribution for tree and
                                  bus-connected processors . . . . . . . . 841--865
         Nihar R. Mahapatra and   
                  Shantanu Dutt   Adaptive Quality Equalizing:
                                  High-performance load balancing for
                                  parallel branch-and-bound across
                                  applications and computing systems . . . 867--881
             Ching-Wen Chen and   
                  Shih-Chang Fu   A minimal links traversed dynamic
                                  rerouting network  . . . . . . . . . . . 883--898
           Michael Mascagni and   
               Ashok Srinivasan   Parameterizing parallel multiplicative
                                  lagged-Fibonacci generators  . . . . . . 899--916
                      Anonymous   Editorial Board  . . . . . . . . . . . . CO2--CO2

Parallel Computing
Volume 30, Number 8, August, 2004

             Gerhard R. Joubert   Editorial note . . . . . . . . . . . . . 917--918
            Peter Koro\vsec and   
               Jurij \vSilc and   
                  Borut Robi\vc   ``Solving the mesh-partitioning problem
                                  with an ant-colony algorithm'' [Parallel
                                  Computing 30 (2004) 785--801]  . . . . . 919--921
     Stéphane Genaud and   
             Arnaud Giersch and   
  Frédéric Vivien   Load-balancing scatter operations for
                                  grid computing . . . . . . . . . . . . . 923--946
                   Ming Zhu and   
       Constantine Katsinis and   
                Wentong Cai and   
                    Bu-Sung Lee   Key Messaging on SOME-Bus clusters . . . 947--971
        Teofilo F. Gonzalez and   
                   David Serena   $n$-Cube network: node disjoint shortest
                                  paths for maximal distance pairs of
                                  vertices . . . . . . . . . . . . . . . . 973--998
                      Anonymous   Editorial Board  . . . . . . . . . . . . CO2--CO2

Parallel Computing
Volume 30, Number 9--10, September / October, 2004

             Chun-Hsi Huang and   
        Sanguthevar Rajasekaran   High-performance parallel bio-computing  999--1000
              Mark L. Green and   
                    Russ Miller   Molecular structure determination on a
                                  computational and data grid  . . . . . . 1001--1017
            Werner Dubitzky and   
             Damian McCourt and   
            Mykola Galushka and   
           Mathilde Romberg and   
                 Bernd Schuller   Grid-enabled data warehousing for
                                  molecular engineering  . . . . . . . . . 1019--1035
       Alfredo Tirado-Ramos and   
          Peter M. A. Sloot and   
         Alfons G. Hoekstra and   
                   Marian Bubak   An integrative approach to
                                  high-performance biomedical problem
                                  solving environments on the Grid . . . . 1037--1055
              Mark L. Green and   
                    Russ Miller   Evolutionary molecular structure
                                  determination using grid-enabled data
                                  mining . . . . . . . . . . . . . . . . . 1057--1071
              David Piggott and   
              Conor Teljeur and   
                     Alan Kelly   Exploring the potential for using the
                                  grid to support health impact assessment
                                  modelling  . . . . . . . . . . . . . . . 1073--1091
                    N. Jacq and   
                C. Blanchet and   
                  C. Combet and   
               E. Cornillot and   
                   L. Duret and   
                  K. Kurata and   
                H. Nakamura and   
               T. Silvestre and   
                      V. Breton   Grid as a bioinformatic tool . . . . . . 1093--1107
                  Minyi Guo and   
      Michael (Shan-Hui) Ho and   
                Weng-Long Chang   Fast parallel molecular solution to the
                                  dominating-set problem on massively
                                  parallel bio-computing . . . . . . . . . 1109--1125
               Chain-Wu Lee and   
                 Chun-Hsi Huang   Toward cooperative genomic knowledge
                                  inference  . . . . . . . . . . . . . . . 1127--1135
             John H. Miller and   
                     Fang Zheng   Large-scale simulations of cellular
                                  signaling processes  . . . . . . . . . . 1137--1149
            Peter K. K. Loh and   
                      W. J. Hsu   Fault-tolerant routing for complete
                                  Josephus Cubes . . . . . . . . . . . . . 1151--1167
                      Anonymous   Editorial Board  . . . . . . . . . . . . CO2--CO2

Parallel Computing
Volume 30, Number 11, November, 2004

Jorge Buenabad-Chávez and   
             Henk L. Muller and   
        Paul W. A. Stallard and   
             David H. D. Warren   The diffusion space of data diffusion
                                  architectures  . . . . . . . . . . . . . 1169--1193
         Alexey Lastovetsky and   
                     Ravi Reddy   On performance analysis of heterogeneous
                                  parallel algorithms  . . . . . . . . . . 1195--1216
           Michael Mascagni and   
                    Hongmei Chi   Parallel linear congruential generators
                                  with Sophie--Germain moduli  . . . . . . 1217--1231
    Suchendra M. Bhandarkar and   
      Shankar R. Chandrasekaran   Parallel parsing of MPEG video on a
                                  shared-memory symmetric multiprocessor   1233--1276

Parallel Computing
Volume 30, Number 12, December, 2004

          Masaaki Shimasaki and   
                   Hans P. Zima   The Earth Simulator  . . . . . . . . . . 1277--1278
                   Tetsuya Sato   The Earth Simulator: roles and impacts   1279--1286
            Shinichi Habata and   
           Kazuhiko Umezawa and   
            Mitsuo Yokokawa and   
             Shigemune Kitawaki   Hardware system of the Earth Simulator   1287--1313
           Takashi Yanagawa and   
                  Kenji Suehiro   Software system of the Earth Simulator   1315--1327
                 K. Itakura and   
                     A. Uno and   
                M. Yokokawa and   
                T. Ishihara and   
                      Y. Kaneda   Scalability of hybrid programming for a
                                  CFD code on the Earth Simulator  . . . . 1329--1343
              Akiyoshi Wakatani   A parallel and scalable algorithm for
                                  ADI method with pre-propagation and
                                  message vectorization  . . . . . . . . . 1345--1359
               Kentaro Sano and   
            Shintaro Momose and   
          Hiroyuki Takizawa and   
          Hiroaki Kobayashi and   
                 Tadao Nakamura   Efficient parallel processing of
                                  competitive learning algorithms  . . . . 1361--1383

Parallel Computing
Volume 31, Number 1, January, 2005

           Jürg Hutter and   
             Alessandro Curioni   Dual-level parallelism for ab initio
                                  molecular dynamics: Reaching teraflop
                                  performance with the CPMD code . . . . . 1--17
               Fumihiko Ino and   
              Kanrou Ooyama and   
               Kenichi Hagihara   A data distributed parallel algorithm
                                  for nonrigid image registration  . . . . 19--43
                 M. Salomon and   
                   F. Heitz and   
               G.-R. Perrin and   
                 J.-P. Armspach   A massively parallel approach to
                                  deformable matching of $3$D medical
                                  images via stochastic differential
                                  equations  . . . . . . . . . . . . . . . 45--71
   Stéphane Guyetant and   
             Mathieu Giraud and   
            Ludovic L'Hours and   
             Steven Derrien and   
     Stéphane Rubini and   
         Dominique Lavenier and   
Frédéric Raimbault   Cluster of re-configurable nodes for
                                  scanning large genomic banks . . . . . . 73--96
             Andrea Di Blas and   
                Arun Jagota and   
                 Richard Hughey   Optimizing neural networks on SIMD
                                  parallel computers . . . . . . . . . . . 97--115
         Michihiro Koibuchi and   
              Akiya Jouraku and   
                 Hideharu Amano   Path selection algorithm: the strategy
                                  for designing deterministic routing from
                                  alternative paths  . . . . . . . . . . . 117--130
              Hong-Chun Hsu and   
          Liang-Chih Chiang and   
            Jimmy J. M. Tan and   
                  Lih-Hsing Hsu   Fault hamiltonicity of augmented cubes   131--145

Parallel Computing
Volume 31, Number 2, February, 2005

               Bruno Raffin and   
               Han-Wei Shen and   
                     Dirk Bartz   Parallel graphics and visualization  . . 147--148
                T. Furumura and   
                        L. Chen   Parallel simulation of strong ground
                                  motions during recent and historical
                                  damaging earthquakes in Tokyo, Japan . . 149--165
                Hongfeng Yu and   
                    Kwan-Liu Ma   A study of I/O methods for parallel
                                  visualization of large-scale data  . . . 167--183
                 Jinzhu Gao and   
                Chaoli Wang and   
                    Liya Li and   
                   Han-Wei Shen   A parallel multiresolution volume
                                  rendering algorithm for large data
                                  visualization  . . . . . . . . . . . . . 185--204
               M. Strengert and   
         M. Magallón and   
                D. Weiskopf and   
               Stefan Guthe and   
                        T. Ertl   Large volume visualization of compressed
                                  time-dependent datasets on GPU clusters  205--219
           David E. DeMarle and   
      Christiaan P. Gribble and   
             Solomon Boulos and   
               Steven G. Parker   Memory sharing for interactive ray
                                  tracing on clusters  . . . . . . . . . . 221--242
                Kevin Liang and   
            Patricia Monger and   
                  Huge Couchman   Interactive parallel visualization of
                                  large particle datasets  . . . . . . . . 243--260

Parallel Computing
Volume 31, Number 3--4, March / April, 2005

           Erich Strohmaier and   
           Jack J. Dongarra and   
              Hans W. Meuer and   
                 Horst D. Simon   Recent trends in the marketplace of high
                                  performance computing  . . . . . . . . . 261--273
               Iain S. Duff and   
              Jennifer A. Scott   Stabilized bordered block diagonal forms
                                  for parallel sparse solvers  . . . . . . 275--289
                Arijit Laha and   
                Amitava Sen and   
               Bhabani P. Sinha   Parallel algorithms for identifying
                                  convex and non-convex basis polygons in
                                  an image . . . . . . . . . . . . . . . . 290--310
            Bhanu Hariharan and   
                 Srinivas Aluru   Efficient parallel algorithms and
                                  software for compressed octrees with
                                  applications to hierarchical methods . . 311--331
                 Li Chunlin and   
                      Li Layuan   A distributed utility-based two level
                                  market solution for optimal resource
                                  scheduling in computational grid . . . . 332--351
         Takashi Midorikawa and   
          Daisuke Shiraishi and   
          Masayoshi Shigeno and   
              Yasuki Tanabe and   
           Toshihiro Hanawa and   
                 Hideharu Amano   The performance of SNAIL-2 a (S2SS-MIN
                                  connected multiprocessor with cache
                                  coherent mechanism)  . . . . . . . . . . 352--370
           Yuan-Hsiang Teng and   
            Jimmy J. M. Tan and   
                  Lih-Hsing Hsu   Honeycomb rectangular disks  . . . . . . 371--388
                 Dong Xiang and   
                    Ai Chen and   
                   Jiaguang Sun   Fault-tolerant routing and multicasting
                                  in hypercubes using a partial path
                                  set-up . . . . . . . . . . . . . . . . . 389--411

Parallel Computing
Volume 31, Number 5, May, 2005

             Daniel A. Reed and   
             Mitsuhisa Sato and   
                 Denis Trystram   Editorial  . . . . . . . . . . . . . . . 413--413
              Margreet Nool and   
            Michael M. J. Proot   A parallel least-squares spectral
                                  element solver for incompressible flow
                                  problems on unstructured grids . . . . . 414--438
            Jacques M. Bahi and   
   Sylvain Contassot-Vivier and   
         Raphaël Couturier   Evaluation of the asynchronous iterative
                                  algorithms in the context of distant
                                  heterogeneous clusters . . . . . . . . . 439--461
              Ghazi Al-Rawi and   
                John Cioffi and   
                  Mark Horowitz   On task mapping optimization for
                                  parallel decoding of low-density
                                  parity-check codes on message-passing
                                  architectures  . . . . . . . . . . . . . 462--490
               Josef Kohout and   
   Ivana Kolingerová and   
    Ji\vrí \vZára   Parallel Delaunay triangulation in $E^2$
                                  and $E^3$ for computers with shared
                                  memory . . . . . . . . . . . . . . . . . 491--522
                      Z. Du and   
                         F. Lin   A novel parallelization approach for
                                  hierarchical clustering  . . . . . . . . 523--527

Parallel Computing
Volume 31, Number 6, June, 2005

       Sanya Tangpongprasit and   
          Takahiro Katagiri and   
                 Kenji Kise and   
               Hiroki Honda and   
                Toshitsugu Yuba   A time-to-live based reservation
                                  algorithm on fully decentralized
                                  resource discovery in Grid computing . . 529--543
                Oscar Plata and   
              Rafael Asenjo and   
    Eladio Gutiérrez and   
          Francisco Corbera and   
            Angeles Navarro and   
               Emilio L. Zapata   On the parallelization of irregular and
                                  dynamic programs . . . . . . . . . . . . 544--562
                 J. Verkaik and   
                      H. X. Lin   A class of novel parallel algorithms for
                                  the solution of tridiagonal systems  . . 563--587
              Robert W. Numrich   Parallel numerical algorithms based on
                                  tensor notation and Co-Array Fortran
                                  syntax . . . . . . . . . . . . . . . . . 588--607
        Marcello Balduccini and   
            Enrico Pontelli and   
              Omar Elkhatib and   
                        Hung Le   Issues in parallel execution of
                                  non-monotonic reasoning systems  . . . . 608--647

Parallel Computing
Volume 31, Number 7, July, 2005

             Alexey Kalinov and   
         Alexey Lastovetsky and   
                    Yves Robert   Heterogeneous computing  . . . . . . . . 649--652
                  T. Hagras and   
                   J. Jane\vcek   A high performance, low complexity
                                  algorithm for compile-time task
                                  scheduling in heterogeneous systems  . . 653--670
                  S. Shivle and   
               P. Sugavanam and   
               H. J. Siegel and   
          A. A. Maciejewski and   
                   T. Banka and   
                 K. Chindam and   
               S. Dussinger and   
                 A. Kutruff and   
              P. Penumarthy and   
               P. Pichumani and   
            P. Satyasekaran and   
                  D. Sendek and   
                   J. Smith and   
                   J. Sousa and   
               J. Sridharan and   
                     J. Velazco   Mapping subtasks with multiple versions
                                  on an ad hoc grid  . . . . . . . . . . . 671--690
        Yoshinori Kishimoto and   
               Shuichi Ichikawa   Optimizing the configuration of a
                                  heterogeneous cluster with
                                  multiprocessing and execution-time
                                  estimation . . . . . . . . . . . . . . . 691--710
              Javier Cuenca and   
     Domingo Giménez and   
     Juan-Pedro Martínez   Heuristics for work distribution of a
                                  homogeneous parallel dynamic programming
                                  scheme on heterogeneous systems  . . . . 711--735
            Ioana Banicescu and   
 Ricolindo L. Cariño and   
         Jaderick P. Pabico and   
      Mahadevan Balasubramaniam   Design and implementation of a novel
                                  dynamic load balancing library for
                                  cluster computing  . . . . . . . . . . . 736--756
            M-Tahar Kechadi and   
                Ilias K. Savvas   Dynamic task scheduling for irregular
                                  network topologies . . . . . . . . . . . 757--776
              A. Srinivasan and   
                     N. Chandra   Latency tolerance through
                                  parallelization of time in scientific
                                  applications . . . . . . . . . . . . . . 777--796
                     Han Yu and   
                    Xin Bai and   
               Dan C. Marinescu   Workflow management and resource
                                  discovery for an intelligent grid  . . . 797--811

Parallel Computing
Volume 31, Number 8--9, August / September, 2005

              Bruno Richard and   
           Nicolas Maillard and   
 César A. F. De Rose and   
                Reynaldo Novaes   The I-Cluster Cloud: distributed
                                  management of idle resources for intense
                                  computing  . . . . . . . . . . . . . . . 813--838
                 Z. G. Wang and   
                 Y. S. Wong and   
                      M. Rahman   Development of a parallel optimization
                                  method based on genetic simulated
                                  annealing algorithm  . . . . . . . . . . 839--857
               J. C. Pichel and   
                D. B. Heras and   
            J. C. Cabaleiro and   
                   F. F. Rivera   Performance optimization of irregular
                                  codes based on the combination of
                                  reordering and blocking techniques . . . 858--876
               G. L. Reijns and   
            A. J. C. van Gemund   Predicting the execution times of
                                  parallel-independent programs using
                                  Pearson distributions  . . . . . . . . . 877--899
             Uro\vs \vCibej and   
          Bo\vstjan Slivnik and   
                  Borut Robi\vc   The complexity of static data
                                  replication in data grids  . . . . . . . 900--912
         Jürgen Dreher and   
                  Rainer Grauer   Racoon: a parallel mesh-adaptive
                                  framework for hyperbolic conservation
                                  laws . . . . . . . . . . . . . . . . . . 913--932
                       Tao Dong   A linear time pessimistic one-step
                                  diagnosis algorithm for hypercube
                                  multicomputer systems  . . . . . . . . . 933--947
           Hayedeh Ahrabian and   
           Abbas Nowzari-Dalini   Parallel generation of binary trees in
                                  $A$-order  . . . . . . . . . . . . . . . 948--955

Parallel Computing
Volume 31, Number 10--12, October / December, 2005

         Barbara M. Chapman and   
             Federico Massaioli   OpenMP . . . . . . . . . . . . . . . . . 957--959
                Xinmin Tian and   
          Jay P. Hoeflinger and   
                 Grant Haab and   
             Yen-Kuang Chen and   
              Milind Girkar and   
                    Sanjiv Shah   A compiler for exploiting nested
                                  parallelism in OpenMP programs . . . . . 960--983
                R. Blikberg and   
              T. Sòrevik   Load balancing and OpenMP implementation
                                  of nested parallelism  . . . . . . . . . 984--998
            C. S. Ierotheou and   
                     H. Jin and   
                G. Matthews and   
              S. P. Johnson and   
                        R. Hood   Generating OpenMP code using an
                                  interactive parallelization environment  999--1012
               Rocco Aversa and   
       Beniamino Di Martino and   
           Massimiliano Rak and   
      Salvatore Venticinque and   
                Umberto Villano   Performance prediction through
                                  simulation of a hybrid MPI/OpenMP
                                  application  . . . . . . . . . . . . . . 1013--1033
               Rocco Aversa and   
       Beniamino Di Martino and   
            Nicola Mazzocca and   
          Salvatore Venticinque   A hierarchical distributed-shared memory
                                  parallel Branch & Bound application with
                                  PVM and OpenMP for multiprocessor
                                  clusters . . . . . . . . . . . . . . . . 1034--1047
                 Kengo Nakajima   Parallel iterative solvers for
                                  finite-element methods using an
                                  OpenMP/MPI hybrid programming model on
                                  the Earth Simulator  . . . . . . . . . . 1048--1065
         Federico Massaioli and   
        Filippo Castiglione and   
              Massimo Bernaschi   OpenMP parallelization of agent-based
                                  models . . . . . . . . . . . . . . . . . 1066--1081
              Roland Norcen and   
                    Andreas Uhl   High performance JPEG 2000 and MPEG-4
                                  VTC on SMPs using OpenMP . . . . . . . . 1082--1098
                  Inho Park and   
                  Seon Wook Kim   Study of OpenMP applications on the
                                  InfiniBand-based software distributed
                                  shared-memory system . . . . . . . . . . 1099--1113
                  Lei Huang and   
            Barbara Chapman and   
                   Zhenying Liu   Towards a more efficient implementation
                                  of OpenMP for clusters via translation
                                  to global arrays . . . . . . . . . . . . 1114--1139
            Motonori Hirano and   
             Mitsuhisa Sato and   
                  Yoshio Tanaka   OpenGR: a directive-based grid
                                  programming environment  . . . . . . . . 1140--1154
          P. E. Hadjidoukas and   
            T. S. Papatheodorou   OpenMP extensions for master-slave
                                  message passing computing  . . . . . . . 1155--1167

Parallel Computing
Volume 32, Number 1, January, 2006

                      Anonymous   Editorial Board  . . . . . . . . . . . . iv--vi
                P. Wapperom and   
                A. N. Beris and   
                   M. A. Straka   A new transpose split method for
                                  three-dimensional FFTs: performance on
                                  an Origin2000 and Alphaserver cluster    1--13
             Chun-Hsi Huang and   
                     Xin He and   
                       Min Qian   Communication-optimal parallel
                                  parenthesis matching . . . . . . . . . . 14--23
            Kazuhide Nakata and   
           Makoto Yamashita and   
           Katsuki Fujisawa and   
                Masakazu Kojima   A parallel primal-dual interior-point
                                  method for semidefinite programs using
                                  positive definite matrix completion  . . 24--43
          Valmir C. Barbosa and   
     Fernando M. N. Miranda and   
         Matheus C. M. Agostini   Cell-centric heuristics for the
                                  classification of cellular automata  . . 44--66
            L. Carracciuolo and   
                 L. D'Amore and   
                       A. Murli   Towards a parallel component for imaging
                                  in PETSc programming environment: a case
                                  study in $3$-D echocardiography  . . . . 67--83
                 Sun-Yuan Hsieh   Fault-tolerant cycle embedding in the
                                  hypercube with more both faulty vertices
                                  and faulty edges . . . . . . . . . . . . 84--91
          Takahiro Katagiri and   
                 Kenji Kise and   
               Hiroki Honda and   
                Toshitsugu Yuba   ABCLibScript: a directive to support
                                  specification of an auto-tuning facility
                                  for numerical software . . . . . . . . . 92--112

Parallel Computing
Volume 32, Number 2, February, 2006

              Maurice Clint and   
     Efstratios Gallopoulos and   
                  Esmond Ng and   
                     Jean Roman   Parallel Matrix Algorithms and
                                  Applications (PMAA'04) . . . . . . . . . 113--114
                  Asad Awan and   
        Ronaldo A. Ferreira and   
         Suresh Jagannathan and   
                   Ananth Grama   Unstructured peer-to-peer networks for
                                  sharing processor cycles . . . . . . . . 115--135
         Patrick R. Amestoy and   
           Abdou Guermouche and   
      Jean-Yves L'Excellent and   
         Stéphane Pralet   Hybrid scheduling for the parallel
                                  solution of linear systems . . . . . . . 136--156
               Peter Arbenz and   
             Martin Be\vcka and   
                 Roman Geus and   
           Ulrich Hetmaniuk and   
               Tiziano Mengotti   On a parallel multilevel preconditioned
                                  Maxwell eigensolver  . . . . . . . . . . 157--165
             Gabriel Ok\vsa and   
      Marián Vajter\vsic   Efficient pre-processing in the parallel
                                  block-Jacobi SVD algorithm . . . . . . . 166--176
               Eric Polizzi and   
                 Ahmed H. Sameh   A parallel hybrid banded system solver:
                                  the SPIKE algorithm  . . . . . . . . . . 177--194
                Petko Yanev and   
    Erricos John Kontoghiorghes   Efficient algorithms for estimating the
                                  general linear model . . . . . . . . . . 195--204

Parallel Computing
Volume 32, Number 3, March, 2006

            P. Rajesh Kumar and   
               K. Sridharan and   
                  S. Srinivasan   A parallel algorithm, architecture and
                                  FPGA realization for landmark
                                  determination and map construction in a
                                  planar unknown environment . . . . . . . 205--221
               Marc Hofmann and   
    Erricos John Kontoghiorghes   Pipeline Givens sequences for computing
                                  the QR decomposition on a EREW PRAM  . . 222--230
          Takahiro Katagiri and   
                 Kenji Kise and   
               Hiroki Honda and   
                Toshitsugu Yuba   ABCLib\_DRSSED: a parallel eigensolver
                                  with an auto-tuning facility . . . . . . 231--250
                 Nahid Emad and   
                  Ani Sedrakian   Toward the reusability for iterative
                                  linear algebra software in distributed
                                  environment  . . . . . . . . . . . . . . 251--266

Parallel Computing
Volume 32, Number 4, April, 2006

              R. S. Montero and   
                   E. Huedo and   
                 I. M. Llorente   Benchmarking of high throughput
                                  computing applications on Grids  . . . . 267--279
               Makoto Satoh and   
            Kiyoshi Negishi and   
              Atsushi Kobayashi   Analysis of two-level data mapping in an
                                  HPF compiler for distributed-memory
                                  machines . . . . . . . . . . . . . . . . 280--300
               Prasanta K. Jana   Polynomial interpolation and polynomial
                                  root finding on OTIS-mesh  . . . . . . . 301--312
             Silvia M. Figueira   Optimal partitioning of nodes to
                                  space-sharing parallel tasks . . . . . . 313--324
                      R. Hatzky   Domain cloning for a particle-in-cell
                                  (PIC) code on a cluster of
                                  symmetric-multiprocessor (SMP) computers 325--330

Parallel Computing
Volume 32, Number 5--6, June, 2006

                   Xiao Qin and   
                     Hong Jiang   A novel fault-tolerant scheduling
                                  algorithm for precedence constrained
                                  tasks in real-time heterogeneous systems 331--356
           Gianluigi Folino and   
         Giuseppe Mendicino and   
           Alfonso Senatore and   
      Giandomenico Spezzano and   
             Salvatore Straface   A model based on cellular automata for
                                  the parallel simulation of $3$D
                                  unsaturated flow . . . . . . . . . . . . 357--376
                    F. Luna and   
                A. J. Nebro and   
                        E. Alba   Observations in using Grid-enabled
                                  technologies for solving multi-objective
                                  optimization problems  . . . . . . . . . 377--393
                A. H. Baker and   
              R. D. Falgout and   
                     U. M. Yang   An assumed partition algorithm for
                                  determining processor
                                  inter-communication  . . . . . . . . . . 394--414
                    E. Alba and   
                 F. Almeida and   
                   M. Blesa and   
                   C. Cotta and   
             M. Díaz and   
                   I. Dorta and   
          J. Gabarró and   
             C. León and   
                   G. Luque and   
                   J. Petit and   
        C. Rodríguez and   
                   A. Rojas and   
                       F. Xhafa   Efficient parallel LAN/WAN algorithms
                                  for optimization. The \sc MALLBA Project 415--440
                  Zhihua Du and   
                       Feng Lin   pNJTree: a parallel program for
                                  reconstruction of neighbor-joining tree
                                  and its application in ClustalW  . . . . 441--446

Parallel Computing
Volume 32, Number 7--8, September, 2006

             Herbert Kuchen and   
                    Murray Cole   Editorial  . . . . . . . . . . . . . . . 447--448
            Marco Danelutto and   
                Marco Aldinucci   Algorithmic skeletons meeting grids  . . 449--462
              Xiao Yan Deng and   
            Greg Michaelson and   
                   Phil Trinder   Autonomous mobility skeletons  . . . . . 463--478
Horacio González-Vélez   Self-adaptive skeletal task farm for
                                  computational grids  . . . . . . . . . . 479--490
              Antonio Dorta and   
         Pablo López and   
             Francisco de Sande   Basic skeletons in llc . . . . . . . . . 491--506
             Clemens Grelck and   
               Sven-Bodo Scholz   Merging compositions of array skeletons
                                  in SaC . . . . . . . . . . . . . . . . . 507--522
   Mercedes Hidalgo-Herrero and   
Yolanda Ortega-Mallén and   
                 Fernando Rubio   Analyzing the influence of mixed
                                  evaluation on the performance of Eden
                                  skeletons  . . . . . . . . . . . . . . . 523--538
          F. Clément and   
                  V. Martin and   
                 A. Vodicka and   
                R. Di Cosmo and   
                        P. Weis   Domain decomposition and skeleton
                                  programming with OCamlP3l  . . . . . . . 539--550
           Rob H. Bisseling and   
           Ildikó Flesch   Mondriaan sparse matrix partitioning for
                                  attacking cryptosystems by a parallel
                                  block Lanczos algorithm --- a case study 551--567
                   E. Cesar and   
                  A. Moreno and   
                J. Sorribes and   
                       E. Luque   Modeling Master/Worker applications for
                                  automatic performance tuning . . . . . . 568--589
         Kiminori Matsuzaki and   
               Zhenjiang Hu and   
                Masato Takeichi   Parallel skeletons for manipulating
                                  general trees  . . . . . . . . . . . . . 590--603
                  J. Falcou and   
            J. Sérot and   
                 T. Chateau and   
          J. T. Lapresté   Quaff: efficient C++ design for parallel
                                  skeletons  . . . . . . . . . . . . . . . 604--615
                Paras Mehta and   
  José Nelson Amaral and   
                  Duane Szafron   Is MPI suitable for a generative
                                  design-pattern system? . . . . . . . . . 616--626

Parallel Computing
Volume 32, Number 9, October, 2006

             Jeff Linderoth and   
               Roberto Musmanno   Optimization on grids --- optimization
                                  for grids  . . . . . . . . . . . . . . . 627--628
Lúcia M. A. Drummond and   
              Eduardo Uchoa and   
Alexandre D. Gonçalves and   
        Juliana M. N. Silva and   
       Marcelo C. P. Santos and   
Maria Clícia S. de Castro   A grid-enabled distributed
                                  branch-and-bound algorithm with
                                  application on the Steiner Problem in
                                  graphs . . . . . . . . . . . . . . . . . 629--642
                   N. Melab and   
                  M. Mezmaz and   
                    E.-G. Talbi   Parallel cooperative meta-heuristics on
                                  the computational grid.: a case study:
                                  the bi-objective Flow-Shop problem . . . 643--659
             Wahid Chrabakh and   
                    Rich Wolski   GridSAT: a system for solving
                                  satisfiability problems using a
                                  computational grid . . . . . . . . . . . 660--687
     Demetrio Laganá and   
            Pasquale Legato and   
           Ornella Pisacane and   
             Francesca Vocaturo   Solving simulation optimization problems
                                  on grid computing systems  . . . . . . . 688--700
           Andrea Attanasio and   
           Gianpaolo Ghiani and   
          Lucio Grandinetti and   
            Francesca Guerriero   Auction algorithms for decentralized
                                  parallel machine scheduling  . . . . . . 701--709

Parallel Computing
Volume 32, Number 10, November, 2006

            Georgios Goumas and   
          Nikolaos Drosinos and   
           Maria Athanasaki and   
              Nectarios Koziris   Message-passing code generation for
                                  non-rectangular tiling transformations   711--732
                  Hon F. Li and   
                  Zunce Wei and   
            Dhrubajyoti Goswami   Quasi-atomic recovery for distributed
                                  agents . . . . . . . . . . . . . . . . . 733--758
              Savina Bansal and   
                Padam Kumar and   
                   Kuldip Singh   An improved two-step algorithm for task
                                  and data parallel scheduling in
                                  distributed memory machines  . . . . . . 759--774

Parallel Computing
Volume 32, Number 11--12, December, 2006

            H. Sarbazi-Azad and   
             M. Ould-Khaoua and   
                   A. Y. Zomaya   Performance evaluation of communication
                                  networks for parallel and distributed
                                  systems  . . . . . . . . . . . . . . . . 775--776
                Luca Gatani and   
             Giuseppe Lo Re and   
               Salvatore Gaglio   An efficient distributed algorithm for
                                  generating and updating multicast trees  777--793
                Rod Fatoohi and   
                 Ken Kardys and   
                 Sumy Koshy and   
 Soundarya Sivaramakrishnan and   
              Jeffrey S. Vetter   Performance evaluation of high-speed
                                  interconnects using dense communication
                                  patterns . . . . . . . . . . . . . . . . 794--807
              James Broberg and   
                 Zahir Tari and   
           Panlop Zeephongsekul   Task assignment with work-conserving
                                  migration  . . . . . . . . . . . . . . . 808--830
              Bahman Javadi and   
         Mohammad K. Akbari and   
               Jemal H. Abawajy   A performance model for analysis of
                                  heterogeneous multi-cluster systems  . . 831--851
                 Masaru Takesue   The psi-cube: a bus-based cube-type
                                  clustering network for high-performance
                                  on-chip systems  . . . . . . . . . . . . 852--869
                    A. Shahrabi   Performance comparison of routing
                                  algorithms in wormhole-switched networks 870--885
      M. Hoseiny Farahabady and   
                  F. Safaei and   
                A. Khonsari and   
                       M. Fathy   Characterization of spatial fault
                                  patterns in interconnection networks . . 886--901
         Azzedine Boukerche and   
            Caron Dzermajko and   
                     Lu Kaiyuan   An enhancement towards dynamic
                                  grid-based DDM protocol for distributed
                                  simulation using multiple levels of data
                                  filtering  . . . . . . . . . . . . . . . 902--919

Parallel Computing
Volume 33, Number 1, February, 2007

                       Dan Reed   Changes and updates  . . . . . . . . . . 1--1
             Jong Wook Kwak and   
                  Chu Shik Jhon   Torus Ring: improving performance of
                                  interconnection network by modifying
                                  hierarchical ring  . . . . . . . . . . . 2--20
           Celso C. Ribeiro and   
                 Isabel Rosseti   Efficient parallel cooperative
                                  implementations of GRASP heuristics  . . 21--35
                  Meijie Ma and   
                Guizhen Liu and   
                    Jun-Ming Xu   Panconnectivity and edge-fault-tolerant
                                  pancyclicity of augmented cubes  . . . . 36--42
          James S. Hammonds and   
               Faisal Saied and   
                Mark A. Shannon   Solving coupled $3$-D paraxial wave and
                                  thermal diffusion equations with
                                  mixed-mode parallel computations . . . . 43--53
    Gregorio Bernabé and   
   Ricardo Fernández and   
      Jose M. García and   
           Manuel E. Acacio and   
    José González   An efficient implementation of a $3$D
                                  wavelet transform based encoder on
                                  hyper-threading technology . . . . . . . 54--72
           Jinn-Shyong Yang and   
            Shyue-Ming Tang and   
             Jou-Ming Chang and   
                    Yue-Li Wang   Parallel construction of optimal
                                  independent spanning trees on hypercubes 73--79

Parallel Computing
Volume 33, Number 2, March, 2007

              Osman Ya\csar and   
                    Hasan Da\=g   Trends in parallel computing . . . . . . 81--82
                    Hasan Da\=g   An approximate inverse preconditioner
                                  and its implementation for conjugate
                                  gradient method  . . . . . . . . . . . . 83--91
                  Halis Sak and   
 Süleyman Özekici and   
            \.Ilkay Boduro\uglu   Parallel computing in Asian option
                                  pricing  . . . . . . . . . . . . . . . . 92--108
                   Omar Ramadan   Three dimensional MPI parallel
                                  implementation of the PML algorithm for
                                  truncating finite-difference time-domain
                                  Grids  . . . . . . . . . . . . . . . . . 109--115
             Peter Rissland and   
                    Yuefan Deng   Electrostatic force computation for
                                  bio-molecules on supercomputers with
                                  torus networks . . . . . . . . . . . . . 116--123
                Ferat Sahin and   
      M. Çetin Yavuz and   
               Ziya Arnavut and   
              Önder Uluyol   Fault diagnosis for airplane engines
                                  using Bayesian networks and distributed
                                  particle swarm optimization  . . . . . . 124--143

Parallel Computing
Volume 33, Number 3, April, 2007

 César A. F. De Rose and   
          Hans-Ulrich Heiss and   
                  Barry Linnert   Distributed dynamic processor allocation
                                  for multicomputers . . . . . . . . . . . 145--158
         Alessia Gualandris and   
      Simon Portegies Zwart and   
           Alfredo Tirado-Ramos   Performance analysis of direct N . . . . 159--173
                   Zeyao Mo and   
                     Xiaowen Xu   Relaxed RS0 or CLJP coarsening strategy
                                  for parallel AMG . . . . . . . . . . . . 174--185
              D. D'Ambrosio and   
                     W. Spataro   Parallel evolutionary modelling of
                                  geological processes . . . . . . . . . . 186--212
             Walfredo Cirne and   
       Francisco Brasileiro and   
            Daniel Paranhos and   
Luís Fabrício W. Góes and   
              William Voorsluys   On the efficacy, efficiency and emergent
                                  behavior of task replication in large
                                  distributed systems  . . . . . . . . . . 213--234

Parallel Computing
Volume 33, Number 4--5, May, 2007

        Christophe Cérin   Large scale grids  . . . . . . . . . . . 235--237
               Vandy Berten and   
                   Bruno Gaujal   Brokering strategies in computational
                                  grids using stochastic prediction models 238--249
        J. R. Bilbao-Castro and   
                  A. Merino and   
           I. García and   
               J. M. Carazo and   
         J. J. Fernández   Parameter optimization in $3$D
                                  reconstruction on a large scale grid . . 250--263
           Benjamin Gaidioz and   
             Birger Koblitz and   
                    Nuno Santos   Exploring high performance distributed
                                  file storage using LDPC codes  . . . . . 264--274
              Denis Caromel and   
      Alexandre di Costanzo and   
         Clément Mathieu   Peer-to-peer for computational grids:
                                  mixing clusters and desktop machines . . 275--288
               Nicolas Jacq and   
             Vincent Breton and   
              Hsin-Yen Chen and   
                 Li-Yung Ho and   
             Martin Hofmann and   
                Vinod Kasam and   
             Hurng-Chun Lee and   
       Yannick Legré and   
               Simon C. Lin and   
          Astrid Maaß and   
         Emmanuel Medernach and   
               Ivan Merelli and   
           Luciano Milanesi and   
            Giulio Rastelli and   
        Matthieu Reichstadt and   
             Jean Salzemann and   
       Horst Schwichtenberg and   
                 Ying-Ta Wu and   
                Marc Zimmermann   Virtual screening on large scale grids   289--301
                  M. Mezmaz and   
                   N. Melab and   
                    E.-G. Talbi   An efficient load balancing strategy for
                                  grid-based branch and bound algorithm    302--313
           Hiroshi Yamauchi and   
                     Dongyan Xu   Portable virtual cycle accounting for
                                  large-scale distributed cycle sharing
                                  systems  . . . . . . . . . . . . . . . . 314--327
               Eun-Kyu Byun and   
                    Jin-Soo Kim   DynaGrid: a dynamic service deployment
                                  and resource migration framework for
                                  WSRF-compliant applications  . . . . . . 328--338
            Moreno Marzolla and   
         Matteo Mordacchini and   
              Salvatore Orlando   Peer-to-peer systems for discovering
                                  resources in a dynamic grid  . . . . . . 339--358

Parallel Computing
Volume 33, Number 6, June, 2007

           Luis Paulo Santo and   
               Bruno Raffin and   
                   Alan Heirich   Parallel graphics and visualization  . . 359--360
              K. Debattista and   
                A. Chalmers and   
              R. Gillibrand and   
               P. Longhurst and   
           G. Mastoropoulou and   
                   V. Sundstedt   Parallel selective rendering of
                                  high-fidelity virtual environments . . . 361--376
      Bernhard Thomaszewski and   
            Wolfgang Blochinger   Physically based simulation of cloth on
                                  distributed memory architectures . . . . 377--390
  Fábio F. Bernardon and   
         Steven P. Callahan and   
    João L. D. Comba and   
        Cláudio T. Silva   An adaptive framework for visualizing
                                  unstructured grids with time-varying
                                  scalar fields  . . . . . . . . . . . . . 391--405
             C. Müller and   
               M. Strengert and   
                        T. Ertl   Adaptive load balancing for raycasting
                                  of non-uniformly bricked volumes . . . . 406--419
                 D. Cotting and   
         M. Waschbüsch and   
                  M. Duller and   
                       M. Gross   WinSGL: synchronizing displays in
                                  parallel graphics using cost-effective
                                  software genlocking  . . . . . . . . . . 420--437
               Mario Lorenz and   
             Guido Brunnett and   
                   Marcel Heinz   Driving tiled displays with an extended
                                  Chromium system based on stream cached
                                  multicast communication  . . . . . . . . 438--466

Parallel Computing
Volume 33, Number 7--8, August, 2007

             Chao-Tung Yang and   
             Kuan-Wei Cheng and   
                 Wen-Chung Shih   On development of an efficient parallel
                                  loop self-scheduling for grid computing
                                  environments . . . . . . . . . . . . . . 467--487
                  Jung-Sheng Fu   Conditional fault-tolerant hamiltonicity
                                  of star graphs . . . . . . . . . . . . . 488--496
           Henrique Andrade and   
                Tahsin Kurc and   
               Alan Sussman and   
                     Joel Saltz   Active semantic caching to optimize
                                  multidimensional data analysis in
                                  parallel and distributed environments    497--520
               V. Hernandez and   
                J. E. Roman and   
                       A. Tomas   Parallel Arnoldi eigensolvers with
                                  enhanced scalability via global
                                  communications rearrangement . . . . . . 521--540
          T. Esposti Ongaro and   
               C. Cavazzoni and   
                 G. Erbacci and   
                    A. Neri and   
                 M. V. Salvetti   A parallel multiphase flow code for the
                                  $3$D simulation of explosive volcanic
                                  eruptions  . . . . . . . . . . . . . . . 541--560
          Isaac D. Scherson and   
         Daniel S. Valencia and   
                 Enrique Cauich   Service address routing: a
                                  network-embedded resource management
                                  layer for cluster computing  . . . . . . 561--571
                    Wei Jie and   
                Wentong Cai and   
                 Lizhe Wang and   
                    Rob Procter   A secure information service for
                                  monitoring large scale grids . . . . . . 572--591

Parallel Computing
Volume 33, Number 9, September, 2007

                 Bernd Mohr and   
  Jesper Larsson Träff and   
              Joachim Worringen   Selected papers from EuroPVM/MPI 2006    593--594
              William Gropp and   
                  Rajeev Thakur   Thread-safety in an MPI implementation:
                                  Requirements and analysis  . . . . . . . 595--604
               Fabian Kulla and   
                  Peter Sanders   Scalable parallel suffix array
                                  construction . . . . . . . . . . . . . . 605--612
Jelena Pje\vsivac-Grbovi\'c and   
             George Bosilca and   
             Graham E. Fagg and   
              Thara Angskun and   
               Jack J. Dongarra   MPI collective algorithm selection and
                                  quadtree encoding  . . . . . . . . . . . 613--623
            Torsten Hoefler and   
          Peter Gottschling and   
           Andrew Lumsdaine and   
                  Wolfgang Rehm   Optimizing a conjugate gradient solver
                                  with non-blocking collective operations  624--633
            Darius Buntinas and   
          Guillaume Mercier and   
                  William Gropp   Implementation and evaluation of
                                  shared-memory communication and
                                  synchronization operations in MPICH2
                                  using the Nemesis communication
                                  subsystem  . . . . . . . . . . . . . . . 634--644

Parallel Computing
Volume 33, Number 10--11, November, 2007

               Wu-chun Feng and   
                 Dinesh Manocha   High-performance computing using
                                  accelerators . . . . . . . . . . . . . . 645--647
          Patrick McCormick and   
                 Jeff Inman and   
               James Ahrens and   
       Jamaludin Mohd-Yusof and   
                  Greg Roth and   
                 Sharen Cummins   Scout: a data-parallel programming
                                  language for graphics processors . . . . 648--662
        Naga K. Govindaraju and   
                 Dinesh Manocha   Cache-efficient numerical algorithms
                                  using graphics hardware  . . . . . . . . 663--684
       Dominik Göddeke and   
            Robert Strzodka and   
       Jamaludin Mohd-Yusof and   
          Patrick McCormick and   
        Sven H. M. Buijssen and   
         Matthias Grajewski and   
                   Stefan Turek   Exploring weak scalability for FEM
                                  calculations on a GPU-enhanced cluster   685--699
           Filip Blagojevic and   
  Dimitrios S. Nikolopoulos and   
      Alexandros Stamatakis and   
   Christos D. Antonopoulos and   
           Matthew Curtis-Maury   Runtime scheduling of dynamic
                                  parallelism on accelerator-based
                                  multi-core systems . . . . . . . . . . . 700--719
             David A. Bader and   
              Virat Agarwal and   
             Kamesh Madduri and   
                  Seunghwa Kang   High performance combinatorial algorithm
                                  design on the Cell Broadband Engine
                                  processor  . . . . . . . . . . . . . . . 720--740
         Martin C. Herbordt and   
                 Josh Model and   
            Bharat Sukhwani and   
                Yongfeng Gu and   
                   Tom VanCourt   Single pass streaming BLAST on FPGAs . . 741--756

Parallel Computing
Volume 33, Number 12, December, 2007

         Alexey Lastovetsky and   
                     Ravi Reddy   Data distribution for dense
                                  factorization on computers with memory
                                  heterogeneity  . . . . . . . . . . . . . 757--779
                          J. Xu   Benchmarks on tera-scalable models for
                                  DNS of turbulent channel flow  . . . . . 780--794
                   N. Botta and   
                     C. Ionescu   Relation-based computations in a monadic
                                  BSP model  . . . . . . . . . . . . . . . 795--821
               M. Vanneschi and   
                     L. Veraldi   Dynamicity in distributed applications:
                                  issues, problems and the ASSIST approach 822--845

Parallel Computing
Volume 34, Number 1, January, 2008

          V. Santhosh Kumar and   
              R. Nanjundiah and   
      M. J. Thazhuthaveetil and   
                R. Govindarajan   Impact of message compression on the
                                  scalability of an atmospheric modeling
                                  application on clusters  . . . . . . . . 1--16
                 Yuhui Deng and   
                 Frank Wang and   
                  Na Helian and   
                  Sining Wu and   
                   Chenhan Liao   Dynamic and scalable storage management
                                  architecture for Grid Oriented Storage
                                  devices  . . . . . . . . . . . . . . . . 17--31
              Jason Brazile and   
             Rudolf Richter and   
      Daniel Schläpfer and   
       Michael E. Schaepman and   
                 Klaus I. Itten   Cluster versus grid for operational
                                  generation of ATCOR's \sc MODTRAN-based
                                  look up tables . . . . . . . . . . . . . 32--46
                Albert Chan and   
                Frank Dehne and   
             Prosenjit Bose and   
                  Markus Latzel   Coarse grained parallel algorithms for
                                  graph matching . . . . . . . . . . . . . 47--62
                Fouad B. Chedid   An optimal parallelization of the
                                  two-list algorithm of cost
                                  ${O}(2^{n/2})$ . . . . . . . . . . . . . 63--65
                      Anonymous   Acknowledgement to reviewers . . . . . . 66--68

Parallel Computing
Volume 34, Number 2, February, 2008

       Andrzej M. Goscinski and   
                Adam K. L. Wong   A study of the concurrent execution of
                                  parallel and sequential applications on
                                  a non-dedicated cluster  . . . . . . . . 69--91
              Antonio Plaza and   
             David Valencia and   
                   Javier Plaza   An experimental comparison of parallel
                                  algorithms for hyperspectral analysis
                                  using heterogeneous and homogeneous
                                  networks of workstations . . . . . . . . 92--114
                   A. Murli and   
                 L. D'Amore and   
            L. Carracciuolo and   
              M. Ceccarelli and   
                   L. Antonelli   High performance edge-preserving
                                  regularization in $3$D SPECT imaging . . 115--132

Parallel Computing
Volume 34, Number 3, March, 2008

    Eladio Gutiérrez and   
                Oscar Plata and   
               Emilio L. Zapata   An analytical model of locality-based
                                  parallel irregular reductions  . . . . . 133--157
Jean-François Pineau and   
                Yves Robert and   
  Frédéric Vivien   The impact of heterogeneity on
                                  master-slave scheduling  . . . . . . . . 158--176
     S. Chandra Sekhara Rao and   
                         Sarita   Parallel solution of large symmetric
                                  tridiagonal linear systems . . . . . . . 177--197

Parallel Computing
Volume 34, Number 4--5, May, 2008

      Volodymyr Kindratenko and   
                   Duncan Buell   Reconfigurable Systems Summer Institute
                                  2007 . . . . . . . . . . . . . . . . . . 199--200
       Roger D. Chamberlain and   
        Joseph M. Lancaster and   
                  Ron K. Cytron   Visions for application development on
                                  hybrid computing systems . . . . . . . . 201--216
               Seth Koehler and   
               John Curreri and   
                 Alan D. George   Performance analysis challenges and
                                  framework for high-performance
                                  reconfigurable computing . . . . . . . . 217--230
                M. Wirthlin and   
              D. Poznanovic and   
            P. Sundararajan and   
                 A. Coppola and   
                D. Pellerin and   
                  W. Najjar and   
                   R. Bruce and   
                   M. Babst and   
               O. Pritchard and   
               P. Palazzari and   
                    G. Kuzmanov   OpenFPGA CoreLib core library
                                  interoperability effort  . . . . . . . . 231--244
             Proshanta Saha and   
              Esam El-Araby and   
             Miaoqing Huang and   
              Mohamed Taher and   
         Sergio Lopez-Buedo and   
           Tarek El-Ghazawi and   
                  Chang Shu and   
                   Kris Gaj and   
             Alan Michalski and   
                   Duncan Buell   Portable library development for
                                  reconfigurable computing systems: a case
                                  study  . . . . . . . . . . . . . . . . . 245--260
                Yongfeng Gu and   
               Tom VanCourt and   
             Martin C. Herbordt   Explicit design of FPGA-based
                                  coprocessors for short-range force
                                  computations in molecular dynamics
                                  simulations  . . . . . . . . . . . . . . 261--277
        Akila Gothandaraman and   
        Gregory D. Peterson and   
               G. L. Warren and   
            Robert J. Hinde and   
             Robert J. Harrison   FPGA acceleration of a quantum Monte
                                  Carlo application  . . . . . . . . . . . 278--291

Parallel Computing
Volume 34, Number 6--8, July, 2008

              Laura Grigori and   
           Bernard Philippe and   
                Ahmed Sameh and   
     Damien Tromeur-Dervout and   
               Marian Vajtersic   Parallel matrix algorithms and
                                  applications . . . . . . . . . . . . . . 293--295
            Emmanuel Agullo and   
           Abdou Guermouche and   
          Jean-Yves L'Excellent   A parallel out-of-core multifrontal
                                  method: Storage of factors on disk and
                                  analysis of models for an out-of-core
                                  active memory  . . . . . . . . . . . . . 296--317
               C. Chevalier and   
                  F. Pellegrini   PT-Scotch: a tool for efficient parallel
                                  graph ordering . . . . . . . . . . . . . 318--331
Guy Antoine Atenekeng Kahou and   
              Laura Grigori and   
                Masha Sosonkina   A partitioning algorithm for
                                  block-diagonal matrices with overlap . . 332--344
        Pascal Hénon and   
               Pierre Ramet and   
                     Jean Roman   On finding approximate supernodes for an
                                  efficient block-ILU$(k)$ factorization   345--362
                  L. Giraud and   
                  A. Haidar and   
                   L. T. Watson   Parallel scalability study of hybrid
                                  preconditioners in three dimensions  . . 363--379
     Raphaël Couturier and   
           Christophe Denis and   
Fabienne Jézéquel   GREMLINS: a large sparse linear solver
                                  for grid environment . . . . . . . . . . 380--391
                N. Yamanaka and   
                   T. Ogita and   
                 S. M. Rump and   
                       S. Oishi   A parallel algorithm for accurate dot
                                  product  . . . . . . . . . . . . . . . . 392--410
                  S. Hunold and   
                  T. Rauber and   
                 G. Rünger   Combining building blocks for parallel
                                  multi-level matrix multiplication  . . . 411--426
                  Kok Fu Ng and   
       Norhashidah Hj. Mohd Ali   Performance analysis of explicit group
                                  parallel algorithms for distributed
                                  memory multicomputer . . . . . . . . . . 427--440
                   C. Bekas and   
                 A. Curioni and   
                    W. Andreoni   Atomic wavefunction initialization in ab
                                  initio molecular dynamics using
                                  distributed Lanczos  . . . . . . . . . . 441--450
             Petko I. Yanev and   
      Erricos J. Kontoghiorghes   Parallel algorithms for downdating the
                                  least squares estimator of the
                                  regression model . . . . . . . . . . . . 451--468
                Maria Lucka and   
           Igor Melichercik and   
                Ladislav Halada   Application of multistage stochastic
                                  programs solved in parallel in portfolio
                                  management . . . . . . . . . . . . . . . 469--485

Parallel Computing
Volume 34, Number 9, September, 2008

                     Dajin Wang   A linear-time algorithm for computing
                                  collision-free path on reconfigurable
                                  mesh . . . . . . . . . . . . . . . . . . 487--496
      Yasheng Maimaitijiang and   
         Mohammed Ali Roula and   
              Stuart Watson and   
                  Ralf Patz and   
         Robert J. Williams and   
                  Huw Griffiths   Parallelization methods for
                                  implementation of a magnetic induction
                                  tomography forward model in symmetric
                                  multiprocessor systems . . . . . . . . . 497--507
                Lee Kee Goh and   
           Bharadwaj Veeravalli   Design and performance evaluation of
                                  combined first-fit task allocation and
                                  migration strategies in mesh
                                  multiprocessor systems . . . . . . . . . 508--520
                   Wei-Ming Lin   Performance modeling and analysis of
                                  correlated parallel computations . . . . 521--538
    J. Sánchez-Curto and   
             P. Chamorro-Posada   On a faster parallel implementation of
                                  the split-step Fourier method  . . . . . 539--549

Parallel Computing
Volume 34, Number 10, October, 2008

              Julien Straubhaar   Parallel preconditioners for the
                                  conjugate gradient algorithm using
                                  Gram--Schmidt and least squares methods  551--569
              Woo-Chul Jeun and   
               Yang-Suk Kee and   
                 Soonhoi Ha and   
                   Changdon Kee   Overcoming performance bottlenecks in
                                  using OpenMP on SMP clusters . . . . . . 570--592
          Carlo Mastroianni and   
             Domenico Talia and   
                   Oreste Verta   Designing an information system for
                                  Grids: Comparing hierarchical,
                                  decentralized P2P and super-peer models  593--611

Parallel Computing
Volume 34, Number 11, November, 2008

             David A. Bader and   
                 Srinivas Aluru   High-performance computational biology   613--615
             Vipin Sachdeva and   
            Michael Kistler and   
               Evan Speight and   
            Tzy-Hwa Kathy Tzeng   Exploring the viability of the Cell
                                  Broadband Engine for bioinformatics
                                  applications . . . . . . . . . . . . . . 616--626
             David A. Bader and   
                 Kamesh Madduri   A graph-theoretic analysis of the human
                                  protein-interaction network using
                                  multicore parallel algorithms  . . . . . 627--639
              Sadaf R. Alam and   
          Pratul K. Agarwal and   
              Jeffrey S. Vetter   Performance characteristics of
                                  biomolecular simulations on high-end
                                  systems with multi-core processors . . . 640--651
                 P. Brenner and   
              J. M. Wozniak and   
                   D. Thain and   
                A. Striegel and   
                 J. W. Peng and   
                J. A. Izaguirre   Biomolecular committor probability
                                  calculation enabled by processing in
                                  network storage  . . . . . . . . . . . . 652--660
             Michela Taufer and   
            Ming-Ying Leung and   
             Thamar Solorio and   
                 Abel Licon and   
              David Mireles and   
             Roberto Araiza and   
                Kyle L. Johnson   RNAVLab: a virtual laboratory for
                                  studying RNA secondary structures based
                                  on grid computing technology . . . . . . 661--680
                 Tim Oliver and   
             Leow Yuan Yeow and   
                 Bertil Schmidt   Integrating FPGA acceleration into HMMer 681--691

Parallel Computing
Volume 34, Number 12, December, 2008

              Alain Merigot and   
              Alfredo Petrosino   Parallel processing for image and video
                                  processing . . . . . . . . . . . . . . . 693--693
              Alain Merigot and   
              Alfredo Petrosino   Parallel processing for image and video
                                  processing: Issues and challenges  . . . 694--699
                         O. Kao   On parallel image retrieval with
                                  dynamically extracted features . . . . . 700--709
               Myeongsoo Oh and   
                Kiyoharu Aizawa   Large-scale image sensing by a group of
                                  smart image sensors  . . . . . . . . . . 710--717
                 C. Colombo and   
               A. Del Bimbo and   
                       A. Valli   A real-time full body tracking and
                                  humanoid animation system  . . . . . . . 718--726
          Francesco Isgr\`o and   
                Domenico Tegolo   A distributed genetic algorithm for
                                  restoration of vertical line scratches   727--734
               P. P. Jonker and   
               J. G. E. Olk and   
                   C. Nicolescu   Distributed bucket processing: a
                                  paradigm embedded in a framework for the
                                  parallel processing of pixel sets  . . . 735--746
          Radhika S. Grover and   
                   Qiang Li and   
                   H.-P. Dommel   Performance study of data layout schemes
                                  for a SAN-based video server . . . . . . 747--756
                Paolo Gamba and   
              Luca Lombardi and   
                    Marco Porta   Log-map analysis . . . . . . . . . . . . 757--764

Parallel Computing
Volume 35, Number 1, January, 2009

                    X. Meng and   
                   V. Chaudhary   Boosting data throughput for sequence
                                  database similarity searches on FPGAs
                                  using an adaptive buffering scheme . . . 1--11
    Ricardo C. Corrêa and   
              Valmir C. Barbosa   Partially ordered distributed
                                  computations on asynchronous
                                  point-to-point networks  . . . . . . . . 12--28
              Lih-Yuan Deng and   
                Huajiang Li and   
            Jyh-Jen Horng Shiau   Scalable parallel multiple recursive
                                  generators of large order  . . . . . . . 29--37
            Alfredo Buttari and   
              Julien Langou and   
               Jakub Kurzak and   
                  Jack Dongarra   A class of parallel tiled linear algebra
                                  algorithms for multicore architectures   38--53
                      Anonymous   Acknowledgement to reviewers . . . . . . 54--55

Parallel Computing
Volume ??, Number ??, January, 2009

                      Anonymous   Editorial Board  . . . . . . . . . . . . ??

Parallel Computing
Volume 35, Number 2, February, 2009

Fabrício A. B. da Silva and   
                  Hermes Senger   Improving scalability of Bag-of-Tasks
                                  applications running on master-slave
                                  platforms  . . . . . . . . . . . . . . . 57--71
                   Yuh-Rau Wang   A novel $O(1)$ time algorithm for $3$D
                                  block-based medial axis transform by
                                  peeling corner shells  . . . . . . . . . 72--82
                Anne Benoit and   
               Mourad Hakem and   
                    Yves Robert   Contention awareness and fault-tolerant
                                  scheduling for precedence constrained
                                  tasks in heterogeneous systems . . . . . 83--108
          L. K. S. Daldorff and   
                    B. Eliasson   Parallelization of a Vlasov--Maxwell
                                  solver in four-dimensional phase space   109--115

Parallel Computing
Volume 35, Number 3, March, 2009

               Rupak Biswas and   
              Leonid Oliker and   
                 Jeffrey Vetter   Revolutionary technologies for
                                  acceleration of emerging petascale
                                  applications . . . . . . . . . . . . . . 117--118
             David A. Bader and   
              Virat Agarwal and   
                  Seunghwa Kang   Computing discrete transforms on the
                                  Cell Broadband Engine  . . . . . . . . . 119--137
               Jakub Kurzak and   
              Wesley Alvaro and   
                  Jack Dongarra   Optimizing matrix multiplication for a
                                  short-vector SIMD architecture --- CELL
                                  processor  . . . . . . . . . . . . . . . 138--150
         Jeremy S. Meredith and   
            Gonzalo Alvarez and   
            Thomas A. Maier and   
       Thomas C. Schulthess and   
              Jeffrey S. Vetter   Accuracy and performance of graphics
                                  processors: a Quantum Monte Carlo
                                  application case study . . . . . . . . . 151--163
             David J. Hardy and   
              John E. Stone and   
                 Klaus Schulten   Multilevel summation of electrostatic
                                  potentials using graphics processing
                                  units  . . . . . . . . . . . . . . . . . 164--177
            Samuel Williams and   
              Leonid Oliker and   
              Richard Vuduc and   
                 John Shalf and   
           Katherine Yelick and   
                   James Demmel   Optimization of sparse matrix-vector
                                  multiplication on emerging multicore
                                  platforms  . . . . . . . . . . . . . . . 178--194

Parallel Computing
Volume 35, Number 4, April, 2009

              Suresh Behara and   
                  Sanjay Mittal   Parallel finite element computation of
                                  incompressible flows . . . . . . . . . . 195--212
          Arquimedes Canedo and   
           Ben A. Abderazek and   
                  Masahiro Sowa   Efficient compilation for queue size
                                  constrained queue processors . . . . . . 213--225
               Tien-Yien Li and   
               Chih-Hsiung Tsai   HOM4PS-2.0para: Parallelization of
                                  HOM4PS-2.0 for solving polynomial
                                  systems  . . . . . . . . . . . . . . . . 226--238
       Sid-Ahmed-Ali Touati and   
                    Zsolt Mathe   Periodic register saturation in
                                  innermost loops  . . . . . . . . . . . . 239--254

Parallel Computing
Volume 35, Number 5, May, 2009

                  Won W. Ro and   
               Jean-Luc Gaudiot   A complexity-effective microprocessor
                                  design with decoupled dispatch queues
                                  and prefetching  . . . . . . . . . . . . 255--268
                 Yaohang Li and   
           Michael Mascagni and   
                   Andrey Gorin   A decentralized parallel implementation
                                  for parallel tempering algorithm . . . . 269--283
                L. Grinberg and   
              D. Pekurovsky and   
              S. J. Sherwin and   
              G. E. Karniadakis   Parallel performance of the coarse space
                                  linear vertex solver and low energy
                                  basis preconditioner for spectral/hp
                                  elements . . . . . . . . . . . . . . . . 284--304
Antonio Robles-Gómez and   
    Aurelio Bermúdez and   
              Rafael Casado and   
Åshild Grònstad Solheim   A dynamic distributed mechanism for
                                  reconfiguring high-performance networks  305--312

Parallel Computing
Volume 35, Number 6, June, 2009

             Ching-Wen Chen and   
             Chuan-Chi Weng and   
                  Chang-Jung Ku   An overlapping and pipelining data
                                  transmission MAC protocol with multiple
                                  channels in ad hoc networks  . . . . . . 313--330
                 Taro Konda and   
             Yoshimasa Nakamura   A new algorithm for singular value
                                  decomposition and its parallelization    331--344
          Gerold Jäger and   
                 Clemens Wagner   Efficient parallelizations of Hermite
                                  and Smith normal form algorithms . . . . 345--357
             Julian Borrill and   
              Leonid Oliker and   
                 John Shalf and   
             Hongzhang Shan and   
                 Andrew Uselton   HPC global file system performance
                                  analysis using a scientific-application
                                  derived benchmark  . . . . . . . . . . . 358--373

Parallel Computing
Volume 35, Number 7, July, 2009

              Markus Geimer and   
                 Felix Wolf and   
          Brian J. N. Wylie and   
                     Bernd Mohr   A scalable tool architecture for
                                  diagnosing wait states in massively
                                  parallel applications  . . . . . . . . . 375--388
                  Jay Smith and   
           Vladimir Shestak and   
          Howard Jay Siegel and   
                 Suzy Price and   
              Larry Teklits and   
             Prasanna Sugavanam   Robust resource allocation in a cluster
                                  based imaging system . . . . . . . . . . 389--400
                  Yang Wang and   
                   Ming Zhu and   
                         Hua Li   A distributed Key Message algorithm to
                                  optimize the communication in clusters   401--415
               Hatem Ltaief and   
                    Marc Garbey   A parallel Aitken-additive Schwarz
                                  waveform relaxation suitable for the
                                  grid . . . . . . . . . . . . . . . . . . 416--428

Parallel Computing
Volume 35, Number 8--9, August / September, 2009

              Cole Trapnell and   
              Michael C. Schatz   Optimizing data intensive GPGPU
                                  computations for DNA sequence alignment  429--440
             Tz-Liang Kueng and   
             Cheng-Kuan Lin and   
                 Tyne Liang and   
            Jimmy J. M. Tan and   
                  Lih-Hsing Hsu   Embedding paths of variable lengths into
                                  hypercubes with conditional link-faults  441--454
Arturo González-Escribano and   
     Arjan J. C. van Gemund and   
Valentín Cardeñoso-Payo   Performance implications of
                                  synchronization structure in parallel
                                  programming  . . . . . . . . . . . . . . 455--474
              Ananta Tiwari and   
           Vahid Tabatabaee and   
       Jeffrey K. Hollingsworth   Tuning parallel applications in parallel 475--492

Parallel Computing
Volume 35, Number 10--11, October / November, 2009

             Diane Lingrand and   
            Tristan Glatard and   
                Johan Montagnat   Modeling the latency on production grids
                                  with respect to the execution context    493--511
                Anshu Dubey and   
              Katie Antypas and   
        Murali K. Ganapathy and   
               Lynn B. Reid and   
            Katherine Riley and   
                Dan Sheeler and   
              Andrew Siegel and   
                    Klaus Weide   Extensible component-based architecture
                                  for FLASH, a massively parallel,
                                  multiphysics simulation code . . . . . . 512--522
I. Marín Carrión and   
    E. Arias Antúnez and   
     M. M. Artigao Castillo and   
J. J. Águila Guerrero and   
          J. J. Miralles Canals   Thread-based implementations of the
                                  false nearest neighbors method . . . . . 523--534
               Hamid Mahini and   
             Hamid Sarbazi-Azad   Resource placement in three-dimensional
                                  tori . . . . . . . . . . . . . . . . . . 535--543
         Henning Meyerhenke and   
            Burkhard Monien and   
             Stefan Schamberger   Graph partitioning and disturbed
                                  diffusion  . . . . . . . . . . . . . . . 544--569

Parallel Computing
Volume 35, Number 12, December, 2009

            Franck Cappello and   
             Thomas Herault and   
                  Jack Dongarra   Foreword . . . . . . . . . . . . . . . . 571--571
                        Bin Jia   Process cooperation in multiple message
                                  broadcast  . . . . . . . . . . . . . . . 572--580
              Peter Sanders and   
               Jochen Speck and   
      Jesper Larsson Träff   Two-tree algorithms for full bandwidth
                                  broadcast, reduction and scan  . . . . . 581--594
              Daniel Becker and   
          Rolf Rabenseifner and   
                 Felix Wolf and   
                John C. Linford   Scalable timestamp synchronization for
                                  event traces of message-passing
                                  applications . . . . . . . . . . . . . . 595--607
              Rajeev Thakur and   
                  William Gropp   Test suite for evaluating performance of
                                  multithreaded MPI communication  . . . . 608--617

Parallel Computing
Volume 36, Number 1, January, 2010

       Jeffrey K. Hollingsworth   Editorial  . . . . . . . . . . . . . . . 1--2
                 P. Amestoy and   
                 I. S. Duff and   
              A. Guermouche and   
                    Tz. Slavova   Analysis of the solution phase of a
                                  parallel multifrontal approach . . . . . 3--15
                    Shigeo Orii   Metrics for evaluation of parallel
                                  efficiency toward highly parallel
                                  processing . . . . . . . . . . . . . . . 16--25
       Juan Piernas-Canovas and   
                Jarek Nieplocha   Implementation and evaluation of active
                                  storage in modern parallel file systems  26--47
            Rajesh Sudarsan and   
              Calvin J. Ribbens   Design and performance of a scheduling
                                  framework for resizable parallel
                                  applications . . . . . . . . . . . . . . 48--64
Carlos Alberto Alonso Sanches and   
         Nei Yoshihiro Soma and   
         Horacio Hideki Yanasse   Observations on optimal parallelizations
                                  of two-list  . . . . . . . . . . . . . . 65--67
                      Anonymous   Acknowledgment to Reviewers  . . . . . . 68--69
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??

Parallel Computing
Volume 36, Number 2--3, February / March, 2010

           Javier Navaridas and   
         Jose Miguel-Alonso and   
  Francisco Javier Ridruejo and   
                Wolfgang Denzel   Reducing complexity in tree-like
                                  computer interconnection networks  . . . 71--85
       Hinde Lilia Bouziane and   
     Christian Pérez and   
                  Thierry Priol   Extending software component models with
                                  the master-worker paradigm . . . . . . . 86--103
                Yi-Neng Lin and   
               Ying-Dar Lin and   
                 Yuan-Cheng Lai   Thread allocation in CMP-based
                                  multithreaded network processors . . . . 104--116
            Mathieu Luisier and   
                Gerhard Klimeck   Numerical strategies towards peta-scale
                                  simulations of nanoelectronics devices   117--128
              Yusuke Okitsu and   
               Fumihiko Ino and   
               Kenichi Hagihara   High-performance cone beam
                                  reconstruction using CUDA compatible
                                  GPUs . . . . . . . . . . . . . . . . . . 129--141
               J. Götz and   
               K. Iglberger and   
             C. Feichtinger and   
                  S. Donath and   
                   U. Rüde   Coupling multibody dynamics and
                                  computational fluid dynamics on 8192
                                  processor cores  . . . . . . . . . . . . 142--151

Parallel Computing
Volume 36, Number 4, April, 2010

             Mauricio Marin and   
         Veronica Gil-Costa and   
           Carolina Bonacic and   
        Ricardo Baeza-Yates and   
              Isaac D. Scherson   Sync/Async parallel search for the
                                  efficient design and construction of Web
                                  search engines . . . . . . . . . . . . . 153--168
          Andrzej Karbowski and   
             Maciej Remiszewski   Assessment of the Cell Broadband Engine
                                  Architecture as a platform to solve
                                  closed-loop optimal control problems . . 169--180
             M. Krotkiewski and   
                   M. Dabrowski   Parallel symmetric sparse matrix-vector
                                  product on scalar multi-core CPUs  . . . 181--198
             J. Berli\'nska and   
                  M. Drozdowski   Heuristics for multi-round divisible
                                  loads scheduling with limited memory . . 199--211

Parallel Computing
Volume 36, Number 5--6, June, 2010

               Costas Bekas and   
             Pasqua D'Ambra and   
               Ananth Grama and   
                Yousef Saad and   
                    Petko Yanev   Special issue on Parallel Matrix
                                  Algorithms and Applications  . . . . . . 213--214
            Joseph M. Elble and   
      Nikolaos V. Sahinidis and   
              Panagiotis Vouzis   GPU computing with Kaczmarz's and other
                                  iterative algorithms for linear systems  215--231
            Stanimire Tomov and   
              Jack Dongarra and   
                  Marc Baboulin   Towards dense linear algebra for hybrid
                                  GPU accelerated manycore systems . . . . 232--240
  Aydìn Buluç and   
            John R. Gilbert and   
                    Ceren Budak   Solving path problems on the GPU . . . . 241--253
           Bora Uçar and   
Ümit V. Çatalyürek and   
                 Cevdet Aykanat   A Matrix Partitioning Interface to PaToH
                                  in MATLAB  . . . . . . . . . . . . . . . 254--272
                  T. Huckle and   
              A. Kallischko and   
                     A. Roy and   
                M. Sedlacek and   
                   T. Weinzierl   An efficient parallel implementation of
                                  the MSPAI preconditioner . . . . . . . . 273--284
                  L. Giraud and   
                  A. Haidar and   
                      S. Pralet   Using multiple levels of parallelism to
                                  enhance the performance of domain
                                  decomposition solvers  . . . . . . . . . 285--296
             Martin Be\vcka and   
             Gabriel Ok\vsa and   
  Marián Vajter\vsic and   
                  Laura Grigori   On iterative QR pre-processing in the
                                  parallel block-Jacobi SVD algorithm  . . 297--307
             Fabrice Dupros and   
          Florent De Martin and   
           Evelyne Foerster and   
         Dimitri Komatitsch and   
                     Jean Roman   High-performance finite-element
                                  simulations of seismic wave propagation
                                  in three-dimensional nonlinear inelastic
                                  geological media . . . . . . . . . . . . 308--325
               Maximilian Emans   Performance of parallel
                                  AMG-preconditioners in CFD-codes for
                                  weakly compressible flows  . . . . . . . 326--338
              Jose E. Roman and   
          Matthias Kammerer and   
               Florian Merz and   
                    Frank Jenko   Fast eigenvalue calculations in a
                                  massively parallel plasma turbulence
                                  code . . . . . . . . . . . . . . . . . . 339--358
            T. Auckenthaler and   
                   M. Bader and   
                  T. Huckle and   
              A. Spörl and   
                    K. Waldherr   Matrix exponentials and parallel prefix
                                  computation in a quantum control problem 359--369

Parallel Computing
Volume 36, Number 7, July, 2010

            Ruppa K. Thulasiram   Preface  . . . . . . . . . . . . . . . . 371--371
                Vladimir Surkov   Parallel option pricing with Fourier
                                  space time-stepping method on graphics
                                  processing units . . . . . . . . . . . . 372--380
              Manfred Gilli and   
                Enrico Schumann   Distributed optimisation of a
                                  portfolio's Omega  . . . . . . . . . . . 381--389
                 S. Corsaro and   
           P. L. De Angelis and   
                  Z. Marino and   
                   F. Perla and   
                     P. Zanetti   On parallel asset-liability management
                                  in life insurance: a forward
                                  risk-neutral approach  . . . . . . . . . 390--402
             Gianluca Fusai and   
          Daniele Marazzina and   
                  Marina Marena   Option pricing, maturity randomization
                                  and distributed computing  . . . . . . . 403--414
           Giray Ökten and   
               Matthew Willyard   Parameterization based on randomized
                                  quasi-Monte Carlo methods  . . . . . . . 415--422

Parallel Computing
Volume 36, Number 8, August, 2010

             Andrew V. Terekhov   Parallel Dichotomy Algorithm for solving
                                  tridiagonal system of linear equations
                                  with multiple right-hand sides . . . . . 423--438
              Daisuke Takahashi   Parallel implementation of
                                  multiple-precision arithmetic and
                                  $2,576,980,370,000$ decimal digits of
                                  $\pi$ calculation  . . . . . . . . . . . 439--448
         Pavan Yalamanchili and   
                Sumod Mohan and   
          Rommel Jalasutram and   
                     Tarek Taha   Acceleration of hierarchical Bayesian
                                  network based cortical models on
                                  multicore architectures  . . . . . . . . 449--468
                 Tomas Hruz and   
           Stefan Geisseler and   
          Marcel Schöngens   Parallelism in simulation and modeling
                                  of scale-free complex networks . . . . . 469--485

Parallel Computing
Volume 36, Number 9, September, 2010

               Qiankun Miao and   
             Guangzhong Sun and   
               Jiulong Shan and   
                  Guoliang Chen   Parallelization and optimization of
                                  Mfold on shared memory system  . . . . . 487--494
                 Dan Gordon and   
                  Rachel Gordon   CARP--CG: a robust and efficient
                                  parallel solver for linear systems,
                                  applied to strongly convection dominated
                                  PDEs . . . . . . . . . . . . . . . . . . 495--515
                    Fei Xia and   
                   Yong Dou and   
                   Dan Zhou and   
                         Xin Li   Fine-grained parallel RNA secondary
                                  structure prediction using SCFGs on FPGA 516--530
                   Sean Rul and   
        Hans Vandierendonck and   
              Koen De Bosschere   A profile-based tool for finding
                                  pipeline parallelism in sequential
                                  programs . . . . . . . . . . . . . . . . 531--551

Parallel Computing
Volume 36, Number 10--11, October / November, 2010

         J. Ignacio Hidalgo and   
        Francisco Fernandez and   
             Juan Lanchares and   
     Erick Cantú-Paz and   
                  Albert Zomaya   Parallel Architectures and Bioinspired
                                  Algorithms . . . . . . . . . . . . . . . 553--554
              M. Ruci\'nski and   
                    D. Izzo and   
                     F. Biscani   On the impact of the migration topology
                                  on the Island Model  . . . . . . . . . . 555--571
José L. Risco-Martín and   
              David Atienza and   
         J. Manuel Colmenar and   
                  Oscar Garnica   A parallel evolutionary algorithm to
                                  optimize dynamic memory managers in
                                  embedded systems . . . . . . . . . . . . 572--590
           Marjan Rouhipour and   
           Peter J. Bentley and   
                 Hooman Shayani   Fast bio-inspired computation using a
                                  GPU-based systemic computer  . . . . . . 591--617
 Carlos Pérez-Miguel and   
         Jose Miguel-Alonso and   
            Alexander Mendiburu   Porting Estimation of Distribution
                                  Algorithms to the Cell Broadband Engine  618--634
           Una-May O'Reilly and   
              Eric Robinson and   
           Sanjeev Mohindra and   
               Julie Mullen and   
                    Nadya Bliss   Hogs and slackers: Using operations
                                  balance in a genetic algorithm to
                                  optimize sparse algebra computation on
                                  distributed architectures  . . . . . . . 635--644

Parallel Computing
Volume 36, Number 12, December, 2010

            Stanimire Tomov and   
                 Rajib Nath and   
                  Jack Dongarra   Accelerating the reduction to upper
                                  Hessenberg, tridiagonal, and bidiagonal
                                  forms through hybrid GPU-based computing 645--654
               K. A. Hawick and   
                   A. Leist and   
                   D. P. Playne   Parallel graph component labelling with
                                  GPUs and CUDA  . . . . . . . . . . . . . 655--678
          T. E. Athanaileas and   
         G. E. Athanasiadou and   
              G. V. Tsoulos and   
                D. I. Kaklamani   Parallel radio-wave propagation modeling
                                  with image-based ray tracing techniques  679--695
              Marina Alonso and   
              Salvador Coll and   
Juan-Miguel Martínez and   
           Vicente Santonja and   
         Pedro López and   
              José Duato   Power saving in regular interconnection
                                  networks . . . . . . . . . . . . . . . . 696--712

Parallel Computing
Volume 37, Number 1, January, 2011

                      Bo Li and   
                    Koichi Wada   Communication latency tolerant parallel
                                  algorithm for particle swarm
                                  optimization . . . . . . . . . . . . . . 1--10
            Yung-Chang Chiu and   
              Ce-Kuen Shieh and   
              Tzu-Chi Huang and   
             Tyng-Yeu Liang and   
                   Kuo-Chih Chu   Data race avoidance and replay scheme
                                  for developing and debugging parallel
                                  programs on distributed shared memory
                                  systems  . . . . . . . . . . . . . . . . 11--25
              Sevin Varoglu and   
                  Stephen Jenks   Architectural support for thread
                                  communications in multi-core processors  26--41
               Rahul Nagpal and   
                  Y. N. Srikant   Compiler-assisted power optimization for
                                  clustered VLIW architectures . . . . . . 42--59
              Oleg V. Shylo and   
         Timothy Middelkoop and   
              Panos M. Pardalos   Restart strategies in optimization:
                                  parallel and serial cases  . . . . . . . 60--68

Parallel Computing
Volume 37, Number 2, February, 2011

          Robert W. Numrich and   
              Michael A. Heroux   Self-similarity of parallel machines . . 69--84
                   Brice Goglin   High-performance message-passing over
                                  generic Ethernet hardware with Open-MX   85--100
                Anshu Dubey and   
              Katie Antypas and   
              Christopher Daley   Parallel algorithms for moving
                                  Lagrangian data on block structured
                                  Eulerian meshes  . . . . . . . . . . . . 101--113
          Alireza Poshtkohi and   
       M. B. Ghaznavi-Ghoushchi   DotDFS: a Grid-based high-throughput
                                  file transfer system . . . . . . . . . . 114--136
                      Anonymous   Editorial Board  . . . . . . . . . . . . ifc

Parallel Computing
Volume 37, Number 3, March, 2011

Antonio Robles-Gómez and   
    Aurelio Bermúdez and   
                  Rafael Casado   Efficient network management applied to
                                  source routed networks . . . . . . . . . 137--156
               Liangxiu Han and   
              Chee Sun Liew and   
            Jano van Hemert and   
               Malcolm Atkinson   A generic parallel processing model for
                                  facilitating data mining and integration 157--171
                   Eric Aubanel   Scheduling of tasks in the parareal
                                  algorithm  . . . . . . . . . . . . . . . 172--182
      José I. Aliaga and   
    Matthias Bollhöfer and   
   Alberto F. Martín and   
      Enrique S. Quintana-Orti'   Exploiting thread-level parallelism in
                                  the iterative solution of sparse linear
                                  systems  . . . . . . . . . . . . . . . . 183--202
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??

Parallel Computing
Volume 37, Number 4--5, April / May, 2011

               Christian Konrad   Two-constraint domain decomposition with
                                  Space Filling Curves . . . . . . . . . . 203--216
            Robert W. Robey and   
          Jonathan M. Robey and   
                     Rob Aulwes   In search of numerical consistency in
                                  parallel programming . . . . . . . . . . 217--229
             Omar Bouattane and   
          Bouchaib Cherradi and   
            Mohamed Youssfi and   
            Mohamed O. Bensalah   Parallel $c$-means algorithm for image
                                  segmentation on a reconfigurable mesh
                                  computer . . . . . . . . . . . . . . . . 230--243
          David Díaz and   
Francisco José Esteban and   
     Pilar Hernández and   
     Juan Antonio Caballero and   
             Gabriel Dorado and   
           Sergio Gálvez   Parallelizing and optimizing a
                                  bioinformatics pairwise sequence
                                  alignment algorithm for many-core
                                  architecture . . . . . . . . . . . . . . 244--259
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??

Parallel Computing
Volume 37, Number 6--7, June / July, 2011

     Dimitrije Jevremovi\'c and   
              Cong T. Trinh and   
           Friedrich Srienc and   
             Carlos P. Sosa and   
                   Daniel Boley   Parallelization of Nullspace Algorithm
                                  for the computation of metabolic
                                  pathways . . . . . . . . . . . . . . . . 261--278
               Fangzhou Wei and   
                  Ali E. Yilmaz   A hybrid message passing/shared memory
                                  parallelization of the adaptive integral
                                  method for multi-core clusters . . . . . 279--301
                   Hao Wang and   
                  Xudong Fu and   
             Guangqian Wang and   
                 Tiejian Li and   
                        Jie Gao   A common parallel computing framework
                                  for modeling hydrological processes of
                                  river basins . . . . . . . . . . . . . . 302--315
           Pablo D. Mininni and   
            Duane Rosenberg and   
                Raghu Reddy and   
                 Annick Pouquet   A hybrid MPI--OpenMP scheme for scalable
                                  parallel pseudospectral computations for
                                  fluid turbulence . . . . . . . . . . . . 316--326
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??

Parallel Computing
Volume 37, Number 8, August, 2011

       Jeffrey K. Hollingsworth   In Memoriam: Angela C. Sodan, PhD
                                  (August 30, 1955--April 21, 2011)  . . . 327--327
                Yves Robert and   
               Leonel Sousa and   
                 Denis Trystram   Parallel Computing --- Special Issue . . 329--330
                Anne Benoit and   
             Henri Casanova and   
       Veronika Rehn-Sonigo and   
                    Yves Robert   Resource allocation for multiple
                                  concurrent in-network stream-processing
                                  applications . . . . . . . . . . . . . . 331--348
            Cristina Boeres and   
Idalmis Milián Sardiña and   
    Lúcia M. A. Drummond   An efficient weighted bi-objective
                                  scheduling algorithm for heterogeneous
                                  systems  . . . . . . . . . . . . . . . . 349--364
                Anne Benoit and   
                Yves Robert and   
           Arnold Rosenberg and   
  Frédéric Vivien   Static worksharing strategies for
                                  heterogeneous computers with
                                  unrecoverable interruptions  . . . . . . 365--378
       Luis Garcés-Erice   Admission control for a responsive
                                  distributed middleware using decision
                                  trees to model run-time parameters . . . 379--391
                 M. M. Khan and   
                 A. D. Rast and   
               J. Navaridas and   
                     X. Jin and   
                L. A. Plana and   
            M. Luján and   
                  S. Temple and   
               C. Patterson and   
                D. Richards and   
                J. V. Woods and   
           J. Miguel-Alonso and   
                   S. B. Furber   Event-driven configuration of a neural
                                  network CMP system over an homogeneous
                                  interconnect fabric  . . . . . . . . . . 392--409
                Anne Benoit and   
          Alexandru Dobrila and   
            Jean-Marc Nicod and   
               Laurent Philippe   Mapping workflow applications with types
                                  on heterogeneous specialized platforms   410--427
           Jorge G. Barbosa and   
                Belmiro Moreira   Dynamic scheduling of a batch of
                                  parallel task jobs on heterogeneous
                                  clusters . . . . . . . . . . . . . . . . 428--438
               Peter Benner and   
              Pablo Ezzatti and   
            Daniel Kressner and   
  Enrique S. Quintana-Orti' and   
           Alfredo Remón   A mixed-precision algorithm for the
                                  solution of Lyapunov equations on hybrid
                                  CPU--GPU platforms . . . . . . . . . . . 439--450
                Chenqi Wang and   
             Neil Cafferkey and   
              James Kennedy and   
               John P. Morrison   CG3DR: Coordination of icosahedral virus
                                  reconstruction using Condensed Graphs    451--465
             Mathieu Giraud and   
Jean-Stéphane Varré   Parallel Position Weight Matrices
                                  algorithms . . . . . . . . . . . . . . . 466--478
              Anna Beletska and   
      W\lodzimierz Bielecki and   
               Albert Cohen and   
            Marek Palkowski and   
            Krzysztof Siedlecki   Coarse-grained loop parallelization:
                                  Iteration Space Slicing vs affine
                                  transformations  . . . . . . . . . . . . 479--497
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??

Parallel Computing
Volume 37, Number 9, September, 2011

              Leonid Oliker and   
            Rajesh Nishtala and   
                   Rupak Biswas   Emerging programming paradigms for
                                  large-scale scientific computing . . . . 499--500
             Kamesh Madduri and   
                 Eun-Jin Im and   
          Khaled Z. Ibrahim and   
            Samuel Williams and   
     Stéphane Ethier and   
                  Leonid Oliker   Gyrokinetic particle-in-cell
                                  optimization on emerging multi- and
                                  manycore platforms . . . . . . . . . . . 501--520
                  Wang Xian and   
                  Aoki Takayuki   Multi-GPU performance of incompressible
                                  flow computation by lattice Boltzmann
                                  method on GPU cluster  . . . . . . . . . 521--535
      Christian Feichtinger and   
            Johannes Habich and   
        Harald Köstler and   
                Georg Hager and   
           Ulrich Rüde and   
                Gerhard Wellein   A flexible Patch-based lattice Boltzmann
                                  parallelization approach for
                                  heterogeneous GPU--CPU clusters  . . . . 536--549
         Darren J. Kerbyson and   
               Michael Lang and   
                    Scott Pakin   Adapting wave-front algorithms to
                                  efficiently utilize systems with deep
                                  communication hierarchies  . . . . . . . 550--561
               Haoqiang Jin and   
           Dennis Jespersen and   
            Piyush Mehrotra and   
               Rupak Biswas and   
                  Lei Huang and   
                Barbara Chapman   High performance computing using MPI and
                                  OpenMP on multi-core parallel systems    562--575
            Rajesh Nishtala and   
                 Yili Zheng and   
           Paul H. Hargrove and   
            Katherine A. Yelick   Tuning collective communication for
                                  Partitioned Global Address Space
                                  programming models . . . . . . . . . . . 576--591
                  David Gay and   
              Joel Galenson and   
                 Mayur Naik and   
                   Kathy Yelick   Yada: Straightforward parallel
                                  programming  . . . . . . . . . . . . . . 592--609
         Steven J. Plimpton and   
                Karen D. Devine   MapReduce in MPI for large-scale graph
                                  algorithms . . . . . . . . . . . . . . . 610--632
              Michael Wilde and   
             Mihael Hategan and   
          Justin M. Wozniak and   
               Ben Clifford and   
             Daniel S. Katz and   
                     Ian Foster   Swift: a language for distributed
                                  parallel scripting . . . . . . . . . . . 633--652
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??

Parallel Computing
Volume 37, Number 10--11, October / November, 2011

                 Lizhi Peng and   
                    Bo Yang and   
                  Lei Zhang and   
                    Yuehui Chen   A parallel evolving algorithm for
                                  flexible neural tree . . . . . . . . . . 653--666
               Min Yeol Lim and   
           Vincent W. Freeh and   
             David K. Lowenthal   Adaptive, transparent CPU scaling
                                  algorithms leveraging inter-node MPI
                                  communication regions  . . . . . . . . . 667--683
            Tristan Glatard and   
            Sorina Camarasu-Pop   A model of pilot-job resource
                                  provisioning on production grids . . . . 684--692
              Loris Marchal and   
  Frédéric Vivien   Editorial  . . . . . . . . . . . . . . . 693--693
           Naga Vydyanathan and   
            Umit Catalyurek and   
                Tahsin Kurc and   
      Ponnuswamy Sadayappan and   
                     Joel Saltz   Optimizing latency and throughput of
                                  application workflows on clusters  . . . 694--712
        Ioannis Riakiotakis and   
          Florina M. Ciorba and   
        Theodore Andronikos and   
        George Papakonstantinou   Distributed dynamic load balancing for
                                  pipelined computations on heterogeneous
                                  systems  . . . . . . . . . . . . . . . . 713--729
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??

Parallel Computing
Volume 37, Number 12, December, 2011

               Peter Arbenz and   
                Yousef Saad and   
                Ahmed Sameh and   
                    Olaf Schenk   Special issue on Parallel Matrix
                                  Algorithms and Applications (PMAA'10)    731--732
           Karan Mendiratta and   
                   Eric Polizzi   A threaded SPIKE algorithm for solving
                                  general banded systems . . . . . . . . . 733--741
              Daniel Maurer and   
              Christian Wieners   A parallel block LU decomposition method
                                  for distributed finite element matrices  742--758
              Chenhan D. Yu and   
              Weichung Wang and   
                   Dan'l Pierce   A CPU--GPU hybrid approach for the
                                  unsymmetric multifrontal method  . . . . 759--770
                L. Karlsson and   
         B. Kågström   Parallel two-stage reduction to
                                  Hessenberg form using dynamic scheduling
                                  on shared-memory architectures . . . . . 771--782
            T. Auckenthaler and   
                    V. Blum and   
             H.-J. Bungartz and   
                  T. Huckle and   
                 R. Johanni and   
             L. Krämer and   
                    B. Lang and   
                 H. Lederer and   
                  P. R. Willems   Parallel solution of partial symmetric
                                  eigenvalue problems from electronic
                                  structure calculations . . . . . . . . . 783--794
                M. Petschow and   
                  P. Bientinesi   MR$^3$-SMP: a symmetric tridiagonal
                                  eigensolver for multi-core architectures 795--805
              A. N. Yzelman and   
               Rob H. Bisseling   Two-dimensional cache-oblivious sparse
                                  matrix-vector multiplication . . . . . . 806--819
          Johannes Langguth and   
    Md. Mostofa Ali Patwary and   
                  Fredrik Manne   Parallel algorithms for bipartite
                                  matching problems on distributed memory
                                  computers  . . . . . . . . . . . . . . . 820--845
                Cyril Flaig and   
                   Peter Arbenz   A scalable memory efficient multigrid
                                  solver for micro-finite element analyses
                                  based on CT images . . . . . . . . . . . 846--854
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??

Parallel Computing
Volume 38, Number 1--2, January / February, 2012

                Torsten Hoefler   Extensions for next-generation parallel
                                  programming models . . . . . . . . . . . 1--1
                 Nick Rutar and   
       Jeffrey K. Hollingsworth   Data centric techniques for mapping
                                  performance data to program variables    2--14
              Joshua Hursey and   
              Richard L. Graham   Analyzing fault aware collective
                                  performance in a process fault tolerant
                                  MPI  . . . . . . . . . . . . . . . . . . 15--25
      Jesper Larsson Träff   Alternative, uniformly expressive and
                                  more scalable interfaces for collective
                                  communication in MPI . . . . . . . . . . 26--36
             George Bosilca and   
        Aurelien Bouteiller and   
            Anthony Danalis and   
             Thomas Herault and   
          Pierre Lemarinier and   
                  Jack Dongarra   DAGuE: a generic distributed DAG engine
                                  for High Performance Computing . . . . . 37--51
          Martin Sandrieser and   
          Siegfried Benkner and   
                   Sabri Pllana   Using explicit platform descriptions to
                                  support programming of heterogeneous
                                  many-core systems  . . . . . . . . . . . 52--65
                Phil Miller and   
               Aaron Becker and   
          Laxmikant Kalé   Using shared arrays in message-driven
                                  parallel programs  . . . . . . . . . . . 66--74
               Pieter Hijma and   
      Rob V. van Nieuwpoort and   
        Ceriel J. H. Jacobs and   
                   Henri E. Bal   Generating synchronization statements in
                                  divide-and-conquer programs  . . . . . . 75--89
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??

Parallel Computing
Volume 38, Number 3, March, 2012

        Lucas Mello Schnorr and   
            Guillaume Huard and   
Philippe Olivier Alexandre Navaux   A hierarchical aggregation model to
                                  achieve visualization scalability in the
                                  analysis of parallel applications  . . . 91--110
              Holger Scherl and   
          Markus Kowarschik and   
          Hannes G. Hofmann and   
              Benjamin Keck and   
              Joachim Hornegger   Evaluation of state-of-the-art hardware
                                  architectures for fast cone-beam CT
                                  reconstruction . . . . . . . . . . . . . 111--124
                  A. Moreno and   
                   E. Cesar and   
                 A. Guevara and   
                J. Sorribes and   
                    T. Margalef   Load balancing in homogeneous pipeline
                                  based applications . . . . . . . . . . . 125--139
       Aleksandr Ovcharenko and   
              Daniel Ibanez and   
          Fabien Delalondre and   
                Onkar Sahni and   
          Kenneth E. Jansen and   
   Christopher D. Carothers and   
               Mark S. Shephard   Neighborhood communication paradigm to
                                  increase scalability in large-scale
                                  dynamic scientific applications  . . . . 140--156
      Andreas Klöckner and   
              Nicolas Pinto and   
                 Yunsup Lee and   
            Bryan Catanzaro and   
                Paul Ivanov and   
                    Ahmed Fasih   PyCUDA and PyOpenCL: a scripting-based
                                  approach to GPU run-time code generation 157--174
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??

Parallel Computing
Volume 38, Number 4--5, April / May, 2012

              Minhaj Ahmad Khan   Scheduling for heterogeneous systems
                                  using constrained critical paths . . . . 175--193
             Kathryn Mohror and   
             Karen L. Karavanic   Trace profiling: Scalable event tracing
                                  on high-end parallel systems . . . . . . 194--225
              Gerassimos Barlas   Cluster-based optimized parallel video
                                  transcoding  . . . . . . . . . . . . . . 226--244
              H. M. Aktulga and   
              J. C. Fogarty and   
               S. A. Pandit and   
                    A. Y. Grama   Parallel reactive molecular dynamics:
                                  Numerical methods and algorithmic
                                  techniques . . . . . . . . . . . . . . . 245--259
          Roman Wyrzykowski and   
            Krzysztof Rojek and   
                 Lukasz Szustak   Model-driven adaptation of
                                  double-precision matrix multiplication
                                  to the Cell processor architecture . . . 260--276
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??

Parallel Computing
Volume 38, Number 6--7, June / July, 2012

           F. Argüello and   
                D. B. Heras and   
              M. Bóo and   
      J. Lamas-Rodríguez   The split-and-merge method in general
                                  purpose computation on GPUs  . . . . . . 277--288
      Timothy D. R. Hartley and   
                 Erik Saule and   
Ümit V. Çatalyürek   Improving performance of adaptive
                                  component-based dataflow middleware  . . 289--309
                    Peng Di and   
                     Hui Wu and   
               Jingling Xue and   
                  Feng Wang and   
                    Canqun Yang   Parallelizing SOR for GPGPUs using
                                  alternate loop tiling  . . . . . . . . . 310--328
               Rahul Nagpal and   
                 Anasua Bhowmik   Criticality guided energy aware
                                  speculation for speculative
                                  multithreaded processors . . . . . . . . 329--341
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??

Parallel Computing
Volume 38, Number 8, August, 2012

      Volodymyr Kindratenko and   
            Gregory D. Peterson   Application accelerators in HPC ---
                                  Editorial introduction . . . . . . . . . 343--343
          Andrew G. Schmidt and   
           Siddhartha Datta and   
           Ashwin A. Mendon and   
                       Ron Sass   Investigation into scaling I/O bound
                                  streaming applications productively with
                                  an all-FPGA cluster  . . . . . . . . . . 344--364
           Frederico Pratas and   
             Pedro Trancoso and   
               Leonel Sousa and   
      Alexandros Stamatakis and   
                Guochun Shi and   
          Volodymyr Kindratenko   Fine-grain parallelism using multi-core,
                                  Cell/BE, and GPU Systems . . . . . . . . 365--390
                    Peng Du and   
                 Rick Weber and   
             Piotr Luszczek and   
            Stanimire Tomov and   
           Gregory Peterson and   
                  Jack Dongarra   From CUDA to OpenCL: Towards a
                                  performance-portable solution for
                                  multi-platform GPU programming . . . . . 391--407
   Francisco Vázquez and   
José Jesús Fernández and   
         Ester M. Garzón   Automatic tuning of the sparse matrix
                                  vector product on GPUs based on the
                                  ELLR-T approach  . . . . . . . . . . . . 408--420
                Depeng Yang and   
       Gregory. D. Peterson and   
                     Husheng Li   Compressed sensing and Cholesky
                                  decomposition on FPGAs and GPUs  . . . . 421--437
           John R. Wernsing and   
                     Greg Stitt   Elastic computing: a portable
                                  optimization framework for hybrid
                                  computers  . . . . . . . . . . . . . . . 438--464
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??

Parallel Computing
Volume 38, Number 9, September, 2012

        Basilio B. Fraguela and   
           Ganesh Bikshandi and   
                    Jia Guo and   
María J. Garzarán and   
                David Padua and   
            Christoph von Praun   Optimization techniques for efficient
                                  HTA programs . . . . . . . . . . . . . . 465--484
           Takeshi Iwashita and   
                Yu Hirotani and   
             Takeshi Mifune and   
            Toshio Murayama and   
                  Hideki Ohtani   Large-scale time-harmonic
                                  electromagnetic field analysis using a
                                  multigrid solver on a distributed memory
                                  parallel computer  . . . . . . . . . . . 485--500
              Amit Amritkar and   
               Danesh Tafti and   
                    Rui Liu and   
                Rick Kufrin and   
                Barbara Chapman   OpenMP parallelism for fluid and
                                  fluid-particulate systems  . . . . . . . 501--517
       Wlodzimierz Bielecki and   
            Marek Palkowski and   
                  Tomasz Klimek   Free scheduling for statement instances
                                  of parameterized arbitrarily nested
                                  affine loops . . . . . . . . . . . . . . 518--532
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??

Parallel Computing
Volume 38, Number 10--11, October / November, 2012

                  Yong Chen and   
                 Huaiyu Zhu and   
                    Hui Jin and   
                    Xian-He Sun   Algorithm-level Feedback-controlled
                                  Adaptive data prefetcher: Accelerating
                                  data access for high-performance
                                  processors . . . . . . . . . . . . . . . 533--551
          Mickeal Verschoor and   
                Andrei C. Jalba   Analysis and performance estimation of
                                  the Conjugate Gradient method on
                                  multiple GPUs  . . . . . . . . . . . . . 552--575
Ümit V. Çatalyürek and   
                   John Feo and   
     Assefaw H. Gebremedhin and   
     Mahantesh Halappanavar and   
                    Alex Pothen   Graph coloring algorithms for multi-core
                                  and massively multithreaded
                                  architectures  . . . . . . . . . . . . . 576--594
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??

Parallel Computing
Volume 38, Number 12, December, 2012

                Madan Sathe and   
                Olaf Schenk and   
                Helmar Burkhart   An auction-based weighted matching
                                  implementation on massively parallel
                                  architectures  . . . . . . . . . . . . . 595--614
                 M. Etinski and   
                J. Corbalan and   
                 J. Labarta and   
                      M. Valero   Parallel job scheduling for power
                                  constrained HPC systems  . . . . . . . . 615--630
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??

Parallel Computing
Volume 39, Number 1, January, 2013

           Dana A. Jacobsen and   
                  Inanc Senocak   Multi-level parallelism for
                                  incompressible flow computations on GPU
                                  clusters . . . . . . . . . . . . . . . . 1--20
            Masha Sosonkina and   
            Layne T. Watson and   
      Nicholas R. Radcliffe and   
           Rafael T. Haftka and   
             Michael W. Trosset   Adjusting process count on demand for
                                  petascale global optimization  . . . . . 21--35
              Diego Andrade and   
        Basilio B. Fraguela and   
            Ramón Doallo   Accurate prediction of the behavior of
                                  multithreaded applications in shared
                                  caches . . . . . . . . . . . . . . . . . 36--57
              Orlando Ayala and   
                 Lian-Ping Wang   Parallel implementation and scalability
                                  analysis of $3$D Fast Fourier Transform
                                  using $2$D domain decomposition  . . . . 58--77
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??

Parallel Computing
Volume 39, Number 2, February, 2013

              Abhinav Sarje and   
                 Srinivas Aluru   All-pairs computations on many-core
                                  graphics processors  . . . . . . . . . . 79--93
Ferit Büyükkeçeci and   
                 Omar Awile and   
              Ivo F. Sbalzarini   A portable OpenCL implementation of
                                  generic particle-mesh and mesh-particle
                                  interpolation in $2$D and $3$D . . . . . 94--111
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??

Parallel Computing
Volume 39, Number 3, March, 2013

                      Anonymous   Preface: Infrastructure for scalable
                                  tools  . . . . . . . . . . . . . . . . . 113--113
                Mark W. Krentel   Libmonitor: a tool for first-party
                                  monitoring . . . . . . . . . . . . . . . 114--119
                 Nick Rutar and   
       Jeffrey K. Hollingsworth   Software techniques for negating skid
                                  and approximating cache miss
                                  measurements . . . . . . . . . . . . . . 120--131
 Marc-André Hermanns and   
      Sriram Krishnamoorthy and   
                     Felix Wolf   A scalable infrastructure for the
                                  performance analysis of passive target
                                  synchronization  . . . . . . . . . . . . 132--145
             Michael O. Lam and   
   Jeffrey K. Hollingsworth and   
                  G. W. Stewart   Dynamic floating-point cancellation
                                  detection  . . . . . . . . . . . . . . . 146--155
             Barry Rountree and   
               Todd Gamblin and   
      Bronis R. de Supinski and   
              Martin Schulz and   
         David K. Lowenthal and   
                   Guy Cobb and   
                     Henry Tufo   Parallelizing heavyweight debugging
                                  tools with \tt mpiecho . . . . . . . . . 156--166
              J. D. Goehner and   
               D. C. Arnold and   
                  D. H. Ahn and   
                  G. L. Lee and   
          B. R. de Supinski and   
             M. P. LeGendre and   
               B. P. Miller and   
                      M. Schulz   LIBI: a framework for bootstrapping
                                  extreme scale software systems . . . . . 167--176
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??

Parallel Computing
Volume 39, Number 4--5, April / May, 2013

                     Sen Su and   
                    Jian Li and   
              Qingjia Huang and   
                 Xiao Huang and   
                 Kai Shuang and   
                       Jie Wang   Cost-efficient task scheduling for
                                  executing large programs in the cloud    177--188
             George Teodoro and   
                   Tony Pan and   
             Tahsin M. Kurc and   
                   Jun Kong and   
           Lee A. D. Cooper and   
                  Joel H. Saltz   Efficient irregular wavefront
                                  propagation algorithms on hybrid
                                  CPU--GPU machines  . . . . . . . . . . . 189--211
              Jack Dongarra and   
            Mathieu Faverge and   
      Thomas Hérault and   
          Mathias Jacquelin and   
              Julien Langou and   
                    Yves Robert   Hierarchical QR factorization algorithms
                                  for multi-core clusters  . . . . . . . . 212--232
             Wagner Kolberg and   
         Pedro de B. Marcos and   
          Julio C. S. Anjos and   
   Alexandre K. S. Miyazaki and   
           Claudio R. Geyer and   
             Luciana B. Arantes   MRSG --- a MapReduce simulator over
                                  SimGrid  . . . . . . . . . . . . . . . . 233--244
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??

Parallel Computing
Volume 39, Number 6--7, June / July, 2013

             Andrew V. Terekhov   A fast parallel algorithm for solving
                                  block-tridiagonal systems of linear
                                  equations including the domain
                                  decomposition method . . . . . . . . . . 245--258
          Christian Obrecht and   
Frédéric Kuznik and   
        Bernard Tourancheau and   
              Jean-Jacques Roux   Scalable lattice Boltzmann solvers for
                                  CUDA GPU clusters  . . . . . . . . . . . 259--270
                Yuefan Deng and   
                 Peng Zhang and   
             Carlos Marques and   
                Reid Powell and   
                       Li Zhang   Analysis of Linpack and power
                                  efficiencies of the world's TOP500
                                  supercomputers . . . . . . . . . . . . . 271--279
          Ichitaro Yamazaki and   
              Hiroto Tadano and   
            Tetsuya Sakurai and   
                Tsutomu Ikegami   Performance comparison of parallel
                                  eigensolvers based on a contour integral
                                  method and a Lanczos method  . . . . . . 280--290
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??

Parallel Computing
Volume 39, Number 8, August, 2013

                  Yang Wang and   
                        Paul Lu   DDS: a deadlock detection-based
                                  scheduling algorithm for workflow
                                  computations in HPC systems with storage
                                  constraints  . . . . . . . . . . . . . . 291--305
                A. Sandroos and   
                I. Honkonen and   
             S. von Alfthan and   
                    M. Palmroth   Multi-GPU simulations of Vlasov's
                                  equation using Vlasiator . . . . . . . . 306--318
               O. Fortmeier and   
          H. M. Bücker and   
       B. O. Fagginger Auer and   
                R. H. Bisseling   A new metric enabling an exact
                                  hypergraph model for the communication
                                  volume in distributed-memory parallel
                                  applications . . . . . . . . . . . . . . 319--335
              Harald Servat and   
        Germán Llort and   
                 Kevin Huck and   
       Judit Giménez and   
           Jesús Labarta   Framework for a productive performance
                                  optimization . . . . . . . . . . . . . . 336--353
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??

Parallel Computing
Volume 39, Number 9, September, 2013

              Fangyang Shen and   
                   Mei Yang and   
                Maurizio Palesi   Guest Editors' Introduction to the
                                  Special Issue on ``Novel On-Chip
                                  Parallel Architectures and Software
                                  Support''  . . . . . . . . . . . . . . . 355--356
              Sandeep Pande and   
            Fearghal Morgan and   
                Gerard Smit and   
              Tom Bruintjes and   
             Jochem Rutgers and   
             Brian McGinley and   
              Seamus Cawley and   
                 Jim Harkin and   
                    Liam McDaid   Fixed latency on-chip interconnect for
                                  hardware spiking neural network
                                  architectures  . . . . . . . . . . . . . 357--371
                Junghee Lee and   
    Chrysostomos Nicopoulos and   
              Hyung Gyu Lee and   
                    Jongman Kim   Sharded Router: a novel on-chip router
                                  architecture employing bandwidth
                                  sharding and stealing  . . . . . . . . . 372--388
      Michael Opoku Agyeman and   
              Ali Ahmadinia and   
               Alireza Shahrabi   Efficient routing techniques in
                                  heterogeneous $3$D Networks-on-Chip  . . 389--407
              Xiaohang Wang and   
                   Peng Liu and   
                   Mei Yang and   
                  Yingtao Jiang   Avoiding request-request type
                                  message-dependent deadlocks in
                                  networks-on-chips  . . . . . . . . . . . 408--423
    Ashkan Beyranvand Nejad and   
                Anca Molnos and   
   Matias Escudero Martinez and   
                  Kees Goossens   A hardware/software platform for QoS
                                  bridging over multi-chip NoC-based
                                  systems  . . . . . . . . . . . . . . . . 424--441
José M. Andión and   
              Manuel Arenaz and   
   Gabriel Rodríguez and   
            Juan Touriño   A novel compiler support for automatic
                                  parallelization on multicore systems . . 442--460
                  Jiyang Yu and   
                   Peng Liu and   
               Weidong Wang and   
             Chunming Huang and   
                   Jie Yang and   
              Yingtao Jiang and   
                   Qingdong Yao   An efficient protocol with
                                  synchronization accelerator for
                                  multi-processor embedded systems . . . . 461--474
  Carlos H. González and   
            Basilio B. Fraguela   A framework for argument-based task
                                  synchronization with automatic detection
                                  of dependencies  . . . . . . . . . . . . 475--489
              Guiyuan Jiang and   
                  Jigang Wu and   
                     Jizhou Sun   Efficient reconfiguration algorithms for
                                  communication-aware three-dimensional
                                  processor arrays . . . . . . . . . . . . 490--503
           Giovanni Mariani and   
           Gianluca Palermo and   
          Vittorio Zaccaria and   
               Cristina Silvano   ARTE: an Application-specific Run-Time
                                  managEment framework for multi-cores
                                  based on queuing models  . . . . . . . . 504--519
             Jingweijia Tan and   
                    Yang Yi and   
              Fangyang Shen and   
                         Xin Fu   Modeling and characterizing GPGPU
                                  reliability in the presence of soft
                                  errors . . . . . . . . . . . . . . . . . 520--532
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??

Parallel Computing
Volume 39, Number 10, October, 2013

         Marcin Krotkiewski and   
               Marcin Dabrowski   Efficient $3$D stencil computations
                                  using CUDA . . . . . . . . . . . . . . . 533--548
                   J. Joven and   
                A. Marongiu and   
               F. Angiolini and   
                  L. Benini and   
                  G. De Micheli   An integrated, programming model-driven
                                  framework for NoC--QoS support in
                                  cluster-based embedded many-cores  . . . 549--566
               Laiping Zhao and   
                  Yizhi Ren and   
                Kouichi Sakurai   Reliable workflow scheduling with less
                                  resource redundancy  . . . . . . . . . . 567--585
                 Libo Huang and   
                  Nong Xiao and   
               Zhiying Wang and   
               Yongwen Wang and   
                    Mingche Lai   Efficient multimedia coprocessor with
                                  enhanced SIMD engines for exploiting ILP
                                  and DLP  . . . . . . . . . . . . . . . . 586--602
          Dimitris Saougkos and   
                   George Manis   Self adaptive run time scheduling for
                                  the automatic parallelization of loops
                                  with the C2$ \mu $TC/SL compiler . . . . 603--614
 Agustín C. Caminero and   
Antonio Robles-Gómez and   
               Salvador Ros and   
   Roberto Hernández and   
                 Llanos Tobarra   P2P-based resource discovery in dynamic
                                  grids allowing multi-attribute and range
                                  queries  . . . . . . . . . . . . . . . . 615--637
              Xiaoliang Wan and   
                      Guang Lin   Hybrid parallel computing of minimum
                                  action method  . . . . . . . . . . . . . 638--651
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??

Parallel Computing
Volume 39, Number 11, November, 2013

              Gregory Tauer and   
                    Rakesh Nagi   A map-reduce Lagrangian heuristic for
                                  multidimensional assignment problems
                                  with decomposable costs  . . . . . . . . 653--668
             G. R. Mudalige and   
                M. B. Giles and   
           J. Thiyagalingam and   
               I. Z. Reguly and   
                C. Bertolli and   
             P. H. J. Kelly and   
                A. E. Trefethen   Design and initial performance of a
                                  high-level unstructured mesh framework
                                  on heterogeneous parallel systems  . . . 669--692
           Javier Navaridas and   
               Steve Furber and   
                Jim Garside and   
                    Xin Jin and   
               Mukaram Khan and   
               David Lester and   
         Mikel Luján and   
  José Miguel-Alonso and   
           Eustace Painkras and   
          Cameron Patterson and   
              Luis A. Plana and   
             Alexander Rast and   
           Dominic Richards and   
                  Yebin Shi and   
               Steve Temple and   
                    Jian Wu and   
                    Shufan Yang   SpiNNaker: Fault tolerance in a power-
                                  and area- constrained large-scale
                                  neuromimetic architecture  . . . . . . . 693--708
             Hameed Hussain and   
       Saif Ur Rehman Malik and   
               Abdul Hameed and   
           Samee Ullah Khan and   
               Gage Bickler and   
            Nasro Min-Allah and   
     Muhammad Bilal Qureshi and   
                Limin Zhang and   
                Wang Yongji and   
                Nasir Ghani and   
           Joanna Kolodziej and   
           Albert Y. Zomaya and   
             Cheng-Zhong Xu and   
               Pavan Balaji and   
             Abhinav Vishnu and   
              Fredric Pinel and   
         Johnatan E. Pecero and   
         Dzmitry Kliazovich and   
              Pascal Bouvry and   
               Hongxiang Li and   
                 Lizhe Wang and   
                  Dan Chenm and   
                    Ammar Rayes   A survey on resource allocation in high
                                  performance distributed computing
                                  systems  . . . . . . . . . . . . . . . . 709--736
              Hoang-Vu Dang and   
                 Bertil Schmidt   CUDA-enabled Sparse Matrix-Vector
                                  Multiplication on GPUs using atomic
                                  operations . . . . . . . . . . . . . . . 737--750

Parallel Computing
Volume 39, Number 12, December, 2013

                  Yong Chen and   
               Pavan Balaji and   
                 Abhinav Vishnu   Special issue on programming models,
                                  systems software, and tools for High-End
                                  Computing  . . . . . . . . . . . . . . . 751--752
                   Wei Tang and   
                 Dongxu Ren and   
                Zhiling Lan and   
                  Narayan Desai   Toward balanced and sustainable job
                                  scheduling for production supercomputers 753--768
               Mark Gardner and   
                Paul Sathre and   
               Wu-chun Feng and   
               Gabriel Martinez   Characterizing the challenges and
                                  evaluating the efficacy of a
                                  CUDA-to-OpenCL translator  . . . . . . . 769--786
                Zhiyi Huang and   
               Kai-Cheung Leung   Performance evaluation of View-Oriented
                                  Transactional Memory . . . . . . . . . . 787--801
                 E. J. Otoo and   
              Gideon Nimako and   
            Daniel Ohene-Kwofie   Chunked extendible dense arrays for
                                  scientific data storage  . . . . . . . . 802--818
              Shannon Steinfadt   Fine-grained parallel implementations
                                  for SWAMP+ Smith--Waterman alignment . . 819--833
                   Jie Shen and   
               Jianbin Fang and   
                  Henk Sips and   
           Ana Lucia Varbanescu   An application-centric evaluation of
                                  OpenCL on multi-core CPUs  . . . . . . . 834--850
             Hisham Mohamed and   
Stéphane Marchand-Maillet   MRO-MPI: MapReduce overlapping using MPI
                                  and an optimized data exchange policy    851--866
        Omer Erdil Albayrak and   
              Ismail Akturk and   
                   Ozcan Ozturk   Improving application behavior on
                                  heterogeneous manycore systems through
                                  kernel mapping . . . . . . . . . . . . . 867--878
        Alexander Reinefeld and   
       Robert Döbbelin and   
           Thorsten Schütt   Analyzing the performance of SMP memory
                                  allocators with iterative MapReduce
                                  applications . . . . . . . . . . . . . . 879--889

Parallel Computing
Volume 40, Number 1, January, 2014

                  L. Yavits and   
                   A. Morad and   
                     R. Ginosar   The effect of communication and
                                  synchronization on Amdahl's law in
                                  multicore systems  . . . . . . . . . . . 1--16
       Lois Curfman McInnes and   
                Barry Smith and   
                 Hong Zhang and   
             Richard Tran Mills   Hierarchical Krylov and nested Krylov
                                  methods for extreme-scale computing  . . 17--31

Parallel Computing
Volume 40, Number 2, February, 2014

               Pavan Balaji and   
                    Zhiyi Huang   Special issue on programming models and
                                  applications for multicores and
                                  manycores --- Guest Editors'
                                  introduction . . . . . . . . . . . . . . 33--34
                Mark Utting and   
             Min-Hsien Weng and   
                 John G. Cleary   The JStar language philosophy  . . . . . 35--50
               Weihua Sheng and   
      Stefan Schürmans and   
        Maximilian Odendahl and   
               Mark Bertsch and   
           Vitaliy Volevach and   
             Rainer Leupers and   
                   Gerd Ascheid   A compiler infrastructure for embedded
                                  heterogeneous MPSoCs . . . . . . . . . . 51--68
                      Vikas and   
            Nasser Giacaman and   
                  Oliver Sinnen   Multiprocessing with GUI-awareness using
                                  OpenMP-like directives in Java . . . . . 69--89
                 Oded Green and   
                   Yitzhak Birk   Scheduling directives: Accelerating
                                  shared-memory many-core processor
                                  execution  . . . . . . . . . . . . . . . 90--106
              Zhenning Wang and   
                 Long Zheng and   
                  Quan Chen and   
                      Minyi Guo   CPU + GPU scheduling with asymptotic
                                  profiling  . . . . . . . . . . . . . . . 107--115
                     Yu Liu and   
                Kento Emoto and   
                   Zhenjiang Hu   A Generate-Test-Aggregate parallel
                                  programming library for systematic
                                  parallel programming . . . . . . . . . . 116--135
                 Zhijun Hao and   
               Chenning Xie and   
                 Haibo Chen and   
                     Binyu Zang   X10-FT: Transparent fault tolerance for
                                  APGAS language and runtime . . . . . . . 136--156

Parallel Computing
Volume 40, Number 3--4, March, 2014

        Mohammad Reza Selim and   
          Mohammed Ziaur Rahman   Carrying on the legacy of imperative
                                  languages in the future parallel
                                  computing era  . . . . . . . . . . . . . 1--33
      Jean-Yves L'Excellent and   
          Wissam M. Sid-Lakhdar   A study of shared-memory parallelism in
                                  a multifrontal solver  . . . . . . . . . 34--46

Parallel Computing
Volume 40, Number 5--6, May, 2014

             Urban Borstnik and   
         Joost VandeVondele and   
        Valéry Weber and   
               Jürg Hutter   Sparse matrix multiplication: the
                                  distributed block-compressed sparse row
                                  library  . . . . . . . . . . . . . . . . 47--58
              Yuki Sugimoto and   
               Fumihiko Ino and   
               Kenichi Hagihara   Improving cache locality for GPU-based
                                  volume rendering . . . . . . . . . . . . 59--69
              Ray-Bing Chen and   
            Yaohung M. Tsai and   
                  Weichung Wang   Adaptive block size for dense $ Q R $
                                  factorization in hybrid CPU--GPU systems
                                  via statistical modeling . . . . . . . . 70--85
         Michael J. Hallock and   
              John E. Stone and   
             Elijah Roberts and   
                  Corey Fry and   
          Zaida Luthey-Schulten   Simulation of reaction diffusion
                                  processes over biologically relevant
                                  size and time scales using multi-GPU
                                  workstations . . . . . . . . . . . . . . 86--99
        Ivan Teixidó and   
       Francesc Sebé and   
                Josep Conde and   
               Francesc Solsona   MPI-based implementation of an enhanced
                                  algorithm to solve the LPN problem in a
                                  memory-constrained environment . . . . . 100--112
   Alberto F. Martín and   
        Ruymán Reyes and   
              Rosa M. Badia and   
Enrique S. Quintana-Ortí   Leveraging task-parallelism in
                                  message-passing dense matrix
                                  factorizations using SMPSs . . . . . . . 113--128
            Jose A. Pascual and   
         Jose Miguel-Alonso and   
                 Jose A. Lozano   Application-aware metrics for partition
                                  selection in cube-shaped topologies  . . 129--139
            Robert Hallberg and   
               Alistair Adcroft   An order-invariant real-to-integer
                                  conversion sum . . . . . . . . . . . . . 140--143
               Oscar Peredo and   
     Julián M. Ortiz and   
     José R. Herrero and   
     Cristóbal Samaniego   Tuning and hybrid parallelization of a
                                  genetic-based multi-point statistics
                                  simulation code  . . . . . . . . . . . . 144--158
                      Anonymous   Editorial Board  . . . . . . . . . . . . IFC

Parallel Computing
Volume 40, Number 7, July, 2014

               Costas Bekas and   
               Ananth Grama and   
                Yousef Saad and   
                    Olaf Schenk   Parallel matrix algorithms . . . . . . . 159--160
              Robert Andrew and   
                Nicholas Dingle   Implementing $ Q R $ factorization
                                  updating algorithms on GPUs  . . . . . . 161--172
           Yiannis Cotronis and   
       Elias Konstantinidis and   
             Maria A. Louka and   
          Nikolaos M. Missirlis   A comparison of CPU and GPU
                                  implementations for solving the
                                  convection diffusion equation using the
                                  local modified SOR method  . . . . . . . 173--185
            T. Auckenthaler and   
                  T. Huckle and   
                    R. Wittmann   A blocked $ Q R $-decomposition for the
                                  parallel symmetric eigenvalue problem    186--194
        Hasan Metin Aktulga and   
                    Lin Lin and   
          Christopher Haine and   
               Esmond G. Ng and   
                      Chao Yang   Parallel eigenvalue calculation based on
                                  multiple shift-invert Lanczos and
                                  contour integral based spectral
                                  projection method  . . . . . . . . . . . 195--212
              Marc Baboulin and   
           Dulceneia Becker and   
             George Bosilca and   
            Anthony Danalis and   
                  Jack Dongarra   An efficient distributed randomized
                                  algorithm for solving large dense
                                  symmetric indefinite linear systems  . . 213--223
                 P. Ghysels and   
                    W. Vanroose   Hiding global synchronization latency in
                                  the preconditioned conjugate gradient
                                  algorithm  . . . . . . . . . . . . . . . 224--238
                Erhan Turan and   
                   Peter Arbenz   Large scale micro finite element
                                  analysis of $3$D bone poroelasticity . . 239--250
                Michele Martone   Efficient multithreaded untransposed,
                                  transposed or symmetric sparse
                                  matrix-vector multiplication with the
                                  Recursive Sparse Blocks format . . . . . 251--270
                L. Karlsson and   
     B. Kågström and   
                      E. Wadbro   Fine-grained bulge-chasing kernels for
                                  strongly scalable parallel $ Q R $
                                  algorithms . . . . . . . . . . . . . . . 271--288
                J. Langguth and   
                    A. Azad and   
            M. Halappanavar and   
                       F. Manne   On parallel push-relabel based
                                  algorithms for bipartite maximum
                                  matching . . . . . . . . . . . . . . . . 289--308
 Jesús Cámara and   
              Javier Cuenca and   
   Luis-Pedro García and   
         Domingo Giménez   Auto-tuned nested parallelism: a way to
                                  reduce the execution time of scientific
                                  software in NUMA systems . . . . . . . . 309--327
       Emanuel H. Rubensson and   
                  Elias Rudberg   Chunks and Tasks: a programming model
                                  for parallelization of dynamic
                                  algorithms . . . . . . . . . . . . . . . 328--343
                      Anonymous   Editorial Board  . . . . . . . . . . . . IFC

Parallel Computing
Volume 40, Number 8, August, 2014

María Botón-Fernández and   
Miguel A. Vega-Rodríguez and   
     Francisco Prieto Castrillo   Self-adaptivity for grid applications.
                                  An Efficient Resources Selection model
                                  based on evolutionary computation
                                  algorithms . . . . . . . . . . . . . . . 345--361
             Chihiro Kodama and   
              Masaaki Terai and   
              Akira T. Noda and   
               Yohei Yamada and   
               Masaki Satoh and   
              Tatsuya Seiki and   
              Shin-ichi Iga and   
            Hisashi Yashiro and   
            Hirofumi Tomita and   
                   Kazuo Minami   Scalable rank-mapping algorithm for an
                                  icosahedral grid system on the massive
                                  parallel computer with a $3$-D torus
                                  network  . . . . . . . . . . . . . . . . 362--373
            Angeles Navarro and   
              Rafael Asenjo and   
          Francisco Corbera and   
            Antonio J. Dios and   
               Emilio L. Zapata   A case study of different task
                                  implementations for multioutput stages
                                  in non-trivial parallel pipeline
                                  applications . . . . . . . . . . . . . . 374--393
    J. Sánchez-Curto and   
         P. Chamorro-Posada and   
                 G. S. McDonald   Efficient parallel implementation of the
                                  nonparaxial beam propagation method  . . 394--407
                   Jie Chen and   
               Tom L. H. Li and   
                 Mihai Anitescu   A parallel linear solver for multilevel
                                  Toeplitz systems with possibly several
                                  right-hand sides . . . . . . . . . . . . 408--424
          Roman Wyrzykowski and   
             Lukasz Szustak and   
                Krzysztof Rojek   Parallelization of $2$D MPDATA EULAG
                                  algorithm on hybrid architectures with
                                  GPU accelerators . . . . . . . . . . . . 425--447
                      Anonymous   Editorial Board  . . . . . . . . . . . . IFC

Parallel Computing
Volume 40, Number 9, October, 2014

               Joao Andrade and   
             Gabriel Falcao and   
                    Vitor Silva   Optimized Fast Walsh--Hadamard Transform
                                  on GPUs for non-binary LDPC decoding . . 449--453
               Ehsan Totoni and   
           Michael T. Heath and   
              Laxmikant V. Kale   Structure-adaptive parallel solution of
                                  sparse triangular linear systems . . . . 454--470
            Diego Arroyuelo and   
           Carolina Bonacic and   
         Veronica Gil-Costa and   
             Mauricio Marin and   
                Gonzalo Navarro   Distributed text search using suffix
                                  arrays . . . . . . . . . . . . . . . . . 471--495
             Yingchong Situ and   
          Chandra S. Martha and   
           Matthew E. Louis and   
                 Zhiyuan Li and   
             Ahmed H. Sameh and   
       Gregory A. Blaisdell and   
        Anastasios S. Lyrintzis   Petascale large eddy simulation of jet
                                  engine noise based on the truncated
                                  SPIKE algorithm  . . . . . . . . . . . . 496--511
        Lucas Mello Schnorr and   
Philippe Olivier Alexandre Navaux   Best of SBAC--PAD 2012 . . . . . . . . . 512--513
                 Luiz Ramos and   
              Ricardo Bianchini   Robust performance in hybrid-memory
                                  cooperative caches . . . . . . . . . . . 514--525
                Joefon Jann and   
          R. Sarma Burugula and   
           Ching-Farn E. Wu and   
           Kaoutar El Maghraoui   Towards an immortal operating system in
                                  virtual environments . . . . . . . . . . 526--535
            Esteban Meneses and   
               Osman Sarood and   
       Laxmikant V. Kalé   Energy profile of rollback-recovery
                                  strategies in high performance computing 536--547
                Teo Milanez and   
           Sylvain Collange and   
Fernando Magno Quintão Pereira and   
          Wagner Meira, Jr. and   
                Renato Ferreira   Thread scheduling and memory coalescing
                                  for dynamic vectorization of SPMD
                                  workloads  . . . . . . . . . . . . . . . 548--558
                      Anonymous   Editorial Board  . . . . . . . . . . . . IFC

Parallel Computing
Volume 40, Number 10, December, 2014

                     Li Tan and   
        Shashank Kothapalli and   
             Longxiang Chen and   
              Omar Hussaini and   
               Ryan Bissiri and   
                   Zizhong Chen   A survey of power and energy efficient
                                  techniques for high performance
                                  numerical linear algebra operations  . . 559--573
     Antonio J. Peña and   
        Carlos Reaño and   
             Federico Silla and   
                Rafael Mayo and   
Enrique S. Quintana-Ortí and   
              José Duato   A complete and efficient CUDA-sharing
                                  solution for HPC clusters  . . . . . . . 574--588
             George Teodoro and   
                   Tony Pan and   
                Tahsin Kurc and   
                   Jun Kong and   
                 Lee Cooper and   
               Scott Klasky and   
                     Joel Saltz   Region templates: Data representation
                                  and management for high-throughput image
                                  analysis . . . . . . . . . . . . . . . . 589--610
                Yizhuo Wang and   
                 Yang Zhang and   
                     Yan Su and   
               Xiaojun Wang and   
                    Xu Chen and   
                 Weixing Ji and   
                       Feng Shi   An adaptive and hierarchical task
                                  scheduling scheme for multi-core
                                  clusters . . . . . . . . . . . . . . . . 611--627
               Andrew White and   
                  Soo-Young Lee   Derivation of optimal input parameters
                                  for minimizing execution time of
                                  matrix-based computations on a GPU . . . 628--645
           Nicholas Horelik and   
              Andrew Siegel and   
              Benoit Forget and   
                     Kord Smith   Monte Carlo domain decomposition for
                                  robust nuclear reactor analysis  . . . . 646--660
      Leandro A. J. Marzulo and   
          Tiago A. O. Alves and   
 Felipe M. G. França and   
      Vítor Santos Costa   Couillard: Parallel programming via
                                  coarse-grained Data-flow Compilation . . 661--680
             Philip C. Roth and   
                      Yong Chen   Guest Editors' introduction to the
                                  special issue on ``DISCS-2013''  . . . . 681--681
               Jesse Weaver and   
   Vito Giovanni Castellana and   
          Alessandro Morari and   
             Antonino Tumeo and   
              Sumit Purohit and   
              Alan Chappell and   
               David Haglin and   
               Oreste Villa and   
          Sutanay Choudhury and   
           Karen Schuchardt and   
                       John Feo   Toward a data scalable solution for
                                  facilitating discovery of science
                                  resources  . . . . . . . . . . . . . . . 682--696
              Jiangling Yin and   
               Junyao Zhang and   
                   Jun Wang and   
                   Wu-chun Feng   SDAFT: a novel scalable data access
                                  framework for parallel BLAST . . . . . . 697--709
                    Yong Li and   
                   Dan Feng and   
                       Zhan Shi   Heterogeneous-aware cache partitioning:
                                  Improving the fairness of shared storage
                                  cache  . . . . . . . . . . . . . . . . . 710--721
             Joong-Yeon Cho and   
              Hyun-Wook Jin and   
                    Min Lee and   
                 Karsten Schwan   Dynamic core affinity for
                                  high-performance file upload on Hadoop
                                  Distributed File System  . . . . . . . . 722--737
                 P. Coetzee and   
                   M. Leeke and   
                      S. Jarvis   Towards unified secure on- and off-line
                                  analytics at scale . . . . . . . . . . . 738--753
          Dominique LaSalle and   
                 George Karypis   MPI for Big Data: New tricks for an old
                                  dog  . . . . . . . . . . . . . . . . . . 754--767
                     Lan Vu and   
                 Gita Alaghband   Novel parallel method for association
                                  rule mining on multi-core shared memory
                                  systems  . . . . . . . . . . . . . . . . 768--785
                      Anonymous   Editorial Board  . . . . . . . . . . . . IFC

Parallel Computing
Volume 41, Number ??, January, 2015

                Saiqin Long and   
               Yuelong Zhao and   
                   Wei Chen and   
                   Yuanbin Tang   A prediction-based dynamic file
                                  assignment strategy for parallel file
                                  systems  . . . . . . . . . . . . . . . . 1--13
           Tassadaq Hussain and   
                Amna Haider and   
          Shakaib A. Gursal and   
          Eduard Ayguadé   AMC: Advanced Multi-accelerator
                                  Controller . . . . . . . . . . . . . . . 14--30
                  Hugo Rito and   
            João Cachopo   Adaptive transaction scheduling for
                                  mixed transactional workloads  . . . . . 31--49
              Ren Xiaoguang and   
                  Xu Xinhai and   
                  Wang Qian and   
                  Chen Juan and   
                  Wang Miao and   
                    Yang Xuejun   GS-DMR: Low-overhead soft error
                                  detection scheme for stencil-based
                                  computation  . . . . . . . . . . . . . . 50--65
              Dounia Khaldi and   
            Pierre Jouvelot and   
                Corinne Ancourt   Parallelizing with BDSC, a
                                  resource-constrained scheduling
                                  algorithm for shared and distributed
                                  memory systems . . . . . . . . . . . . . 66--89
     Alexandros V. Gerbessiotis   Extending the BSP model for multi-core
                                  and out-of-core computing: MBSP  . . . . 90--102
                      Anonymous   Editorial Board  . . . . . . . . . . . . IFC

Parallel Computing
Volume 42, Number ??, February, 2015

Miguel A. Vega-Rodríguez and   
David L. González-Álvarez   Parallelism in bioinformatics: a view
                                  from different parallelism-based
                                  technologies . . . . . . . . . . . . . . 1--3
         Michael Bromberger and   
               Fabian Nowak and   
                  Wolfgang Karl   Combined hardware-software
                                  multi-parallel prefiltering on the
                                  Convey HC-1 for fast homology detection  4--17
             Miquel Orobitg and   
           Fernando Guirado and   
             Fernando Cores and   
               Jordi Llados and   
               Cedric Notredame   High performance computing improvements
                                  on bioinformatics consistency-based
                                  multiple sequence alignment tools  . . . 18--34
   Sérgio E. D. Dias and   
               Abel J. P. Gomes   Triangulating molecular surfaces over a
                                  LAN of GPU-enabled computers . . . . . . 35--47
             Romain Vasseur and   
      Stéphanie Baud and   
      Luiz Angelo Steffenel and   
           Xavier Vigouroux and   
            Laurent Martiny and   
      Michaël Krajecki and   
                 Manuel Dauchez   Inverse docking method for new proteins
                                  targets identification: a parallel
                                  approach . . . . . . . . . . . . . . . . 48--59
             Marco Ferretti and   
                    Mirto Musci   Geometrical motifs search in proteins: a
                                  parallel approach  . . . . . . . . . . . 60--74
                Elmar Peise and   
      Diego Fabregat-Traver and   
               Paolo Bientinesi   High performance solutions for big-data
                                  GWAS . . . . . . . . . . . . . . . . . . 75--87
      Gonzalo Martín and   
             David E. Singh and   
   Maria-Cristina Marinescu and   
         Jesús Carretero   Towards efficient large scale
                                  epidemiological simulations in EpiGraph  88--102
                      Anonymous   Editorial Board  . . . . . . . . . . . . IFC

Parallel Computing
Volume 43, Number ??, March, 2015

Daniel Chavarría-Miranda and   
               Ajay Panyala and   
                 Wenjing Ma and   
              Adrian Prantl and   
          Sriram Krishnamoorthy   Global transformations for legacy
                                  parallel applications via structural
                                  analysis and rewriting . . . . . . . . . 1--26
                   Kenli Li and   
                   Jing Liu and   
                 Lanjun Wan and   
                    Shu Yin and   
                       Keqin Li   A cost-optimal parallel algorithm for
                                  the $0$--$1$ knapsack problem and its
                                  performance on multicore CPU and GPU
                                  implementations  . . . . . . . . . . . . 27--42
            Matthias Diener and   
         Eduardo H. M. Cruz and   
      Philippe O. A. Navaux and   
               Anselm Busse and   
         Hans-Ulrich Heiß   Communication-aware process and thread
                                  mapping using online communication
                                  detection  . . . . . . . . . . . . . . . 43--63
                      Anonymous   Editorial Board  . . . . . . . . . . . . ifc--ifc

Parallel Computing
Volume 44, Number ??, May, 2015

                    Jian Li and   
                     Sen Su and   
                Xiang Cheng and   
                 Meina Song and   
                    Liyu Ma and   
                       Jie Wang   Cost-efficient coordinated scheduling
                                  for leasing cloud resources on hybrid
                                  workloads  . . . . . . . . . . . . . . . 1--17
               Haifeng Wang and   
                    Yunpeng Cao   Predicting power consumption of GPUs
                                  with fuzzy wavelet neural networks . . . 18--36
     João V. F. Lima and   
            Thierry Gautier and   
            Vincent Danjean and   
               Bruno Raffin and   
               Nicolas Maillard   Design and analysis of scheduling
                                  strategies for multi-CPU and multi-GPU
                                  architectures  . . . . . . . . . . . . . 37--52
                 J. Iverson and   
                  C. Kamath and   
                     G. Karypis   Evaluation of connected-component
                                  labeling algorithms for
                                  distributed-memory systems . . . . . . . 53--68
                      Anonymous   Editorial Board  . . . . . . . . . . . . ifc--ifc

Parallel Computing
Volume 45, Number ??, June, 2015

                      Anonymous   Best papers from ACM Computing Frontiers
                                  2014 Conference  . . . . . . . . . . . . 1
           Ali JavadiAbhari and   
               Shruti Patil and   
              Daniel Kudrow and   
                Jeff Heckey and   
                Alexey Lvov and   
          Frederic T. Chong and   
             Margaret Martonosi   ScaffCC: Scalable compilation and
                                  analysis of quantum programs . . . . . . 2--17
           Vladimir Gajinov and   
            Srdjan Stipi\'c and   
                Igor Eri\'c and   
             Osman S. Unsal and   
      Eduard Ayguadé and   
                 Adrian Cristal   DaSH: a benchmark suite for hybrid
                                  dataflow and shared memory programming
                                  models . . . . . . . . . . . . . . . . . 18--48
           Javier Navaridas and   
         Mikel Luján and   
              Luis A. Plana and   
               Steve Temple and   
                Steve B. Furber   SpiNNaker: Enhanced multicast routing    49--66
                      Anonymous   Editorial Board  . . . . . . . . . . . . ifc--ifc

Parallel Computing
Volume 46, Number ??, July, 2015

      Christian Feichtinger and   
            Johannes Habich and   
        Harald Köstler and   
           Ulrich Rüde and   
                  Takayuki Aoki   Performance modeling and analysis of
                                  heterogeneous lattice Boltzmann
                                  simulations on CPU--GPU clusters . . . . 1--13
  Juan-Antonio Rico-Gallego and   
Juan-Carlos Díaz-Martín   $ \tau $-Lop: Modeling performance of
                                  shared memory MPI  . . . . . . . . . . . 14--31
              Javier Prades and   
             Federico Silla and   
        Holger Fröning and   
       Mondrian Nüssle and   
              José Duato   On the design of a new dynamic
                                  credit-based end-to-end flow control
                                  mechanism for HPC clusters . . . . . . . 32--59
      Gonzalo Martín and   
             David E. Singh and   
   Maria-Cristina Marinescu and   
         Jesús Carretero   Enhancing the performance of malleable
                                  MPI applications by using
                                  performance-aware dynamic
                                  reconfiguration  . . . . . . . . . . . . 60--77
              Siew Yin Chan and   
             Teck Chaw Ling and   
                   Eric Aubanel   Performance modeling for hierarchical
                                  graph partitioning in heterogeneous
                                  multi-core environment . . . . . . . . . 78--97
                 Yan Y. Liu and   
                   Shaowen Wang   A scalable parallel genetic algorithm
                                  for the Generalized Assignment Problem   98--119
                      Anonymous   Editorial Board  . . . . . . . . . . . . ifc--ifc

Parallel Computing
Volume 47, Number ??, August, 2015

         Aydin Buluç and   
              Leonid Oliker and   
                   John Gilbert   Special issue ``Graph analysis for
                                  scientific discovery'' . . . . . . . . . 1--2
  Ahmet Erdem Sariyüce and   
                 Erik Saule and   
                 Kamer Kaya and   
Ümit V. Çatalyürek   Incremental closeness centrality in
                                  distributed memory . . . . . . . . . . . 3--18
                     Hao Lu and   
     Mahantesh Halappanavar and   
            Ananth Kalyanaraman   Parallel heuristics for scalable
                                  community detection  . . . . . . . . . . 19--37
         James P. Fairbanks and   
        Ramakrishnan Kannan and   
                Haesun Park and   
                 David A. Bader   Behavioral clusters in dynamic graphs    38--50
            George M. Slota and   
                 Kamesh Madduri   Parallel color-coding  . . . . . . . . . 51--69
             Vince Lyzinski and   
          Daniel L. Sussman and   
       Donniell E. Fishkind and   
                  Henry Pao and   
                    Li Chen and   
       Joshua T. Vogelstein and   
              Youngser Park and   
                Carey E. Priebe   Spectral clustering for
                                  divide-and-conquer graph matching  . . . 70--87
                      Anonymous   Editorial Board  . . . . . . . . . . . . ifc--ifc

Parallel Computing
Volume 48, Number ??, October, 2015

        Tatjana Davidovi\'c and   
         Teodor Gabriel Crainic   Parallel Local Search to schedule
                                  communicating tasks on identical
                                  processors . . . . . . . . . . . . . . . 1--14
              Ryan E. Grant and   
         Mohammad J. Rashti and   
               Pavan Balaji and   
                   Ahmad Afsahi   Scalable connectionless RDMA over
                                  unreliable datagrams . . . . . . . . . . 15--39
            Ivanoe De Falco and   
            Umberto Scafuri and   
              Ernesto Tarantino   Mapping of time-consuming multitask
                                  applications on a cloud system by
                                  multiobjective Differential Evolution    40--58
                  M. Alonso and   
                    S. Coll and   
      J. M. Martínez and   
                V. Santonja and   
                P. López   Power consumption management in fat-tree
                                  interconnection networks . . . . . . . . 59--80
                Anna Sikora and   
           Tom\`as Margalef and   
                    Josep Jorba   Online root-cause performance analysis
                                  of parallel applications . . . . . . . . 81--107
                 Peng Zhang and   
                   Ling Liu and   
                    Yuefan Deng   A data-driven paradigm for mapping
                                  problems . . . . . . . . . . . . . . . . 108--124
                   Ryo Asai and   
              Andrey Vladimirov   Intel Cilk Plus for complex parallel
                                  algorithms: ``Enormous Fast Fourier
                                  Transforms'' (EFFT) library  . . . . . . 125--142
                      Anonymous   Editorial Board  . . . . . . . . . . . . ifc--ifc

Parallel Computing
Volume 49, Number ??, November, 2015

             Shinya Maeyama and   
          Tomohiko Watanabe and   
           Yasuhiro Idomura and   
              Motoki Nakata and   
            Masanori Nunami and   
               Akihiro Ishizawa   Improved strong scaling of a
                                  spectral/finite difference gyrokinetic
                                  code for multi-scale plasma turbulence   1--12
                 Pablo Abad and   
               Pablo Prieto and   
            Valentin Puente and   
            Jose-Angel Gregorio   Improving last level shared cache
                                  performance through mobile insertion
                                  policies (MIP) . . . . . . . . . . . . . 13--27
                Xiaohua Shi and   
              Fredrick Park and   
                  Lina Wang and   
                   Jack Xin and   
                    Yingyong Qi   Parallelization of a color-entropy
                                  preprocessed Chan--Vese model for face
                                  contour detection on multi-core CPU and
                                  GPU  . . . . . . . . . . . . . . . . . . 28--49
                   S. Weise and   
                       C. Hasse   Reducing the memory footprint in Large
                                  Eddy Simulations of reactive flows . . . 50--65
            Berenger Bramas and   
            Olivier Coulaud and   
              Guillaume Sylvand   Time-domain BEM for the wave equation on
                                  distributed-heterogeneous architectures:
                                  a blocking approach  . . . . . . . . . . 66--82
           Sylvain Collange and   
               David Defour and   
              Stef Graillat and   
                Roman Iakymchuk   Numerical reproducibility for the
                                  parallel reduction on multi- and
                                  many-core architectures  . . . . . . . . 83--97
               Peter Arbenz and   
              Laura Grigori and   
                Rolf Krause and   
                    Olaf Schenk   Special issue on Parallel Matrix
                                  Algorithms and Applications (PMAA'14)    99--100
              I. E. Venetis and   
                  A. Kouris and   
                 A. Sobczyk and   
             E. Gallopoulos and   
                    A. H. Sameh   A direct tridiagonal solver based on
                                  Givens rotations for GPU architectures   101--116
       Dominik Göddeke and   
           Mirco Altenbernd and   
                  Dirk Ribbrock   Fault-tolerant finite-element multigrid
                                  algorithms with hierarchically
                                  compressed asynchronous checkpointing    117--135
         Vedran Novakovi\'c and   
               Sanja Singer and   
                    Sasa Singer   Blocking and parallelization of the
                                  Hari--Zimmermann variant of the
                                  Falk--Langemeyer algorithm for the
                                  generalized SVD  . . . . . . . . . . . . 136--152
              Martin Galgon and   
          Lukas Krämer and   
                Jonas Thies and   
            Achim Basermann and   
                     Bruno Lang   On the parallel iterative solution of
                                  linear systems arising in the FEAST
                                  algorithm for computing inner
                                  eigenvalues  . . . . . . . . . . . . . . 153--163
              Ali Dorostkar and   
             Maya Neytcheva and   
                Björn Lund   Numerical and computational aspects of
                                  some block-preconditioners for saddle
                                  point systems  . . . . . . . . . . . . . 164--178
                Weifeng Liu and   
                   Brian Vinter   Speculative segmented sum for sparse
                                  matrix-vector multiplication on
                                  heterogeneous processors . . . . . . . . 179--193
                      Anonymous   Editorial Board  . . . . . . . . . . . . ifc--ifc

Parallel Computing
Volume 50, Number ??, December, 2015

             Santiago Badia and   
   Alberto F. Martín and   
                Javier Principe   On the scalability of inexact balancing
                                  domain decomposition by constraints with
                                  overlapped coarse/fine corrections . . . 1--24
               Nuno Diegues and   
                   Paolo Romano   Self-tuning Intel Restricted
                                  Transactional Memory . . . . . . . . . . 25--52
   Eike Hermann Müller and   
            Robert Scheichl and   
                  Eero Vainikko   Petascale solvers for anisotropic PDEs
                                  in atmospheric modelling on GPU clusters 53--69
                      Anonymous   Editorial Board  . . . . . . . . . . . . ifc--ifc

Parallel Computing
Volume 51, Number ??, January, 2016

               Pavan Balaji and   
             Abhinav Vishnu and   
                      Yong Chen   Special Issue on Parallel Programming
                                  Models and Systems Software for High-End
                                  Computing  . . . . . . . . . . . . . . . 1--2
                    Huy Bui and   
              Eun-Sung Jung and   
       Venkatram Vishwanath and   
             Andrew Johnson and   
                Jason Leigh and   
               Michael E. Papka   Improving sparse data movement
                                  performance using multiple paths on the
                                  Blue Gene/Q supercomputer  . . . . . . . 3--16
                 S. Herbein and   
                S. McDaniel and   
              N. Podhorszki and   
                   J. Logan and   
                  S. Klasky and   
                      M. Taufer   Performance characterization of
                                  irregular I/O at the extreme scale . . . 17--36
                      Lu Li and   
             Usman Dastgeer and   
              Christoph Kessler   Pruning strategies in adaptive off-line
                                  tuning for optimized composition of
                                  components on heterogeneous systems  . . 37--45
     Antonio J. Peña and   
                   Pavan Balaji   A data-oriented profiler to assist in
                                  data partitioning and distribution for
                                  heterogeneous memory in HPC  . . . . . . 46--55
               Jiangzhou He and   
              Wenguang Chen and   
                  Zhizhong Tang   NestedMP: Enabling cache-aware thread
                                  mapping for nested parallel shared
                                  memory applications  . . . . . . . . . . 56--66
             Evan Balzuweit and   
             David P. Bunde and   
             Vitus J. Leung and   
              Austin Finley and   
                 Alan C. S. Lee   Local search to improve coordinate-based
                                  task mapping . . . . . . . . . . . . . . 67--78
            Lucas A. Wilson and   
              Jeffery von Ronne   A task-uncoordinated distributed
                                  dataflow model for scalable high
                                  performance parallel program execution   79--87
                      Anonymous   Editorial Board  . . . . . . . . . . . . ifc--ifc

Parallel Computing
Volume 52, Number ??, February, 2016

       Hariswaran Sitaraman and   
                      Ray Grout   Balancing conflicting requirements for
                                  grid and particle decomposition in
                                  continuum-Lagrangian solvers . . . . . . 1--21
            Julien Herrmann and   
             George Bosilca and   
      Thomas Hérault and   
              Loris Marchal and   
                Yves Robert and   
                  Jack Dongarra   Assessing the cost of redistribution
                                  followed by a computational kernel:
                                  Complexity and performance results . . . 22--41
               T. Weinzierl and   
                 B. Verleye and   
                   P. Henri and   
                       D. Roose   Two particle-in-grid realisations on
                                  spacetrees . . . . . . . . . . . . . . . 42--64
           Jorge F. Fabeiro and   
              Diego Andrade and   
            Basilio B. Fraguela   Writing a performance-portable matrix
                                  multiplication . . . . . . . . . . . . . 65--77
               Philipp Hupp and   
                Mario Heene and   
                 Riko Jacob and   
              Dirk Pflüger   Global communication schemes for the
                                  numerical solution of high-dimensional
                                  PDEs . . . . . . . . . . . . . . . . . . 78--105
               Xiongwei Fei and   
                   Kenli Li and   
              Wangdong Yang and   
                       Keqin Li   A secure and efficient file protecting
                                  system based on SHA3 and parallel AES    106--132
                 Dan Ibanez and   
                   Ian Dunn and   
               Mark S. Shephard   Hybrid MPI-thread parallelization of
                                  adaptive mesh operations . . . . . . . . 133--143
           Mahmoud Meribout and   
                  Ahmad Firadus   A new systolic multiprocessor
                                  architecture for real-time soft
                                  tomography algorithms  . . . . . . . . . 144--155
                      Anonymous   Editorial Board  . . . . . . . . . . . . ifc--ifc

Parallel Computing
Volume 53, Number ??, April, 2016

                 M. Llorens and   
                  J. Oliver and   
                   J. Silva and   
                     S. Tamarit   Dynamic slicing of concurrent
                                  specification languages  . . . . . . . . 1--22
                 Zhihao Lou and   
                   John Reinitz   Parallel simulated annealing using an
                                  adaptive resampling interval . . . . . . 23--31
      Michelle Mills Strout and   
              Alan LaMielle and   
               Larry Carter and   
            Jeanne Ferrante and   
           Barbara Kreaseck and   
         Catherine Olschanowsky   An approach for code generation in the
                                  Sparse Polyhedral Framework  . . . . . . 32--57
                      Anonymous   Editorial Board  . . . . . . . . . . . . ifc--ifc

Parallel Computing
Volume 54, Number ??, May, 2016

                      Anonymous   Preface: 26th International Symposium on
                                  Computer Architecture and High
                                  Performance Computing  . . . . . . . . . 1
            Michail Alvanos and   
              Ettore Tiotto and   
  José Nelson Amaral and   
            Montse Farreras and   
               Xavier Martorell   Using shared-data localization to reduce
                                  the cost of inspector-execution in
                                  unified-parallel-C programs  . . . . . . 2--14
         Francis B. Moreira and   
          Marco A. Z. Alves and   
            Matthias Diener and   
      Philippe O. A. Navaux and   
                   Israel Koren   A dynamic block-level execution profiler 15--28
    Rachata Ausavarungnirun and   
               Chris Fallin and   
                Xiangyao Yu and   
        Kevin Kai-Wei Chang and   
               Greg Nazario and   
             Reetuparna Das and   
             Gabriel H. Loh and   
                     Onur Mutlu   A case for hierarchical rings with
                                  deflection routing: an energy-efficient
                                  on-chip communication substrate  . . . . 29--45
     Marcio Machado Pereira and   
             Matthew Gaudet and   
           J. Nelson Amaral and   
                   Guido Araujo   Study of hardware transactional memory
                                  characteristics and serialization
                                  policies on Haswell  . . . . . . . . . . 46--58
         Eduardo H. M. Cruz and   
            Matthias Diener and   
          Marco A. Z. Alves and   
    Laércio L. Pilla and   
          Philippe O. A. Navaux   LAPT: a locality-aware page table for
                                  thread and data mapping  . . . . . . . . 59--71
          Iván Cores and   
Mónica Rodríguez and   
   Patricia González and   
  María J. Martín   Reducing the overhead of an MPI
                                  application-level migration approach . . 72--82
           Olivier Beaumont and   
       Lionel Eyraud-Dubois and   
Juan-Angel Lorenzo-del-Castillo   Analyzing real cluster data for
                                  formulating allocation algorithms in
                                  cloud platforms  . . . . . . . . . . . . 83--96
      José I. Aliaga and   
              Rosa M. Badia and   
              Maria Barreda and   
    Matthias Bollhöfer and   
          Ernesto Dufrechou and   
              Pablo Ezzatti and   
Enrique S. Quintana-Ortí   Exploiting task and data parallelism in
                                  ILUPACK's preconditioned CG solver on
                                  NUMA architectures and many-core
                                  accelerators . . . . . . . . . . . . . . 97--107
       Márcio Castro and   
        Emilio Francesquini and   
             Fabrice Dupros and   
                Hideo Aochi and   
      Philippe O. A. Navaux and   
Jean-François Méhaut   Seismic wave propagation simulations on
                                  low-power and performance-centric
                                  manycores  . . . . . . . . . . . . . . . 108--120
                  Yun R. Qu and   
             Viktor K. Prasanna   Compact hash tables for decision-trees   121--127
               Tuan Tu Tran and   
               Yongchao Liu and   
                 Bertil Schmidt   Bit-parallel approximate pattern
                                  matching: Kepler GPU versus Xeon Phi . . 128--138
                      Anonymous   Editorial Board  . . . . . . . . . . . . ifc--ifc

Parallel Computing
Volume 55, Number ??, July, 2016

                Hank Childs and   
                Franck Cappello   Preface: Visualization and data
                                  analytics for scientific discovery . . . 1
          William M. Putman and   
                 Lesley Ott and   
             Anton Darmenov and   
                Arlindo daSilva   A global perspective of atmospheric
                                  carbon dioxide concentrations  . . . . . 2--8
           Paris Perdikaris and   
           Joseph A. Insley and   
           Leopold Grinberg and   
                     Yue Yu and   
           Michael E. Papka and   
         George Em. Karniadakis   Visualizing multiphysics,
                                  fluid-structure interaction phenomena in
                                  intracranial aneurysms . . . . . . . . . 9--16
              John E. Stone and   
                Melih Sener and   
         Kirby L. Vandivort and   
            Angela Barragan and   
         Abhishek Singharoy and   
                   Ivan Teo and   
     João V. Ribeiro and   
           Barry Isralewitz and   
                     Bo Liu and   
             Boon Chong Goh and   
          James C. Phillips and   
    Craig MacGregor-Chatwin and   
         Matthew P. Johnson and   
         Lena F. Kourkoutis and   
             C. Neil Hunter and   
                 Klaus Schulten   Atomic detail visualization of
                                  photosynthetic membranes with
                                  GPU-accelerated ray tracing  . . . . . . 17--27
                  Leigh Orf and   
          Robert Wilhelmson and   
                   Louis Wicker   Visualization of a simulated long-track
                                  EF5 tornado embedded within a supercell
                                  thunderstorm . . . . . . . . . . . . . . 28--34
          Christopher Lewis and   
          Miguel Valenciano and   
               Charles Cornwell   Visualizations of molecular dynamics
                                  simulations of high-performance
                                  polycrystalline structural ceramics  . . 35--42
            Patrick O'Leary and   
               James Ahrens and   
  Sébastien Jourdain and   
           Scott Wittenburg and   
            David H. Rogers and   
                  Mark Petersen   Cinema image-based in situ analysis and
                                  visualization of MPAS-ocean simulations  43--48
                      Anonymous   Editorial Board  . . . . . . . . . . . . ifc--ifc

Parallel Computing
Volume 56, Number ??, August, 2016

             Stefan Engblom and   
               Dimitar Lukarski   Fast Matlab compatible sparse assembly
                                  on multicore computers . . . . . . . . . 1--17
            Souley Madougou and   
             Ana Varbanescu and   
               Cees de Laat and   
             Rob van Nieuwpoort   The landscape of GPGPU performance
                                  modeling tools . . . . . . . . . . . . . 18--33
                      Anonymous   Editorial Board  . . . . . . . . . . . . ifc--ifc

Parallel Computing
Volume 57, Number ??, September, 2016

             Oguz Selvitopi and   
                 Cevdet Aykanat   Reducing latency cost in $2$D sparse
                                  matrix partitioning models . . . . . . . 1--24
           Alejandro Acosta and   
              Sergio Afonso and   
              Francisco Almeida   Extending Paralldroid with object
                                  oriented annotations . . . . . . . . . . 25--36
            Keisuke Tsugane and   
               Taisuke Boku and   
              Hitoshi Murai and   
             Mitsuhisa Sato and   
               William Tang and   
                       Bei Wang   Hybrid-view programming of nuclear
                                  fusion simulation code in the PGAS
                                  parallel programming language XcalableMP 37--51
                 Ketan Date and   
                    Rakesh Nagi   GPU-accelerated Hungarian algorithms for
                                  the Linear Assignment Problem  . . . . . 52--72
               Sean Wallace and   
                  Zhou Zhou and   
       Venkatram Vishwanath and   
              Susan Coghlan and   
                 John Tramm and   
                Zhiling Lan and   
               Michael E. Papka   Application power profiling on IBM Blue
                                  Gene/Q . . . . . . . . . . . . . . . . . 73--86
       Emanuel H. Rubensson and   
                  Elias Rudberg   Locality-aware parallel block-sparse
                                  matrix-matrix multiplication using the
                                  Chunks and Tasks programming model . . . 87--106
             Vishnu Abhinav and   
             Andres Marquez and   
          Dimitris Nikolopoulos   Editorial of the Special issue: SI: E2SC 107
               Ziming Zhang and   
               Michael Lang and   
                Scott Pakin and   
                        Song Fu   TracSim: Simulating and scheduling
                                  trapped power capacity to maximize
                                  machine room throughput  . . . . . . . . 108--124
                  Lena Oden and   
             Benjamin Klenk and   
            Holger Fröning   Analyzing GPU-controlled communication
                                  with dynamic parallelism in terms of
                                  performance and energy . . . . . . . . . 125--134
               Peter Arbenz and   
              Laura Grigori and   
                Rolf Krause and   
                    Olaf Schenk   Special issue on Parallel Matrix
                                  Algorithms and Applications (PMAA'14)    135--136
               Radu Popescu and   
          Michael A. Heroux and   
                 Simone Deparis   Parallel subdomain solver strategies for
                                  the algebraic additive Schwarz
                                  preconditioner . . . . . . . . . . . . . 137--153
 Lubomír Ríha and   
Tomás Brzobohatý and   
     Alexandros Markopoulos and   
      Marta Jarosová and   
       Tomás Kozubek and   
         David Horák and   
            Václav Hapla   Implementation of the efficient
                                  communication layer for the highly
                                  parallel total FETI and hybrid total
                                  FETI solvers . . . . . . . . . . . . . . 154--166
            Karl E. Prikopa and   
      Wilfried N. Gansterer and   
                   Elias Wimmer   Parallel iterative refinement linear
                                  least squares solvers based on
                                  all-reduce operations  . . . . . . . . . 167--184
              Gemma Sanjuan and   
           Tom\`as Margalef and   
              Ana Cortés   Applying domain decomposition to wind
                                  field calculation  . . . . . . . . . . . 185--196
          Bruno Carpentieri and   
                   Jia Liao and   
            Masha Sosonkina and   
           Aldo Bonfiglioli and   
                     Sven Baars   Using the VBARMS method in parallel
                                  computing  . . . . . . . . . . . . . . . 197--211
         Martin Köhler and   
                      Jens Saak   On GPU acceleration of common solvers
                                  for (quasi-) triangular generalized
                                  Lyapunov equations . . . . . . . . . . . 212--221
              Lars Karlsson and   
            Daniel Kressner and   
         André Uschmajew   Parallel algorithms for tensor
                                  completion in the CP format  . . . . . . 222--234
       Jean-Guillaume Dumas and   
            Thierry Gautier and   
      Clément Pernet and   
            Jean-Louis Roch and   
                    Ziad Sultan   Recursion based parallelization of exact
                                  dense linear algebra routines for
                                  Gaussian elimination . . . . . . . . . . 235--249
                      Anonymous   Editorial Board  . . . . . . . . . . . . ifc--ifc

Parallel Computing
Volume 58, Number ??, October, 2016

                  E. Calore and   
                 A. Gabbana and   
                   J. Kraus and   
              E. Pellegrini and   
             S. F. Schifano and   
                 R. Tripiccione   Massively parallel lattice-Boltzmann
                                  codes on large GPU clusters  . . . . . . 1--24
             Michela Taufer and   
               Pavan Balaji and   
               Satoshi Matsuoka   Special Issue on Cluster Computing . . . 25--26
          Khaled Hamidouche and   
           Akshay Venkatesh and   
           Ammar Ahmad Awan and   
             Hari Subramoni and   
           Ching-Hsiang Chu and   
           Dhabaleswar K. Panda   CUDA-Aware OpenSHMEM: Extensions and
                                  Designs for High Performance OpenSHMEM
                                  on GPU Clusters  . . . . . . . . . . . . 27--36
              Ashwin M. Aji and   
     Antonio J. Peña and   
               Pavan Balaji and   
                   Wu-chun Feng   MultiCL: Enabling automatic scheduling
                                  for task-parallel workloads in OpenCL    37--55
       Edgar A. León and   
                 Ian Karlin and   
              Ryan E. Grant and   
                Matthew Dosanjh   Program optimizations: the interplay
                                  between power, performance, and energy   56--75
                 Jiaan Zeng and   
                     Beth Plale   Argus: a Multi-tenancy NoSQL store with
                                  workload-aware resource reservation  . . 76--89
          Anthony Agelastos and   
             Benjamin Allan and   
                 Jim Brandt and   
                Ann Gentile and   
            Sophia Lefantzi and   
                 Steve Monk and   
                 Jeff Ogden and   
               Mahesh Rajan and   
                 Joel Stevenson   Continuous whole-system monitoring
                                  toward rapid understanding of production
                                  HPC applications and systems . . . . . . 90--106
                  Zhou Zhou and   
                    Xu Yang and   
              Dongfang Zhao and   
                  Paul Rich and   
                   Wei Tang and   
                   Jia Wang and   
                    Zhiling Lan   I/O-aware bandwidth allocation for
                                  petascale computing systems  . . . . . . 107--116
                Ariful Azad and   
             Aydin Buluç   A matrix-algebraic formulation of
                                  distributed-memory maximal cardinality
                                  matching algorithms in bipartite graphs  117--130
              Jianping Zeng and   
                    Hongfeng Yu   A study of graph partitioning schemes
                                  for parallel graph community detection   131--139
                   Dong Dai and   
               Philip Carns and   
             Robert B. Ross and   
               John Jenkins and   
          Nicholas Muirhead and   
                      Yong Chen   An asynchronous traversal engine for
                                  graph-based rich metadata management . . 140--156
                      Anonymous   Editorial Board  . . . . . . . . . . . . ifc--ifc

Parallel Computing
Volume 59, Number ??, November, 2016

         Oscar Vega-Gisbert and   
              Jose E. Roman and   
             Jeffrey M. Squyres   Design and implementation of Java
                                  bindings in Open MPI . . . . . . . . . . 1--20
             Antonino Tumeo and   
                   John Feo and   
                   Oreste Villa   Special Issue on Theory and Practice of
                                  Irregular Applications (TaPIA) . . . . . 21--23
            Andrea Marongiu and   
       Alessandro Capotondi and   
                    Luca Benini   Controlling NUMA effects in embedded
                                  manycore applications with lightweight
                                  nested parallelism support . . . . . . . 24--42
                    Yao Zhu and   
                David F. Gleich   A parallel min-cut algorithm using
                                  iteratively reweighted least squares
                                  targeting at problems with
                                  floating-point edge weights  . . . . . . 43--59
                Daming Feng and   
        Andrey N. Chernikov and   
         Nikos P. Chrisochoides   Two-level locality-aware parallel
                                  Delaunay image-to-mesh conversion  . . . 60--70
                 Seher Acer and   
             Oguz Selvitopi and   
                 Cevdet Aykanat   Improving performance of sparse matrix
                                  dense matrix multiplication on
                                  large-scale parallel systems . . . . . . 71--96
      Mohammed A. Al Farhan and   
          Dinesh K. Kaushik and   
                 David E. Keyes   Unstructured computational aerodynamics
                                  on many integrated core architecture . . 97--118
                    J. Gmys and   
                  M. Mezmaz and   
                   N. Melab and   
                    D. Tuyttens   A GPU-based Branch-and-Bound algorithm
                                  using Integer--Vector--Matrix data
                                  structure  . . . . . . . . . . . . . . . 119--139
          Steven C. Rennich and   
               Darko Stosic and   
               Timothy A. Davis   Accelerating sparse Cholesky
                                  factorization on GPUs  . . . . . . . . . 140--150
Cristina Montañola-Sales and   
         Bhakti S. S. Onggo and   
     Josep Casanovas-Garcia and   
Jose María Cela-Espín and   
 Adriana Kaplan-Marcusán   Approaching parallel computing to
                                  simulating population dynamics in
                                  demography . . . . . . . . . . . . . . . 151--170
                      Anonymous   Editorial Board  . . . . . . . . . . . . ifc--ifc

Parallel Computing
Volume 60, Number ??, December, 2016

                  Chen Wang and   
                      Ce Yu and   
             Shanjiang Tang and   
                  Jian Xiao and   
                 Jizhou Sun and   
                  Xiangfei Meng   A general and fast distributed system
                                  for large-scale dynamic programming
                                  applications . . . . . . . . . . . . . . 1--21
            Martina Prugger and   
            Lukas Einkemmer and   
            Alexander Ostermann   Evaluation of the partitioned global
                                  address space (PGAS) model for an
                                  inviscid Euler solver  . . . . . . . . . 22--40
                      Anonymous   Editorial Board  . . . . . . . . . . . . ifc--ifc

Parallel Computing
Volume 61, Number ??, January, 2017

             Philip C. Roth and   
                 R. Shane Canon   Special Issue on Data-Intensive Scalable
                                  Computing Systems  . . . . . . . . . . . 1--2
                    Wei Xie and   
                  Yong Chen and   
                 Philip C. Roth   ASA-FTL: an adaptive separation aware
                                  flash translation layer for solid state
                                  drives . . . . . . . . . . . . . . . . . 3--17
               Pengfei Xuan and   
            Walter B. Ligon and   
          Pradip K. Srimani and   
                    Rong Ge and   
                       Feng Luo   Accelerating big data analytics on HPC
                                  clusters using two-level storage . . . . 18--34
             Preeti Malakar and   
           Venkatram Vishwanath   Data movement optimizations for
                                  independent MPI I/O on the Blue Gene/Q   35--51
     Francisco Rodrigo Duro and   
         Javier Garcia Blas and   
              Florin Isaila and   
            Jesus Carretero and   
          Justin M. Wozniak and   
                       Rob Ross   Experimental evaluation of a flexible
                                  I/O architecture for accelerating
                                  workflow engines in ultrascale
                                  environments . . . . . . . . . . . . . . 52--67
                Huansong Fu and   
               Haiquan Chen and   
                    Yue Zhu and   
                     Weikuan Yu   FARMS: Efficient MapReduce speculation
                                  for failure recovery in short jobs . . . 68--82
                 Lizhen Shi and   
                 Zhong Wang and   
                 Weikuan Yu and   
                  Xiandong Meng   A case study of tuning MapReduce for
                                  efficient bioinformatics in the cloud    83--95
                      Anonymous   Editorial Board  . . . . . . . . . . . . ifc--ifc

Parallel Computing
Volume 62, Number ??, February, 2017

             Amandeep Verma and   
                 Sakshi Kaushal   A hybrid multi-objective Particle Swarm
                                  Optimization for scientific workflow
                                  scheduling . . . . . . . . . . . . . . . 1--19
               Robert Speck and   
                Daniel Ruprecht   Toward fault-tolerant parallel-in-time
                                  integration with PFASST  . . . . . . . . 20--37
                      Anonymous   Editorial Board  . . . . . . . . . . . . ifc--ifc

Parallel Computing
Volume 63, Number ??, April, 2017

                Jiaquan Gao and   
              Yuanshen Zhou and   
                  Guixia He and   
                      Yifei Xia   A multi-GPU parallel optimization model
                                  for the preconditioned conjugate
                                  gradient algorithm . . . . . . . . . . . 1--16
            Andrew Giuliani and   
              Lilia Krivodonova   Face coloring in unstructured CFD codes  17--37
                 Boyu Zhang and   
             Trilce Estrada and   
             Pietro Cicotti and   
               Pavan Balaji and   
                 Michela Taufer   Enabling scalable and accurate
                                  clustering of distributed ligand
                                  geometries on supercomputers . . . . . . 38--60
            Douglas Otstott and   
           Latchesar Ionkov and   
               Michael Lang and   
                      Ming Zhao   TCASM: an asynchronous shared memory
                                  interface for high-performance
                                  application composition  . . . . . . . . 61--78
                      Anonymous   Editorial Board  . . . . . . . . . . . . ifc--ifc

Parallel Computing
Volume 64, Number ??, May, 2017

               Rupak Biswas and   
             David Donofrio and   
                  Leonid Oliker   High-End Computing for Next-Generation
                                  Scientific Discovery . . . . . . . . . . 1--2
        Thomas Bönisch and   
              Michael Resch and   
          Thomas Schwitalla and   
            Matthias Meinke and   
           Volker Wulfmeyer and   
           Kirsten Warrach-Sagi   Hazel Hen --- leading HPC technology and
                                  its impact on science in Germany and
                                  Europe . . . . . . . . . . . . . . . . . 3--11
                Fumiyoshi Shoji   Lessons learned from development and
                                  operation of the K computer  . . . . . . 12--19
            Eric J. Nielsen and   
                   Boris Diskin   High-performance aerodynamic
                                  computations for aerospace applications  20--32
        Vincent Cavé and   
       Romain Clédat and   
               Paul Griffin and   
                 Ankit More and   
            Bala Seshasayee and   
             Shekhar Borkar and   
          Sanjay Chatterjee and   
               Dave Dunning and   
                  Joshua Fryman   Traleika Glacier: a hardware--software
                                  co-designed approach to exascale
                                  computing  . . . . . . . . . . . . . . . 33--49
               Protonu Basu and   
            Samuel Williams and   
         Brian Van Straalen and   
              Leonid Oliker and   
            Phillip Colella and   
                      Mary Hall   Compiler-based code generation and
                                  autotuning for geometric multigrid on
                                  GPU-accelerated supercomputers . . . . . 50--64
    Sébastien Rumley and   
            Meisam Bahadori and   
             Robert Polster and   
           Simon D. Hammond and   
           David M. Calhoun and   
                     Ke Wen and   
             Arun Rodrigues and   
                  Keren Bergman   Optical interconnects for extreme scale
                                  computing systems  . . . . . . . . . . . 65--80
               Rupak Biswas and   
                Zhang Jiang and   
            Kostya Kechezhi and   
               Sergey Knysh and   
         Salvatore Mandr\`a and   
             Bryan O'Gorman and   
    Alejandro Perdomo-Ortiz and   
             Andre Petukhov and   
   John Realpe-Gómez and   
            Eleanor Rieffel and   
          Davide Venturelli and   
                Fedir Vasko and   
                    Zhihui Wang   A NASA perspective on quantum computing:
                                  Opportunities and challenges . . . . . . 81--98
                      Anonymous   Editorial Board  . . . . . . . . . . . . ifc--ifc

Parallel Computing
Volume 65, Number ??, July, 2017

                   S. Cools and   
                    W. Vanroose   The communication-hiding pipelined
                                  BiCGstab method for the parallel
                                  solution of large unsymmetric linear
                                  systems  . . . . . . . . . . . . . . . . 1--20
              Ryuji Yoshida and   
            Seiya Nishizawa and   
            Hisashi Yashiro and   
          Sachiho A. Adachi and   
               Yousuke Sato and   
           Tsuyoshi Yamaura and   
                Hirofumi Tomita   CONeP: a cost-effective online nesting
                                  procedure for regional atmospheric
                                  models . . . . . . . . . . . . . . . . . 21--31
                      Anonymous   Editorial Board  . . . . . . . . . . . . ifc--ifc

Parallel Computing
Volume 66, Number ??, August, 2017

           Giovanni Mariani and   
             Andreea Anghel and   
              Rik Jongerius and   
                  Gero Dittmann   Classification of thread profiles for
                                  scaling application behavior . . . . . . 1--21
              Maria Predari and   
     Aurélien Esnard and   
                     Jean Roman   Comparison of initial partitioning
                                  methods for multilevel direct $k$-way
                                  graph partitioning with fixed vertices   22--39
                      Anonymous   Editorial Board  . . . . . . . . . . . . ifc--ifc

Parallel Computing
Volume 67, Number ??, September, 2017

       Milan B. Radulovi\'c and   
             Sylvain Girbal and   
            Milo V. Tomasevi\'c   Low-level implementation of the SISC
                                  protocol for thread-level speculation on
                                  a multi-core architecture  . . . . . . . 1--19
            Nicholas Geneva and   
                 Cheng Peng and   
                Xiaoming Li and   
                 Lian-Ping Wang   A scalable interface-resolved simulation
                                  of particle-laden flow using the lattice
                                  Boltzmann method . . . . . . . . . . . . 20--37
                 Marc Casas and   
               Greg Bronevetsky   Prediction of the impact of network
                                  switch utilization on application
                                  performance via active measurement . . . 38--56
   Roberto Peñaranda and   
Crispín Gómez and   
María Engracia Gómez and   
             Pedro López   XOR-based HoL-blocking reduction routing
                                  mechanisms for direct networks . . . . . 57--74
                      Anonymous   Editorial Board  . . . . . . . . . . . . ifc--ifc

Parallel Computing
Volume 68, Number ??, October, 2017

      Sunita Chandrasekaran and   
         Antonio J. Peña   Special Issue on Topics on Heterogeneous
                                  Computing  . . . . . . . . . . . . . . . 1--2
                Iman Faraji and   
        Seyed H. Mirsadeghi and   
                   Ahmad Afsahi   Exploiting heterogeneity of
                                  communication channels for efficient GPU
                                  selection on multi-GPU nodes . . . . . . 3--16
            Joshua D. Booth and   
       Nathan D. Ellingwood and   
        Heidi K. Thornquist and   
      Sivasankaran Rajamanickam   Basker: Parallel sparse $ L U $
                                  factorization utilizing hierarchical
                                  parallelism and data layouts . . . . . . 17--31
               Hartwig Anzt and   
                 Mark Gates and   
              Jack Dongarra and   
            Moritz Kreutzer and   
            Gerhard Wellein and   
             Martin Köhler   Preconditioned Krylov solvers on GPUs    32--44
         Valeria Cardellini and   
      Alessandro Fanfarillo and   
            Salvatore Filippone   Coarray-based load balancing on
                                  heterogeneous and many-core
                                  architectures  . . . . . . . . . . . . . 45--58
               Luis Costero and   
         Francisco D. Igual and   
             Katzalin Olcoz and   
      Sandra Catalán and   
Rafael Rodríguez-Sánchez and   
Enrique S. Quintana-Ortí   Revisiting conventional task schedulers
                                  to exploit asymmetry in multi-core
                                  architectures for dense linear algebra
                                  operations . . . . . . . . . . . . . . . 59--76
             John D. Leidel and   
                      Yong Chen   HMC-Sim-2.0: a co-design infrastructure
                                  for exploring custom memory cube
                                  operations . . . . . . . . . . . . . . . 77--88
                      Anonymous   Editorial Board  . . . . . . . . . . . . ifc--ifc

Parallel Computing
Volume 69, Number ??, November, 2017

              Hoang-Vu Dang and   
                  Marc Snir and   
                  William Gropp   Eliminating contention bottlenecks in
                                  multithreaded MPI  . . . . . . . . . . . 1--23
          Martin Ruefenacht and   
                  Mark Bull and   
                  Stephen Booth   Generalisation of recursive doubling for
                                  AllReduce: Now with simulation . . . . . 24--44
      Ana Moreton-Fernandez and   
  Arturo Gonzalez-Escribano and   
                Diego R. Llanos   A technique to automatically determine
                                  Ad-hoc communication patterns at runtime 45--62
                 Weiming Lu and   
              Yaoguang Wang and   
             Jingyuan Jiang and   
                   Jian Liu and   
                Yapeng Shen and   
                    Baogang Wei   Hybrid storage architecture and
                                  efficient MapReduce processing for
                                  unstructured data  . . . . . . . . . . . 63--77
             Ramzi Mahmoudi and   
               Mohamed Akil and   
     Mohamed Hédi Bedoui   Concurrent computation of topological
                                  watershed on shared memory parallel
                                  machines . . . . . . . . . . . . . . . . 78--97
    Alexandra Carpen-Amarie and   
              Sascha Hunold and   
      Jesper Larsson Träff   On expected and observed communication
                                  performance with MPI derived datatypes   98--117
                      Anonymous   Editorial Board  . . . . . . . . . . . . ifc--ifc

Parallel Computing
Volume 70, Number ??, December, 2017

             Jeff Hollingsworth   Editorial  . . . . . . . . . . . . . . . 1--1
          Michael A. Heroux and   
          C. Kristopher Garrett   Special Issue on SC16 Student Cluster
                                  Competition Reproducibility Initiative   3--4
             Rainier Ababao and   
              Joe A. Garcia and   
                Joseph Voss and   
           W. Cyrus Proctor and   
                  R. Todd Evans   Student Cluster Competition 2016
                                  reproducibility challenge: Genomic
                                  partitioning with ParConnect . . . . . . 5--10
               Ying Hao Tan and   
                Yiyang Shao and   
                 Siyuan Liu and   
                    Bu-Sung Lee   Student cluster competition: ParConnect
                                  reproducibility task report  . . . . . . 11--17
           Marek Baranowski and   
             Braden Caywood and   
                Hannah Eyre and   
                Janaan Lake and   
               Kevin Parker and   
             Kincaid Savoie and   
                Hari Sundar and   
                      Mary Hall   Reproducing ParConnect for SC16  . . . . 18--21
                   Lei Yang and   
                  Yilong Li and   
                 Zhenxin Fu and   
                 Zhuohan Li and   
                 Wenbin Hou and   
                   Haoze Wu and   
               Xiaolin Wang and   
                      Yun Liang   ParConnect reproducibility report  . . . 22--26
             G. R. Williams and   
                 G. P. Behm and   
                  T. Nguyen and   
                 A. Esparza and   
                 V. G. Haka and   
                   A. Ramos and   
                  B. Wright and   
                 J. C. Otto and   
              C. P. Paolini and   
                   M. P. Thomas   SC16 student cluster competition
                                  challenge: Investigating the
                                  reproducibility of results for the
                                  ParConnect application . . . . . . . . . 27--34
              Edward Hutter and   
               Chung-Ting Huang   ParConnect: Results from the student
                                  cluster competition at SC16  . . . . . . 35--40
           Alexander Ditter and   
              Jan Laukemann and   
              Benedikt Oehlrich   Reproducibility report: Team SegFAUlt @
                                  SCC 2016 . . . . . . . . . . . . . . . . 41--45
         Maximilian Hornung and   
            Svilen Stefanov and   
            David Schneller and   
           Sharru Mòller   Analysis of the ParConnect algorithm ran
                                  on Intel Xeon Phi Knights Landing  . . . 46--53
              Patrick Flick and   
                Chirag Jain and   
                   Tony Pan and   
                 Srinivas Aluru   Reprint of ``A parallel connectivity
                                  algorithm for de Bruijn graphs in
                                  metagenomic applications'' . . . . . . . 54--65
                      Anonymous   Editorial Board  . . . . . . . . . . . . ifc--ifc

Parallel Computing
Volume 71, Number ??, January, 2018

                      Anonymous   Reviewer acknowledgement 2017  . . . . . I--II
                      Anonymous   Editorial Board  . . . . . . . . . . . . ii--ii
               Hartwig Anzt and   
           Thomas K. Huckle and   
   Jürgen Bräckle and   
                  Jack Dongarra   Incomplete Sparse Approximate Inverses
                                  for Parallel Preconditioning . . . . . . 1--22
               Minquan Fang and   
               Jianbin Fang and   
               Weimin Zhang and   
               Haifang Zhou and   
              Jianxing Liao and   
                  Yuangang Wang   Benchmarking the GPU memory at the warp
                                  level  . . . . . . . . . . . . . . . . . 23--41
        Hassan Salehe Matar and   
                Erdal Mutlu and   
             Serdar Tasiran and   
                     Didem Unat   Output nondeterminism detection for
                                  programming models combining dataflow
                                  with shared memory . . . . . . . . . . . 42--57

Parallel Computing
Volume 72, Number ??, February, 2018

                      Anonymous   Editorial Board  . . . . . . . . . . . . ii--ii
               Hongzhi Wang and   
                 Feng Xiong and   
                 Jianing Li and   
               Shengfei Shi and   
               Jianzhong Li and   
                       Hong Gao   Data management on new processors: A
                                  survey . . . . . . . . . . . . . . . . . 1--13
         Joanna Berli\'nska and   
              Maciej Drozdowski   Comparing load-balancing algorithms for
                                  MapReduce under Zipfian data skews . . . 14--28
              Junxiong Wang and   
               Hongzhi Wang and   
                Chenxu Zhao and   
               Jianzhong Li and   
                       Hong Gao   Iteration acceleration for distributed
                                  learning systems . . . . . . . . . . . . 29--41
                      Anonymous   Foreword for the special issue on the
                                  best papers from the EuroMPI 2016
                                  conference . . . . . . . . . . . . . . . 42--42

Parallel Computing
Volume 74, Number ??, 2018

                      Anonymous   Editorial Board  . . . . . . . . . . . . ii--ii
   Christos D. Antonopoulos and   
Enrique S. Quintana-Ortí   Parallel programming for resilience and
                                  energy efficiency  . . . . . . . . . . . 1--2
                     Li Tan and   
        Nathan DeBardeleben and   
                 Qiang Guan and   
             Sean Blanchard and   
                   Michael Lang   Using virtualization to quantify power
                                  conservation via near-threshold voltage
                                  reduction for inherently resilient
                                  applications . . . . . . . . . . . . . . 3--15
                   F. Rizzi and   
                  K. Morris and   
                K. Sargsyan and   
                   P. Mycek and   
                   C. Safta and   
      O. Le Ma\^ìtre and   
                 O. M. Knio and   
              B. J. Debusschere   Exploring the interplay of resilience
                                  and energy consumption for a task-based
                                  partial differential equations
                                  preconditioner . . . . . . . . . . . . . 16--27
      Sandra Catalán and   
     José R. Herrero and   
Enrique S. Quintana-Ortí and   
Rafael Rodríguez-Sánchez   Energy balance between voltage-frequency
                                  scaling and resilience for linear
                                  algebra routines on low-power multicore
                                  architectures  . . . . . . . . . . . . . 28--39
               Patrick Judd and   
            Jorge Albericio and   
        Tayler Hetherington and   
                 Tor Aamodt and   
     Natalie Enright Jerger and   
             Raquel Urtasun and   
               Andreas Moshovos   Proteus: Exploiting precision
                                  variability in deep neural networks  . . 40--51
        Panos Koutsovasilis and   
         Christos Kalogirou and   
        Christos Konstantas and   
           Manolis Maroudas and   
            Michalis Spyrou and   
       Christos D. Antonopoulos   AcHEe: Evaluating approximate computing
                                  and heterogeneity for energy efficiency  52--67
                      Anonymous   Editorial Board  . . . . . . . . . . . . ii--ii
            Emmanuel Agullo and   
               Peter Arbenz and   
                 Luc Giraud and   
                    Olaf Schenk   Special issue on parallel matrix
                                  algorithms and applications (PMAA'16)    1--2
                 Mark Gates and   
            Stanimire Tomov and   
                  Jack Dongarra   Accelerating the SVD two stage
                                  bidiagonal reduction and divide and
                                  conquer using GPUs . . . . . . . . . . . 3--18
       Wajih Halim Boukaram and   
           George Turkiyyah and   
               Hatem Ltaief and   
                 David E. Keyes   Batched $ Q R $ and SVD algorithms on
                                  GPUs with applications in hierarchical
                                  matrix compression . . . . . . . . . . . 19--33
              Akira Imakura and   
                Tetsuya Sakurai   Block SS-CAA: A complex moment-based
                                  parallel nonlinear eigensolver using the
                                  block communication-avoiding Arnoldi
                                  procedure  . . . . . . . . . . . . . . . 34--48
                  Chao Chen and   
            Hadi Pouransari and   
  Sivasankaran Rajamanickam and   
              Erik G. Boman and   
                     Eric Darve   A distributed-memory hierarchical solver
                                  for general sparse linear systems  . . . 49--64
      Gustavo Chávez and   
           George Turkiyyah and   
            Stefano Zampini and   
               Hatem Ltaief and   
                    David Keyes   Accelerated Cyclic Reduction: A
                                  distributed-memory fast solver for
                                  structured linear systems  . . . . . . . 65--83
          Mathias Jacquelin and   
                    Lin Lin and   
                      Chao Yang   PSelInv --- A distributed memory
                                  parallel algorithm for selected
                                  inversion: The non-symmetric case  . . . 84--98
               Shaden Smith and   
               Jongsoo Park and   
                 George Karypis   HPC formulations of optimization
                                  algorithms for tensor completion . . . . 99--117
     A. Lamas Daviña and   
                    J. E. Roman   MPI-CUDA parallel linear solvers for
                                  block-tridiagonal matrices in the
                                  context of SLEPc's eigensolvers  . . . . 118--135
         Vassilis Kalantzis and   
    A. Cristiano I. Malossi and   
               Costas Bekas and   
         Alessandro Curioni and   
     Efstratios Gallopoulos and   
                    Yousef Saad   A scalable iterative dense linear system
                                  solver for multiple right-hand sides in
                                  data analytics . . . . . . . . . . . . . 136--153

Parallel Computing
Volume 75, Number ??, July, 2018

                      Anonymous   Editorial Board  . . . . . . . . . . . . ii--ii
              Daisuke Takahashi   Computation of the 100 quadrillionth
                                  hexadecimal digit of $ \pi $ on a
                                  cluster of Intel Xeon Phi processors . . 1--10
     Germán Ceballos and   
               Thomas Grass and   
                 Andra Hugo and   
           David Black-Schaffer   Analyzing performance variation of task
                                  schedulers with TaskInsight  . . . . . . 11--27
       János Végh   Introducing the explicitly
                                  many-processor approach  . . . . . . . . 28--40
             Henri Casanova and   
            Julien Herrmann and   
                    Yves Robert   Computing the expected makespan of task
                                  graphs in the presence of silent errors  41--60
               Beichuan Yan and   
            Richard A. Regueiro   Superlinear speedup phenomenon in
                                  parallel $3$D Discrete Element Method
                                  (DEM) simulations of complex-shaped
                                  particles  . . . . . . . . . . . . . . . 61--87
      Przemyslaw Spychalski and   
                 Ryszard Arendt   Machine Learning in Multi-Agent Systems
                                  using Associative Arrays . . . . . . . . 88--99
      Jesper Larsson Träff   Practical, distributed, low overhead
                                  algorithms for irregular gather and
                                  scatter collectives  . . . . . . . . . . 100--117
      Alessandro Fanfarillo and   
               Davide Del Vento   Notified access in coarray-based
                                  hydrodynamics applications on many-core
                                  architectures: Design and performance    118--129
              Wayne Joubert and   
                James Nance and   
           Deborah Weighill and   
                Daniel Jacobson   Parallel accelerated vector similarity
                                  calculations for genomics applications   130--145

Parallel Computing
Volume 76, Number ??, August, 2018

                      Anonymous   Editorial Board  . . . . . . . . . . . . ii--ii
David L. González-Álvarez and   
Miguel A. Vega-Rodríguez and   
      Álvaro Rubio-Largo   Searching for common patterns on protein
                                  sequences by means of a parallel hybrid
                                  honey-bee mating optimization algorithm  1--17
      Sandra Catalán and   
     José R. Herrero and   
Enrique S. Quintana-Ortí and   
Rafael Rodríguez-Sánchez   Static scheduling of the $ L U $
                                  factorization with look-ahead on
                                  asymmetric multicore processors  . . . . 18--27
Antonio Gómez-Iglesias and   
  Miguel Cárdenas-Montes   Performance evaluation of the
                                  three-point angular correlation function 28--41
            Alcides Fonseca and   
                   Bruno Cabral   Overcoming the No Free Lunch Theorem in
                                  Cut-off Algorithms for Fork--Join
                                  programs . . . . . . . . . . . . . . . . 42--56
                Ivy Bo Peng and   
            Roberto Gioiosa and   
              Gokcen Kestor and   
          Jeffrey S. Vetter and   
             Pietro Cicotti and   
                Erwin Laure and   
               Stefano Markidis   Characterizing the performance benefit
                                  of hybrid memory system for HPC
                                  applications . . . . . . . . . . . . . . 57--69
               Brice Goglin and   
           Emmanuel Jeannot and   
            Farouk Mansouri and   
              Guillaume Mercier   Hardware topology management in MPI
                                  applications through hierarchical
                                  communicators  . . . . . . . . . . . . . 70--90
              Xuechen Zhang and   
                 Song Jiang and   
              Alseny Diallo and   
                       Lei Wang   IR+: Removing parallel I/O interference
                                  of MPI programs via data replication
                                  over heterogeneous storage devices . . . 91--105

Parallel Computing
Volume 77, Number ??, September, 2018

                      Anonymous   Editorial Board  . . . . . . . . . . . . ii--ii
             Simon Pickartz and   
             Carsten Clauss and   
              Stefan Lankes and   
                Antonello Monti   Revisiting locality-awareness in view of
                                  dynamically changing topologies  . . . . 1--18
          Srinivasan Ramesh and   
      Aur\`ele Mahéo and   
              Sameer Shende and   
            Allen D. Malony and   
             Hari Subramoni and   
                Amit Ruhela and   
      Dhabaleswar K. (DK) Panda   MPI performance engineering with the MPI
                                  tool interface: the integration of
                                  MVAPICH and TAU  . . . . . . . . . . . . 19--37
         Sergio Rivas-Gomez and   
            Roberto Gioiosa and   
                Ivy Bo Peng and   
              Gokcen Kestor and   
        Sai Narasimhamurthy and   
                Erwin Laure and   
               Stefano Markidis   MPI windows on storage for HPC
                                  applications . . . . . . . . . . . . . . 38--56
           Kurt B. Ferreira and   
                 Scott Levy and   
             Kevin Pedretti and   
                  Ryan E. Grant   Characterizing MPI matching via
                                  trace-based simulation . . . . . . . . . 57--83
    Alejandro Estaña and   
               Kevin Molloy and   
               Marc Vaisset and   
           Nathalie Sibille and   
      Thierry Siméon and   
         Pau Bernadó and   
             Juan Cortés   Hybrid parallelization of a multi-tree
                                  path search algorithm: Application to
                                  highly-flexible biomolecules . . . . . . 84--100
               Yonggang Che and   
               Meifang Yang and   
                 Chuanfu Xu and   
                      Yutong Lu   Petascale scramjet combustion simulation
                                  on the Tianhe-2 heterogeneous
                                  supercomputer  . . . . . . . . . . . . . 101--117
                 Siyuan Liu and   
                  Meiru Hao and   
                    Bu-Sung Lee   Student Cluster Competition 2017, team
                                  Nanyang Technological University:
                                  Reproducing vectorization of the Tersoff
                                  multi-body potential on the Intel
                                  Broadwell architecture . . . . . . . . . 118--124
      Sunita Chandrasekaran and   
         Antonio J. Peña   Special issue on applications for the
                                  heterogeneous computing era 2017 . . . . 125--127
                  James Lin and   
                 Zhigeng Xu and   
                 Linjin Cai and   
               Akira Nukada and   
               Satoshi Matsuoka   Evaluating the SW26010 many-core
                                  processor with a micro-benchmark suite
                                  for performance optimizations  . . . . . 128--143

Parallel Computing
Volume 78, Number ??, October, 2018

                      Anonymous   Editorial Board  . . . . . . . . . . . . ii--ii
              Harald Servat and   
       Jesús Labarta and   
       Hans-Christian Hoppe and   
       Judit Giménez and   
         Antonio J. Peña   Understanding memory access patterns
                                  using the BSC performance tools  . . . . 1--14
              Michael Wolfe and   
                 Seyong Lee and   
                Jungwon Kim and   
               Xiaonan Tian and   
                  Rengan Xu and   
            Barbara Chapman and   
          Sunita Chandrasekaran   The OpenACC data model: Preliminary
                                  study on its major challenges and
                                  implementations  . . . . . . . . . . . . 15--27
                 Zhenxin Fu and   
                   Lei Yang and   
                 Wenbin Hou and   
                 Zhuohan Li and   
                   Yifan Wu and   
                Yihua Cheng and   
               Xiaolin Wang and   
                      Yun Liang   Student Cluster Competition 2017, Team
                                  Peking University: Reproducing
                                  vectorization of the Tersoff multi-body
                                  potential on the Intel Broadwell
                                  architecture . . . . . . . . . . . . . . 28--32
              Mehmet Deveci and   
            Christian Trott and   
      Sivasankaran Rajamanickam   Multithreaded sparse matrix--matrix
                                  multiplication for many-core and GPU
                                  architectures  . . . . . . . . . . . . . 33--46
        Ka Cheong Jason Lau and   
                  Yuxuan Li and   
                    Lei Xie and   
                   Qian Xie and   
                 Beichen Li and   
                    Yu Chen and   
                Guanyu Feng and   
                  Jiping Yu and   
                 Xinjian Yu and   
                  Miao Wang and   
                 Wentao Han and   
                    Jidong Zhai   Student cluster competition 2017, team
                                  Tsinghua University: Reproducing
                                  vectorization of the Tersoff multi-body
                                  potential on the Intel Skylake and
                                  NVIDIA Volta architectures . . . . . . . 47--53
              Sergio Iserte and   
                Rafael Mayo and   
Enrique S. Quintana-Ortí and   
      Vicenç Beltran and   
         Antonio J. Peña   DMR API: Improving cluster productivity
                                  by turning applications into malleable   54--66
                  Z. Marcus and   
                   J. Booth and   
                    C. Bunn and   
                   M. Leger and   
                   S. Hance and   
                 T. Sweeney and   
              C. McCardwell and   
                       D. Kaeli   Student cluster competition 2017, team
                                  Northeastern University: Reproducing
                                  vectorization of the Tersoff multi-body
                                  potential on the NVIDIA V100 . . . . . . 67--71
             ChanJung Chang and   
              YungChing Lin and   
              YuHsuan Cheng and   
               YuCheng Wang and   
                    LiYu Yu and   
               TienChi Yang and   
                     Jerry Chou   Student cluster competition 2017, team
                                  NTHU: Reproducing vectorization of the
                                  Tersoff multi-body potential on the
                                  Intel Skylake and Nvidia P100
                                  architecture . . . . . . . . . . . . . . 72--78
          Lisa Marie Dreier and   
            Svilen Stefanov and   
            David Schneller and   
               Alexander Ditter   SC17 student cluster competition, Team
                                  Technical University of Munich and
                                  Friedrich-Alexander University
                                  Erlangen--Nürnberg: Reproducing
                                  vectorization of the Tersoff multi-body
                                  potential on the Intel Broadwell
                                  architecture . . . . . . . . . . . . . . 79--83
            Abdelhalim Amer and   
               Pavan Balaji and   
                    Zhiyi Huang   8th International Workshop on
                                  Programming Models and Applications for
                                  Multicores and Manycores (PMAM'17) . . . 84--84
               Pedro Alonso and   
      Sandra Catalán and   
     José R. Herrero and   
Enrique S. Quintana-Ortí and   
Rafael Rodríguez-Sánchez   Two-sided orthogonal reductions to
                                  condensed forms on asymmetric multicore
                                  processors . . . . . . . . . . . . . . . 85--100
                 Xuhao Chen and   
                 Cheng Chen and   
                   Jie Shen and   
               Jianbin Fang and   
                   Tao Tang and   
                Canqun Yang and   
                   Zhiying Wang   Orchestrating parallel detection of
                                  strongly connected components on GPUs    101--114

Parallel Computing
Volume 79, Number ??, November, 2018

                      Anonymous   Editorial Board  . . . . . . . . . . . . ii--ii
                Janaan Lake and   
               Qixiang Chao and   
                Hannah Eyre and   
               Emerson Ford and   
               Kevin Parker and   
                 Kincaid Savoie   Student Cluster Competition 2017, Team
                                  University of Utah: Reproducing
                                  Vectorization of the Tersoff Multi-Body
                                  Potential on the Intel Broadwell and
                                  Intel Skylake Platforms  . . . . . . . . 1--8
           Ralph H. Castain and   
              Joshua Hursey and   
        Aurelien Bouteiller and   
                     David Solt   PMIx: Process management for exascale
                                  environments . . . . . . . . . . . . . . 9--29
             James Sullivan and   
                Collin Weir and   
            Austin Reichert and   
              R. Todd Evans and   
           W. Cyrus Proctor and   
                 Nicolas Thorne   Student cluster competition 2017, Team
                                  University of Texas at Austin/Texas
                                  State University: Reproducing
                                  vectorization of the Tersoff multi-body
                                  potential on the Intel Skylake and
                                  NVIDIA V100 architectures  . . . . . . . 30--35
                Yuan Yirang and   
                  Chang Luo and   
               Li Changfeng and   
                    Sun Tongjun   Domain decomposition modified with
                                  characteristic mixed finite element of
                                  compressible oil-water seepage
                                  displacement and its numerical analysis  36--47
            C. Kris Garrett and   
       Stephen Lien Harrell and   
              Michael A. Heroux   Special Issue on SCC'17 Reproducibility
                                  Initiative . . . . . . . . . . . . . . . 48--49

Parallel Computing
Volume 80, Number ??, December, 2018

                      Anonymous   Editorial Board  . . . . . . . . . . . . ii--ii
         Maciej Paszy\'nski and   
               Leszek Siwik and   
               Maciej Wo\'zniak   Concurrency of three-dimensional refined
                                  isogeometric analysis  . . . . . . . . . 1--22
                 Mario Badr and   
         Natalie Enright Jerger   A high-level model for exploring
                                  multi-core architectures . . . . . . . . 23--35
             Sebastian Eibl and   
               Ulrich Rüde   A Local Parallel Communication Algorithm
                                  for Polydisperse Rigid Body Dynamics . . 36--48

Parallel Computing
Volume 81, Number ??, January, 2019

                      Anonymous   Editorial Board  . . . . . . . . . . . . ii--ii
                 I. Masliah and   
             A. Abdelfattah and   
                  A. Haidar and   
                   S. Tomov and   
                M. Baboulin and   
                  J. Falcou and   
                    J. Dongarra   Algorithms and optimization techniques
                                  for high-performance matrix--matrix
                                  multiplications of very small matrices   1--21
Andrés E. Tomás and   
Rafael Rodríguez-Sánchez and   
      Sandra Catalán and   
Rocío Carratalá-Sáez and   
Enrique S. Quintana-Ortí   Dynamic look-ahead in the reduction to
                                  band form for the singular value
                                  decomposition  . . . . . . . . . . . . . 22--31
           Daniel J. Holmes and   
             Bradley Morgan and   
           Anthony Skjellum and   
   Purushotham V. Bangalore and   
             Srinivas Sridharan   Planning for performance: Enhancing
                                  achievable performance for MPI through
                                  persistent collective operations . . . . 32--57
      Alessandro Fanfarillo and   
         Sudip Kumar Garain and   
            Dinshaw Balsara and   
                   Daniel Nagle   Resilient computational applications
                                  using Coarray Fortran  . . . . . . . . . 58--67
         Chaitanya Talnikar and   
                      Qiqi Wang   A two-level computational graph method
                                  for the adjoint of a finite volume based
                                  compressible unsteady flow solver  . . . 68--84
                 Yanhua Cao and   
                      Li Lu and   
                   Jiadi Yu and   
                Shiyou Qian and   
                 Yanmin Zhu and   
                      Minglu Li   Online cost-rejection rate scheduling
                                  for resource requests in hybrid clouds   85--103
             J. W. Buurlage and   
            R. H. Bisseling and   
                K. J. Batenburg   A geometric partitioning method for
                                  distributed tomographic reconstruction   104--121
       Remko van Wagensveld and   
       Tobias Wägemann and   
                Ralph Mader and   
    Ramin Tavakoli Kolagari and   
                 Ulrich Margull   Evaluation and modeling of the supercore
                                  parallelization pattern in automotive
                                  real-time systems  . . . . . . . . . . . 122--130
               Hartwig Anzt and   
              Jack Dongarra and   
               Goran Flegar and   
Enrique S. Quintana-Ortí   Variable-size batched Gauss--Jordan
                                  elimination for block-Jacobi
                                  preconditioning on graphics processors   131--146

Parallel Computing
Volume 82, Number ??, 2019

                      Anonymous   Editorial Board  . . . . . . . . . . . . ii--ii
             Abhinav Vishnu and   
               Pavan Balaji and   
                      Yong Chen   Guest Editor's Introduction: P2S2: SI
                                  2016 . . . . . . . . . . . . . . . . . . 1--2
              Neda Tavakoli and   
                   Dong Dai and   
                      Yong Chen   Client-side straggler-aware I/O
                                  scheduler for object-based parallel file
                                  systems  . . . . . . . . . . . . . . . . 3--18
           Hiroshi Yoritaka and   
                 Ken Matsui and   
            Masahiro Yasugi and   
            Tasuku Hiraishi and   
                  Seiji Umatani   Probabilistic guards: a mechanism for
                                  increasing the granularity of
                                  work-stealing programs . . . . . . . . . 19--36
                 Jason Mair and   
                Zhiyi Huang and   
                    David Eyers   Manila: Using a densely populated
                                  PMC-space for power modelling within
                                  large-scale systems  . . . . . . . . . . 37--56
                   Xing Fan and   
              Oliver Sinnen and   
                Nasser Giacaman   Supporting asynchronization in OpenMP
                                  for event-driven programming . . . . . . 57--74
                   Dong Dai and   
          Forrest Sheng Bao and   
                 Jiang Zhou and   
                Xuanhua Shi and   
                      Yong Chen   Vectorizing disks blocks for efficient
                                  storage system via deep learning . . . . 75--90

Parallel Computing
Volume 83, Number ??, April, 2019

                      Anonymous   Editorial Board  . . . . . . . . . . . . ii--ii
               Pavan Balaji and   
             Abhinav Vishnu and   
                      Yong Chen   Foreword to the special issue for the
                                  Workshop on Parallel Programming Models
                                  and Systems Software for High-End
                                  Computing (P2S2 2017)  . . . . . . . . . 1--2
            Juan J. Durillo and   
       Philipp Gschwandtner and   
               Klaus Kofler and   
               Thomas Fahringer   Multi-Objective region-Aware
                                  optimization of parallel programs  . . . 3--21
        Sai Narasimhamurthy and   
             Nikita Danilov and   
                  Sining Wu and   
           Ganesan Umanesan and   
           Stefano Markidis and   
         Sergio Rivas-Gomez and   
                Ivy Bo Peng and   
                Erwin Laure and   
               Dirk Pleiter and   
                  Shaun de Witt   SAGE: Percipient Storage for Exascale
                                  Data Centric Computing . . . . . . . . . 22--33
                  Zixi Quan and   
                Volker Haarslev   A parallel computing architecture for
                                  high-performance OWL reasoning . . . . . 34--46
              Loris Marchal and   
                 Erik Saule and   
                  Oliver Sinnen   Special Issue Proposal for the
                                  \booktitleParallel Computing Journal:
                                  HeteroPar 2016 and HCW 2016 Workshops    47--47
             Dylan Machovec and   
             Bhavesh Khemka and   
            Nirmal Kumbhare and   
            Sudeep Pasricha and   
     Anthony A. Maciejewski and   
          Howard Jay Siegel and   
                 Ali Akoglu and   
          Gregory A. Koenig and   
               Salim Hariri and   
                 Cihan Tunc and   
             Michael Wright and   
              Marcia Hilton and   
         Rajendra Rambharos and   
        Christopher Blandin and   
                Farah Fargo and   
                Ahmed Louri and   
                     Neena Imam   Utility-based resource management in an
                                  oversubscribed energy-constrained
                                  heterogeneous environment executing
                                  parallel applications  . . . . . . . . . 48--72
                  T. Cojean and   
              A. Guermouche and   
                    A. Hugo and   
                  R. Namyst and   
                P. A. Wacrenier   Resource aggregation for task-based
                                  Cholesky Factorization on top of modern
                                  architectures  . . . . . . . . . . . . . 73--92
      João Guerreiro and   
            Aleksandar Ilic and   
                  Nuno Roma and   
             Pedro Tomás   DVFS-aware application classification to
                                  improve GPGPUs energy efficiency . . . . 93--117
        Julio Proaño and   
      Carmen Carrión and   
                Blanca Caminero   Empirical modeling and simulation of an
                                  heterogeneous Cloud computing
                                  environment  . . . . . . . . . . . . . . 118--134

Parallel Computing
Volume 84, Number ??, May, 2019

                      Anonymous   Editorial Board  . . . . . . . . . . . . ii--ii
             Nawrin Sultana and   
      Martin Rüfenacht and   
           Anthony Skjellum and   
             Ignacio Laguna and   
                 Kathryn Mohror   Failure recovery for bulk synchronous
                                  applications with MPI stages . . . . . . 1--14
              Wayne Joubert and   
                James Nance and   
             Sharlee Climer and   
           Deborah Weighill and   
                Daniel Jacobson   Parallel accelerated Custom Correlation
                                  Coefficient calculations for genomics
                                  applications . . . . . . . . . . . . . . 15--23
Javier López-Gómez and   
Javier Fernández Muñoz and   
      David del Rio Astorga and   
             Manuel F. Dolz and   
               J. Daniel Garcia   Exploring stream parallel patterns in
                                  distributed MPI environments . . . . . . 24--36
         Anton Shterenlikht and   
                 Luis Cebamanos   MPI vs Fortran coarrays beyond 100k
                                  cores: $3$D cellular automata  . . . . . 37--49
          Pedro Valero-Lara and   
          Raül Sirvent and   
     Antonio J. Peña and   
           Jesús Labarta   MPI + OpenMP tasking scalability for
                                  multi-morphology simulations of the
                                  human brain  . . . . . . . . . . . . . . 50--61
              William Gropp and   
                  Rajeev Thakur   Guest Editor's introduction: Special
                                  issue on best papers from EuroMPI/USA
                                  2017 . . . . . . . . . . . . . . . . . . 62--62
                 Scott Levy and   
           Kurt B. Ferreira and   
             Whit Schonbein and   
              Ryan E. Grant and   
          Matthew G. F. Dosanjh   Using simulation to examine the effect
                                  of MPI message matching costs on
                                  application performance  . . . . . . . . 63--74

Parallel Computing
Volume 85, Number ??, July, 2019

                      Anonymous   Editorial Board  . . . . . . . . . . . . ii--ii
        Valentin Le F\`evre and   
             Thomas Herault and   
                Yves Robert and   
        Aurelien Bouteiller and   
               Atsushi Hori and   
             George Bosilca and   
                  Jack Dongarra   Comparing the performance of rigid,
                                  moldable and grid-shaped applications on
                                  failure-prone HPC platforms  . . . . . . 1--12
                Amit Ruhela and   
             Hari Subramoni and   
         Sourav Chakraborty and   
     Mohammadreza Bayatpour and   
               Pouya Kousha and   
      Dhabaleswar K. (DK) Panda   Efficient design for MPI asynchronous
                                  progress without dedicated resources . . 13--26
             Joel Mat\vejka and   
        Björn Forsberg and   
               Michal Sojka and   
        P\vremysl \vS\rucha and   
                Luca Benini and   
            Andrea Marongiu and   
       Zden\vek Hanzálek   Combining PREM compilation and static
                                  scheduling for high-performance and
                                  predictable MPSoC execution  . . . . . . 27--44
          Michael Gowanlock and   
                     Ben Karsin   A hybrid CPU/GPU approach for optimizing
                                  sorting throughput . . . . . . . . . . . 45--55
           Martin Schreiber and   
   Nathanaël Schaeffer and   
                   Richard Loft   Exponential integrators with
                                  parallel-in-time rational approximations
                                  for the shallow-water equations on the
                                  rotating sphere  . . . . . . . . . . . . 56--65
                   Zhuo Liu and   
            Amit Kumar Nath and   
              Xiaoning Ding and   
                Huansong Fu and   
             Md. Muhib Khan and   
                     Weikuan Yu   Multivariate modeling and two-level
                                  scheduling of analytic queries . . . . . 66--78
      José I. Aliaga and   
          Ernesto Dufrechou and   
              Pablo Ezzatti and   
Enrique S. Quintana-Ortí   Accelerating the task/data-parallel
                                  version of ILUPACK's BiCG in
                                  multi-CPU/GPU configurations . . . . . . 79--87
             P.-H. Tournier and   
                I. Aliferis and   
               M. Bonazzoli and   
                M. de Buhan and   
                  M. Darbas and   
                  V. Dolean and   
                   F. Hecht and   
                 P. Jolivet and   
              I. El Kanfoud and   
              C. Migliaccio and   
                   F. Nataf and   
                 Ch. Pichot and   
                     S. Semenov   Microwave tomographic imaging of
                                  cerebrovascular accidents by using
                                  high-performance computing . . . . . . . 88--97
               William D. Gropp   Using node and socket information to
                                  implement MPI Cartesian topologies . . . 98--108
                  Liyang Xu and   
              Xiaoguang Ren and   
                  Qian Wang and   
                  Xinhai Xu and   
                    Xuejun Yang   Full-neighbor-list based numerical
                                  reproducibility method for parallel
                                  molecular dynamics simulations . . . . . 109--118
 Marc-André Hermanns and   
            Nathan T. Hjelm and   
           Michael Knobloch and   
             Kathryn Mohror and   
                  Martin Schulz   The MPI\_T events interface: an early
                                  evaluation and overview of the interface 119--130
           Angelika Schwarz and   
                  Lars Karlsson   Scalable eigenvector computation for the
                                  non-symmetric eigenvalue problem . . . . 131--140
           Ammar Ahmad Awan and   
Karthik Vadambacheri Manian and   
           Ching-Hsiang Chu and   
             Hari Subramoni and   
           Dhabaleswar K. Panda   Optimized large-message broadcast for
                                  deep learning workloads: MPI, MPI +
                                  NCCL, or NCCL2?  . . . . . . . . . . . . 141--152
                 Kevin Sala and   
              Xavier Teruel and   
             Josep M. Perez and   
     Antonio J. Peña and   
      Vicenç Beltran and   
                  Jesus Labarta   Integrating blocking and non-blocking
                                  MPI primitives with task-based
                                  programming models . . . . . . . . . . . 153--166
                   P. K\rus and   
                   A. Marek and   
          S. S. Köcher and   
             H.-H. Kowalski and   
                C. Carbogno and   
               Ch. Scheurer and   
                  K. Reuter and   
               M. Scheffler and   
                     H. Lederer   Optimizations of the eigensolvers in the
                                  ELPA library . . . . . . . . . . . . . . 167--177
        Aristeidis Mastoras and   
                Thomas R. Gross   Load-balancing for load-imbalanced
                                  fine-grained linear pipelines  . . . . . 178--189
               Millad Ghane and   
      Sunita Chandrasekaran and   
             Margaret S. Cheung   pointerchain: Tracing pointers to their
                                  roots --- A case study in molecular
                                  dynamics simulations . . . . . . . . . . 190--203
                Julien Adam and   
          Maxime Kermarquer and   
      Jean-Baptiste Besnard and   
    Leonardo Bautista-Gomez and   
        Marc Pérache and   
         Patrick Carribault and   
              Julien Jaeger and   
            Allen D. Malony and   
                  Sameer Shende   Checkpoint/restart approaches for a
                                  thread-based MPI runtime . . . . . . . . 204--219
                  Qiao Kang and   
  Jesper Larsson Träff and   
            Reda Al-Bahrani and   
              Ankit Agrawal and   
             Alok Choudhary and   
                  Wei-keng Liao   Scalable Algorithms for MPI Intergroup
                                  Allgather and Allgatherv . . . . . . . . 220--230
                    Shi Sha and   
                  Wujie Wen and   
Gustavo A. Chaparro-Baquero and   
                      Gang Quan   Thermal-constrained energy efficient
                                  real-time scheduling on multi-core
                                  platforms  . . . . . . . . . . . . . . . 231--242

Parallel Computing
Volume 86, Number ??, August, 2019

                      Anonymous   Editorial Board  . . . . . . . . . . . . ii--ii
            Dumitrel Loghin and   
                  Yong Meng Teo   The time and energy efficiency of modern
                                  multicore systems  . . . . . . . . . . . 1--13
               Pavan Balaji and   
                     Marc Casas   Special issue on the Message Passing
                                  Interface  . . . . . . . . . . . . . . . 14--15
                       S. Cools   Analyzing and improving maximal
                                  attainable accuracy in the communication
                                  hiding pipelined BiCGStab method . . . . 16--35
            Thanh-Dang Diep and   
            Kien Trung Pham and   
        Karl Fürlinger and   
                      Nam Thoai   A time-stamping system to detect memory
                                  consistency errors in MPI one-sided
                                  applications . . . . . . . . . . . . . . 36--44
      Ludovic A. R. Capelli and   
               Zhenjiang Hu and   
       Timothy A. K. Zakian and   
                 Nick Brown and   
                   J. Mark Bull   iPregel: Vertex-centric programmability
                                  vs memory efficiency and performance,
                                  why choose?  . . . . . . . . . . . . . . 45--56
                Shengguo Li and   
                    Jie Liu and   
                      Yunfei Du   A high performance implementation of
                                  Zolo-SVD algorithm on distributed memory
                                  systems  . . . . . . . . . . . . . . . . 57--65
          Ichitaro Yamazaki and   
                Edmond Chow and   
        Aurelien Bouteiller and   
                  Jack Dongarra   Performance of asynchronous optimized
                                  Schwarz with one-sided communication . . 66--81
             George Matheou and   
            Vassos Soteriou and   
           Paraskevas Evripidou   Toward data-driven architectural support
                                  in improving the performance of future
                                  HPC architectures  . . . . . . . . . . . 82--106

Parallel Computing
Volume 87, Number ??, September, 2019

                      Anonymous   Editorial Board  . . . . . . . . . . . . ii--ii
              Hajime Fujita and   
              Chongxiao Cao and   
               Sayantan Sur and   
             Charles Archer and   
               Erik Paulson and   
                 Maria Garzaran   Efficient implementation of MPI-3 RMA
                                  over openFabrics interfaces  . . . . . . 1--10
              Shaolong Chen and   
             Miquel Angel Senar   Exploring efficient data parallelism for
                                  genome read mapping on multicore and
                                  manycore architectures . . . . . . . . . 11--24
               Weihao Liang and   
                  Yong Chen and   
                 Jialin Liu and   
                        Hong An   CARS: a contention-aware scheduler for
                                  efficient resource management of HPC
                                  storage systems  . . . . . . . . . . . . 25--34
                    Olga Pearce   Exploring utilization options of
                                  heterogeneous architectures for
                                  multi-physics simulations  . . . . . . . 35--45
                    Ting Yu and   
                    Mengchi Liu   A memory efficient maximal clique
                                  enumeration method for sparse graphs
                                  with a parallel implementation . . . . . 46--59
           Jeffrey S. Young and   
                  Eric Hein and   
             Srinivas Eswar and   
              Patrick Lavin and   
                  Jiajia Li and   
                Jason Riedy and   
              Richard Vuduc and   
                      Tom Conte   A microbenchmark characterization of the
                                  Emu Chick  . . . . . . . . . . . . . . . 60--69
                  Fang Zhou and   
                    Song Wu and   
              Youchuang Jia and   
                  Xiang Gao and   
                    Hai Jin and   
               Xiaofei Liao and   
                  Pingpeng Yuan   VAIL: a Victim-Aware Cache Policy to
                                  improve NVM Lifetime for hybrid memory
                                  system . . . . . . . . . . . . . . . . . 70--76
            J. Austin Ellis and   
            Thomas M. Evans and   
         Steven P. Hamilton and   
               C. T. Kelley and   
                 Tara M. Pandya   Optimization of processor allocation for
                                  domain decomposed Monte Carlo
                                  calculations . . . . . . . . . . . . . . 77--86
               Guanghui Zhu and   
                   Chen Guo and   
                      Le Lu and   
                  Zhi Huang and   
              Chunfeng Yuan and   
                    Rong Gu and   
                    Yihua Huang   DGST: Efficient and scalable suffix tree
                                  construction on distributed
                                  data-parallel platforms  . . . . . . . . 87--102

Parallel Computing
Volume 88, Number ??, 2019

                     Min Si and   
                Zhiyi Huang and   
                   Pavan Balaji   International workshop on programming
                                  models and applications for multicores
                                  and manycores (PMAM 2018)  . . . . . . . ??
              Michael Rippl and   
                 Bruno Lang and   
                  Thomas Huckle   Parallel eigenvalue computation for
                                  banded generalized eigenvalue problems   ??
              Jean M. Favre and   
                Alexander Blass   A comparative evaluation of three volume
                                  rendering libraries for the
                                  visualization of sheared thermal
                                  convection . . . . . . . . . . . . . . . ??
        Reuben D. Budiardja and   
           Christian Y. Cardall   Targeting GPUs with OpenMP directives on
                                  Summit: a simple and effective Fortran
                                  experience . . . . . . . . . . . . . . . ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   Publisher's Note . . . . . . . . . . . . ??

Parallel Computing
Volume 89, Number ??, November, 2019

         Jose Monsalve Diaz and   
             Kyle Friedline and   
            Swaroop Pophale and   
            Oscar Hernandez and   
         David E. Bernholdt and   
          Sunita Chandrasekaran   Analysis of OpenMP 4.5 Offloading in
                                  Implementations: Correctness and
                                  Overhead . . . . . . . . . . . . . . . . ??
   S. Mahdieh Ghazimirsaeed and   
              Ryan E. Grant and   
                   Ahmad Afsahi   A dynamic, unified design for dedicated
                                  message matching engines for collective
                                  and point-to-point communications  . . . ??
           Anton G. Artemov and   
              Elias Rudberg and   
           Emanuel H. Rubensson   Parallelization and scalability analysis
                                  of inverse factorization using the
                                  chunks and tasks programming model . . . ??
                     Min Si and   
             Abhinav Vishnu and   
                      Yong Chen   Parallel programming models and systems
                                  software for high-end computing (P2S2
                                  2018)  . . . . . . . . . . . . . . . . . ??
         Alexander Heinecke and   
           Alexander Breuer and   
                     Yifeng Cui   Tensor-optimized hardware accelerates
                                  fused discontinuous Galerkin simulations ??
                  Xinzhe Wu and   
               Serge G. Petiton   A distributed and parallel asynchronous
                                  unite and conquer method to solve large
                                  scale non-Hermitian linear systems with
                                  multiple right-hand sides  . . . . . . . ??
   Dimitris Palyvos-Giannas and   
          Vincenzo Gulisano and   
        Marina Papatriantafilou   GeneaLog: Fine-grained data streaming
                                  provenance in cyber-physical systems . . ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   November 2019  . . . . . . . . . . . . . ??

Parallel Computing
Volume 90, Number ??, December, 2019

            Yusuke Nagasaka and   
           Satoshi Matsuoka and   
                Ariful Azad and   
             Aydin Buluç   Performance optimization, modeling and
                                  analysis of sparse matrix-matrix
                                  products on multi-core and many-core
                                  processors . . . . . . . . . . . . . . . ??
                    Bian Wu and   
              Weiliang Heng and   
                    Bu-Sung Lee   Student Cluster Competition 2018, Team
                                  Nanyang Technological University:
                                  Reproducing performance of a
                                  Multi-Physics Simulations of the
                                  Tsunamigenic 2004 Sumatra Megathrust
                                  Earthquake on the Intel Skylake
                                  architecture . . . . . . . . . . . . . . ??
              Nicole Brewer and   
                 HyeJin Kim and   
                 Claudia Li and   
             Heidi Anderson and   
              Jessica Lanum and   
                  Jia Cheoh and   
              Betsy Hillery and   
               Trinity Overmyer   Student cluster competition 2018, team
                                  Ada Six of Purdue University:
                                  Reproducing Extreme Scale Multi-Physics
                                  Simulations of Tsunamigenic 2004 Sumatra
                                  Megathrust Earthquake on Intel Skylake
                                  architecture . . . . . . . . . . . . . . ??
             Thaddeus Koehn and   
                  Peter Athanas   Data staging for efficient high
                                  throughput stream processing . . . . . . ??
           Julia Bazi\'nska and   
           Maciej Korpalski and   
               Maciej Szpindler   Student Cluster Competition 2018, Team
                                  University of Warsaw, University of
                                  Wroclaw, Warsaw University of
                                  Technology: Reproducing performance of a
                                  multi-physics simulations of the
                                  Tsunamigenic 2004 Sumatra megathrust
                                  earthquake on the Intel Skylake
                                  architecture . . . . . . . . . . . . . . ??
                    C. Bunn and   
                 H. Barclay and   
                 A. Lazarev and   
                   F. Yusuf and   
                   J. Fitch and   
                   J. Booth and   
               K. Shivdikar and   
                       D. Kaeli   Student cluster competition 2018, team
                                  northeastern university: Reproducing
                                  performance of a multi-physics
                                  simulations of the Tsunamigenic 2004
                                  Sumatra Megathrust earthquake on the AMD
                                  EPYC 7551 architecture . . . . . . . . . ??
                 ShaoFu Lin and   
               ChiChen Yang and   
              YuHsuan Cheng and   
                KengJui Hsu and   
              HungHsin Chen and   
              YuanChing Lin and   
                     Jerry Chou   Student Cluster Competition 2018, team
                                  NTHU: Reproducing performance of
                                  multi-physics simulations of the
                                  tsunamigenic 2004 Sumatra megathrust
                                  earthquake on the Intel Skylake
                                  architecture . . . . . . . . . . . . . . ??
                   Jiaao He and   
             Chenggang Zhao and   
                  Jiping Yu and   
                 Xinjian Yu and   
                Liyan Zheng and   
                Chenyao Lou and   
                Shizhi Tang and   
                 Wentao Han and   
                    Jidong Zhai   Student Cluster Competition 2018, Team
                                  Tsinghua University: Reproducing
                                  performance of multi-physics simulations
                                  of the Tsunamigenic 2004 Sumatra
                                  megathrust earthquake on the Intel
                                  Skylake Architecture . . . . . . . . . . ??
                 Hai Ah Nam and   
          Elsa Gonsiorowski and   
                  Scott Michael   Special Issue on the SC'18 Student
                                  Cluster Competition Reproducibility
                                  Initiative . . . . . . . . . . . . . . . ??
            Mateusz Starzec and   
            Grazyna Starzec and   
          Aleksander Byrski and   
                 Wojciech Turek   Distributed ant colony optimization
                                  based on actor model . . . . . . . . . . ??
              Afshin Zafari and   
          Elisabeth Larsson and   
               Martin Tillenius   DuctTeip: an efficient programming model
                                  for distributed task-based parallel
                                  computing  . . . . . . . . . . . . . . . ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   December 2019  . . . . . . . . . . . . . ??

Parallel Computing
Volume 91, Number ??, March, 2020

            Antonio J. Pena and   
                         Min Si   Guest editorial: Special Issue on
                                  Applications and System Software for
                                  Hybrid Exascale Systems  . . . . . . . . ??
               Vasco Amaral and   
           Beatriz Norberto and   
       Miguel Goulão and   
            Marco Aldinucci and   
          Siegfried Benkner and   
           Andrea Bracciali and   
             Paulo Carreira and   
               Edgars Celms and   
      Lu\'ìs Correia and   
             Clemens Grelck and   
              Helen Karatza and   
          Christoph Kessler and   
           Peter Kilpatrick and   
            Hugo Martiniano and   
             Ilias Mavridis and   
               Sabri Pllana and   
      Ana Resp\'ìcio and   
   José Simão and   
        Lu\'ìs Veiga and   
                       Ari Visa   Programming languages for data-Intensive
                                  HPC applications: a systematic mapping
                                  study  . . . . . . . . . . . . . . . . . ??
                   Yu Huang and   
                   Kai Gong and   
                    Eric Mercer   An efficient algorithm for match pair
                                  approximation in message passing . . . . ??
             Tobias Ribizel and   
                   Hartwig Anzt   Parallel selection on GPUs . . . . . . . ??
              Valeriy Manin and   
                     Bruno Lang   Cannon-type triangular matrix
                                  multiplication for the reduction of
                                  generalized HPD eigenproblems to
                                  standard form  . . . . . . . . . . . . . ??
    Vianney Kengne Tchendji and   
      Armel Nkonjoh Ngomade and   
       Jerry Lacmou Zeutouo and   
Jean Frédéric Myoupo   Efficient CGM-based parallel algorithms
                                  for the longest common subsequence
                                  problem with multiple
                                  substring-exclusion constraints  . . . . ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   March 2020 . . . . . . . . . . . . . . . ??

Parallel Computing
Volume 92, Number ??, April, 2020

              Takeshi Terao and   
            Katsuhisa Ozaki and   
                  Takeshi Ogita   $ L U$-Cholesky $ Q R$ algorithms for
                                  thin $ Q R$ decomposition  . . . . . . . ??
           Md Maruf Hussain and   
              Noriyuki Fujimoto   GPU-based parallel multi-objective
                                  particle swarm optimization for large
                                  swarms and high dimensional problems . . ??
          Massimo Bernaschi and   
             Pasqua D'Ambra and   
                 Dario Pasquini   AMG based on compatible weighted
                                  matching for GPUs  . . . . . . . . . . . ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   April 2020 . . . . . . . . . . . . . . . ??

Parallel Computing
Volume 93, Number ??, May, 2020

               Jakub Kruzik and   
                David Horak and   
               Vaclav Hapla and   
                  Martin Cermak   Comparison of selected FETI coarse space
                                  projector implementation strategies  . . ??
    Carlos Junqueira-Junior and   
João Luiz F. Azevedo and   
              Jairo Panetta and   
            William R. Wolf and   
                   Sami Yamouni   On the scalability of CFD tool for
                                  supersonic jet flow configurations . . . ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   May 2020 . . . . . . . . . . . . . . . . ??

Parallel Computing
Volume 96, Number ??, August, 2020

              Hitoshi Murai and   
                 Mitsuhisa Sato   Design and evaluation of efficient
                                  global data movement in partitioned
                                  global address space . . . . . . . . . . ??
               Bengisu Elis and   
                   Dai Yang and   
                Olga Pearce and   
             Kathryn Mohror and   
                  Martin Schulz   QMPI: a next generation MPI profiling
                                  interface for modern HPC platforms . . . ??
              Carolin Penke and   
              Andreas Marek and   
          Christian Vorwerk and   
              Claudia Draxl and   
                   Peter Benner   High performance solution of
                                  skew-symmetric eigenvalue problems with
                                  applications in solving the
                                  Bethe--Salpeter eigenvalue problem . . . ??
            Timon E. Knigge and   
               Rob H. Bisseling   An improved exact algorithm and an
                                  NP-completeness proof for sparse matrix
                                  bipartitioning . . . . . . . . . . . . . ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   August 2020  . . . . . . . . . . . . . . ??

Parallel Computing
Volume 97, Number ??, September, 2020

             Seiya Watanabe and   
              Takayuki Aoki and   
                Tomohiro Takaki   A domain partitioning method using a
                                  multi-phase-field model for block-based
                                  AMR applications . . . . . . . . . . . . ??
                 JunKyu Lee and   
        Gregory D. Peterson and   
  Dimitrios S. Nikolopoulos and   
            Hans Vandierendonck   AIR: Iterative refinement acceleration
                                  using arbitrary dynamic precision  . . . ??
                Jaume Bosch and   
      Carlos Álvarez and   
Daniel Jiménez-González and   
           Xavier Martorell and   
          Eduard Ayguadé   Asynchronous runtime with distributed
                                  manager for task-based programming
                                  models . . . . . . . . . . . . . . . . . ??
                Dunwei Gong and   
                  Tian Tian and   
                Jinxin Wang and   
                    Ying Du and   
                       Zheng Li   A novel method of grouping target paths
                                  for parallel programs  . . . . . . . . . ??
               Shardul Natu and   
                 Ketan Date and   
                    Rakesh Nagi   GPU-accelerated Lagrangian heuristic for
                                  multidimensional assignment problems
                                  with decomposable costs  . . . . . . . . ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   September 2020 . . . . . . . . . . . . . ??

Parallel Computing
Volume 98, Number ??, October, 2020

       Nusrat Sharmin Islam and   
              Gengbin Zheng and   
               Sayantan Sur and   
               Akhil Langer and   
                 Maria Garzaran   Minimizing the usage of hardware
                                  counters for collective communication
                                  using triggered operations . . . . . . . ??
     Rodolfo Pereira Araujo and   
        Igor Machado Coelho and   
 Leandro Augusto Justen Marzulo   A multi-improvement local search using
                                  dataflow and GPU to solve the minimum
                                  latency problem  . . . . . . . . . . . . ??
                    Yi Zhou and   
                Yuanqi Chen and   
             Shubbhi Taneja and   
                Ajit Chavan and   
                   Xiao Qin and   
                     Jifu Zhang   ThermoBench: a thermal efficiency
                                  benchmark for clusters in data centers   ??
            Cuong M. Nguyen and   
               Philip J. Rhodes   Delaunay triangulation of large-scale
                                  datasets using two-level parallelism . . ??
                  Jian Xiao and   
                   Min Long and   
                      Ce Yu and   
                   Xin Zhou and   
                          Li Ji   Performance optimization of
                                  non-equilibrium ionization simulations
                                  from MapReduce and GPU acceleration  . . ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
            Marco Aldinucci and   
         Valeria Cardellini and   
          Gabriele Mencagli and   
               Massimo Torquati   Data stream processing in HPC systems:
                                  New frameworks and architectures for
                                  high-frequency streaming . . . . . . . . ??
                      Anonymous   October 2020 . . . . . . . . . . . . . . ??

Parallel Computing
Volume 99, Number ??, November, 2020

             Herbert Jordan and   
       Philipp Gschwandtner and   
               Peter Thoman and   
              Peter Zangerl and   
           Alexander Hirsch and   
           Thomas Fahringer and   
              Thomas Heller and   
                    Dietmar Fey   The allscale framework architecture  . . ??
              Jianguo Liang and   
                   Rong Hua and   
                  Hao Zhang and   
               Wenqiang Zhu and   
                         You Fu   Accelerated molecular dynamics
                                  simulation of Silicon Crystals on
                                  TaihuLight using OpenACC . . . . . . . . ??
                  Huan Zhou and   
         José Gracia and   
              Naweiluo Zhou and   
                 Ralf Schneider   Collectives in hybrid MPI+MPI code:
                                  Design, practice and performance . . . . ??
            Nirmal Kumbhare and   
                 Ali Akoglu and   
          Aniruddha Marathe and   
               Salim Hariri and   
                 Ghaleb Abdulla   Dynamic power management for
                                  value-oriented schedulers in
                                  power-constrained HPC system . . . . . . ??
  Jesper Larsson Träff and   
                Torsten Hoefler   Special issue: Selected papers from
                                  EuroMPI 2019 . . . . . . . . . . . . . . ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   November 2020  . . . . . . . . . . . . . ??

Parallel Computing
Volume 100, Number ??, December, 2020

               Zhongming Fu and   
                  Zhuo Tang and   
                    Li Yang and   
                   Kenli Li and   
                       Keqin Li   ImRP: a Predictive Partition Method for
                                  Data Skew Alleviation in Spark Streaming
                                  Environment  . . . . . . . . . . . . . . ??
                   Shi Dong and   
                    Pu Zhao and   
                    Xue Lin and   
                    David Kaeli   Exploring GPU acceleration of Deep
                                  Neural Networks using Block Circulant
                                  Matrices . . . . . . . . . . . . . . . . ??
   Massimiliano Lupo Pasini and   
             Bruno Turcksin and   
                  Wenjun Ge and   
             Jean-Luc Fattebert   A parallel strategy for density
                                  functional theory computations on
                                  accelerated nodes  . . . . . . . . . . . ??
             Andrew Reisner and   
              Markus Berndt and   
           J. David Moulton and   
                  Luke N. Olson   Scalable line and plane relaxation in a
                                  parallel structured multigrid solver . . ??
           Angelika Schwarz and   
Carl Christian Kjelgaard Mikkelsen and   
                  Lars Karlsson   Robust parallel eigenvector computation
                                  for the non-symmetric eigenvalue problem ??
           Mohammad Almasri and   
                Walid Abu-Sufah   CCF: an efficient SpMV storage format
                                  for AVX512 platforms . . . . . . . . . . ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   December 2020  . . . . . . . . . . . . . ??

Parallel Computing
Volume 94--95, Number ??, June, 2021

               Xiongwei Fei and   
                   Kenli Li and   
              Wangdong Yang and   
                       Keqin Li   Analysis of energy efficiency of a
                                  parallel AES algorithm for CPU--GPU
                                  heterogeneous platforms  . . . . . . . . ??
        Joshua Dennis Booth and   
                  Gregory Bolet   An on-node scalable sparse incomplete LU
                                  factorization for a many-core iterative
                                  solver with \pkgJavelin  . . . . . . . . ??
               Baicheng Yan and   
                 Limin Xiao and   
               Guangjun Qin and   
                 Zhang Yang and   
                   Bin Dong and   
                  Haonan Yu and   
                      Hongyu Wu   QTMS: a quadratic time complexity
                                  topology-aware process mapping method
                                  for large-scale parallel applications on
                                  shared HPC system  . . . . . . . . . . . ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   June 2020  . . . . . . . . . . . . . . . ??

Parallel Computing
Volume 101, Number ??, April, 2021

                  Benbo Zha and   
                      Hong Shen   Improved probabilistic I/O scheduling
                                  for limited-size Burst-Buffers deployed
                                  HPC  . . . . . . . . . . . . . . . . . . ??
              Qianqian Tong and   
              Guannan Liang and   
                 Xingyu Cai and   
              Chunjiang Zhu and   
                       Jinbo Bi   Asynchronous parallel stochastic
                                  Quasi-Newton methods . . . . . . . . . . ??
         Mohammad K. Fallah and   
                Mahmood Fazlali   Parallel branch and bound algorithm for
                                  solving integer linear programming
                                  models derived from behavioral synthesis ??
                Jiaquan Gao and   
                    Qi Chen and   
                      Guixia He   A thread-adaptive sparse approximate
                                  inverse preconditioning algorithm on
                                  multi-GPUs . . . . . . . . . . . . . . . ??
        Esra Ruzgar Ateskan and   
             Kayhan Erciyes and   
           Mehmet Emin Dalkilic   Parallelization of network motif
                                  discovery using star contraction . . . . ??
Fatéma Zahra Benchara and   
                Mohamed Youssfi   A new scalable distributed $k$-means
                                  algorithm based on Cloud micro-services
                                  for High-performance computing . . . . . ??
                 Yaling Xun and   
                 Jifu Zhang and   
               Haifeng Yang and   
                       Xiao Qin   HBPFP-DC: a parallel frequent itemset
                                  mining using Spark . . . . . . . . . . . ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   April 2021 . . . . . . . . . . . . . . . ??

Parallel Computing
Volume 102, Number ??, May, 2021

                Melih Sener and   
                Stuart Levy and   
              John E. Stone and   
             AJ Christensen and   
           Barry Isralewitz and   
           Robert Patterson and   
          Kalina Borkiewicz and   
          Jeffrey Carpenter and   
             C. Neil Hunter and   
      Zaida Luthey-Schulten and   
                      Donna Cox   Multiscale modeling and cinematic
                                  visualization of photosynthetic energy
                                  conversion processes from electronic to
                                  cell scales  . . . . . . . . . . . . . . ??
                Olaf Schenk and   
               Peter Arbenz and   
                 Luc Giraud and   
                   Wim Vanroose   Guest editorial: Virtual special issue
                                  on parallel matrix algorithms and
                                  applications (PMAA'18) . . . . . . . . . ??
     Chiheb-Eddine Ben Ncir and   
             Abdallah Hamza and   
                  Waad Bouaguel   Parallel and scalable Dunn Index for the
                                  validation of big data clusters  . . . . ??
                   Xin Long and   
                  Jigang Wu and   
                   Yalan Wu and   
                  Long Chen and   
                      Yidong Li   Context switch cost aware joint task
                                  merging and scheduling for deep learning
                                  applications . . . . . . . . . . . . . . ??
          Hiroyuki Takizawa and   
           Shinji Shiotsuki and   
                Naoki Ebata and   
                  Ryusuke Egawa   OpenCL-like offloading with
                                  metaprogramming for SX-Aurora TSUBASA    ??
            Salvatore Cielo and   
            Luigi Iapichino and   
      Johannes Günther and   
        Christoph Federrath and   
            Elisabeth Mayer and   
               Markus Wiedemann   Visualizing the world's largest
                                  turbulence simulation  . . . . . . . . . ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   May 2021 . . . . . . . . . . . . . . . . ??

Parallel Computing
Volume 103, Number ??, June, 2021

      Mansour Khelghatdoust and   
                Vincent Gramoli   A scalable and low latency probe-based
                                  scheduler for data analytics frameworks  ??
                Ryoma Ohira and   
           Md. Saiful Islam and   
                 Humayun Kayesh   Speedup vs. quality: Asynchronous and
                                  cluster-based distributed adaptive
                                  genetic algorithms for ordered problems  ??
            Ronald Gonzales and   
               Yury Gryazin and   
                   Yun Teck Lee   Parallel FFT algorithms for high-order
                                  approximations on three-dimensional
                                  compact stencils . . . . . . . . . . . . ??
               Akemi Shioya and   
                Yusaku Yamamoto   Block red-black MILU(0) preconditioner
                                  with relaxation on GPU . . . . . . . . . ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   June 2021  . . . . . . . . . . . . . . . ??

Parallel Computing
Volume 104--105, Number ??, July, 2021

            Stephanie Brink and   
             Matthew Larsen and   
                Hank Childs and   
                 Barry Rountree   Evaluating adaptive and predictive power
                                  management strategies for optimizing
                                  visualization performance on
                                  supercomputers . . . . . . . . . . . . . ??
            Stephanie Brink and   
             Matthew Larsen and   
                Hank Childs and   
                 Barry Rountree   Evaluating adaptive and predictive power
                                  management strategies for optimizing
                                  visualization performance on
                                  supercomputers . . . . . . . . . . . . . ??
               Chahak Mehta and   
            Amarnath Karthi and   
              Vishrut Jetly and   
              Bhaskar Chaudhury   Parallel Fast Multipole Method
                                  accelerated FFT on HPC clusters  . . . . ??
               Chahak Mehta and   
            Amarnath Karthi and   
              Vishrut Jetly and   
              Bhaskar Chaudhury   Parallel Fast Multipole Method
                                  accelerated FFT on HPC clusters  . . . . ??
              Jacob Lambert and   
                 Seyong Lee and   
          Jeffrey S. Vetter and   
                Allen D. Malony   Optimization with the OpenACC-to-FPGA
                                  framework on the Arria 10 and Stratix 10
                                  FPGAs  . . . . . . . . . . . . . . . . . ??
              Jacob Lambert and   
                 Seyong Lee and   
          Jeffrey S. Vetter and   
                Allen D. Malony   Optimization with the OpenACC-to-FPGA
                                  framework on the Arria 10 and Stratix 10
                                  FPGAs  . . . . . . . . . . . . . . . . . ??
             Jared Brzenski and   
        Christopher Paolini and   
               Jose E. Castillo   Improving the I/O of large geophysical
                                  models using PnetCDF and BeeGFS  . . . . ??
             Jared Brzenski and   
        Christopher Paolini and   
               Jose E. Castillo   Improving the I/O of large geophysical
                                  models using PnetCDF and BeeGFS  . . . . ??
   Massimiliano Lupo Pasini and   
                  Junqi Yin and   
                Ying Wai Li and   
               Markus Eisenbach   A scalable algorithm for the
                                  optimization of neural network
                                  architectures  . . . . . . . . . . . . . ??
   Massimiliano Lupo Pasini and   
                  Junqi Yin and   
                Ying Wai Li and   
               Markus Eisenbach   A scalable algorithm for the
                                  optimization of neural network
                                  architectures  . . . . . . . . . . . . . ??
            Christos Bellas and   
            Anastasios Gounaris   HySet: a hybrid framework for exact set
                                  similarity join using a GPU  . . . . . . ??
            Christos Bellas and   
            Anastasios Gounaris   HySet: a hybrid framework for exact set
                                  similarity join using a GPU  . . . . . . ??
            Fareed Qararyah and   
              Mohamed Wahib and   
              Doga Dikbayir and   
     Mehmet Esat Belviranli and   
                     Didem Unat   A computational-graph partitioning
                                  method for training memory-constrained
                                  DNNs . . . . . . . . . . . . . . . . . . ??
            Fareed Qararyah and   
              Mohamed Wahib and   
              Doga Dikbayir and   
     Mehmet Esat Belviranli and   
                     Didem Unat   A computational-graph partitioning
                                  method for training memory-constrained
                                  DNNs . . . . . . . . . . . . . . . . . . ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   July 2021  . . . . . . . . . . . . . . . ??
                      Anonymous   July 2021  . . . . . . . . . . . . . . . ??

Parallel Computing
Volume 106, Number ??, September, 2021

                 Seher Acer and   
              Erik G. Boman and   
         Christian A. Glusa and   
      Sivasankaran Rajamanickam   \pkgSphynx: a parallel multi-GPU graph
                                  partitioner for distributed-memory
                                  systems  . . . . . . . . . . . . . . . . ??
           Joseph Schuchart and   
            Philipp Samfass and   
       Christoph Niethammer and   
         José Gracia and   
                 George Bosilca   Callback-based completion notification
                                  using MPI Continuations  . . . . . . . . ??
                Cherifa Dad and   
      Jean-Philippe Tavella and   
         Stéphane Vialle   Synthesis and feedback on the
                                  distribution and parallelization of
                                  FMI-CS-based co-simulations with the
                                  DACCOSIM platform  . . . . . . . . . . . ??
              Mellila Bouam and   
        Charles Bouillaguet and   
           Claire Delaplace and   
             Camille Noûs   Computational records with aging
                                  hardware: Controlling half the output of
                                  SHA-256  . . . . . . . . . . . . . . . . ??
             Masahiro Nakao and   
                Maaki Sakai and   
             Yoshiko Hanada and   
              Hitoshi Murai and   
                 Mitsuhisa Sato   Graph optimization algorithm for
                                  low-latency interconnection networks . . ??
                 Zhixing Yu and   
                  Kejing He and   
                    Xiuhong Zou   \pkgPEAB: a pool-based distributed
                                  evolutionary algorithm model with buffer ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   September 2021 . . . . . . . . . . . . . ??

Parallel Computing
Volume 107, Number ??, October, 2021

             Evan Schneider and   
            Brant Robertson and   
             Alexander Kuhn and   
            Christopher Lux and   
                  Marc Nienhaus   NVIDIA IndeX accelerated computing for
                                  visualizing Cholla's galactic winds  . . ??
                Bipin Kumar and   
                 Matt Rehme and   
              Neethi Suresh and   
          Nihanth Cherukuru and   
      Stanislaw Jaroszynski and   
                  Samual Li and   
               Scott Pearse and   
              Tim Scheitlin and   
        Suryachandra A. Rao and   
             Ravi S. Nanjundiah   Optimization of DNS code and
                                  visualization of entrainment and mixing
                                  phenomena at cloud edges . . . . . . . . ??
            Andreas Jocksch and   
           Noé Ohana and   
             Emmanuel Lanti and   
          Eirini Koutsaniti and   
        Vasileios Karakasis and   
                Laurent Villard   An optimisation of allreduce
                                  communication in message-passing systems ??
                John Lawson and   
                     Mehdi Goli   Performance portability through machine
                                  learning guided kernel selection in SYCL
                                  libraries  . . . . . . . . . . . . . . . ??
                Michael Orr and   
                  Oliver Sinnen   Optimal task scheduling for partially
                                  heterogeneous systems  . . . . . . . . . ??
  Jesper Larsson Träff and   
              Sascha Hunold and   
          Guillaume Mercier and   
               Daniel J. Holmes   MPI collective communication through a
                                  single set of interfaces: a case for
                                  orthogonality  . . . . . . . . . . . . . ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   October 2021 . . . . . . . . . . . . . . ??

Parallel Computing
Volume 108, Number ??, December, 2021

                Linchao Cai and   
               Junrong Yang and   
               Shoubin Dong and   
                   Zhenyu Jiang   GPU accelerated parallel
                                  reliability-guided digital volume
                                  correlation with automatic seed
                                  selection based on $3$D SIFT . . . . . . ??
           Kurt B. Ferreira and   
                     Scott Levy   Evaluating MPI resource usage summary
                                  statistics . . . . . . . . . . . . . . . ??
      Matthew G. F. Dosanjh and   
              Andrew Worley and   
              Derek Schafer and   
        Prema Soundararajan and   
             Sheikh Ghafoor and   
           Anthony Skjellum and   
   Purushotham V. Bangalore and   
                  Ryan E. Grant   Implementation and evaluation of MPI 4.0
                                  partitioned communication libraries  . . ??
 Mirsaeid Hosseini Shirvani and   
           Reza Noorian Talouki   A novel hybrid heuristic-based list
                                  scheduling algorithm in heterogeneous
                                  cloud computing environment for
                                  makespan optimization  . . . . . . . . . ??
    David B. Williams-Young and   
         Abhishek Bagusetty and   
            Wibe A. de Jong and   
           Douglas Doerfler and   
     Hubertus J. J. van Dam and   
Álvaro Vázquez-Mayagoitia and   
          Theresa L. Windus and   
                      Chao Yang   Achieving performance portability in
                                  Gaussian basis set density functional
                                  theory on accelerator based
                                  architectures in \pkgNWChemEx  . . . . . ??
              Sean M. Couch and   
              Jared Carlson and   
             Michael Pajkos and   
            Brian W. O'Shea and   
                Anshu Dubey and   
                 Tom Klosterman   Towards performance portability in the
                                  Spark astrophysical magnetohydrodynamics
                                  solver in the Flash-X simulation
                                  framework  . . . . . . . . . . . . . . . ??
         Richard Tran Mills and   
              Mark F. Adams and   
               Satish Balay and   
                  Jed Brown and   
                  Alp Dener and   
            Matthew Knepley and   
            Scott E. Kruger and   
              Hannah Morgan and   
                Todd Munson and   
                  Karl Rupp and   
             Barry F. Smith and   
            Stefano Zampini and   
                 Hong Zhang and   
                  Junchao Zhang   Toward performance-portable PETSc for
                                  GPU-based exascale systems . . . . . . . ??
              John R. Tramm and   
               Andrew R. Siegel   Immortal rays: Rethinking random ray
                                  neutron transport on GPU architectures   ??
                   A. Myers and   
                 A. Almgren and   
               L. D. Amorim and   
                    J. Bell and   
                  L. Fedeli and   
                      L. Ge and   
                    K. Gott and   
                D. P. Grote and   
                   M. Hogan and   
                   A. Huebl and   
             R. Jambunathan and   
                    R. Lehe and   
                      C. Ng and   
                   M. Rowan and   
                O. Shapoval and   
         M. Thévenet and   
                  J.-L. Vay and   
                H. Vincenti and   
                    E. Yang and   
                 N. Za\"\im and   
                   W. Zhang and   
                    Y. Zhao and   
                        E. Zoni   Porting \pkgWarpX to GPU-accelerated
                                  platforms  . . . . . . . . . . . . . . . ??
           Kenneth Moreland and   
             Robert Maynard and   
              David Pugmire and   
           Abhishek Yenpure and   
            Allison Vacanti and   
             Matthew Larsen and   
                    Hank Childs   Minimizing development costs for
                                  efficient many-core visualization using
                                  \pkgMCD$^3$  . . . . . . . . . . . . . . ??
              Cody J. Balos and   
           David J. Gardner and   
          Carol S. Woodward and   
             Daniel R. Reynolds   Enabling GPU accelerated computing in
                                  the \pkgSUNDIALS time integration
                                  library  . . . . . . . . . . . . . . . . ??
                 Keren Zhou and   
           Laksono Adhianto and   
          Jonathon Anderson and   
              Aaron Cherian and   
             Dejan Grubisic and   
               Mark Krentel and   
                 Yumeng Liu and   
               Xiaozhu Meng and   
            John Mellor-Crummey   Measurement and analysis of
                                  GPU-accelerated applications with
                                  \pkgHPCToolkit . . . . . . . . . . . . . ??
              Fabian Czappa and   
         Alexandru Calotoiu and   
           Thomas Höhl and   
               Heiko Mantel and   
                Toni Nguyen and   
                     Felix Wolf   Design-time performance modeling of
                                  compositional parallel programs  . . . . ??
          Robert D. Falgout and   
                 Ruipeng Li and   
   Björn Sjögreen and   
                    Lu Wang and   
              Ulrike Meier Yang   Porting \pkghypre to heterogeneous
                                  computer architectures: Strategies and
                                  experiences  . . . . . . . . . . . . . . ??
          Ahmad Abdelfattah and   
              Valeria Barra and   
              Natalie Beams and   
                Ryan Bleile and   
                  Jed Brown and   
        Jean-Sylvain Camier and   
              Robert Carson and   
              Noel Chalmers and   
             Veselin Dobrev and   
             Yohann Dudouit and   
               Paul Fischer and   
                Ali Karakus and   
          Stefan Kerkemeier and   
               Tzanio Kolev and   
              Yu-Hsiang Lan and   
               Elia Merzari and   
                  Misun Min and   
           Malachi Phillips and   
         Thilina Rathnayake and   
              Robert Rieben and   
               Thomas Stitt and   
        Ananias Tomboulides and   
            Stanimire Tomov and   
             Vladimir Tomov and   
              Arturo Vargas and   
              Tim Warburton and   
                  Kenneth Weiss   GPU algorithms for Efficient Exascale
                                  Discretizations  . . . . . . . . . . . . ??
              Yuta Hasegawa and   
              Takayuki Aoki and   
        Hiromichi Kobayashi and   
           Yasuhiro Idomura and   
                Naoyuki Onodera   Tree cutting approach for domain
                                  partitioning on forest-of-octrees-based
                                  block-structured static adaptive mesh
                                  refinement with lattice Boltzmann method ??
               Atsushi Hori and   
           Emmanuel Jeannot and   
             George Bosilca and   
             Takahiro Ogura and   
              Balazs Gerofi and   
                    Jie Yin and   
                Yutaka Ishikawa   An international survey on MPI users . . ??
                  Wen Cheng and   
                Shijun Deng and   
              Lingfang Zeng and   
                  Yang Wang and   
         André Brinkmann   AIOC$^2$: a deep Q-learning approach to
                                  autonomic I/O congestion control in
                                  \pkgLustre . . . . . . . . . . . . . . . ??
     Igor Fontana de Nardin and   
      Rodrigo da Rosa Righi and   
  Thiago Roberto Lima Lopes and   
Cristiano André da Costa and   
            Heon Young Yeom and   
            Harald Köstler   On revisiting energy and performance in
                                  microservices applications: a cloud
                                  elasticity-driven approach . . . . . . . ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   December 2021  . . . . . . . . . . . . . ??

Parallel Computing
Volume 109, Number ??, March, 2022

             Seonmyeong Bak and   
            Colleen Bertoni and   
                 Swen Boehm and   
           Reuben Budiardja and   
         Barbara M. Chapman and   
          Johannes Doerfert and   
           Markus Eisenbach and   
                 Hal Finkel and   
            Oscar Hernandez and   
               Joseph Huber and   
           Shintaro Iwasaki and   
                 Vivek Kale and   
            Paul R. C. Kent and   
              JaeHyuk Kwack and   
                Meifeng Lin and   
             Piotr Luszczek and   
                     Ye Luo and   
                   Buu Pham and   
            Swaroop Pophale and   
            Kiran Ravikumar and   
               Vivek Sarkar and   
            Thomas Scogland and   
                Shilei Tian and   
                    P. K. Yeung   OpenMP application experiences: Porting
                                  to accelerated nodes . . . . . . . . . . ??
             Joachim Protze and   
 Marc-André Hermanns and   
    Matthias S. Müller and   
             Van Man Nguyen and   
              Julien Jaeger and   
        Emmanuelle Saillard and   
         Patrick Carribault and   
                  Denis Barthou   MPI detach --- Towards automatic
                                  asynchronous local completion  . . . . . ??
          Stephane Bouhrour and   
              Thibaut Pepin and   
                  Julien Jaeger   Towards leveraging collective
                                  performance with the support of MPI 4.0
                                  features in MPC  . . . . . . . . . . . . ??
     Leonardo Solis-Vasquez and   
         Andreas F. Tillack and   
       Diogo Santos-Martins and   
               Andreas Koch and   
              Scott LeGrand and   
                  Stefano Forli   Benchmarking the performance of
                                  irregular computations in AutoDock--GPU
                                  molecular docking  . . . . . . . . . . . ??
           Stephen Timcheck and   
                  Jeremy Buhler   Reducing queuing impact in streaming
                                  applications with irregular dataflow . . ??
                 Dong Zhong and   
                Qinglei Cao and   
             George Bosilca and   
                  Jack Dongarra   Using long vector extensions for MPI
                                  reductions . . . . . . . . . . . . . . . ??
             Mirko Mariotti and   
           Daniel Magalotti and   
              Daniele Spiga and   
                Loriano Storchi   The BondMachine, a moldable computer
                                  architecture . . . . . . . . . . . . . . ??
              Boro Sofranac and   
            Ambros Gleixner and   
              Sebastian Pokutta   Accelerating domain propagation: an
                                  efficient GPU-parallel algorithm over
                                  sparse matrices  . . . . . . . . . . . . ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   March 2022 . . . . . . . . . . . . . . . ??

Parallel Computing
Volume 110, Number ??, May, 2022

                 Sunwoo Lee and   
               Kai-yuan Hou and   
                 Kewei Wang and   
               Saba Sehrish and   
               Marc Paterno and   
          James Kowalkowski and   
             Quincey Koziol and   
             Robert B. Ross and   
              Ankit Agrawal and   
             Alok Choudhary and   
                  Wei-keng Liao   A case study on parallel \pkgHDF5
                                  dataset concatenation for high energy
                                  physics data analysis  . . . . . . . . . ??
                    Rong Gu and   
                    Jun Shi and   
               Xiaofei Chen and   
              Zhaokang Wang and   
                   Yang Che and   
                  Kai Zhang and   
                    Yihua Huang   \pkgOctopus-DF: Unified DataFrame-based
                                  cross-platform data analytic system  . . ??
           Rafael F. Schmid and   
       Flávia Pisani and   
    Edson N. Cáceres and   
                    Edson Borin   An evaluation of fast segmented sorting
                                  implementations on GPUs  . . . . . . . . ??
                   Fei Teng and   
                     Lei Yu and   
                   Xiao Liu and   
                        Pei Lai   Tight Lower bound on power consumption
                                  for scheduling real-time periodic tasks
                                  in core-level DVFS systems . . . . . . . ??
          Spiros N. Agathos and   
  Vassilios V. Dimakopoulos and   
            Ilias K. Kasmeridis   Compiler-assisted, adaptive runtime
                                  system for the support of OpenMP in
                                  embedded multicores  . . . . . . . . . . ??
                  Ian Bogle and   
            George M. Slota and   
              Erik G. Boman and   
            Karen D. Devine and   
      Sivasankaran Rajamanickam   Parallel graph coloring algorithms for
                                  distributed GPU environments . . . . . . ??
             Pieter Ghysels and   
                      Ryan Synk   High performance sparse multifrontal
                                  solvers on modern GPUs . . . . . . . . . ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   May 2022 . . . . . . . . . . . . . . . . ??

Parallel Computing
Volume 111, Number ??, July, 2022

        Kasia \'Swirydowicz and   
                 Eric Darve and   
               Wesley Jones and   
             Jonathan Maack and   
               Shaked Regev and   
        Michael A. Saunders and   
          Stephen J. Thomas and   
                 Slaven Pele\vs   Linear solvers for power grid
                                  optimization problems: a review of
                                  GPU-accelerated linear solvers . . . . . ??
                      Anonymous   Special issue of Selected Papers from
                                  EuroMPI/USA 2020 . . . . . . . . . . . . ??
              Jianguo Liang and   
                   Rong Hua and   
               Wenqiang Zhu and   
                    Yuxi Ye and   
                     You Fu and   
                      Hao Zhang   OpenACC + Athread collaborative
                                  optimization of Silicon-Crystal
                                  application on Sunway TaihuLight . . . . ??
              Nitin Gawande and   
                Sayan Ghosh and   
     Mahantesh Halappanavar and   
             Antonino Tumeo and   
            Ananth Kalyanaraman   Towards scaling community detection on
                                  distributed-memory heterogeneous systems ??
               Zhongyu Shen and   
                Jilin Zhang and   
                Tomohiro Suzuki   Task-parallel tiled direct solver for
                                  dense symmetric indefinite systems . . . ??
                   Yalan Wu and   
                  Jigang Wu and   
                   Peng Liu and   
                  Yinhe Han and   
        Thambipillai Srikanthan   Reconfiguration algorithms for
                                  synchronous communication on switch
                                  based degradable arrays  . . . . . . . . ??
               Terry Cojean and   
        Yu-Hsiang Mike Tsai and   
                   Hartwig Anzt   Ginkgo --- a math library designed for
                                  platform portability . . . . . . . . . . ??
      Johannes Pekkilä and   
Miikka S. Väisälä and   
 Maarit J. Käpylä and   
        Matthias Rheinhardt and   
                    Oskar Lappi   Scalable communication for high-order
                                  stencil computations using CUDA-aware
                                  MPI  . . . . . . . . . . . . . . . . . . ??
             Keita Iwabuchi and   
              Karim Youssef and   
           Kaushik Velusamy and   
               Maya Gokhale and   
                   Roger Pearce   Metall: a persistent memory allocator
                                  for data-centric analytics . . . . . . . ??
            Yassine Ramdane and   
              Omar Boussaid and   
        Doulkifli Boukra\`a and   
              Nadia Kabachi and   
                Fadila Bentayeb   Building a novel physical design of a
                                  distributed big data warehouse over a
                                  Hadoop cluster to enhance OLAP cube
                                  query performance  . . . . . . . . . . . ??
              Robert Schade and   
              Tobias Kenter and   
           Hossam Elgabarty and   
               Michael Lass and   
            Ole Schütt and   
              Alfio Lazzaro and   
                 Hans Pabst and   
               Stephan Mohr and   
           Jürg Hutter and   
       Thomas D. Kühne and   
               Christian Plessl   Towards electronic structure-based
                                  ab-initio molecular dynamics
                                  simulations with hundreds of millions of
                                  atoms  . . . . . . . . . . . . . . . . . ??
           Tetsuro Nakamura and   
                Shogo Saito and   
               Kei Fujimoto and   
             Masashi Kaneko and   
                Akinori Shiraga   Spatial- and time- division multiplexing
                                  in CNN accelerator . . . . . . . . . . . ??
   Nuntipat Phisutthangkoon and   
              Jeeraporn Werapun   Optimal ATAPE task scheduling on
                                  reconfigurable and partitionable
                                  hierarchical hypercube networks  . . . . ??
                Ziheng Wang and   
                  Heng Chen and   
                Weiling Cai and   
               Xiaoshe Dong and   
                  Xingjun Zhang   $C$-Lop: Accurate contention-based
                                  modeling of MPI concurrent communication ??
    Vianney Kengne Tchendji and   
    Hermann Bogning Tepiele and   
       Mathias Akong Onabid and   
Jean Frédéric Myoupo and   
           Jerry Lacmou Zeutouo   A coarse-grained multicomputer parallel
                                  algorithm for the sequential substring
                                  constrained longest common subsequence
                                  problem  . . . . . . . . . . . . . . . . ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   July 2022  . . . . . . . . . . . . . . . ??

Parallel Computing
Volume 112, Number ??, September, 2022

              Andrew Garmon and   
       Vinay Ramakrishnaiah and   
                    Danny Perez   Resource allocation for task-level
                                  speculative scientific applications: a
                                  proof of concept using Parallel
                                  Trajectory Splicing  . . . . . . . . . . ??
             Daniel Bielich and   
              Julien Langou and   
             Stephen Thomas and   
        Kasia \'Swirydowicz and   
          Ichitaro Yamazaki and   
                  Erik G. Boman   Low-synch Gram--Schmidt with delayed
                                  reorthogonalization for Krylov solvers   ??
            Busenur Aktilav and   
                   Isil Öz   Performance and accuracy predictions of
                                  approximation methods for shortest-path
                                  algorithms on GPUs . . . . . . . . . . . ??
                  Lena Oden and   
               Jörg Keller   Improving cryptanalytic applications
                                  with stochastic runtimes on GPUs and
                                  multicores . . . . . . . . . . . . . . . ??
                  Zhong Liu and   
                   Xin Xiao and   
                    Chen Li and   
                   Sheng Ma and   
                    Deng Rangyu   Optimizing convolutional neural networks
                                  on multi-core vector accelerator . . . . ??
                  Wenhu Shi and   
                Hongjian Li and   
                Junzhe Guan and   
                  Hang Zeng and   
             Rafe Misskat jahan   Energy-efficient scheduling algorithms
                                  based on task clustering in
                                  heterogeneous Spark clusters . . . . . . ??
               Seo Jin Jang and   
                    Wei Liu and   
                     Wei Li and   
                  Yong Beom Cho   Parallel multi-view HEVC for
                                  heterogeneously embedded cluster system  ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   September 2022 . . . . . . . . . . . . . ??

Parallel Computing
Volume 113, Number ??, October, 2022

              Alessio Netti and   
                Michael Ott and   
              Carla Guillen and   
             Daniele Tafani and   
                  Martin Schulz   Operational Data Analytics in practice:
                                  Experiences from design to deployment in
                                  production HPC environments  . . . . . . ??
                 J. Pronold and   
                  J. Jordan and   
             B. J. N. Wylie and   
                I. Kitayama and   
                M. Diesmann and   
                      S. Kunkel   Routing brain traffic through the von
                                  Neumann bottleneck: Efficient cache
                                  usage in spiking neural network
                                  simulation code on general purpose
                                  computers  . . . . . . . . . . . . . . . ??
               Jiazhi Jiang and   
                  Dan Huang and   
                 Jiangsu Du and   
                  Yutong Lu and   
                   Xiangke Liao   Optimizing small channel $3$D
                                  convolution on GPU with tensor core  . . ??
                Gizen Mutlu and   
         Çigdem Inan Aci   SVM-SMO-SGD: a hybrid-parallel support
                                  vector machine algorithm using
                                  sequential minimal optimization with
                                  stochastic gradient descent  . . . . . . ??
                 Tianshi Xu and   
         Vassilis Kalantzis and   
                 Ruipeng Li and   
                 Yuanzhe Xi and   
            Geoffrey Dillon and   
                    Yousef Saad   parGeMSLR: a parallel multilevel Schur
                                  complement low-rank preconditioning and
                                  solution package for general sparse
                                  matrices . . . . . . . . . . . . . . . . ??
               Qingxiao Sun and   
                     Liu Yi and   
               Hailong Yang and   
                Mingzhen Li and   
              Zhongzhi Luan and   
                     Depei Qian   QoS-aware dynamic resource allocation
                                  with improved utilization and energy
                                  efficiency on GPU  . . . . . . . . . . . ??
                Jaemin Choi and   
                  Zane Fink and   
                  Sam White and   
                 Nitin Bhat and   
          David F. Richards and   
              Laxmikant V. Kale   Accelerating communication for parallel
                                  programming models on GPU systems  . . . ??
                  Yan Huang and   
               Qingbin Wang and   
                 Minghao Lv and   
             Xingguang Song and   
                Jinkai Feng and   
                   Xuli Tan and   
                Ziyan Huang and   
                   Chuyuan Zhou   Fast calculation of isostatic
                                  compensation correction using the
                                  GPU-parallel prism method  . . . . . . . ??
                   Hao Wang and   
                      Ce Yu and   
                  Jian Xiao and   
             Shanjiang Tang and   
                      Yu Lu and   
                     Hao Fu and   
                    Bo Kang and   
                 Gang Zheng and   
                   Chenzhou Cui   A method for efficient radio
                                  astronomical data gridding on multi-core
                                  vector processor . . . . . . . . . . . . ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   October 2022 . . . . . . . . . . . . . . ??

Parallel Computing
Volume 114, Number ??, December, 2022

                Lukas Spies and   
               Amanda Bienz and   
              David Moulton and   
                 Luke Olson and   
                 Andrew Reisner   Tausch: a halo exchange library for
                                  large heterogeneous computing systems
                                  using MPI, OpenCL, and CUDA  . . . . . . ??
               Xinyuan Wang and   
                   Hejiao Huang   SGPM: a coroutine framework for
                                  transaction processing . . . . . . . . . ??
               Paul Fischer and   
          Stefan Kerkemeier and   
                  Misun Min and   
              Yu-Hsiang Lan and   
           Malachi Phillips and   
         Thilina Rathnayake and   
               Elia Merzari and   
        Ananias Tomboulides and   
                Ali Karakus and   
              Noel Chalmers and   
                  Tim Warburton   NekRS, a GPU-accelerated spectral
                                  element Navier--Stokes solver  . . . . . ??
             Masahiro Nakao and   
           Masaki Tsukamoto and   
             Yoshiko Hanada and   
                 Keiji Yamamoto   Graph optimization algorithm using
                                  symmetry and host bias for low-latency
                                  indirect network . . . . . . . . . . . . ??
                 Adel Dabah and   
           Ibrahim Chegrane and   
           Sa\"\id Yahiaoui and   
           Ahcene Bendjoudi and   
       Nadia Nouali-Taboudjemat   Efficient parallel branch-and-bound
                                  approaches for exact graph edit distance
                                  problem  . . . . . . . . . . . . . . . . ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   December 2022  . . . . . . . . . . . . . ??

Parallel Computing
Volume 115, Number ??, February, 2023

          Guilherme Andrade and   
            Renato Ferreira and   
                 George Teodoro   Spatial-aware data partition for
                                  distributed memory parallelization of
                                  ANN search in multimedia retrieval . . . ??
                G. Patronas and   
            N. Vlassopoulos and   
                 Ph. Bellos and   
                      D. Reisis   Accelerating the scheduling of the
                                  network resources of the next-generation
                                  optical data centers . . . . . . . . . . ??
     Özcan Dülger and   
  Halit Oguztüzün and   
       Mübeccel Demirekler   Uphill resampling for particle filter
                                  and its implementation on graphics
                                  processing unit  . . . . . . . . . . . . ??
                 Guoqing Wu and   
               Hongyun Tian and   
                     Guo Lu and   
                       Wei Wang   ParVoro++: a scalable parallel algorithm
                                  for constructing $3$D Voronoi
                                  tessellations based on $ k d$-tree
                                  decomposition  . . . . . . . . . . . . . ??
                    Kuan Li and   
                    Kang He and   
              Stef Graillat and   
                  Hao Jiang and   
               Tongxiang Gu and   
                        Jie Liu   Multi-level parallel multi-layer block
                                  reproducible summation algorithm . . . . ??
         Phillip Allen Lane and   
            Joshua Dennis Booth   Heterogeneous sparse matrix--vector
                                  multiplication via compressed sparse row
                                  format . . . . . . . . . . . . . . . . . ??
              Valeriy Manin and   
                     Bruno Lang   Efficient parallel reduction of
                                  bandwidth for symmetric matrices . . . . ??
                      Anonymous   Reviewer acknowledgment  . . . . . . . . ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   February 2023  . . . . . . . . . . . . . ??

Parallel Computing
Volume 116, Number ??, July, 2023

                Yidong Chen and   
                    Chen Li and   
                Yonghong Hu and   
                    Zhonghua Lu   A parallel non-convex approximation
                                  framework for risk parity portfolio
                                  design . . . . . . . . . . . . . . . . . ??
            Marek Palkowski and   
           Wlodzimierz Bielecki   NPDP benchmark suite for the evaluation
                                  of the effectiveness of automatic
                                  optimizing compilers . . . . . . . . . . ??
                  Zeshi Liu and   
                   Zhen Xie and   
               Wenqian Dong and   
              Mengting Yuan and   
                Haihang You and   
                        Dong Li   A heterogeneous processing-in-memory
                                  approach to accelerate quantum chemistry
                                  simulation . . . . . . . . . . . . . . . ??
               Akira Nukada and   
            Taichiro Suzuki and   
               Satoshi Matsuoka   Efficient checkpoint/restart of CUDA
                                  applications . . . . . . . . . . . . . . ??
           David Castells-Rufas   GPU acceleration of Levenshtein distance
                                  computation between long strings . . . . ??
                Lukas Reitz and   
           Kai Hardenbicker and   
              Tobias Werner and   
                  Claudia Fohry   Lifeline-based load balancing schemes
                                  for Asynchronous Many-Task runtimes in
                                  clusters . . . . . . . . . . . . . . . . ??
            Shelby Lockhart and   
               Amanda Bienz and   
           William D. Gropp and   
                  Luke N. Olson   Characterizing the performance of
                                  node-aware strategies for irregular
                                  point-to-point communication on
                                  heterogeneous architectures  . . . . . . ??
                     Lei Yu and   
               Tianqi Zhong and   
                    Peng Bi and   
                   Lan Wang and   
                       Fei Teng   Segment based power-efficient scheduling
                                  for real-time DAG tasks on edge devices  ??
       Clément Foyer and   
               Brice Goglin and   
   Andr\`es Rubio Proaño   A survey of software techniques to
                                  emulate heterogeneous memory systems in
                                  high-performance computing . . . . . . . ??
       Andres Pastrana-Cruz and   
                  Manuel Lafond   A lightweight semi-centralized strategy
                                  for the massive parallelization of
                                  branching algorithms . . . . . . . . . . ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   July 2023  . . . . . . . . . . . . . . . ??

Parallel Computing
Volume 117, Number ??, September, 2023

Srdan Daniel Simi\'c and
Nikola Tankovi\'c and
Darko Etinger Big data BPMN workflow resource
optimization in the cloud . . . . . . . ??
Rene Halver and
Christoph Junghans and
Godehard Sutmann Using heterogeneous GPU nodes with a
Cabana-based implementation of MPCD . . ??
Bin Yu and
Xu Lu and
Cong Tian and
Meng Wang and
Chu Chen and
Ming Lei and
Zhenhua Duan Adaptively parallel runtime verification
based on distributed network for
temporal properties . . . . . . . . . . ??
Jiang Zheng and
Jiazhi Jiang and
Jiangsu Du and
Dan Huang and
Yutong Lu Optimizing massively parallel sparse
matrix computing on ARM many-core
processor . . . . . . . . . . . . . . . ??
Lih-Yuan Deng and
Bryan R. Winter and
Jyh-Jen Horng Shiau and
Henry Horng-Shing Lu and
Nirman Kumar and
Ching-Chi Yang Parallelizable efficient large order
multiple recursive generators . . . . . ??
Ami Marowka and
Przemys\law Stpiczy\'nski Editorial on Advances in High
Performance Programming . . . . . . . . ??
Jinliang Shi and
Dewu Chen and
Jiabi Liang and
Lin Li and
Yue Lin and
Jianjiang Li New YARN sharing GPU based on graphics
memory granularity scheduling . . . . . ??
Adam Sky and
César Polindara and
Ingo Muench and
Carolin Birk A flexible sparse matrix data format and
parallel algorithms for the assembly of
finite element matrices on shared memory
systems . . . . . . . . . . . . . . . . ??
Muhammad Kabeer and
Ibrahim Yusuf and
Nasir Ahmad Sufi Distributed software defined
network-based fog to fog collaboration
scheme . . . . . . . . . . . . . . . . . ??
Ou Wu and
Shanshan Li and
He Zhang and
Liwen Liu and
Haoming Li and
Yanze Wang and
Ziyi Zhang An optimal scheduling algorithm
considering the transactions worst-case
delay for multi-channel hyperledger
fabric network . . . . . . . . . . . . . ??
Ignacio Laguna and
Anh Tran and
Ganesh Gopalakrishnan Finding inputs that trigger
floating-point exceptions in
heterogeneous computing via Bayesian
optimization . . . . . . . . . . . . . . 103042:1--103042:13
Hao Zhang and
Zhiyi Huang and
Yawen Chen and
Jianguo Liang and
Xiran Gao ESA: an efficient sequence alignment
algorithm for biological database search
on Sunway TaihuLight . . . . . . . . . . ??
Anonymous Editorial Board . . . . . . . . . . . . ??
Anonymous September 2023 . . . . . . . . . . . . . ??

Parallel Computing
Volume 118, Number ??, November, 2023

            Matthias Bolten and   
        Stephanie Friedhoff and   
                     Jens Hahne   Task graph-based performance analysis of
                                  parallel-in-time methods . . . . . . . . ??
           James D. Trotter and   
          Johannes Langguth and   
                       Xing Cai   Targeting performance and
                                  user-friendliness: GPU-accelerated
                                  finite element computation with
                                  automated code generation in FEniCS  . . ??
                  Zhexu Liu and   
               Shaofeng Liu and   
                Zhiyong Fan and   
                      Zhen Zhao   Low consumption automatic discovery
                                  protocol for DDS-based large-scale
                                  distributed parallel computing . . . . . ??
                  Yunqi Gao and   
               Zechao Zhang and   
                    Bing Hu and   
                 A-Long Jin and   
                    Chunming Wu   OF-WFBP: a near-optimal communication
                                  mechanism for tensor fusion in
                                  distributed deep learning  . . . . . . . ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   November 2023  . . . . . . . . . . . . . ??

Parallel Computing
Volume 119, Number ??, February, 2024

                 Shushan Li and   
                  Meng Wang and   
                 Hong Zhang and   
                        Yao Liu   Program partitioning and deadlock
                                  analysis for MPI based on logical clocks ??
                     Xi Liu and   
                Gizem Kayar and   
                     Ken Perlin   A GPU-based hydrodynamic simulator with
                                  boid interactions  . . . . . . . . . . . ??
           Mohammad Norouzi and   
              Nicolas Morew and   
                Qamar Ilias and   
         Lukas Rothenberger and   
              Ali Jannesari and   
                     Felix Wolf   Fast data-dependence profiling through
                                  prior static analysis  . . . . . . . . . ??
                     Ke Liu and   
                Haonan Tong and   
             Zhongxiang Sun and   
                 Zhixin Ren and   
             Guangkui Huang and   
                Hongyin Zhu and   
                 Luyang Liu and   
                Qunyang Lin and   
                   Chuang Zhang   Integrating FPGA-based hardware
                                  acceleration with relational databases   ??
                    Anne Benoit   Editorial for \booktitleParallel
                                  Computing  . . . . . . . . . . . . . . . ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   February 2024  . . . . . . . . . . . . . ??

Parallel Computing
Volume 120, Number ??, June, 2024

               Jianjiang Li and   
                     Lin Li and   
               Qingwei Wang and   
                    Wei Xue and   
                Jiabi Liang and   
                   Jinliang Shi   Parallel optimization and application of
                                  unstructured sparse triangular solver on
                                  new generation of Sunway architecture    ??
              Kohei Yoshida and   
               Shinobu Miwa and   
              Hayato Yamaki and   
                   Hiroki Honda   Analyzing the impact of CUDA versions on
                                  GPU applications . . . . . . . . . . . . ??
                  Kaihao Ma and   
                Zhenkun Cai and   
                   Xiao Yan and   
                 Yang Zhang and   
                    Zhi Liu and   
                 Yihui Feng and   
                    Chao Li and   
                    Wei Lin and   
                    James Cheng   PPS: Fair and efficient black-box
                                  scheduling for multi-tenant GPU clusters ??
            Sanjay Bhardwaj and   
                 Da-Hye Kim and   
                 Dong-Seong Kim   Federated learning based modulation
                                  classification for multipath channels    ??
        Fahimeh Yazdanpanah and   
                 Mohammad Alaei   An approach for low-power heterogeneous
                                  parallel implementation of ALC-PSO
                                  algorithm using OmpSs and CUDA . . . . . ??
              Qingcai Jiang and   
                Zhenwei Cao and   
                 Xinhui Cui and   
                Lingyun Wan and   
                Xinming Qin and   
                 Huanqi Cao and   
                    Hong An and   
                Junshi Chen and   
                    Jie Liu and   
                     Wei Hu and   
                   Jinlong Yang   Extending the limit of LR-TDDFT on two
                                  different approaches: Numerical
                                  algorithms and new Sunway heterogeneous
                                  supercomputer  . . . . . . . . . . . . . ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   June 2024  . . . . . . . . . . . . . . . ??

Parallel Computing
Volume 121, Number ??, September, 2024

                   Duo Yang and   
                    Bing Hu and   
                     An Liu and   
                 A-Long Jin and   
              Kwan L. Yeung and   
                       Yang You   WBSP: Addressing stragglers in
                                  distributed machine learning with
                                  worker-busy synchronous parallel . . . . ??
          Alexander Agathos and   
               Philip Azariadis   Multi-GPU $3$D $k$-nearest neighbors
                                  computation with application to ICP,
                                  point cloud smoothing and normals
                                  computation  . . . . . . . . . . . . . . ??
                Chunfeng Li and   
              Karim Soliman and   
                    Fei Yin and   
                    Jin Wei and   
                       Feng Shi   NxtSPR: a deadlock-free shortest path
                                  routing dedicated to relaying for
                                  Triplet-Based many-core Architecture . . ??
                  Gang Xian and   
              Wenxiang Yang and   
                 Yusong Tan and   
               Jinghua Feng and   
                    Yuqi Li and   
                 Jian Zhang and   
                         Jie Yu   Mobilizing underutilized storage nodes
                                  via job path: a job-aware file striping
                                  approach . . . . . . . . . . . . . . . . ??
          Jirí Klepl and   
                Adam Smelko and   
      Lukás Rozsypal and   
                  Martin Krulis   Abstractions for C++ code optimizations
                                  in parallel high-performance
                                  applications . . . . . . . . . . . . . . ??
               Dolores Miao and   
             Ignacio Laguna and   
       Giorgis Georgakoudis and   
     Konstantinos Parasyris and   
    Cindy Rubio-González   An automated OpenMP mutation testing
                                  framework for performance optimization   ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   September 2024 . . . . . . . . . . . . . ??

Parallel Computing
Volume 122, Number ??, November, 2024

             Xingwang Huang and   
                    Min Xie and   
                    Dong An and   
                  Shubin Su and   
                Zongliang Zhang   Task scheduling in cloud computing based
                                  on grey wolf optimization with a new
                                  encoding mechanism . . . . . . . . . . . ??
             Adrian Schmitz and   
                Semih Burak and   
              Julian Miller and   
        Matthias S. Müller   Parallel Pattern Compiler for Automatic
                                  Global Optimizations . . . . . . . . . . ??
             Rahim Alizadeh and   
            Shahriar Bijani and   
                Fatemeh Shakeri   Distributed consensus-based estimation
                                  of the leading eigenvalue of a
                                  non-negative irreducible matrix  . . . . ??
               Fenglong Cai and   
                  Dong Yuan and   
                   Zhe Yang and   
                 Yonghui Xu and   
                     Wei He and   
                    Wei Guo and   
                     Lizhen Cui   FastPTM: Fast weights loading of
                                  pre-trained models for parallel
                                  inference service provisioning . . . . . ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   November 2024  . . . . . . . . . . . . . ??

Parallel Computing
Volume 123, Number ??, March, 2025

               Xiaofeng Zou and   
                Yuanxi Peng and   
                     Tuo Li and   
               Lingjun Kong and   
                       Lu Zhang   Seesaw: a 4096-bit vector processor for
                                  accelerating Kyber based on RISC-V ISA
                                  extensions . . . . . . . . . . . . . . . ??
                 Zheng Miao and   
             Jon C. Calhoun and   
                        Rong Ge   Towards resilient and energy efficient
                                  scalable Krylov solvers  . . . . . . . . ??
        Kasia \'Swirydowicz and   
       Nicholson Koukpaizan and   
              Maksudul Alam and   
               Shaked Regev and   
           Michael Saunders and   
                 Slaven Pele\vs   Iterative methods in GPU-resident linear
                                  solvers for nonlinear constrained
                                  optimization . . . . . . . . . . . . . . ??
                  Xiran Gao and   
                    Li Chen and   
                 Haoyu Wang and   
                 Huimin Cui and   
                  Xiaobing Feng   Scalable tasking runtime with
                                  parallelized builders for explicit
                                  message passing architectures  . . . . . ??
             Henri Casanova and   
             Arnaud Giersch and   
             Arnaud Legrand and   
             Martin Quinson and   
   Frédéric Suter   Lowering entry barriers to developing
                                  custom simulators of distributed
                                  applications and platforms with SimGrid  ??
              Jaroslav Olha and   
        Jana Hozzová and   
                Matej Antol and   
          Jirí Filipovic   Estimating resource budgets to ensure
                                  autotuning efficiency  . . . . . . . . . ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   March 2025 . . . . . . . . . . . . . . . ??

Parallel Computing
Volume 124, Number ??, June, 2025

             Anshuman Misra and   
           Ajay D. Kshemkalyani   Byzantine-tolerant detection of
                                  causality: There is no holy grail  . . . ??
                Siyang Xing and   
                 Youmeng Li and   
                 Zikun Deng and   
                Qijun Zheng and   
                    Zeyu Lu and   
                   Qinglin Wang   Multi-level parallelism optimization for
                                  two-dimensional convolution
                                  vectorization method on multi-core
                                  vector accelerator . . . . . . . . . . . ??
                   Wei Qian and   
               Zhengwei Zhu and   
               Chenyang Zhu and   
                    Yanping Zhu   FPGA-based accelerator for YOLOv5 object
                                  detection with optimized computation and
                                  data access for edge deployment  . . . . ??
              Rupinder Kaur and   
             Gurjinder Kaur and   
             Major Singh Goraya   EESF: Energy-efficient scheduling
                                  framework for deadline-constrained
                                  workflows with computation speed
                                  estimation method in cloud . . . . . . . ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   June 2025  . . . . . . . . . . . . . . . ??

Parallel Computing
Volume 125, Number ??, September, 2025

          Harish Padmanaban and   
          Nurkasym Arkabaev and   
            Maher Ali Rusho and   
            Vladyslav Kozub and   
                    Yurii Kozub   Using Java to create and analyze models
                                  of parallel computing systems  . . . . . ??
                  Yuyao Niu and   
                     Marc Casas   ALBBA: an efficient ALgebraic Bypass BFS
                                  Algorithm on long vector architectures   ??
                   Hui Zhao and   
                 Wentao Zhi and   
                 Xiaoqin Lu and   
                  Jing Wang and   
                    Nan Luo and   
                     Bo Wan and   
                      Quan Wang   Multi-workflow fault-tolerance
                                  scheduling strategy considering
                                  resources supply delay in WaaS platforms ??
                 Xiang Zhao and   
                  Haitao Du and   
                        Yi Kang   Enable cross-iteration parallelism for
                                  PIM-based graph processing with
                                  vertex-level synchronization . . . . . . ??
                   Ali Nada and   
           Hazem Ismail Ali and   
                  Liang Liu and   
                Yousra Alkabani   Software acceleration of multi-user MIMO
                                  uplink detection on GPU  . . . . . . . . ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   September 2025 . . . . . . . . . . . . . ??

Parallel Computing
Volume 126, Number ??, November, 2025

               Qingke Zhang and   
              Wenliang Chen and   
               Shuzhao Pang and   
                 Sichen Tao and   
                 Conglin Li and   
                        Xin Yin   GPU/CUDA-Accelerated gradient growth
                                  optimizer for efficient complex
                                  numerical global optimization  . . . . . ??
        Shiva Shankar Reddy and   
          Silpa Nrusimhadri and   
            Gadiraju Mahesh and   
Veeranki Venkata Rama Maheswara Rao   A dependency-aware task offloading in
                                  IoT-based edge computing system using an
                                  optimized deep learning approach . . . . ??
               Pedro Moreno and   
              Miguel Areias and   
                  Ricardo Rocha   A sleek lock-free hash map in an ERA of
                                  safe memory reclamation methods  . . . . ??
        Athanasios Margaris and   
              Stavros Souravlas   Detecting chaotic regions of recurrent
                                  equations in parallel environments . . . ??
          Motahhare Mirzaei and   
           Mehrdad Ashtiani and   
     Mohammad Javad Pirhadi and   
                Sauleh Eetemadi   LSHDP: Locally sharded heterogeneous
                                  data parallel for distributed deep
                                  learning . . . . . . . . . . . . . . . . ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??
                      Anonymous   November 2025  . . . . . . . . . . . . . ??

Parallel Computing
Volume 127, Number ??, March, 2026

         S.-Kazem Shekofteh and   
              Daniel Bogacz and   
            Christian Alles and   
            Holger Fröning   Butterfly factorization for vision
                                  transformers on multi-IPU systems  . . . ??
               Tomas Vondra and   
                    David Sebek   Benchmark of classical disk array and
                                  software-defined storage on
                                  near-identical hardware  . . . . . . . . ??
 Challa Muralikrishna Yadav and   
          B. Naresh Kumar Reddy   Machine learning-driven fault-tolerant
                                  core mapping in Network-on-Chip
                                  architectures for advanced computing
                                  networks . . . . . . . . . . . . . . . . ??
            Kazutomo Yoshii and   
              John R. Tramm and   
                Bryce Allen and   
              Tomohiro Ueno and   
               Kentaro Sano and   
              Andrew Siegel and   
                   Pete Beckman   A case study in hardware specialization
                                  for Monte Carlo cross-section lookup . . ??
             Sergej Breiter and   
           James D. Trotter and   
            Karl Fürlinger   Cache partitioning for sparse
                                  matrix--vector multiplication on the
                                  A64FX  . . . . . . . . . . . . . . . . . ??
                 Yubiao Pan and   
                Ailing Tian and   
                  Huizhen Zhang   PROAD: Boosting Caffe Training via
                                  improving LevelDB I/O performance with
                                  Parallel Read, Out-of-Order
                                  Optimization, and Adaptive Design  . . . ??
           Sergey Malkovsky and   
            Aleksei Sorokin and   
                 Sergey Korolev   Analysis of the impact of NUMA node
                                  configuration on the performance of
                                  offloading computations to GPUs  . . . . ??
              Jan Laukemann and   
                Georg Hager and   
                Gerhard Wellein   Microarchitectural comparison, in-core
                                  modeling, and memory hierarchy analysis
                                  of state-of-the-art CPUs: Grace,
                                  Sapphire Rapids, and Genoa . . . . . . . ??
      Jirí Filipovic and   
Suren Harutyunyan Gevorgyan and   
       Eduardo César and   
                    Anna Sikora   Towards analysis and refinement of
                                  auto-tuning spaces . . . . . . . . . . . ??
              Yongxiang Cao and   
               Hongxu Jiang and   
              Guocheng Zhao and   
              Dongcheng Shi and   
               Runhua Zhang and   
                       Wei Wang   LSAF: A load-balancing SpGEMM
                                  acceleration framework with dynamic
                                  package and static partition for
                                  multi-core systolic arrays . . . . . . . ??
                Yizhuo Wang and   
                  Bowen Liu and   
                Senhao Shao and   
                Jianhua Gao and   
                 Weixing Ji and   
                    Hongbo Xing   HRPF: A parallel programming framework
                                  for recursive algorithms on
                                  heterogeneous CPU-GPU systems  . . . . . ??
                      Anonymous   Editorial Board  . . . . . . . . . . . . ??