Table of contents for issues of International Journal of Parallel Programming

Last update: Wed Oct 8 06:46:45 MDT 2025

International Journal of Parallel Programming
Volume 15, Number 1, February, 1986

                      Anonymous   Important announcement . . . . . . . . . 1--1
                      Anonymous   Editorial: a journal transformed . . . . 3--4
            Edward J. Krall and   
          Patrick F. McGehearty   A case study of parallel execution of a
                                  rule-based expert system . . . . . . . . 5--32
               Vaughan R. Pratt   Modeling concurrency with partial orders 33--71
                       S. Kasif   Control and data driven execution of
                                  logic programs: a comparison . . . . . . 73--99
                       Parallax   How are parallel systems invented? . . . 101--102

International Journal of Parallel Programming
Volume 15, Number 2, April, 1986

                  Paul R. Hudak   The Denotational Semantics of a
                                  Para-Functional Programming Language . . 103--125
                   Guang R. Gao   Maximum pipelining linear recurrence on
                                  static data flow computers . . . . . . . 127--149
        Donald M. Chiarulli and   
                Duncan A. Buell   Parallel microprogramming tools for a
                                  horizontally reconfigurable architecture 151--162
                     D. Nau and   
                  P. Purdom and   
                Chun-Hung Tzeng   Experiments on alternatives to minimax   163--183
                       Parallax   When is pull better than push? (parallel
                                  programming) . . . . . . . . . . . . . . 185--188

International Journal of Parallel Programming
Volume 15, Number 3, June, 1986

               Khayri A. M. Ali   OR-parallel execution of PROLOG on a
                                  multi-sequential machine . . . . . . . . 189--214
           Bharat Jayaraman and   
               Robert M. Keller   Primitives for resource management in a
                                  demand-driven reduction model  . . . . . 215--244
                  S. Taylor and   
                   S. Safra and   
                     E. Shapiro   A parallel implementation of Flat
                                  Concurrent Prolog  . . . . . . . . . . . 245--275
                       Parallax   The bards on parallel programming  . . . 277--277

International Journal of Parallel Programming
Volume 15, Number 4, August, 1986

                  Michael Wolfe   Loops skewing: The wavefront method
                                  revisited  . . . . . . . . . . . . . . . 279--293
           Eugene D. Brooks, II   The butterfly barrier (multiprocessing)  295--307
                Alan George and   
           Michael T. Heath and   
                 Joseph Liu and   
                      Esmond Ng   Solution of sparse positive definite
                                  systems on a shared-memory
                                  multiprocessor . . . . . . . . . . . . . 309--325
                 S. P. Rana and   
                  D. K. Banerji   An optimal distributed solution to the
                                  dining philosophers problem  . . . . . . 327--335
                      Anonymous   Hotspotting  . . . . . . . . . . . . . . 337--337

International Journal of Parallel Programming
Volume 15, Number 5, October, 1986

           Khayri A. M. Ali and   
                    Seif Haridi   Global garbage collection for
                                  distributed heap storage systems . . . . 339--387
                Hossam El-Gindy   An optimal speed-up parallel algorithm
                                  for triangulating simplicial point sets
                                  in space . . . . . . . . . . . . . . . . 389--398
                       Ed Merks   An Optimal Parallel Algorithm for
                                  Triangulating a Set of Points in the
                                  Plane  . . . . . . . . . . . . . . . . . 399--411
               B. Gro\vselj and   
                     C. Tropper   Pseudosimulation: an algorithm for
                                  distributed simulation with limited
                                  memory . . . . . . . . . . . . . . . . . 413--456
                      Anonymous   The church of the least fixed point  . . 457--457

International Journal of Parallel Programming
Volume 15, Number 6, December, 1986

        Robert H. Halstead, Jr.   An Assessment of Multilisp --- Lessons
                                  from Experience  . . . . . . . . . . . . 459--501
              Eliezer Dekel and   
              Shietung Peng and   
            S. Sitharma Lyengar   Optimal parallel algorithms for
                                  constructing and maintaining a balanced
                                  $m$-way search tree  . . . . . . . . . . 503--528
     Virgilio A. F. Almeida and   
              Lawrence W. Dowdy   Performance analysis of a scheme for
                                  concurrency/synchronization using
                                  queueing network models  . . . . . . . . 529--550
Venkatramana G. Ajjanagadde and   
                  L. M. Patnaik   Systolic Architecture for B-Spline
                                  Surfaces . . . . . . . . . . . . . . . . 551--565
                 Gary Lindstrom   Sans pareil: Referees  . . . . . . . . . 567--568

International Journal of Parallel Programming
Volume 16, Number 1, February, 1987

          Shlomit S. Pinter and   
                Yaron Wolfstahl   On mapping processes to processors in
                                  distributed systems  . . . . . . . . . . 1--15
     Kristine Stougaard Thomsen   Inheritance on processes, exemplified on
                                  distributed termination detection  . . . 17--52
             E. P. DeBenedictis   A Multiprocessor Using Protocol-Based
                                  Programming Primitives . . . . . . . . . 53--84
                      Anonymous   Amdahl's law . . . . . . . . . . . . . . 85--85

International Journal of Parallel Programming
Volume 16, Number 2, April, 1987

                 Ian Foster and   
                 Stephen Taylor   Flat Parlog: a basis for comparison  . . 87--125
                Henk Meijer and   
                   Selim G. Akl   Optimal computation of prefix sums on a
                                  binary tree of processors  . . . . . . . 127--136
              Michael Wolfe and   
                 Utpal Banerjee   Data dependence and its application to
                                  parallel processing  . . . . . . . . . . 137--178
                      Anonymous   Isomorphic Computers Inc.: With
                                  Isomorphic Computers, more is more\TM    179--182

International Journal of Parallel Programming
Volume 16, Number 3, June, 1987

              Adolfo Guzman and   
            Edward J. Krall and   
      Patrick F. McGehearty and   
              Nader Bagherzadeh   Performance of symbolic applications on
                                  a parallel architecture  . . . . . . . . 183--214
        Richard M. Fujimoto and   
                 Hwa-chung Feng   A shared memory algorithm and proof for
                                  the generalized alternative construct in
                                  CSP  . . . . . . . . . . . . . . . . . . 215--241
               R. L. Wainwright   Deriving parallel computations from
                                  functional specifications: a seismic
                                  example on a hypercube . . . . . . . . . 243--260
                      Anonymous   Systolic processing  . . . . . . . . . . 261--261

International Journal of Parallel Programming
Volume 16, Number 4, August, 1987

             Nissim Francez and   
                    Shmuel Katz   Fairness and the axioms of control
                                  predicates . . . . . . . . . . . . . . . 263--278
                Frances E. Hunt   Experiments with applicative updating:
                                  practical results  . . . . . . . . . . . 279--303
                 E. Bradley and   
            R. H. Halstead, Jr.   Simulating logic circuits: a
                                  multiprocessor application . . . . . . . 305--338
                      Anonymous   Connectionism  . . . . . . . . . . . . . 339--339

International Journal of Parallel Programming
Volume 16, Number 5, October, 1987

                Ashok Samal and   
                  Tom Henderson   Parallel Consistent Labeling Algorithms  341--364
            Charles Koelbel and   
            Piyush Mehrotra and   
             John Van Rosendale   Semi-automatic process partitioning for
                                  parallel computation . . . . . . . . . . 365--382
                Michael G. Main   Trace, failure and testing equivalences
                                  for communicating processes  . . . . . . 383--400
                     A. Davison   Blackboard systems in Polka  . . . . . . 401--424
                      Anonymous   Fixpoints in Daily Life  . . . . . . . . 425--425

International Journal of Parallel Programming
Volume 16, Number 6, December, 1987

            John R. Gilbert and   
                 Earl Zmijewski   A parallel graph partitioning algorithm
                                  for a message-passing multiprocessor . . 427--449
           Pierpaolo Degano and   
               Sergio Marchetti   Partial ordering models for concurrency
                                  can be defined operationally . . . . . . 451--478
          V. Nageshwara Rao and   
                    Vipin Kumar   Parallel depth first search. Part I.
                                  Implementation . . . . . . . . . . . . . 479--499
                Vipin Kumar and   
              V. Nageshwara Rao   Parallel depth first search. Part II.
                                  Analysis . . . . . . . . . . . . . . . . 501--519
                 Gary Lindstrom   Sans pareil: Referees  . . . . . . . . . 521--522

International Journal of Parallel Programming
Volume 17, Number 1, February, 1988

              Debra Hensgen and   
             Raphael Finkel and   
                     Udi Manber   Two algorithms for barrier
                                  synchronization  . . . . . . . . . . . . 1--17
          Patrick Valduriez and   
               Setrag Khoshfian   Parallel evaluation of the transitive
                                  closure of a database relation . . . . . 19--42
        Stephen L. Stepoway and   
           Michael Christiansen   Parallel Rendering of Fractal Surfaces   43--58
                   P. A. Tinker   Performance of an OR-parallel logic
                                  programming system . . . . . . . . . . . 59--92
                 Gary Lindstrom   Sage commentary  . . . . . . . . . . . . 93--93

International Journal of Parallel Programming
Volume 17, Number 2, April, 1988

                Anoop Gupta and   
               Milind Tambe and   
                  Dirk Kalp and   
              Charles Forgy and   
                   Allen Newell   Parallel implementation of OPS5 on the
                                  Encore multiprocessor: results and
                                  analysis . . . . . . . . . . . . . . . . 95--124
                 John S. Conery   Binding environments for parallel logic
                                  programs in non-shared memory
                                  multiprocessors  . . . . . . . . . . . . 125--152
           Rance Cleaveland and   
             Prakash Panangaden   Type theory and concurrency  . . . . . . 153--206

International Journal of Parallel Programming
Volume 17, Number 3, June, 1988

                 Z. Somogyi and   
           K. Ramamohanarao and   
                     J. Vaghani   A backtracking algorithm for the stream
                                  AND-parallel execution of logic programs 207--257
      Elizabeth W. Edmiston and   
              Nolan G. Core and   
              Joel H. Saltz and   
                 Roger M. Smith   Parallel processing of biological
                                  sequence comparison algorithms . . . . . 259--275
            V. K. Janakiram and   
            E. F. Gehringer and   
              D. P. Agrawal and   
                   Mehrotra and   
                             R.   A randomized parallel branch-and-bound
                                  algorithm  . . . . . . . . . . . . . . . 277--301

International Journal of Parallel Programming
Volume 17, Number 4, August, 1988

      Carla Schlatter Ellis and   
                Thomas J. Olson   Algorithms for parallel memory
                                  allocation . . . . . . . . . . . . . . . 303--345
        Mark T. Vandevoorde and   
                Eric S. Roberts   WorkCrews: an abstraction for
                                  controlling parallelism  . . . . . . . . 347--366

International Journal of Parallel Programming
Volume 17, Number 5, October, 1988

                James S. Miller   Implementing a Scheme-Based Parallel
                                  Processing System  . . . . . . . . . . . 367--402
                 G. Cybenko and   
                T. G. Allen and   
                   J. E. Polito   Practical Parallel Union-Find Algorithms
                                  for Transitive Closure and Clustering    403--423
              Benjamin Goldberg   Multiprocessor execution of functional
                                  programs . . . . . . . . . . . . . . . . 425--473

International Journal of Parallel Programming
Volume 17, Number 6, December, 1988

               Lionel M. Ni and   
                  Chung-Ta King   On partitioning and mapping for
                                  hypercube computing  . . . . . . . . . . 475--495
                   Jim Crammond   A Garbage Collection Algorithm for
                                  Shared Memory Parallel Processors  . . . 497--522
               Michael J. Swain   Comments on A. Samal and T. Henderson:
                                  ``Parallel consistent labeling
                                  algorithms'' [Internat. J. Parallel
                                  Programming \bf 16 (1987), no. 5,
                                  341--364]  . . . . . . . . . . . . . . . 523--528
                  Gary Linstrom   Sans pareil: referees  . . . . . . . . . 529--530

International Journal of Parallel Programming
Volume 18, Number 1, February, 1989

              Anne Neirynck and   
         Prakash Panangaden and   
                 Alan J. Demers   Effect analysis in higher-order
                                  languages  . . . . . . . . . . . . . . . 1--36
                Ran Ginosar and   
                    David Egozi   Topological comparison of perfect
                                  shuffle and hypercube  . . . . . . . . . 37--68
             David M. Nicol and   
              Joel H. Saltz and   
              James C. Townsend   Delay Point Schedules for Irregular
                                  Parallel Computations  . . . . . . . . . 69--90

International Journal of Parallel Programming
Volume 18, Number 2, April, 1989

              Kee-Hyun Park and   
              Lawrence W. Dowdy   Dynamic partitioning of multiprocessor
                                  systems  . . . . . . . . . . . . . . . . 91--120
       Alessandro Giacalone and   
             Prateek Mishra and   
                 Sanjiva Prasad   FACILE: a Symmetric Integration of
                                  Concurrent and Functional Programming    121--160

International Journal of Parallel Programming
Volume 18, Number 3, June, 1989

                Rajiv Gupta and   
                Charles R. Hill   A Scalable Implementation of Barrier
                                  Synchronization Using An Adaptive
                                  Combining Tree . . . . . . . . . . . . . 161--180
                     Ian Foster   A Multicomputer Garbage Collector for a
                                  Single Assignment Language . . . . . . . 181--203
                   Yi Xin Zhang   Parallel algorithms for minimal spanning
                                  trees of directed graphs . . . . . . . . 205--221
                  Xiaoqiu Huang   A space--efficient parallel sequence
                                  comparison algorithm for a
                                  message--passing multiprocessor  . . . . 223--239

International Journal of Parallel Programming
Volume 18, Number 4, August, 1989

             David Hemmendinger   Initializing memory shared by several
                                  processors . . . . . . . . . . . . . . . 241--253
            Gadi Taubenfeld and   
                Shmuel Katz and   
                   Shlomo Moran   Initial failures in distributed
                                  computations . . . . . . . . . . . . . . 255--276
                     Jason Gait   Speedup and optimality in pipeline
                                  programs . . . . . . . . . . . . . . . . 277--290
                G. A. Geist and   
                          E. Ng   Task scheduling for parallel sparse
                                  Cholesky factorization . . . . . . . . . 291--314

International Journal of Parallel Programming
Volume 18, Number 5, October, 1989

              Jeannette M. Wing   Verifying atomic data types  . . . . . . 315--357
               Selim G. Akl and   
                    Frank Dehne   Pipelined search on coarse grained
                                  networks . . . . . . . . . . . . . . . . 359--364
              Juanito Camilleri   An Operational Semantics for occam . . . 365--400 (or 149--167??)
           Arvind K. Bansal and   
               Leon S. Sterling   Transforming generate-and-test programs
                                  to execute under committed-choice
                                  AND-parallelism  . . . . . . . . . . . . 401--446

International Journal of Parallel Programming
Volume 18, Number 6, December, 1989

             Ambuj K. Singh and   
                  Ross Overbeek   Derivation of Efficient Parallel
                                  Programs: an Example From Genetic
                                  Sequence Analysis  . . . . . . . . . . . 447--484
      Frederick Springsteel and   
             Ivan Stojmenovi\'c   Parallel general prefix computations
                                  with geometric, algebraic, and other
                                  applications . . . . . . . . . . . . . . 485--503
              Woei-Kae Chen and   
   Matthias F. M. Stallmann and   
            Edward F. Gehringer   Hypercube embedding heuristics: an
                                  evaluation . . . . . . . . . . . . . . . 505--549
                 Gary Lindstrom   Sans pareil: Referees  . . . . . . . . . 551--552

International Journal of Parallel Programming
Volume 19, Number 1, February, 1990

               John H. Reif and   
                Scott A. Smolka   Data flow analysis of distributed
                                  communicating processes  . . . . . . . . 1--30
           Russell M. Clapp and   
            Trevor N. Mudge and   
               Donald C. Winsor   Cache Coherence Requirements for
                                  Interprocess Rendezvous  . . . . . . . . 31--51
                Rajiv Gupta and   
                Michael Epstein   High Speed Synchronization of Processors
                                  Using Fuzzy Barriers . . . . . . . . . . 53--73

International Journal of Parallel Programming
Volume 19, Number 2, April, 1990

            Duane A. Bailey and   
             Janice E. Cuny and   
                Craig P. Loomis   ParaGraph: Graph editor support for
                                  parallel programming environments  . . . 75--110
           Raymond Greenlaw and   
                Lawrence Snyder   Achieving speedups for APL on an SIMD
                                  distributed memory machine . . . . . . . 111--127
           Khayri A. M. Ali and   
                Roland Karlsson   The Muse Approach to OR-Parallel Prolog  129--162 (or 129--160??)

International Journal of Parallel Programming
Volume 19, Number 3, June, 1990

         Manuel E. Bermudez and   
       Richard Newman-Wolfe and   
              George Logothetis   Parallel Construction of SLR(1) and
                                  LALR(1) Parsers  . . . . . . . . . . . . 163--184
          Soumitra Sengupta and   
            Arthur J. Bernstein   Concurrency Control Optimizations in a
                                  Prolog Database  . . . . . . . . . . . . 185--211
                Frank Dehne and   
               Quoc T. Pham and   
             Ivan Stojmenovi\'c   Optimal Visibility Algorithms for Binary
                                  Images on the Hypercube  . . . . . . . . 213--224
           Boris D. Lubachevsky   Synchronization Barrier and Related
                                  Tools for Shared Memory Parallel
                                  Programming  . . . . . . . . . . . . . . 225--250

International Journal of Parallel Programming
Volume 19, Number 4, August, 1990

          L. V. Kalé and   
             Vikram A. Saletore   Parallel State-Space Search for a First
                                  Solution with Consistent Linear Speedups 251--293
            Oscar H. Ibarra and   
               Michael A. Palis   An Efficient All-Parses Systolic
                                  Algorithm for General Context-free
                                  Parsing  . . . . . . . . . . . . . . . . 295--331
               Laurent Langlois   Systolic Parsing of Context-free
                                  Languages  . . . . . . . . . . . . . . . 333--355

International Journal of Parallel Programming
Volume 19, Number 5, October, 1990

          Carole M. McNamee and   
               Ronald A. Olsson   Transformations for optimizing
                                  interprocess communication and
                                  synchronization mechanisms . . . . . . . 357--387
                  Rok Sosic and   
          Richard F. Riesenfeld   Parallel Algorithms for Line Generation  389--404
          Douglas M. Blough and   
              Nader Bagherzadeh   Near-Optimal Message Routing and
                                  Broadcasting in Faulty Hypercubes  . . . 405--423

International Journal of Parallel Programming
Volume 19, Number 6, December, 1990

                        E. Tick   Execution Characteristics of Layered
                                  Streams  . . . . . . . . . . . . . . . . 425--443
           Khayri A. M. Ali and   
                Roland Karlsson   Full Prolog and Scheduling
                                  OR-Parallelism in Muse . . . . . . . . . 445--475
                Michael D. Rice   Semantics for Data Parallel Computation  477--509
                 Gary Lindstrom   Sans pareil: Referees  . . . . . . . . . 511--512

International Journal of Parallel Programming
Volume 20, Number 1, February, 1991

               Manfred Broy and   
               Thomas Streicher   Specification and Design of Shared
                                  Resource Arbitration . . . . . . . . . . 1--22
                 Paul Feautrier   Dataflow Analysis of Array and Scalar
                                  References . . . . . . . . . . . . . . . 23--53 (or 23--52??)
                   Mike Livesey   A Network Model of Barrier
                                  Synchronization Algorithms . . . . . . . 55--74

International Journal of Parallel Programming
Volume 20, Number 2, April, 1991

                    R. Mall and   
                  L. M. Patnaik   Formal Timing Analysis of Distributed
                                  Systems  . . . . . . . . . . . . . . . . 75--94
                   V. Singh and   
                   V. Kumar and   
                    G. Agha and   
                   C. Tomlinson   Efficient Algorithms for Parallel
                                  Sorting on Mesh Multicomputers . . . . . 95--131
               D. B. Skillicorn   Models for Practical Parallel
                                  Computation  . . . . . . . . . . . . . . 133--158

International Journal of Parallel Programming
Volume 20, Number 3, June, 1991

                     Kai Li and   
        Jeffrey F. Naughton and   
                 James S. Plank   An Efficient Checkpointing Method for
                                  Multicomputers with Wormhole Routing . . 159--180
          Carole M. McNamee and   
               Ronald A. Olsson   An Attribute Grammar Approach to
                                  Compiler Optimization of IntraModule
                                  Interprocess Communication . . . . . . . 181--202
               Gurdip Singh and   
            Arthur J. Bernstein   On the Relative Execution Times of
                                  Distributed Protocols  . . . . . . . . . 203--235
             Virginia M. Lo and   
          Sanjay Rajopadhye and   
                Samik Gupta and   
              David Keldsen and   
          Moataz A. Mohamed and   
              Bill Nitzberg and   
              Jan Arne Tell and   
                Xiaoxiong Zhong   OREGAMI: Tools for mapping parallel
                                  computations to parallel architectures   237--270

International Journal of Parallel Programming
Volume 20, Number 4, August, 1991

                 P. Adamson and   
                        E. Tick   Greedy Partitioned Algorithms for the
                                  Shortest-Path Problem  . . . . . . . . . 271--298
               Matthew Huntbach   Parallel Branch-and-Bound Search in
                                  Parlog . . . . . . . . . . . . . . . . . 299--314
                      Zheng Lin   A Distributed Fair Polling Scheme
                                  Applied to OR-Parallel Logic Programming 315--339

International Journal of Parallel Programming
Volume 20, Number 5, October, 1991

          Mohammad Ashraf Iqbal   Approximate Algorithms for Partitioning
                                  Problems . . . . . . . . . . . . . . . . 341--361
                 Calvin Lin and   
                Lawrence Snyder   A Portable Implementation of SIMPLE  . . 363--401
               Amitabha Das and   
            Louise E. Moser and   
            P. M. Melliar-Smith   A Parallel Sorting Algorithm for a Novel
                                  Model of Computation . . . . . . . . . . 403--419

International Journal of Parallel Programming
Volume 20, Number 6, December, 1991

           Andrzej Ciepielewski   Scheduling in OR-parallel Prolog
                                  systems: survey and open problems  . . . 421--451
         Steven Y. Susswein and   
        Thomas C. Henderson and   
          Joseph L. Zachary and   
               Chuck Hansen and   
                Paul Hinker and   
                Gary C. Marsden   Parallel Path Consistency  . . . . . . . 453--473
                Frank Dehne and   
                Russ Miller and   
             Andrew Rau Chaplin   Optical Clustering on a Mesh-Connected
                                  Computer . . . . . . . . . . . . . . . . 475--486
                 Gary Lindstorm   Sans pareil: Referees  . . . . . . . . . 487--488

International Journal of Parallel Programming
Volume 21, Number 1, February, 1992

           Michael A. Palis and   
                David S. L. Wei   Parallel Parsing of Tree Adjoining
                                  Grammars on the Connection Machine . . . 1--38
              Stephen A. Schwab   Extended parallelism in the Gröbner basis
                                  algorithm  . . . . . . . . . . . . . . . 39--66
        Balkrishna Ramkumar and   
       Laxmikant V. Kalé   A Join Algorithm for Combining AND
                                  Parallel Solutions in AND/OR Parallel
                                  Systems  . . . . . . . . . . . . . . . . 67--107

International Journal of Parallel Programming
Volume 21, Number 2, April, 1992

               Dilip Sarkar and   
             Ivan Stojmenovi\'c   Parallel Algorithms for Separation of
                                  Two Sets of Points and Recognition of
                                  Digital Convex Polygons  . . . . . . . . 109--121
                  Xining Li and   
                John Cleary and   
                    Brian Unger   Virtual Time and Virtual Space . . . . . 123--150
           Michael A. Palis and   
                Sunil M. Shende   An NC Algorithm for Recognizing Tree
                                  Adjoining Languages  . . . . . . . . . . 151--167

International Journal of Parallel Programming
Volume 21, Number 3, June, 1992

                Rajiv Gupta and   
                      Sunah Lee   Exploiting Parallelism on a Fine-Grained
                                  MIMD Architecture Based Upon Channel
                                  Queues . . . . . . . . . . . . . . . . . 169--192
             Ling-Yu Chuang and   
                Vernon Rego and   
                  Aditya Mathur   An application of program unification to
                                  priority queue vectorization . . . . . . 193--224

International Journal of Parallel Programming
Volume 21, Number 4, August, 1992

            R. Govindarajan and   
                      S. Yu and   
               V. S. Lakshmanan   Attempting Guards in Parallel: a Data
                                  Flow Approach to Execute Generalized
                                  Guarded Commands . . . . . . . . . . . . 225--268
               Ouri Wolfson and   
              Weining Zhang and   
              Harish Butani and   
            Akira Kawaguchi and   
                        Mok Kui   Parallel Processing of Graph
                                  Reachability in Databases  . . . . . . . 269--302
                Alan P. Sprague   A Parallel Algorithm to Construct a
                                  Dominance Graph on Non-overlapping
                                  Rectangles . . . . . . . . . . . . . . . 303--312

International Journal of Parallel Programming
Volume 21, Number 5, October, 1992

                 Paul Feautrier   Some efficient solutions to the affine
                                  scheduling problem. I. One-dimensional
                                  time . . . . . . . . . . . . . . . . . . 313--347
                   W. Loots and   
                 T. H. C. Smith   A parallel algorithm for the $0$-$1$
                                  knapsack problem . . . . . . . . . . . . 349--362
         Bradley K. Seevers and   
           Michael J. Quinn and   
              Philip J. Hatcher   A Parallel Programming Environment
                                  Supporting Multiple Data-Parallel
                                  Modules  . . . . . . . . . . . . . . . . 363--386

International Journal of Parallel Programming
Volume 21, Number 6, December, 1992

                      Anonymous   Important announcement to subscribers    387--387
                 Paul Feautrier   Some Efficient Solutions to the Affine
                                  Scheduling Problem. Part II.
                                  Multidimensional Time  . . . . . . . . . 389--420
                    Qi Ning and   
                   Guang R. Gao   Optimal Loop Storage Allocation for
                                  Argument-Fetching Dataflow Machines  . . 421--448
           Khayri A. M. Ali and   
                Roland Karlsson   Scheduling Speculative Work in MUSE and
                                  Performance Results  . . . . . . . . . . 449--476
                 Gary Lindstrom   Referees and Valedictory . . . . . . . . 477--479

International Journal of Parallel Programming
Volume 22, Number 1, February, 1994

                    Gordon Bell   Scalable, Parallel Computers:
                                  Alternatives, Issues, and Challenges . . 3--46
                 Jack B. Dennis   Machines and Models for Parallel
                                  Computing  . . . . . . . . . . . . . . . 47--77
                    Ken Kennedy   Compiler technology for
                                  machine-independent parallel programming 79--98
                  David J. Kuck   What Do Users of Parallel Computer
                                  Systems Really Need? . . . . . . . . . . 99--127

International Journal of Parallel Programming
Volume 22, Number 2, April, 1994

          Nicholas Carriero and   
                David Gelernter   Case studies in asynchronous data
                                  parallelism  . . . . . . . . . . . . . . 129--149
            William Y. Chen and   
            Scott A. Mahlke and   
            Nancy J. Warter and   
                 Sadun Anik and   
                 Wen-Mei W. Hwu   Profile-assisted instruction scheduling  151--181
                     Wei Li and   
                 Keshav Pingali   A singular loop transformation framework
                                  based on non-singular matrices . . . . . 183--205

International Journal of Parallel Programming
Volume 22, Number 3, June, 1994

                Wen-Mei Hwu and   
                   Alex Nicolau   From the Guest Editors . . . . . . . . . 207
            Walid A. Najjar and   
                  Lucas Roh and   
            A. P. Wim Böhm   An Evaluation of Medium-Grain Dataflow
                                  Code . . . . . . . . . . . . . . . . . . 209--242
                 Gary Tyson and   
                Matthew Farrens   Code Scheduling for Multiple Instruction
                                  Stream Architectures . . . . . . . . . . 243--272
             M. Rajagopalan and   
                    V. H. Allan   Specification of Software Pipelining
                                  Using Petri Nets . . . . . . . . . . . . 273--301
             Mark R. Gilder and   
       Mukkai S. Krishnamoorthy   Automatic Source-Code Parallelization
                                  Using HICOR Objects  . . . . . . . . . . 303--350
                  Jian Wang and   
        Christine Eisenbeis and   
             Martin Jourdan and   
                      Bogong Su   Decomposed software pipelining: a new
                                  perspective and a new approach . . . . . 351--373

International Journal of Parallel Programming
Volume 22, Number 4, August, 1994

             Yosi Ben-Asher and   
                   Eitan Farchi   Using True Concurrency to Model
                                  Execution of Parallel Programs . . . . . 375--407
                 Feipei Lai and   
            Yung-kuang Chao and   
                Chia-Jung Hsieh   The Complementary Relationship of
                                  Interprocedural Register Allocation and
                                  Inlining . . . . . . . . . . . . . . . . 409--434
            M. K. Stoj\vcev and   
        E. I. Milovanovi\'c and   
          I. \vZ. Milovanovi\'c   An Optimal Scheduling Procedure for
                                  Matrix Inversion on Linear Array at a
                                  Processor Level  . . . . . . . . . . . . 435--448
           Michael L. Scott and   
         John M. Mellor-Crummey   Fast, Contention-Free Combining Tree
                                  Barriers for Shared-Memory
                                  Multiprocessors  . . . . . . . . . . . . 449--481

International Journal of Parallel Programming
Volume 22, Number 5, October, 1994

                 Utpal Banerjee   Editor's Introduction  . . . . . . . . . 483
               Larry Carter and   
            Jeanne Ferrante and   
                   Vasanth Bala   XDP: a Compiler Intermediate Language
                                  Extension for the Representation and
                                  Optimization of Data Movement  . . . . . 485--518
              Milind Girkar and   
Constantine D. Polychronopoulos   The Hierarchical Task Graph as a
                                  Universal Intermediate Representation    519--551
            Keith A. Faigin and   
     Stephen A. Weatherford and   
          Jay P. Hoeflinger and   
             David A. Padua and   
               Paul M. Petersen   The Polaris Internal Representation  . . 553--586

International Journal of Parallel Programming
Volume 22, Number 6, December, 1994

                    Jie Liu and   
         Vikram A. Saletore and   
                   Ted G. Lewis   Safe Self-Scheduling: a Parallel Loop
                                  Scheduling Scheme for Shared-Memory
                                  Multiprocessors  . . . . . . . . . . . . 589--616
               Theodore Johnson   Parallel-Access Memory Management Using
                                  Fast-Fits  . . . . . . . . . . . . . . . 617--648

International Journal of Parallel Programming
Volume 23, Number 1, February, 1995

              Shlomit S. Pinter   Introduction . . . . . . . . . . . . . . 3
          Nicholas Carriero and   
            David Gelernter and   
            Marc Jourdenais and   
                 David Kaminsky   Piranha Scheduling: Strategies and Their
                                  Implementation . . . . . . . . . . . . . 5--33
              Steven Novack and   
              Alexandru Nicolau   A Hierarchical Approach to
                                  Instruction-level Parallelization  . . . 35--62
             Dror E. Maydan and   
           John L. Hennessy and   
                  Monica S. Lam   Effectiveness of Data Dependence
                                  Analysis . . . . . . . . . . . . . . . . 63--81
            David Bernstein and   
   Mauricio Breternitz, Jr. and   
            Ahmed M. Gheith and   
                Bilha Mendelson   Solutions and Debugging for Data
                                  Consistency in Multiprocessors with
                                  Noncoherent Caches . . . . . . . . . . . 83--103

International Journal of Parallel Programming
Volume 23, Number 2, April, 1995

             David Abramson and   
                       A. McKay   Evaluating the Performance of a SISAL
                                  Implementation of the Abingdon Cross
                                  Image Processing Benchmark . . . . . . . 105--134
          Dror G. Feitelson and   
                  Larry Rudolph   Coscheduling Based on Runtime
                                  Identification of Activity Working Sets  135--160
               Wei-Ming Lin and   
                        Bo Yang   Probabilistic Performance Analysis for
                                  Parallel Search Techniques . . . . . . . 161--189
   Jean-François Collard   Automatic Parallelization of while-Loops
                                  Using Speculative Execution  . . . . . . 191--219

International Journal of Parallel Programming
Volume 23, Number 3, June, 1995

             Stephen Melvin and   
                      Yale Patt   Enhancing Instruction Scheduling with a
                                  Block-Structured ISA . . . . . . . . . . 221--243
               Heng-Yi Chao and   
                 Mary P. Harper   Minimizing Redundant Dependencies and
                                  Interprocessor Synchronizations  . . . . 245--262

International Journal of Parallel Programming
Volume 23, Number 4, August, 1995

          Elana D. Granston and   
            Thierry Montaut and   
          François Bodin   Loop Transformations to Prevent False
                                  Sharing  . . . . . . . . . . . . . . . . 263--301
                Wayne Kelly and   
                   William Pugh   Using Affine Closure to Find Legal
                                  Reordering Transformations . . . . . . . 303--325
                Eric Stoltz and   
                  Michael Wolfe   Detecting Value-Based Scalar Dependence  327--358
               Yi-Qing Yang and   
            Corinne Ancourt and   
        François Irigoin   Minimal Data Dependence Abstractions for
                                  Loop Transformations: Extended Version   359--388

International Journal of Parallel Programming
Volume 23, Number 5, October, 1995

             Yosi Ben-Asher and   
              Gudula Runger and   
             Assaf Schuster and   
               Reinhard Wilhelm   2DT-FP: a Parallel Functional
                                  Programming Language on Two-Dimensional
                                  Data . . . . . . . . . . . . . . . . . . 389--422
          Elana D. Granston and   
        Alexander V. Veidenbaum   Combining Flow and Dependence Analyses
                                  to Expose Redundant Array Accesses . . . 423--470
              Martin Griebl and   
             Christian Lengauer   A Communication Scheme for the
                                  Distributed Execution of Loop Nests with
                                  while Loops  . . . . . . . . . . . . . . 471--496

International Journal of Parallel Programming
Volume 23, Number 6, December, 1995

            Mario Mango Furnari   Guest Editor's Introduction  . . . . . . 497
           Andrea Capitanio and   
          Alexandru Nicolau and   
                     Nikil Dutt   A Hypergraph-Based Model for Port
                                  Allocation on Multiple-Register-File
                                  VLIW Architectures . . . . . . . . . . . 499--513
             Eduard Ayguade and   
              Jesus Labarta and   
               Jordi Garcia and   
              Merce Girones and   
                   Mateo Valero   Analyzing Reference Patterns in
                                  Automatic Data Distribution Tools  . . . 515--535
       Lawrence Rauchwerger and   
             Nancy M. Amato and   
                 David A. Padua   A Scalable Method for Run-Time Loop
                                  Parallelization  . . . . . . . . . . . . 537--576

International Journal of Parallel Programming
Volume 24, Number 1, February, 1996

            Matthew Farrens and   
                    Wen-mei Hwu   Guest Editors' Introduction  . . . . . . 1
             B. Ramakrishna Rau   Iterative Modulo Scheduling  . . . . . . 3--64
         Michael Schlansker and   
              Vinod Kathail and   
                     Sadun Anik   Parallelization of Control Recurrences
                                  for ILP Processors . . . . . . . . . . . 65--102

International Journal of Parallel Programming
Volume 24, Number 2, April, 1996

  Alexandre E. Eichenberger and   
         Edward S. Davidson and   
             Santosh G. Abraham   Minimizing Register Requirements of a
                                  Modulo Schedule via Optimum Stage
                                  Scheduling . . . . . . . . . . . . . . . 103--132
              Po-Yung Chang and   
                   Eric Hao and   
                 Tse-Yu Yeh and   
                      Yale Patt   Branch Classification: a New Mechanism
                                  for Improving Branch Predictor
                                  Performance  . . . . . . . . . . . . . . 133--158
                 Gary Tyson and   
                Matthew Farrens   Evaluating the Effects of Predicated
                                  Execution on Branch Prediction . . . . . 159--186
            Thomas M. Conte and   
            Burzin A. Patel and   
         Kishore N. Menezes and   
                    J. Stan Cox   Hardware-Based Profiling: an Effective
                                  Technique for Profile-Driven
                                  Optimization . . . . . . . . . . . . . . 187--206

International Journal of Parallel Programming
Volume 24, Number 3, June, 1996

               Jean-Luc Gaudiot   Guest Editor's Introduction  . . . . . . 207
              Po-Yung Chang and   
                   Eric Hao and   
               Yale N. Patt and   
                 Pohua P. Chang   Using Predicated Execution to Improve
                                  the Performance of a Dynamically
                                  Scheduled Machine with Speculative
                                  Execution  . . . . . . . . . . . . . . . 209--234
          David H. Albonesi and   
                   Israel Koren   A Mean Value Analysis Multiprocessor
                                  Model Incorporating Super-scalar
                                  Processors and Latency Tolerating
                                  Techniques . . . . . . . . . . . . . . . 235--263
                 M. Cosnard and   
                         M. Loi   A Simple Algorithm for the Generation of
                                  Efficient Loop Structures  . . . . . . . 265--289

International Journal of Parallel Programming
Volume 24, Number 4, August, 1996

            Dean Engelhardt and   
              Andrew Wendelborn   A Partitioning-Independent Paradigm for
                                  Nested Data Parallelism  . . . . . . . . 291--317
          Herbert H. J. Hum and   
           Olivier Maquelin and   
          Kevin B. Theobald and   
                Xinmin Tian and   
               Guang R. Gao and   
              Laurie J. Hendren   A Study of the EARTH-MANNA Multithreaded
                                  System . . . . . . . . . . . . . . . . . 319--348
                Evan Torrie and   
         Margaret Martonosi and   
               Mary W. Hall and   
                 Chau-Wen Tseng   Memory Referencing Behavior in
                                  Compiler-Parallelized Applications . . . 349--376
            Thomas Sterling and   
            Daniel Savarese and   
             Phillip Merkey and   
                    Kevin Olson   An Empirical Evaluation of the Convex
                                  SPP-1000 Hierarchical Shared Memory
                                  System . . . . . . . . . . . . . . . . . 377--396

International Journal of Parallel Programming
Volume 24, Number 5, October, 1996

         Lesley R. Matheson and   
               Robert E. Tarjan   Parallelism in Multigrid Methods: How
                                  Much Is Too Much?  . . . . . . . . . . . 397--432
                  Kish Shen and   
         Manuel V. Hermenegildo   High-Level Characteristics of OR- and
                                  Independent AND-Parallelism in Prolog    433--478

International Journal of Parallel Programming
Volume 24, Number 6, December, 1996

            Rastislav Bodik and   
                    Rajiv Gupta   Array Data Flow Analysis for Load-Store
                                  Optimizations in Fine-Grain
                                  Architectures  . . . . . . . . . . . . . 481--512
        Beatrice Creusillet and   
        François Irigoin   Interprocedural Array Region Analyses    513--546
               Rakesh Ghiya and   
              Laurie J. Hendren   Connection Analysis: a Practical
                                  Interprocedural Heap Analysis for C  . . 547--578
                Wayne Kelly and   
               William Pugh and   
                Evan Rosser and   
              Tatiana Shpeisman   Transitive Closure of Infinite Graphs
                                  and its Applications . . . . . . . . . . 579--598
         Thomas J. Sheffler and   
           Robert Schreiber and   
               William Pugh and   
            John R. Gilbert and   
          Siddhartha Chatterjee   Efficient Distribution Analysis via
                                  Graph Contraction  . . . . . . . . . . . 599--620

International Journal of Parallel Programming
Volume 25, Number 1, February, 1997

                Frank Dehne and   
                  Siang W. Song   Randomized Parallel List Ranking for
                                  Distributed Memory Multi-processors  . . 1--16
       Christoph W. Kessler and   
                   Helmut Seidl   The Fork95 Parallel Programming
                                  Language: Design, Implementation,
                                  Application  . . . . . . . . . . . . . . 17--50

International Journal of Parallel Programming
Volume 25, Number 2, April, 1997

           Kemal Ebcio\uglu and   
                    Wen-mei Hwu   Guest Editors' Introduction  . . . . . . 51
               Vasanth Bala and   
                   Norman Rubin   Efficient Instruction Scheduling Using
                                  Finite State Automata  . . . . . . . . . 53--82
            Thomas M. Conte and   
              Sumedh W. Sathaye   Optimization of VLIW Compatibility
                                  Systems Employing Dynamic Rescheduling   83--112
            Richard E. Hank and   
             Wen-mei W. Hwu and   
             B. Ramakrishna Rau   Region-Based Compilation: Introduction,
                                  Motivation, and Initial Experience . . . 113--146

International Journal of Parallel Programming
Volume 25, Number 3, June, 1997

         Michael Schlansker and   
                  Vinod Kathail   Techniques for Critical Path Reduction
                                  of Scalar Programs . . . . . . . . . . . 147--181
                Marco Fillo and   
         Stephen W. Keckler and   
           William J. Dally and   
         Nicholas P. Carter and   
               Andrew Chang and   
           Yevgeny Gurevich and   
                    Whay S. Lee   The M-Machine Multicomputer  . . . . . . 183--212
                 Gary Tyson and   
            Matthew Farrens and   
              John Matthews and   
             Andrew R. Pleszkun   Managing Data Caches Using Selective
                                  Cache Line Replacement . . . . . . . . . 213--242

International Journal of Parallel Programming
Volume 25, Number 4, August, 1997

            Walid A. Najjar and   
           Gabriel M. Silberman   Foreword to the Special Issues . . . . . 243
           Chris J. Newburn and   
                 John Paul Shen   Post-Pass Partitioning of Signal
                                  Processing Programs  . . . . . . . . . . 245--280
              Stephen Jenks and   
               Jean-Luc Gaudiot   Exploiting Locality and Tolerating
                                  Remote Memory Access Latency Using
                                  Thread Migration . . . . . . . . . . . . 281--304
          Laurie J. Hendren and   
                 Xinan Tang and   
               Yingchun Zhu and   
           Shereen Ghobrial and   
               Guang R. Gao and   
                    Xun Xue and   
                Haiying Cai and   
                 Pierre Ouellet   Compiling C for the EARTH Multithreaded
                                  Architecture . . . . . . . . . . . . . . 305--338

International Journal of Parallel Programming
Volume 25, Number 5, October, 1997

             Po- Yung Chang and   
               Marius Evers and   
                   Yale N. Patt   Improving Branch Prediction Accuracy by
                                  Reducing Pattern History Table
                                  Interference . . . . . . . . . . . . . . 339--362
            Stephan Jourdan and   
                Jared Stark and   
              Tse-Hao Hsing and   
                   Yale N. Patt   Recovery Requirements of Branch
                                  Prediction Storage Structures in the
                                  Presence of Mispredicted-Path Execution  363--383
             Lorenz Huelsbergen   Dynamic Resolution: a Runtime Technique
                                  for the Parallelization of Modifications
                                  to Directed Acyclic Graphs . . . . . . . 385--417
               Daeyeon Park and   
         Rafael H. Saavedra and   
                    Sungdo Moon   Adaptive Granularity: Transparent
                                  Integration of Fine- and Coarse-Grain
                                  Communication  . . . . . . . . . . . . . 419--446

International Journal of Parallel Programming
Volume 25, Number 6, December, 1997

                Alain Darte and   
  Frédéric Vivien   Optimal Fine and Medium Grain
                                  Parallelism Detection in Polyhedral
                                  Reduced Dependence Graphs  . . . . . . . 447--496
             Catherine Mongenet   Affine Dependence Classification for
                                  Communications Minimization  . . . . . . 497--524
           Vincent Loechner and   
                 Doran K. Wilde   Parameterized Polyhedra and Their
                                  Vertices . . . . . . . . . . . . . . . . 525--549

International Journal of Parallel Programming
Volume 26, Number 1, February, 1998

         Editorial Introduction   Editor's Announcement  . . . . . . . . . 1--2
                     David Sehr   Guest Editor's Introduction  . . . . . . 3--4
              Val Donaldson and   
                Jeanne Ferrante   Analyzing Asynchronous Pipeline
                                  Schedules  . . . . . . . . . . . . . . . 5--42
                Tito Autrey and   
                  Michael Wolfe   Initial Results for Glacial Variable
                                  Analysis . . . . . . . . . . . . . . . . 43--64
                 Ajita John and   
                James C. Browne   Compilation of constraint programs with
                                  noncyclic and cyclic dependencies to
                                  procedural parallel programs . . . . . . 65--119

International Journal of Parallel Programming
Volume 26, Number 2, April, 1998

                Josep Llosa and   
      Eduard Ayguadé and   
                   Mateo Valero   Quantitative evaluation of register
                                  pressure on software pipelined loops . . 121--142
          Ricardo Bianchini and   
         Enrique V. Carrera and   
        Leonidas Kontothanassis   Evaluating the effect of coherence
                                  protocols on the performance of parallel
                                  programming constructs . . . . . . . . . 143--181
            John John E. So and   
           Thomas J. Downar and   
      Raghunandan Janardhan and   
              Howard Jay Siegel   Mapping conjugate gradient algorithms
                                  for neutron diffusion applications onto
                                  SIMD, MIMD, and mixed-mode machines  . . 183--207

International Journal of Parallel Programming
Volume 26, Number 3, June, 1998

           Thomas Grün and   
              Thomas Rauber and   
             Jochen Röhrig   Support for Efficient Programming on the
                                  SB-PRAM  . . . . . . . . . . . . . . . . 209--240
               Cindy Norris and   
                Lori L. Pollock   Experiences with cooperating register
                                  allocation and instruction scheduling    241--283
        Pierre-Yves Calland and   
                Alain Darte and   
                Yves Robert and   
                Frederic Vivien   On the Removal of Anti- and
                                  Output-Dependences . . . . . . . . . . . 285--312
             Erik R. Altman and   
                   Guang R. Gao   Optimal Modulo Scheduling Through
                                  Enumeration  . . . . . . . . . . . . . . 313--344

International Journal of Parallel Programming
Volume 26, Number 4, August, 1998

                Steve Beaty and   
                    Wen-mei Hwu   Foreword to the Special Issue  . . . . . 345--347
         Santosh G. Abraham and   
              Vinod Kathail and   
              Brian L. Deitrich   Meld Scheduling: a Technique for
                                  Relaxing Scheduling Constraints  . . . . 349--381
           Ashwini K. Nanda and   
             James O. Bondi and   
                 Simonjit Dutta   The Misprediction Recovery Cache . . . . 383--415
         John C. Gyllenhaal and   
             Wen-mei W. Hwu and   
             B. Ramakrishna Rau   Optimization of Machine Descriptions for
                                  Efficient Use  . . . . . . . . . . . . . 417--447
                   Eric Hao and   
              Po-Yung Chang and   
               Marius Evers and   
                   Yale N. Patt   Increasing the Instruction Fetch Rate
                                  via Block-Structured Instruction Set
                                  Architectures  . . . . . . . . . . . . . 449--478
            Michael E. Wolf and   
             Dror E. Maydan and   
                  Ding-Kai Chen   Combining Loop Transformations
                                  Considering Caches and Scheduling  . . . 479--503
           Mikko H. Lipasti and   
                 John Paul Shen   Exploiting Value Locality to Exceed the
                                  Dataflow Limit . . . . . . . . . . . . . 505--538

International Journal of Parallel Programming
Volume 26, Number 5, October, 1998

                 Zhiyuan Li and   
                  Pen-Chung Yew   Introduction . . . . . . . . . . . . . . 539--540
                Insung Park and   
               Michael Voss and   
            Brian Armstrong and   
               Rudolf Eigenmann   Parallel Programming and Performance
                                  Evaluation with the URSA Tool Family . . 541--561
                 Jaejin Lee and   
          Samuel P. Midkiff and   
                 David A. Padua   A Constant Propagation Algorithm for
                                  Explicitly Parallel Programs . . . . . . 563--589
                Hwansoo Han and   
             Chau-Wen Tseng and   
                   Pete Keleher   Eliminating Barrier Synchronization for
                                  Compiler-Parallelized Codes on Software
                                  DSMs . . . . . . . . . . . . . . . . . . 591--612
        John Mellor-Crummey and   
                    Vikram Adve   Simplifying Control Flow in
                                  Compiler-Generated Parallel Code . . . . 613--638

International Journal of Parallel Programming
Volume 26, Number 6, December, 1998

                 Zhiyuan Li and   
                  Pen-Chung Yew   Introduction . . . . . . . . . . . . . . 639--640
          Nicholas Mitchell and   
        Karin Högstedt and   
               Larry Carter and   
                Jeanne Ferrante   Quantifying the Multi-Level Nature of
                                  Tiling Interactions  . . . . . . . . . . 641--670
               Jingling Xue and   
               Chua-Huang Huang   Reuse-Driven Tiling for Improving Data
                                  Locality . . . . . . . . . . . . . . . . 671--696

International Journal of Parallel Programming
Volume 27, Number 1, February, 1999

             Jenn-Yuan Tsai and   
             Zhenzhen Jiang and   
                  Pen-Chung Yew   Compiler Techniques for the
                                  Superthreaded Architectures  . . . . . . 1--19
             Thomas Kistler and   
                  Michael Franz   A Tree-Based Alternative to Java
                                  Byte-Codes . . . . . . . . . . . . . . . 21--33
          Edward H. Gornish and   
           Alexander Veidenbaum   An Integrated Hardware/Software Data
                                  Prefetching Scheme for Shared-Memory
                                  Multiprocessors  . . . . . . . . . . . . 35--70

International Journal of Parallel Programming
Volume 27, Number 2, April, 1999

                     Kazuki Joe   Guest Editor's Introduction  . . . . . . 71--72
            Bret A. Marsolf and   
           Kyle A. Gallivan and   
           Harry A. G. Wijshoff   The Utilization of Matrix Structure to
                                  Generate Optimized Code from MATLAB
                                  Programs . . . . . . . . . . . . . . . . 73--96
             Atsushi Kubota and   
              Shogo Tatsumi and   
           Toshihiko Tanaka and   
           Masahiro Goshima and   
           Shin-ichiro Mori and   
          Hiroshi Nakashima and   
                  Shinji Tomita   A Technique to Eliminate Redundant
                                  Inter-Processor Communication on
                                  Parallelizing Compiler TINPAR  . . . . . 97--109
            Mariko Sasakura and   
                 Kazuki Joe and   
         Yoshitoshi Kunieda and   
                  Keijiro Araki   NaraView: an Interactive $3$D
                                  Visualization System for Parallelization
                                  of Programs  . . . . . . . . . . . . . . 111--129

International Journal of Parallel Programming
Volume 27, Number 3, June, 1999

      Michael F. P. O'Boyle and   
        Peter M. W. Knijnenburg   Nonsingular Data Transformations:
                                  Definition, Validity, and Applications   131--159
              Avi Mendelson and   
               Michael Bekerman   Design Alternatives of Multithreaded
                                  Architecture . . . . . . . . . . . . . . 161--193
                    Min Tan and   
            Janet M. Siegel and   
              Howard Jay Siegel   Parallel Implementations of Block-Based
                                  Motion Vector Estimation for Video
                                  Compression on Four Parallel Processing
                                  Systems  . . . . . . . . . . . . . . . . 195--225

International Journal of Parallel Programming
Volume 27, Number 4, August, 1999

              Shlomit S. Pinter   Introduction . . . . . . . . . . . . . . 227--228
         Yiannakis Sazeides and   
                 James E. Smith   Limits of Data Value Predictability  . . 229--256
            Steven Phillips and   
                    Anne Rogers   Parallel Speech Recognition  . . . . . . 257--288
          Ragini Narasimhan and   
      Daniel J. Rosenkrantz and   
                     S. S. Ravi   Using Data Flow Information to Obtain
                                  Efficient Check Sets for Algorithm-Based
                                  Fault Tolerance  . . . . . . . . . . . . 289--323

International Journal of Parallel Programming
Volume 27, Number 5, October, 1999

               Thomas Conte and   
                Wen-Mei Hwu and   
                Mark Smotherman   Editors' Introduction  . . . . . . . . . 325--326
            Keith I. Farkas and   
                  Paul Chow and   
           Norman P. Jouppi and   
                Zvonko Vranesic   The Multicluster Architecture: Reducing
                                  Processor Cycle Time Through
                                  Partitioning . . . . . . . . . . . . . . 327--356
              Gary S. Tyson and   
                 Todd M. Austin   Memory Renaming: Fast, Early and
                                  Accurate Processing of Memory
                                  Communication  . . . . . . . . . . . . . 357--380
            David I. August and   
             Wen-mei W. Hwu and   
                Scott A. Mahlke   The Partial Reverse If-Conversion
                                  Framework for Balancing Control Flow and
                                  Predication  . . . . . . . . . . . . . . 381--423

International Journal of Parallel Programming
Volume 27, Number 6, December, 1999

               Thomas Conte and   
                Wen-mei Hwu and   
                Mark Smotherman   Editors' Introduction  . . . . . . . . . 425--426
           Andreas Moshovos and   
               Gurindar S. Sohi   Speculative Memory Cloaking and
                                  Bypassing  . . . . . . . . . . . . . . . 427--456
             Darko Kirovski and   
                Johnson Kin and   
      William H. Mangione-Smith   Procedure Based Program Compression  . . 457--475
                 Jack L. Lo and   
            Susan J. Eggers and   
              Henry M. Levy and   
            Sujay S. Parekh and   
                Dean M. Tullsen   Tuning Compiler Optimizations for
                                  Simultaneous Multithreading  . . . . . . 477--503

International Journal of Parallel Programming
Volume 28, Number 1, February, 2000

            R. Govindarajan and   
     N. S. S. Narasimha Rao and   
               E. R. Altman and   
                   Guang R. Gao   Enhanced Co-Scheduling: a Software
                                  Pipelining Method Using Modulo-Scheduled
                                  Pipeline Theory  . . . . . . . . . . . . 1--46
           Vincent Loechner and   
             Catherine Mongenet   Communication Optimization for Affine
                                  Recurrence Equations Using Broadcast and
                                  Locality . . . . . . . . . . . . . . . . 47--102
                Marc Daumas and   
           Paraskevas Evripidou   Parallel Implementations of the
                                  Selection Problem: a Case Study  . . . . 103--131

International Journal of Parallel Programming
Volume 28, Number 2, April, 2000

                      Anonymous   Guest Editor's Introduction  . . . . . . 133--134
           Kazuaki Ishizaki and   
            Hideaki Komatsu and   
                Toshio Nakatani   A Loop Transformation Algorithm for
                                  Communication Overlapping  . . . . . . . 135--154
            Naoshi Uchihira and   
              Hideji Kawata and   
                Fumitaka Tamura   Scenario-Based Hypersequential
                                  Programming  . . . . . . . . . . . . . . 155--157
            Hironori Nakajo and   
           Akihiro Ichikawa and   
                   Yukio Kaneda   A Distributed Shared-Memory System on a
                                  Workstation Cluster Using Fast Serial
                                  Links  . . . . . . . . . . . . . . . . . 179--194
               Hideki Saito and   
      Nicholas J. Stavrakos and   
Constantine D. Polychronopoulos and   
                         others   The Design of the PROMIS
                                  Compiler-Towards Multi-Level
                                  Parallelization  . . . . . . . . . . . . 195--212

International Journal of Parallel Programming
Volume 28, Number 3, June, 2000

              Denis Barthou and   
               Albert Cohen and   
   Jean-François Collard   Maximal Static Expansion . . . . . . . . 213--243
             David K. Lowenthal   Accurately Selecting Block Size at
                                  Runtime in Pipelined Parallel Programs   245--274
        Ramiro Varela Arias and   
Camino Rodríguez Vela and   
      Jorge Puente Peinador and   
          Cesar Alonso Gonzalez   Parallel Logic Programming for Problem
                                  Solving  . . . . . . . . . . . . . . . . 275--319

International Journal of Parallel Programming
Volume 28, Number 4, August, 2000

                      Anonymous   Introduction . . . . . . . . . . . . . . 321--323
                Erven Rohou and   
      François Bodin and   
        Christine Eisenbeis and   
                   Andre Seznec   Handling Global Constraints in Compiler
                                  Strategy . . . . . . . . . . . . . . . . 325--345
              Andreas Krall and   
                 Sylvain Lelait   Compilation Techniques for Multimedia
                                  Processors . . . . . . . . . . . . . . . 347--361
                N. Sreraman and   
                R. Govindarajan   A Vectorizing Compiler for Multimedia
                                  Extensions . . . . . . . . . . . . . . . 363--400
             Henk Corporaal and   
              Johan Janssen and   
                  Marnix Arnold   Computation in the Context of Transport
                                  Triggered Architectures  . . . . . . . . 401--427

International Journal of Parallel Programming
Volume 28, Number 5, October, 2000

                      Anonymous   Introduction . . . . . . . . . . . . . . 429--430
               Wolfram Amme and   
                Peter Braun and   
  François Thomasset and   
             Eberhard Zehendner   Data Dependence Analysis of Assembly
                                  Code . . . . . . . . . . . . . . . . . . 431--467
            Fabien Quillere and   
          Sanjay Rajopadhye and   
                    Doran Wilde   Generation of Efficient Nested Loops
                                  from Polyhedra . . . . . . . . . . . . . 469--498
                Alain Darte and   
                Guillaume Huard   Loop Shifting for Loop Compaction  . . . 499--534

International Journal of Parallel Programming
Volume 28, Number 6, December, 2000

           Paraskevas Evripidou   Introduction . . . . . . . . . . . . . . 535--536
               Manish Gupta and   
         Sayak Mukhopadhyay and   
                    Navin Sinha   Automatic Parallelization of Recursive
                                  Procedures . . . . . . . . . . . . . . . 537--562
                Lori Carter and   
                 Beth Simon and   
                Brad Calder and   
               Larry Carter and   
                Jeanne Ferrante   Path Analysis and Renaming for
                                  Predicated Instruction Scheduling  . . . 563--588
                    Peng Wu and   
                    David Padua   Containers on the Parallelization of
                                  General-Purpose Java Programs  . . . . . 589--605
              Martin Griebl and   
             Paul Feautrier and   
             Christian Lengauer   Index Set Splitting  . . . . . . . . . . 607--631

International Journal of Parallel Programming
Volume 29, Number 1, February, 2001

                      Anonymous   Introduction . . . . . . . . . . . . . . 1--2
           Venkata Krishnan and   
                Josep Torrellas   The Need for Fast Communication in
                                  Hardware-Based Speculative Chip
                                  Multiprocessors  . . . . . . . . . . . . 3--33
             Pierre Michaud and   
        André Seznec and   
         Stéphan Jourdan   An Exploration of Instruction Fetch
                                  Requirement in Out-of-Order Superscalar
                                  Processors . . . . . . . . . . . . . . . 35--58
                Ramon Canal and   
      Joan-Manuel Parcerisa and   
        Antonio González   Dynamic Code Partitioning for Clustered
                                  Architectures  . . . . . . . . . . . . . 59--79
              Artur Klauser and   
             Srilatha Manne and   
                  Dirk Grunwald   Selective Branch Inversion: Confidence
                                  Estimation for Branch Predictors . . . . 81--110

International Journal of Parallel Programming
Volume 29, Number 2, April, 2001

             Matthew Arnold and   
              Michael Hsiao and   
              Ulrich Kremer and   
               Barbara G. Ryder   Exploring the Interaction between Java's
                                  Implicitly Thrown Exceptions and
                                  Instruction Scheduling . . . . . . . . . 111--137
      Dhruva R. Chakrabarti and   
            Prithviraj Banerjee   Static Single Assignment Form for
                                  Message-Passing Programs . . . . . . . . 139--184
          Jay P. Hoeflinger and   
              Yunheung Paek and   
                       Kwang Yi   Unified Interprocedural Parallelism
                                  Detection  . . . . . . . . . . . . . . . 185--215

International Journal of Parallel Programming
Volume 29, Number 3, June, 2001

        John Mellor-Crummey and   
              David Whalley and   
                    Ken Kennedy   Improving Memory Hierarchy Performance
                                  for Irregular Applications Using Data
                                  and Computation Reorderings  . . . . . . 217--247
  Dimitrios S. Nikolopoulos and   
      Theodore S. Papatheodorou   The Architectural and Operating System
                                  Implications on the Performance of
                                  Synchronization on ccNUMA
                                  Multiprocessors  . . . . . . . . . . . . 249--282
             Hongzhang Shan and   
            Jaswinder Pal Singh   A Comparison of MPI, SHMEM and
                                  Cache-Coherent Shared Address Space
                                  Programming Models on a Tightly-Coupled
                                  Multiprocessors  . . . . . . . . . . . . 283--318
        Induprakas Kodukula and   
                 Keshav Pingali   Data-Centric Transformations for
                                  Locality Enhancement . . . . . . . . . . 319--364

International Journal of Parallel Programming
Volume 29, Number 4, August, 2001

          Mayez Al-Mouhamed and   
              Hussam Abu-Haimed   Evaluation of Neural and Genetic
                                  Algorithms for Synthesizing Parallel
                                  Storage Schemes  . . . . . . . . . . . . 365--399
                Raju Pandey and   
                James C. Browne   Support for Implementation of
                                  Evolutionary Concurrent Systems  . . . . 401--431
            Isabelle Attali and   
              Denis Caromel and   
             Yung-Syau Chen and   
           Jean-Luc Gaudiot and   
           Andrew L. Wendelborn   Enhancing Functional and Irregular
                                  Parallelism: Stateful Functions and
                                  their Semantics  . . . . . . . . . . . . 433--460

International Journal of Parallel Programming
Volume 29, Number 5, October, 2001

                Alex Veidenbaum   Guest Editor's Introduction  . . . . . . 461--462
                    Ken Kennedy   Fast Greedy Weighted Fusion  . . . . . . 463--491
               Nawaaz Ahmed and   
             Nikolay Mateev and   
                 Keshav Pingali   Synthesizing Transformations for
                                  Locality Enhancement of
                                  Imperfectly-Nested Loop Nests  . . . . . 493--544
                   Vivek Sarkar   Optimized Unrolling of Nested Loops  . . 545--581

International Journal of Parallel Programming
Volume 29, Number 6, December, 2001

             Yosi Ben-Asher and   
               Dimitry Podvolny   Y-Invalidate: a New Protocol for
                                  Implementing Weak Consistency in DSM
                                  Systems  . . . . . . . . . . . . . . . . 583--606
                 Inbum Jung and   
             Jongwoong Hyun and   
                Joonwon Lee and   
                    Joongsoo Ma   Two-Phase Barrier: a Synchronization
                                  Primitive for Improving the Processor
                                  Utilization  . . . . . . . . . . . . . . 607--627

International Journal of Parallel Programming
Volume 30, Number 1, February, 2002

             Tracy D. Braun and   
               Renard Ulrey and   
     Anthony A. Maciejewski and   
              Howard Jay Siegel   Parallel Approaches for Singular Value
                                  Decomposition as Applied to Robotic
                                  Manipulator Jacobians  . . . . . . . . . 1--35
          Francisco Corbera and   
              Rafael Asenjo and   
                  Emilio Zapata   New Shape Analysis and Interprocedural
                                  Techniques for Automatic Parallelization
                                  of C Codes . . . . . . . . . . . . . . . 37--63

International Journal of Parallel Programming
Volume 30, Number 2, April, 2002

             Aart J. C. Bik and   
              Milind Girkar and   
               Paul M. Grey and   
                    Xinmin Tian   Automatic Intra-Register Vectorization
                                  for the Intel\reg Architecture . . . . . 65--98
        Jose M. Mantas Ruiz and   
        Julio Ortega Lopera and   
   Jose A. Carrillo de la Plata   Component-Based Derivation of a Parallel
                                  Stiff ODE Solver Implemented in a
                                  Cluster of Computers . . . . . . . . . . 99--148

International Journal of Parallel Programming
Volume 30, Number 3, June, 2002

             Dragan Milicev and   
                Zoran Jovanovic   Control Flow Regeneration for Software
                                  Pipelined Loops with Conditions  . . . . 149--179
                David Wonnacott   Achieving Scalable Locality with Time
                                  Skewing  . . . . . . . . . . . . . . . . 181--221

International Journal of Parallel Programming
Volume 30, Number 4, August, 2002

                Alex Veidenbaum   Guest Editor's Introduction  . . . . . . 223--224
  Dimitrios S. Nikolopoulos and   
      Eduard Ayguadé and   
Constantine D. Polychronopoulos   Runtime vs. Manual Data Distribution for
                                  Architecture-Agnostic Shared-Memory
                                  Programming Models . . . . . . . . . . . 225--255
           Pramod G. Joisha and   
          Samuel P. Midkiff and   
        Mauricio J. Serrano and   
                   Manish Gupta   Efficiently Adapting Java Binaries in
                                  Limited Memory Contexts  . . . . . . . . 257--289
               Arun Chauhan and   
                    Ken Kennedy   Reducing and Vectorizing Procedures for
                                  Telescoping Languages  . . . . . . . . . 291--315
           George S. Almasi and   
         C\ualin Ca\cscaval and   
José G. Castaños and   
              Monty Denneau and   
                Wilm Donath and   
          Maria Eleftheriou and   
              Mark Giampapa and   
                  Howard Ho and   
               Derek Lieber and   
     José E. Moreira and   
               Dennis Newns and   
                  Marc Snir and   
           Henry S. Warren, Jr.   Demonstrating the Scalability of a
                                  Molecular Dynamics Application on a
                                  Petaflops Computer . . . . . . . . . . . 317--351

International Journal of Parallel Programming
Volume 30, Number 5, October, 2002

            Krishna M. Kavi and   
          Alireza Moshtaghi and   
                  Deng-jyi Chen   Modeling Multithreaded Applications
                                  Using Petri Nets . . . . . . . . . . . . 353--371
               Alex Ramirez and   
      Josep Ll. Larriba-Pey and   
             Carlos Navarro and   
               Mateo Valero and   
                Josep Torrellas   Software Trace Cache for Commercial
                                  Applications . . . . . . . . . . . . . . 373--395

International Journal of Parallel Programming
Volume 30, Number 6, December, 2002

               Ivan D. Baev and   
           Waleed M. Meleis and   
             Santosh G. Abraham   Backtracking-Based Instruction
                                  Scheduling to Fill Branch Delay Slots    397--418
               Paola Favati and   
               Grazia Lotti and   
             Ornella Menchi and   
               Francesco Romani   Railway Computation for Infinite Linear
                                  Systems  . . . . . . . . . . . . . . . . 419--439

International Journal of Parallel Programming
Volume 31, Number 1, February, 2003

                     Kazuki Joe   Guest Editor's Introduction  . . . . . . 1--2
          Siegfried Benkner and   
                  Viera Sipkova   Exploiting Distributed-Memory and
                                  Shared-Memory Parallelism on Clusters of
                                  SMPs with Data Parallel Programs . . . . 3--19
                Minsoo Jeon and   
                  Dongseung Kim   Parallel Merge Sort with Load Balancing  21--33
   J. Davison de St.Germain and   
                Alan Morris and   
           Steven G. Parker and   
            Allen D. Malony and   
                  Sameer Shende   Performance Analysis Integration in the
                                  Uintah Software Development Cycle  . . . 35--53
           Takeshi Iwashita and   
              Masaaki Shimasaki   Block Red-Black Ordering: a New Ordering
                                  Strategy for Parallelization of ICCG
                                  Method . . . . . . . . . . . . . . . . . 55--75

International Journal of Parallel Programming
Volume 31, Number 2, April, 2003

    Alfredo Cristobal-Salas and   
           Andrei Tchernykh and   
           Jean-Luc Gaudiot and   
                    Wen-Yen Lin   Non-Strict Execution in Parallel and
                                  Distributed Computing  . . . . . . . . . 77--105
           Patricio Buli\'c and   
               Veselko Gu\vstin   An Extended ANSI C for Processors with a
                                  Multimedia Extension . . . . . . . . . . 107--136
                 Zhijian Lu and   
                  John Lach and   
             Mircea R. Stan and   
                  Kevin Skadron   Alloyed Branch History: Combining Global
                                  and Local Branch History for Robust
                                  Performance  . . . . . . . . . . . . . . 137--177

International Journal of Parallel Programming
Volume 31, Number 3, June, 2003

                      Anonymous   Erratum  . . . . . . . . . . . . . . . . 179--179
                 Eduard Ayguade   Guest Editor's Introduction  . . . . . . 181--183
          Daisuke Takahashi and   
             Mitsuhisa Sato and   
                   Taisuke Boku   Performance Evaluation of the Hitachi
                                  SR8000 Using SPEC OMP2001 Benchmarks . . 185--196
               Hideki Saito and   
              Greg Gaertner and   
               Wesley Jones and   
           Rudolf Eigenmann and   
         Hidetoshi Iwashita and   
              Ron Lieberman and   
       Matthijs van Waveren and   
                  Brian Whitney   Large System Performance of SPEC OMP
                                  Benchmark Suites . . . . . . . . . . . . 197--209
            Hirofumi Nakano and   
          Kazuhisa Ishizaka and   
               Motoki Obata and   
               Keiji Kimura and   
              Hironori Kasahara   Static Coarse Grain Task Scheduling with
                                  Cache Optimization Using OpenMP  . . . . 211--223
              Seung-Jai Min and   
            Ayon Basumallik and   
               Rudolf Eigenmann   Optimizing OpenMP Programs on Software
                                  Distributed Shared Memory Systems  . . . 225--249

International Journal of Parallel Programming
Volume 31, Number 4, August, 2003

                Silvius Rus and   
       Lawrence Rauchwerger and   
                 Jay Hoeflinger   Hybrid Analysis: Static & Dynamic Memory
                                  Reference Analysis . . . . . . . . . . . 251--283
          Richard L. Graham and   
              Sung-Eun Choi and   
            David J. Daniel and   
             Nehal N. Desai and   
          Ronald G. Minnich and   
         Craig E. Rasmussen and   
           L. Dean Risinger and   
            Mitchel W. Sukalski   A Network-Failure-Tolerant
                                  Message-Passing System for Terascale
                                  Clusters . . . . . . . . . . . . . . . . 285--303
         Venkata K. Pingali and   
             Sally A. McKee and   
            Wilson C. Hsieh and   
                 John B. Carter   Restructuring Computations for Temporal
                                  Data Cache Locality  . . . . . . . . . . 305--338

International Journal of Parallel Programming
Volume 31, Number 5, October, 2003

               Han-Saem Yun and   
                 Jihong Kim and   
                  Soo-Mook Moon   Time Optimal Software Pipelining of
                                  Loops with Control Flows . . . . . . . . 339--391
                       Keqin Li   On the Performance of Randomized
                                  Embedding of Reproduction Trees in
                                  Static Networks  . . . . . . . . . . . . 393--406

International Journal of Parallel Programming
Volume 31, Number 6, December, 2003

                 Alex Orailoglu   Guest Editor's Introduction  . . . . . . 407--409
              Kubilay Atasu and   
                Laura Pozzi and   
                    Paolo Ienne   Automatic Application-Specific
                                  Instruction-Set Extensions Under
                                  Microarchitectural Constraints . . . . . 411--428
               Nathan Clark and   
              Hongtao Zhong and   
                Wilkin Tang and   
                   Scott Mahlke   Automatic Design of Application Specific
                                  Instruction Set Extensions Through
                                  Dataflow Graph Exploration . . . . . . . 429--449
       José L. Ayala and   
       Alexander Veidenbaum and   
    Marisa López-Vallejo   Power-Aware Compilation for Register
                                  File Energy Reduction  . . . . . . . . . 451--467
                G. Surendra and   
                S. Banerjee and   
                    S. K. Nandy   On the Effectiveness of Flow Aggregation
                                  in Improving Instruction Reuse in
                                  Network Processing Applications  . . . . 469--487
                 C. Kachris and   
               N. Bourbakis and   
                      A. Dollas   A Reconfigurable Logic-Based Processor
                                  for the SCAN Image and Video Encryption
                                  Algorithm  . . . . . . . . . . . . . . . 489--506

International Journal of Parallel Programming
Volume 32, Number 1, February, 2004

                    Lei Pan and   
                MingKin Lai and   
               Koji Noguchi and   
          Javid J. Huseynov and   
             Lubomir F. Bic and   
         Michael B. Dillencourt   Distributed Parallel Computing Using
                                  Navigational Programming . . . . . . . . 1--37
               Jongwook Woo and   
           Jean-Luc Gaudiot and   
           Andrew L. Wendelborn   Alias Analysis in Java with
                                  Reference-Set Representation for
                                  High-Performance Computing . . . . . . . 39--76

International Journal of Parallel Programming
Volume 32, Number 2, April, 2004

                N. P. Manoj and   
            K. V. Manjunath and   
                R. Govindarajan   CAS-DSM: a Compiler Assisted Software
                                  Distributed Shared Memory  . . . . . . . 77--122
              Mayez Al-Mouhamed   Array Organization in Parallel Memories  123--163

International Journal of Parallel Programming
Volume 32, Number 3, June, 2004

                 Utpal Banerjee   Guest Editor's Introduction  . . . . . . 165--166
                Jiuxing Liu and   
                Jiesheng Wu and   
           Dhabaleswar K. Panda   High Performance RDMA-Based MPI
                                  Implementation over InfiniBand . . . . . 167--198
              Daniel Ortega and   
               Mateo Valero and   
          Eduard Ayguadé   Dynamic Memory Instruction Bypassing . . 199--224
                Ravi Rajwar and   
            Alain Kägi and   
               James R. Goodman   Inferential Queueing and Speculative
                                  Push . . . . . . . . . . . . . . . . . . 225--258

International Journal of Parallel Programming
Volume 32, Number 4, August, 2004

                 Utpal Banerjee   Guest Editor's Introduction  . . . . . . 259--261
            Julita Corbalan and   
           Xavier Martorell and   
                  Jesus Labarta   Page Migration with Dynamic
                                  Space-Sharing Scheduling Policies: The
                                  Case of the SGI O2000  . . . . . . . . . 263--288
             Steven Carroll and   
   Constantine Polychronopoulos   A Framework for Incremental Extensible
                                  Compiler Construction  . . . . . . . . . 289--316
 Konstantinos Kyriakopoulos and   
              Kleanthis Psarris   Data Dependence Analysis Techniques for
                                  Increased Accuracy and Extracted
                                  Parallelism  . . . . . . . . . . . . . . 317--359

International Journal of Parallel Programming
Volume 32, Number 5, October, 2004

          Stavros Souravlas and   
              Manos Roumeliotis   A Pipeline Technique for Dynamic Data
                                  Transfer on a Multiprocessor Grid  . . . 361--388
             Hideya Iwasaki and   
                   Zhenjiang Hu   A New Parallel Skeleton for General
                                  Accumulative Computations  . . . . . . . 389--414
              H. Sarojadevi and   
                S. K. Nandy and   
                S. Balakrishnan   On the Correctness of Program Execution
                                  When Cache Coherence Is Maintained
                                  Locally at Data-Sharing Boundaries in
                                  Distributed Shared Memory
                                  Multiprocessors  . . . . . . . . . . . . 415--446

International Journal of Parallel Programming
Volume 32, Number 6, December, 2004

             Javier Zalamea and   
                Josep Llosa and   
      Eduard Ayguadé and   
                   Mateo Valero   Software and Hardware Techniques to
                                  Optimize Register File Utilization in
                                  VLIW Architectures . . . . . . . . . . . 447--474
           Virgil Palanciuc and   
                   Dragos Badea   A Spill Code Minimization
                                  Technique-Application in the Metrowerks
                                  StarCore C Compiler  . . . . . . . . . . 475--499
                Vijay Menon and   
                 Keshav Pingali   Look Left, Look Right, Look Left Again:
                                  an Application of Fractal Symbolic
                                  Analysis to Linear Algebra Code
                                  Restructuring  . . . . . . . . . . . . . 501--523

International Journal of Parallel Programming
Volume 33, Number 1, February, 2005

              Yonghong Song and   
                 Cheng Wang and   
                     Zhiyuan Li   A Polynomial-Time Algorithm for Memory
                                  Space Reduction  . . . . . . . . . . . . 1--33
         Eric Hung-Yu Tseng and   
               Jean-Luc Gaudiot   Automatic Array Partitioning Based on
                                  the Smith Normal Form  . . . . . . . . . 35--56
                       Mo Zeyao   Concatenation Algorithms for Parallel
                                  Numerical Simulation of Radiation
                                  Hydrodynamics coupled with Neutron
                                  Transport  . . . . . . . . . . . . . . . 57--71

International Journal of Parallel Programming
Volume 33, Number 2--3, June, 2005

               Frederica Darema   The Next Generation Software Program . . 73--79
            David I. August and   
               Sharad Malik and   
              Li-Shiuan Peh and   
                  Vijay Pai and   
        Manish Vachharajani and   
                  Paul Willmann   Achieving Structural and Composable
                                  Modeling of Complex Systems  . . . . . . 81--101
               Naveen Kumar and   
          Bruce R. Childers and   
            Daniel Williams and   
           Jack W. Davidson and   
                 Mary Lou Soffa   Compile-Time Planning for Overhead
                                  Reduction in Software Dynamic
                                  Translators  . . . . . . . . . . . . . . 103--114
        Shobana Padmanabhan and   
              Phillip Jones and   
         David V. Schuehler and   
          Scott J. Friedman and   
      Praveen Krishnamurthy and   
               Huakai Zhang and   
          Roger Chamberlain and   
              Ron K. Cytron and   
               Jason Fritts and   
               John W. Lockwood   Extracting and Improving
                                  Microarchitecture Performance on
                                  Reconfigurable Architectures . . . . . . 115--136
            Victor Eijkhout and   
              Erika Fuentes and   
              Thomas Eidson and   
                  Jack Dongarra   The Component Structure of a
                                  Self-Adapting Numerical Software System  137--143
             Douglas Gregor and   
          Jaakko Järvi and   
          Mayuresh Kulkarni and   
           Andrew Lumsdaine and   
               David Musser and   
                 Sibylle Schupp   Generic Programming and High-Performance
                                  Libraries  . . . . . . . . . . . . . . . 145--164
                Yoon-Ju Lee and   
             Pedro C. Diniz and   
               Mary W. Hall and   
                   Robert Lucas   Empirical Optimization for a Sparse
                                  Linear Solver: a Case Study  . . . . . . 165--181
              Gengbin Zheng and   
             Terry Wilmarth and   
     Praveen Jagadishprasad and   
       Laxmikant V. Kalé   Simulation-Based Performance Prediction
                                  for Large Parallel Machines  . . . . . . 183--207
                  F. Berman and   
                H. Casanova and   
                   A. Chien and   
                  K. Cooper and   
                    H. Dail and   
                A. Dasgupta and   
                    W. Deng and   
                J. Dongarra and   
                L. Johnsson and   
                 K. Kennedy and   
                 C. Koelbel and   
                     B. Liu and   
                     X. Liu and   
                  A. Mandal and   
                   G. Marin and   
                  M. Mazina and   
          J. Mellor-Crummey and   
                  C. Mendes and   
                A. Olugbile and   
                   M. Patel and   
                    D. Reed and   
                     Z. Shi and   
                 O. Sievert and   
                     H. Xia and   
                     A. YarKhan   New Grid Scheduling and Rescheduling
                                  Methods in the GrADS Project . . . . . . 209--229
           J. Eliot B. Moss and   
                Trek Palmer and   
           Timothy Richards and   
          Edward K. Walters and   
               Charles C. Weems   CISL: a Class-Based Machine Description
                                  Language for Co-Generation of Compilers
                                  and Simulators . . . . . . . . . . . . . 231--246

International Journal of Parallel Programming
Volume 33, Number 4, August, 2005

                  Ravi Iyer and   
                Jack Perdue and   
       Lawrence Rauchwerger and   
             Nancy M. Amato and   
                   Laxmi Bhuyan   An Experimental Evaluation of the HP
                                  V-Class and SGI Origin 2000
                                  Multiprocessors using Microbenchmarks
                                  and Scientific Applications  . . . . . . 307--350
                   Chao Lin and   
                 Jang-Ping Sheu   Efficient Broadcast in Heterogeneous
                                  Networks of Workstations Using Two
                                  Sub-Networks . . . . . . . . . . . . . . 351--391
           Sid-Ahmed-Ali Touati   Register Saturation in Instruction Level
                                  Parallelism  . . . . . . . . . . . . . . 393--449

International Journal of Parallel Programming
Volume 33, Number 5, October, 2005

           Jean-Luc Gaudiot and   
                 Siang Wun Song   Message from the Guest Editors . . . . . 451--452
            Rodolfo Azevedo and   
                Sandro Rigo and   
         Marcus Bartholomeu and   
               Guido Araujo and   
           Cristiano Araujo and   
                    Edna Barros   The ArchC Architecture Description
                                  Language and Tools . . . . . . . . . . . 453--484
          Debora R. Roberti and   
           Roberto P. Souto and   
    Haroldo F. Campos Velho and   
       Gervasio A. Degrazia and   
               Domenico Anfossi   Parallel Implementation of a Lagrangian
                                  Stochastic Model for Pollutant
                                  Dispersion . . . . . . . . . . . . . . . 485--498
   Edson Toshimi Midorikawa and   
       Helio Marci Oliveira and   
              Jean Marcos Laine   PEMPIs: a New Methodology for Modeling
                                  and Prediction of MPI Programs
                                  Performance  . . . . . . . . . . . . . . 499--527
                 Onur Mutlu and   
                Hyesoon Kim and   
         David N. Armstrong and   
                   Yale N. Patt   Using the First-Level Caches as Filters
                                  to Reduce the Pollution Caused by
                                  Speculative Memory References  . . . . . 529--559
                    Yue Luo and   
               Lizy K. John and   
                Lieven Eeckhout   SMA: a Self-Monitored Adaptive Cache
                                  Warm-Up Scheme for Microprocessor
                                  Simulation . . . . . . . . . . . . . . . 561--581

International Journal of Parallel Programming
Volume 33, Number 6, December, 2005

               Franco Fummi and   
                  Ian G. Harris   Editorial  . . . . . . . . . . . . . . . 583--584
                Mirko Loghi and   
           Tiziana Margaria and   
        Graziano Pravadelli and   
               Bernhard Steffen   Dynamic and Formal Verification of
                                  Embedded Systems: a Comparative Survey   585--611
         Jean-Pierre Talpin and   
            Paul Le Guernic and   
       Sandeep Kumar Shukla and   
                   Rajesh Gupta   A Compositional Behavioral Modeling
                                  Framework for Embedded System Design and
                                  Conformance Checking . . . . . . . . . . 613--643
              Alfred Koelbl and   
                    Carl Pixley   Constructing Efficient Formal Models
                                  from High-Level Descriptions Using
                                  Symbolic Simulation  . . . . . . . . . . 645--666
          Francesco Bruschi and   
          Fabrizio Ferrandi and   
               Donatella Sciuto   A Framework for the Functional
                                  Verification of SystemC Models . . . . . 667--695
        Iñigo Ugarte and   
                  Pablo Sanchez   Verification of Embedded Systems Based
                                  on Interval Analysis . . . . . . . . . . 697--720

International Journal of Parallel Programming
Volume 34, Number 1, March, 2006

              Ian G. Harris and   
                   Franco Fummi   Guest Editor\'s Introduction . . . . . . 1--2
                    Xi Chen and   
                Harry Hsieh and   
                 Felice Balarin   Verification Approach of Metropolis
                                  Design Framework for Embedded Systems    3--27
                 Samar Abdi and   
                  Daniel Gajski   Verification of System Level Model
                                  Transformations  . . . . . . . . . . . . 29--59
               David Currie and   
               Xiushan Feng and   
            Masahiro Fujita and   
                 Alan J. Hu and   
                  Mark Kwan and   
                Sreeranga Rajan   Embedded Software Verification Using
                                  Symbolic Execution and Uninterpreted
                                  Functions  . . . . . . . . . . . . . . . 61--91
     Ernesto Sánchez and   
        Matteo Sonza Reorda and   
             Giovanni Squillero   Efficient Techniques for Automatic
                                  Verification-Oriented Test Set
                                  Optimization . . . . . . . . . . . . . . 93--109

International Journal of Parallel Programming
Volume 34, Number 2, April, 2006

            Bilha Mendelson and   
          Shlomit S. Pinter and   
                      Ayal Zaks   Introduction . . . . . . . . . . . . . . 111--112
             Michael Factor and   
             Assaf Schuster and   
              Konstantin Shagin   A Platform-Independent Distributed
                                  Runtime for Standard Multithreaded Java  113--142
           Gregory Chockler and   
                  Dahlia Malkhi   Light-Weight Leases for Storage-Centric
                                  Coordination . . . . . . . . . . . . . . 143--170
          Alexander Gendler and   
              Avi Mendelson and   
                   Yitzhak Birk   A PAB-Based Multi-Prefetcher Mechanism   171--188

International Journal of Parallel Programming
Volume 34, Number 3, June, 2006

             Chris Jesshope and   
                Alex Shafarenko   Special issue on Micro-grids --- Guest
                                  Editor Introduction  . . . . . . . . . . 189--192
     Carmen Martínez and   
            Enrique Vallejo and   
       Ramón Beivide and   
                   Cruz Izu and   
           Miquel Moretó   Dense Gaussian Networks: Suitable
                                  Topologies for On-Chip Multiprocessors   193--211
             Pedro Trancoso and   
       Paraskevas Evripidou and   
           Kyriakos Stavrou and   
                Costas Kyriacou   A Case for Chip Multiprocessors Based on
                                  the Data-Driven Multithreading Model . . 213--235
      Asadollah Shahbahrami and   
               Ben Juurlink and   
              Demid Borodin and   
           Stamatis Vassiliadis   Avoiding Conversion and Rearrangement
                                  Overhead in SIMD Architectures . . . . . 237--260
             Sylvain Girbal and   
          Nicolas Vasilache and   
      Cédric Bastoul and   
               Albert Cohen and   
              David Parello and   
                Marc Sigler and   
                  Olivier Temam   Semi-Automatic Composition of Loop
                                  Transformations for Deep Parallelism and
                                  Memory Hierarchies . . . . . . . . . . . 261--317

International Journal of Parallel Programming
Volume 34, Number 4, August, 2006

             Chris Jesshope and   
                Alex Shafarenko   Guest Editor's Introduction á $<$Part 2$>$  319--322
           Gajinder Panesar and   
              Daniel Towner and   
              Andrew Duller and   
                  Alan Gray and   
                   Will Robbins   Deterministic Parallel Processing  . . . 323--341
                   Ian Bell and   
             Nabil Hasasneh and   
                 Chris Jesshope   Supporting Microthread Scheduling and
                                  Synchronisation in CMPs  . . . . . . . . 343--381
             Clemens Grelck and   
               Sven-Bodo Scholz   SAC --- a Functional Array Language for
                                  Efficient Multi-threaded Execution . . . 383--427

International Journal of Parallel Programming
Volume 34, Number 5, October, 2006

       Paraskevas Evripidou and   
                 George Samaras   Metacomputing with Mobile Agents . . . . 429--458
                 Paul Feautrier   Scalable and Structured Scheduling . . . 459--487

International Journal of Parallel Programming
Volume 34, Number 6, December, 2006

                  A. Aiello and   
           M. Mango Furnari and   
              A. Massarotti and   
                  S. Brandi and   
                  V. Caputo and   
                      V. Barone   An Experimental Ontology Server for an
                                  Information Grid Environment . . . . . . 489--508
               Ales Holobar and   
            Milan Ojstersek and   
                  Damjan Zazula   Distributed Jacobi Joint Diagonalization
                                  on Clusters of Personal Computers  . . . 509--530

International Journal of Parallel Programming
Volume 35, Number 1, February, 2007

                 Rajani Pai and   
                R. Govindarajan   FEADS: a Framework for Exploring the
                                  Application Design Space on Network
                                  Processors . . . . . . . . . . . . . . . 1--31
           Ender Özcan and   
                Esin Onbasioglu   Memetic Algorithms for Parallel Code
                                  Optimization . . . . . . . . . . . . . . 33--61
              Chunhui Zhang and   
                   Fadi Kurdahi   Reducing Off-Chip Memory Access via
                                  Stream-Conscious Tiling on Multimedia
                                  Applications . . . . . . . . . . . . . . 63--98

International Journal of Parallel Programming
Volume 35, Number 2, April, 2007

                  Tony Givargis   Special Issue On Embedded Processors ---
                                  Guest Editor Introduction  . . . . . . . 99--100
              JoAnn M. Paul and   
                 Brett H. Meyer   Amdahl's Law Revisited for Single Chip
                                  Systems  . . . . . . . . . . . . . . . . 101--123
            Sorin Manolache and   
                 Petru Eles and   
                      Zebo Peng   Fault-aware Communication Mapping for
                                  NoCs with Guaranteed Latency . . . . . . 125--156
               Peter Petrov and   
                 Alex Orailoglu   Dynamic Tag Reduction for Low-Power
                                  Caches in Embedded Systems with Virtual
                                  Memory . . . . . . . . . . . . . . . . . 157--177

International Journal of Parallel Programming
Volume 35, Number 3, June, 2007

                 Sally A. McKee   Guest Editor's Introduction  . . . . . . 179--180
     José E. Moreira and   
         Valentina Salapura and   
              George Almasi and   
             Charles Archer and   
           Ralph Bellofatto and   
              Peter Bergner and   
             Randy Bickford and   
           Mathias Blumrich and   
  José R. Brunheroto and   
           Arthur A. Bright and   
            Michael Brutman and   
José G. Castaños and   
                  Dong Chen and   
                Paul Coteus and   
               Paul Crumley and   
                  Sam Ellis and   
         Thomas Engelsiepen and   
                  Alan Gara and   
              Mark Giampapa and   
                Tom Gooding and   
                 Shawn Hall and   
             Ruud A. Haring and   
               Roger Haskin and   
        Philip Heidelberger and   
              Dirk Hoenicke and   
               Todd Inglett and   
         Gerrard V. Kopcsay and   
               Derek Lieber and   
              David Limpert and   
               Pat McCarthy and   
              Mark Megerian and   
                 Mike Mundy and   
             Martin Ohmacht and   
                Jeff Parker and   
               Rick A. Rand and   
                   Don Reed and   
             Ramendra Sahoo and   
              Alda Sanomiya and   
               Richard Shok and   
                Brian Smith and   
          Gordon G. Stewart and   
                Todd Takken and   
              Pavlos Vranas and   
           Brian Wallenfelt and   
          Michael Blocksome and   
                  Joe Ratterman   The Blue Gene/L Supercomputer: a
                                  Hardware and Software Story  . . . . . . 181--206
             Gregory L. Lee and   
              Martin Schulz and   
                Dong H. Ahn and   
              Andrew Bernat and   
      Bronis R. de Supinski and   
               Steven Y. Ko and   
                 Barry Rountree   Dynamic Binary Instrumentation and Data
                                  Aggregation on Large Scale Systems . . . 207--232
               Michael Gschwind   The Cell Broadband Engine: Exploiting
                                  Multiple Levels of Parallelism in a Chip
                                  Multiprocessor . . . . . . . . . . . . . 233--262
            Samuel Williams and   
                 John Shalf and   
              Leonid Oliker and   
               Shoaib Kamil and   
             Parry Husbands and   
               Katherine Yelick   Scientific Computing Kernels on the Cell
                                  Processor  . . . . . . . . . . . . . . . 263--298
               James Laudon and   
             Lawrence Spracklen   The Coming Wave of Multithreaded Chip
                                  Multiprocessors  . . . . . . . . . . . . 299--330

International Journal of Parallel Programming
Volume 35, Number 4, August, 2007

      Eduard Ayguadé and   
            Matthias S. Mueller   Special Issue on OpenMP --- Guest
                                  Editors' Introduction  . . . . . . . . . 331--333
           Greg Bronevetsky and   
          Bronis R. de Supinski   Complete Formal Specification of the
                                  OpenMP Memory Model  . . . . . . . . . . 335--392
            Alejandro Duran and   
               Roger Ferrer and   
     Juan José Costa and   
            Marc Gonz\`alez and   
           Xavier Martorell and   
      Eduard Ayguadé and   
           Jesús Labarta   A Proposal for Error Handling in OpenMP  393--416
                Alan Morris and   
            Allen D. Malony and   
               Sameer S. Shende   Supporting Nested OpenMP Parallelism in
                                  the TAU Performance System . . . . . . . 417--436

International Journal of Parallel Programming
Volume 35, Number 5, October, 2007

      Eduard Ayguadé and   
            Matthias S. Mueller   Introduction . . . . . . . . . . . . . . 437--439
              Russell Brown and   
                  Ilya Sharapov   High-Scalability Parallelization of a
                                  Molecular Modeling Application:
                                  Performance and Productivity Comparison
                                  Between OpenMP and MPI Implementations   441--458
              Dieter an Mey and   
             Samuel Sarholz and   
             Christian Terboven   Nested Parallelization with OpenMP . . . 459--476
       Markus Nordén and   
            Henrik Löf and   
           Jarmo Rantakokko and   
               Sverker Holmgren   Dynamic Data Migration for Structured
                                  AMR Solvers  . . . . . . . . . . . . . . 477--491
           Tien-Hsiung Weng and   
            Ruey-Kuen Perng and   
                Barbara Chapman   OpenMP Implementation of SPICE3 Circuit
                                  Simulator  . . . . . . . . . . . . . . . 493--505

International Journal of Parallel Programming
Volume 35, Number 6, December, 2007

               Anup Gangwar and   
            M. Balakrishnan and   
        Preeti Ranjan Panda and   
                   Anshul Kumar   Evaluation of Bus Based Interconnect
                                  Mechanisms in Clustered VLIW
                                  Architectures  . . . . . . . . . . . . . 507--527
                 Issam W. Damaj   Parallel Algorithms Development for
                                  Programmable Devices with Application
                                  from Cryptography  . . . . . . . . . . . 529--572
             Laurent Baduel and   
     Françoise Baude and   
                  Denis Caromel   Asynchronous Typed Object Groups for
                                  Grid Programming . . . . . . . . . . . . 573--614
                Kento Emoto and   
               Zhenjiang Hu and   
            Kazuhiko Kakehi and   
                Masato Takeichi   A Compositional Framework for Developing
                                  Parallel Programs on Two-Dimensional
                                  Arrays . . . . . . . . . . . . . . . . . 615--658

International Journal of Parallel Programming
Volume 36, Number 1, February, 2008

            Preeti Ranjan Panda   Guest Editor Introduction: Special Issue
                                  on Multiprocessor-based Embedded Systems 1--2
           Martino Ruggiero and   
             Alessio Guerri and   
            Davide Bertozzi and   
             Michela Milano and   
                    Luca Benini   A Fast and Accurate Technique for
                                  Mapping Parallel Applications on
                                  Stream-Oriented MPSoC Platforms with
                                  Communication Awareness  . . . . . . . . 3--36
                 Traian Pop and   
                   Paul Pop and   
                 Petru Eles and   
                      Zebo Peng   Analysis and Optimisation of
                                  Hierarchically Scheduled Multiprocessor
                                  Embedded Systems . . . . . . . . . . . . 37--67
                Lobna Kriaa and   
            Aimen Bouchhima and   
              Marius Gligor and   
       Anne-Marie Fouillart and   
Fréderic Pétrot and   
            Ahmed-Amine Jerraya   Parallel Programming of Multi-processor
                                  SoC: a HW--SW Interface Perspective  . . 68--92
               Ilya Issenin and   
                     Nikil Dutt   Using FORAY Models to Enable MPSoC
                                  Memory Optimizations . . . . . . . . . . 93--113
Mohammad Abdullah Al Faruque and   
               Jörg Henkel   QoS-supported On-chip Communication for
                                  Multi-processors . . . . . . . . . . . . 114--139
              Seng Lin Shee and   
               Andrea Erdos and   
               Sri Parameswaran   Architectural Exploration of
                                  Heterogeneous Multiprocessor Systems for
                                  JPEG . . . . . . . . . . . . . . . . . . 140--162

International Journal of Parallel Programming
Volume 36, Number 2, April, 2008

        Alberto F. De Souza and   
                 Rajkumar Buyya   Introduction to the Special Issue on the
                                  18th International Symposium on Computer
                                  Architecture and High Performance
                                  Computing  . . . . . . . . . . . . . . . 163--165
               Fredrik Warg and   
                  Per Stenstrom   Dual-thread Speculation: a Simple
                                  Approach to Uncover Thread-level
                                  Parallelism on a Simultaneous
                                  Multithreaded Processor  . . . . . . . . 166--183
            Peter A. Rounce and   
            Alberto F. De Souza   Dynamic Instruction Scheduling in a
                                  Trace-based Multi-threaded Architecture  184--205
        Wessam M. Hassanein and   
           Layali K. Rashid and   
             Moustafa A. Hammad   Analyzing the Effects of Hyperthreading
                                  on the Performance of Data Management
                                  Systems  . . . . . . . . . . . . . . . . 206--225
 Renata Braga Araújo and   
Guilherme Henrique Trielli Ferreira and   
     Gustavo Henrique Orair and   
               Wagner Meira and   
Renato Antônio Celso Ferreira and   
 Dorgival Olavo Guedes Neto and   
           Mohammed Javeed Zaki   The ParTriCluster Algorithm for Gene
                                  Expression Analysis  . . . . . . . . . . 226--249
             George Teodoro and   
              Tulio Tavares and   
            Renato Ferreira and   
                Tahsin Kurc and   
               Wagner Meira and   
            Dorgival Guedes and   
                   Tony Pan and   
                     Joel Saltz   A Run-time System for Efficient
                                  Execution of Scientific Workflows on
                                  Distributed Environments . . . . . . . . 250--266
             Gabriel H. Loh and   
       Daniel A. Jiménez   Modulo Path History for the Reduction of
                                  Pipeline Overheads in Path-based Neural
                                  Branch Predictors  . . . . . . . . . . . 267--286

International Journal of Parallel Programming
Volume 36, Number 3, June, 2008

               Guang R. Gao and   
             Mitsuhisa Sato and   
          Eduard Ayguadé   Guest Editors Introduction: Special
                                  Issue on OpenMP  . . . . . . . . . . . . 287--288
             Kevin O\'Brien and   
           Kathryn O\'Brien and   
                 Zehra Sura and   
                  Tong Chen and   
                      Tao Zhang   Supporting OpenMP on Cell  . . . . . . . 289--311
               Haoqiang Jin and   
            Barbara Chapman and   
                  Lei Huang and   
              Dieter an Mey and   
              Thomas Reichstein   Performance Evaluation of a Multi-Zone
                                  Application in Different OpenMP
                                  Approaches . . . . . . . . . . . . . . . 312--325
        Milos Milovanovi\'c and   
               Roger Ferrer and   
           Vladimir Gajinov and   
             Osman S. Unsal and   
             Adrian Cristal and   
      Eduard Ayguadé and   
                   Mateo Valero   Nebelung: Execution Environment for
                                  Transactional OpenMP . . . . . . . . . . 326--346
                    Jie Tao and   
               Marcel Kunze and   
               Fabian Nowak and   
              Rainer Buchty and   
                  Wolfgang Karl   Performance Advantage of Reconfigurable
                                  Cache Design on Multicore Processor
                                  Systems  . . . . . . . . . . . . . . . . 347--360

International Journal of Parallel Programming
Volume 36, Number 4, August, 2008

               Dongsoo Kang and   
                   Chen Liu and   
               Jean-Luc Gaudiot   The Impact of Speculative Execution on
                                  SMT Processors . . . . . . . . . . . . . 361--385
               K. Subramani and   
             Kiran Yellajyosula   On the Design and Implementation of a
                                  Shared Memory Dispatcher for Partially
                                  Clairvoyant Schedulers . . . . . . . . . 386--411
   Mariana Luderitz Kolberg and   
     Luiz Gustavo Fernandes and   
        Dalcidio Moraes Claudio   Dense Linear System: a Parallel
                                  Self-verified Solver . . . . . . . . . . 412--425
                Ahmad Faraj and   
            Pitch Patarasuk and   
                       Xin Yuan   Bandwidth Efficient All-to-All Broadcast
                                  on Switched Clusters . . . . . . . . . . 426--453

International Journal of Parallel Programming
Volume 36, Number 5, October, 2008

                  Tony Givargis   Guest Editor Introduction: Special Issue
                                  on Embedded Processors . . . . . . . . . 455--456
              Praveen Kalla and   
               X. Sharon Hu and   
               Jörg Henkel   A Flexible Framework for Communication
                                  Evaluation in SoC Design . . . . . . . . 457--477
                  Roman Lysecky   Scalability and Parallel Execution of
                                  Warp Processing: Dynamic
                                  Hardware/Software Partitioning . . . . . 478--492
                    Zhi Guo and   
            Betul Buyukkurt and   
                John Cortes and   
             Abhishek Mitra and   
                  Walild Najjar   A Compiler Intermediate Representation
                                  for Reconfigurable Fabrics . . . . . . . 493--520

International Journal of Parallel Programming
Volume 36, Number 6, December, 2008

             Hsiao-Hsi Wang and   
              Kuan-Ching Li and   
               Ssu-Hsuan Lu and   
            Chun-Chieh Yang and   
               Jean-Luc Gaudiot   Design and Implementation of an Agent
                                  Home Scheme Strategy for Prefetch-Based
                                  DSM Systems  . . . . . . . . . . . . . . 521--542
                Ahmad Faraj and   
            Pitch Patarasuk and   
                       Xin Yuan   A Study of Process Arrival Patterns for
                                  MPI Collective Operations  . . . . . . . 543--570
             Aart J. C. Bik and   
          David L. Kreitzer and   
                    Xinmin Tian   A Case Study on Compiler Optimizations
                                  for the Intel$^\reg $ Core$^{TM}$ 2 Duo
                                  Processor  . . . . . . . . . . . . . . . 571--591
      H. L. A. van der Spek and   
                   S. Groot and   
               E. M. Bakker and   
              H. A. G. Wijshoff   A Compile/Run-time Environment for the
                                  Automatic Transformation of Linked List
                                  Data Structures  . . . . . . . . . . . . 592--623

International Journal of Parallel Programming
Volume 37, Number 1, February, 2009

              Nicholas Carriero   Guest Editor Introduction: Special Issue
                                  on High Performance Computing for High
                                  Productivity Environments  . . . . . . . 1--2
              Gaurav Sharma and   
                     Jos Martin   MATLAB$^\reg $: a Language for Parallel
                                  Computing  . . . . . . . . . . . . . . . 3--36
                 Masatoshi Seki   dRuby and Rinda: Implementation and
                                  Application of Distributed Ruby and its
                                  Parallel Coordination Mechanism  . . . . 37--57
        L. Anthony Drummond and   
            Vicente Galiano and   
    Violeta Migallón and   
            Jose Penadés   PyACTS: a Python Based Interface to ACTS
                                  Tools and Parallel Scientific
                                  Applications . . . . . . . . . . . . . . 58--77
               Luke Tierney and   
              A. J. Rossini and   
                          Na Li   Snow: a Parallel Computing Framework for
                                  the R System . . . . . . . . . . . . . . 78--90
             David E. Hudak and   
                Neil Ludban and   
        Ashok Krishnamurthy and   
            Vijay Gadepally and   
            Siddharth Samsi and   
                         others   A Computational Science IDE for HPC
                                  Systems: Design and Applications . . . . 91--105
         Robert D. Bjornson and   
       Nicholas J. Carriero and   
          Martin H. Schultz and   
         Patrick M. Shields and   
              Stephen B. Weston   NetWorkSpace: a Coordination System for
                                  High-Productivity Environments . . . . . 106--125

International Journal of Parallel Programming
Volume 37, Number 2, April, 2009

                    Jun Cao and   
                Ayush Goyal and   
         Krista A. Novstrup and   
          Samuel P. Midkiff and   
             James M. Caruthers   An Optimizing Compiler for Parallel
                                  Chemistry Simulations  . . . . . . . . . 127--152
           J. Miguel-Alonso and   
               J. Navaridas and   
                 F. J. Ridruejo   Interconnection Network Simulation Using
                                  Traces of MPI Applications . . . . . . . 153--174
              Joahyoung Lee and   
                     Inbum Jung   Recovery Strategies for Streaming Media
                                  Service in a Cluster-Based VOD Server
                                  with a Fault Node  . . . . . . . . . . . 175--194
         Athanasios I. Margaris   Log File Formats for Parallel
                                  Applications: a Review . . . . . . . . . 195--222
         Mohammad J. Rashti and   
                   Ahmad Afsahi   A Speculative and Adaptive MPI
                                  Rendezvous Protocol Over RDMA-enabled
                                  Interconnects  . . . . . . . . . . . . . 223--246

International Journal of Parallel Programming
Volume 37, Number 3, June, 2009

           Rudolf Eigenmann and   
          Eduard Ayguadé   Guest Editors' Introduction  . . . . . . 247--249
           Greg Bronevetsky and   
            John Gyllenhaal and   
          Bronis R. de Supinski   CLOMP: Accurately Characterizing OpenMP
                                  Application Overheads  . . . . . . . . . 250--265
        Karl Fürlinger and   
                  Shirley Moore   Capturing and Analyzing the Execution
                                  Control Flow of OpenMP Applications  . . 266--276
            Tobias Hilbrich and   
    Matthias S. Müller and   
                Bettina Krammer   MPI Correctness Checking for OpenMP/MPI
                                  Applications . . . . . . . . . . . . . . 277--291
            Alejandro Duran and   
               Roger Ferrer and   
      Eduard Ayguadé and   
              Rosa M. Badia and   
                  Jesus Labarta   A Proposal to Extend the OpenMP Tasking
                                  Model with Dependent Tasks . . . . . . . 292--305
        Morten S. Rasmussen and   
         Matthias B. Stuart and   
                  Sven Karlsson   Parallelism and Scalability in an Image
                                  Processing Application . . . . . . . . . 306--323
      Pascal Vander-Swalmen and   
              Gilles Dequen and   
          Michaël Krajecki   A Collaborative Approach for
                                  Multi-Threaded SAT Solving . . . . . . . 324--342

International Journal of Parallel Programming
Volume 37, Number 4, August, 2009

                 Prabhat Mishra   Guest Editor Introduction: Special Issue
                                  on Nano/Bio-Inspired Applications and
                                  Architectures  . . . . . . . . . . . . . 343--344
Jayram Moorkanikara Nageswaran and   
               Andrew Felch and   
        Ashok Chandrasekhar and   
                 Nikil Dutt and   
            Richard Granger and   
                         others   Brain Derived Vision Algorithm on High
                                  Performance Architectures  . . . . . . . 345--369
                  Yang Zhao and   
         Krishnendu Chakrabarty   On-Line Testing of Lab-on-Chip Using
                                  Reconfigurable Digital-Microfluidic
                                  Compactors . . . . . . . . . . . . . . . 370--388
            Scott Chilstedt and   
                  Chen Dong and   
                    Deming Chen   Design and Evaluation of a Carbon
                                  Nanotube-Based Programmable Architecture 389--416
             Michael DeBole and   
      Ramakrishnan Krishnan and   
        Varsha Balakrishnan and   
               Wenping Wang and   
                   Hong Luo and   
                         others   New-Age: a Negative Bias Temperature
                                  Instability-Estimation Framework for
                                  Microarchitectural Components  . . . . . 417--431

International Journal of Parallel Programming
Volume 37, Number 5, October, 2009

     Stéphane Genaud and   
           Emmanuel Jeannot and   
            Choopan Rattanapoka   Fault-Management in P2P-MPI  . . . . . . 433--461
      Mohammad Reza Bonyadi and   
      Mohsen Ebrahimi Moghaddam   A Bipartite Genetic Algorithm for
                                  Multi-processor Task Scheduling  . . . . 462--487
                Guochun Shi and   
      Volodymyr Kindratenko and   
                Steven Gottlieb   The Bottom-Up Implementation of One MILC
                                  Lattice QCD Application on the Cell
                                  Blade  . . . . . . . . . . . . . . . . . 488--507
                  Chen Tian and   
                   Min Feng and   
            Vijay Nagarajan and   
                    Rajiv Gupta   Speculative Parallelization of
                                  Sequential Loops on Multicores . . . . . 508--535

International Journal of Parallel Programming
Volume 37, Number 6, December, 2009

               Nadia Nedjah and   
       Luiza de Macedo Mourelle   High-Performance Hardware of the
                                  Sliding-Window Method for Parallel
                                  Computation of Modular Exponentiations   537--555
               Steen Larsen and   
     Parthasarathy Sarangam and   
             Ram Huggahalli and   
             Siddharth Kulkarni   Architectural Breakdown of End-to-End
                                  Latency in a TCP/IP Network  . . . . . . 556--571
    Carolina Ribeiro Xavier and   
   Rafael Sachetto Oliveira and   
 Vinicius da Fonseca Vieira and   
   Rodrigo Weber dos Santos and   
                   Wagner Meira   Multi-Level Parallelism for the Cardiac
                                  Bidomain Equations . . . . . . . . . . . 572--592
            Claudio Schepke and   
           Nicolas Maillard and   
          Philippe O. A. Navaux   Parallel Lattice Boltzmann Method with
                                  Blocked Partitioning . . . . . . . . . . 593--611

International Journal of Parallel Programming
Volume 38, Number 1, February, 2010

           Sven-Bodo Scholz and   
                Alex Shafarenko   Guest Editors' Editorial: Special Issue
                                  on the Second International Workshop on
                                  Microgrids . . . . . . . . . . . . . . . 1--3
         Benedict R. Gaster and   
             Tim Bainbridge and   
                David Lacey and   
                  David Gardner   Compilation Techniques for High Level
                                  Parallel Code  . . . . . . . . . . . . . 4--18
                  Jan Haase and   
            Andreas Hofmann and   
              Klaus Waldschmidt   A Self Distributing Virtual Machine for
                                  Adaptive Multicore Environments  . . . . 19--37
             Clemens Grelck and   
           Sven-Bodo Scholz and   
                Alex Shafarenko   Asynchronous Stream Processing with
                                  S-Net  . . . . . . . . . . . . . . . . . 38--67
Philip K. F. Hölzenspies and   
         Timon D. ter Braak and   
                  Jan Kuper and   
          Gerard J. M. Smit and   
               Johann M. Hurink   Run-time Spatial Mapping of Streaming
                                  Applications to Heterogeneous
                                  Multi-Processor Systems  . . . . . . . . 68--83

International Journal of Parallel Programming
Volume 38, Number 2, April, 2010

                 Xiaobin Li and   
               Jean-Luc Gaudiot   Tolerating Radiation-Induced Transient
                                  Faults in Modern Processors  . . . . . . 85--116
                  Chao Dong and   
                Huijie Zhao and   
                       Wei Wang   Parallel Nonnegative Matrix
                                  Factorization Algorithm on the
                                  Distributed Memory Platform  . . . . . . 117--137
                      Nan Zhang   Computing Optimised Parallel Speeded-Up
                                  Robust Features (P-SURF) on Multi-Core
                                  Processors . . . . . . . . . . . . . . . 138--158
     Alexandros V. Gerbessiotis   Parallel Option Price Valuations with
                                  the Explicit Finite Difference Method    159--182

International Journal of Parallel Programming
Volume 38, Number 3--4, June, 2010

        Preeti Ranjan Panda and   
                Rajendran Panda   Guest Editorial: Special Issue on VLSI
                                  Design and Embedded Systems  . . . . . . 183--184
           Alexander Czutro and   
                Ilia Polian and   
              Matthew Lewis and   
               Piet Engelke and   
          Sudhakar M. Reddy and   
                         others   Thread-Parallel Integrated Test Pattern
                                  Generator Utilizing Satisfiability
                                  Analysis . . . . . . . . . . . . . . . . 185--202
               Tameesh Suri and   
                Aneesh Aggarwal   Improving Adaptability and Per-Core
                                  Performance of Many-Core Processors
                                  Through Reconfiguration  . . . . . . . . 203--224
         Unmesh D. Bordoloi and   
           Samarjit Chakraborty   GPU-based Acceleration of System-level
                                  Design Tasks . . . . . . . . . . . . . . 225--253
            Reiley Jeyapaul and   
             Aviral Shrivastava   Code Transformations for TLB Power
                                  Reduction  . . . . . . . . . . . . . . . 254--276
                     Sourav Roy   H-NMRU: an Efficient Cache Replacement
                                  Policy with Low Area . . . . . . . . . . 277--287
         Spyros Apostolakos and   
         Apostolos Meliones and   
             George Lykakis and   
         Emmanuel Touloupis and   
             Vassilis Vlagoulis   Design, Implementation and Validation of
                                  an Open Source IP-PBX/VoIP Gateway
                                  Multi-Core SoC . . . . . . . . . . . . . 288--302
                   T. Kempf and   
            S. Wallentowitz and   
                 G. Ascheid and   
                 R. Leupers and   
                        H. Meyr   Analytical and Simulation-based Design
                                  Space Exploration of Software Defined
                                  Radios . . . . . . . . . . . . . . . . . 303--321
          Vinay B. Y. Kumar and   
            Siddharth Joshi and   
           Sachin B. Patkar and   
                   H. Narayanan   FPGA Based High Performance
                                  Double-Precision Matrix Multiplication   322--338

International Journal of Parallel Programming
Volume 38, Number 5--6, October, 2010

    Matthias S. Müller and   
          Eduard Ayguadé   Guest Editors' Introduction  . . . . . . 339--340
         Stephen L. Olivier and   
                   Jan F. Prins   Comparison of OpenMP 3.0 and Other Task
                                  Parallel Frameworks on Unbalanced Task
                                  Graphs . . . . . . . . . . . . . . . . . 341--360
               Chunhua Liao and   
          Daniel J. Quinlan and   
       Jeremiah J. Willcock and   
                   Thomas Panas   Semantic-Aware Automatic Parallelization
                                  of Modern Applications Using High-Level
                                  Abstractions . . . . . . . . . . . . . . 361--378
               Paul Kapinos and   
                  Dieter an Mey   Productivity and Performance Portability
                                  of the OpenMP 3.0 Tasking Concept When
                                  Applied to an Engineering Code Written
                                  in Fortran 95  . . . . . . . . . . . . . 379--395
               J. Mark Bull and   
              James Enright and   
                     Xu Guo and   
              Chris Maynard and   
                     Fiona Reid   Performance Evaluation of Mixed-Mode
                                  OpenMP/MPI Implementations . . . . . . . 396--417
  François Broquedis and   
          Nathalie Furmento and   
               Brice Goglin and   
Pierre-André Wacrenier and   
                 Raymond Namyst   ForestGOMP: an Efficient OpenMP
                                  Environment for NUMA Architectures . . . 418--439
      Eduard Ayguadé and   
              Rosa M. Badia and   
             Pieter Bellens and   
             Daniel Cabrera and   
Alejandro Duran Roger Ferrer and   
       Marc González and   
            Francisco Igual and   
Daniel Jiménez-González and   
       Jesús Labarta and   
             Luis Martinell and   
           Xavier Martorell and   
                Rafael Mayo and   
      Josep M. Pérez and   
               Judit Planas and   
Enrique S. Quintana-Ortí   Extending OpenMP to Survive the
                                  Heterogeneous Multi-Core Era . . . . . . 440--459

International Journal of Parallel Programming
Volume 39, Number 1, February, 2011

         Valentina Salapura and   
     José E. Moreira and   
                 Sally A. McKee   Guest Editors Introduction . . . . . . . 1--2
        Daniele Paolo Scarpazza   Top-Performance Tokenization and
                                  Small-Ruleset Regular Expression
                                  Matching: a Quantitative Performance
                                  Analysis and Optimization Study on the
                                  Cell/B.E. Processor  . . . . . . . . . . 3--32
         Arrvindh Shriraman and   
              Sandhya Dwarkadas   Analyzing Conflicts in
                                  Hardware-Supported Memory Transactions   33--61
              Mehmet Belgin and   
                Godmar Back and   
              Calvin J. Ribbens   A Library for Pattern-based Sparse
                                  Matrix Vector Multiply . . . . . . . . . 62--87
      Rob V. van Nieuwpoort and   
                 John W. Romein   Correlating Radio Astronomy Signals with
                                  Many-Core Hardware . . . . . . . . . . . 88--114
               Jiayuan Meng and   
                  Kevin Skadron   A Performance Study for Iterative
                                  Stencil Loops on GPUs with Ghost Zone
                                  Optimizations  . . . . . . . . . . . . . 115--142

International Journal of Parallel Programming
Volume 39, Number 2, April, 2011

        Ghada F. El Kabbany and   
             Nayer M. Wanas and   
            Nadia H. Hegazi and   
               Samir I. Shaheen   A Dynamic Load Balancing Framework for
                                  Real-time Applications in Message
                                  Passing Systems  . . . . . . . . . . . . 143--182
               K. A. Hawick and   
                   A. Leist and   
                   D. P. Playne   Regular Lattice and Small-World Spin
                                  Model Simulations Using CUDA and GPUs    183--201
        Simon Uzezi Ewedafe and   
       Rio Hirowati Shariffudin   Parallel Implementation of $2$-D
                                  Telegraphic Equation on MPI/PVM Cluster  202--231
            Nasser Giacaman and   
                  Oliver Sinnen   Parallel Iterator for Parallelizing
                                  Object-Oriented Applications . . . . . . 232--269

International Journal of Parallel Programming
Volume 39, Number 3, June, 2011

           Christian Fensch and   
                 Marcelo Cintra   An Evaluation of an OS-Based Coherence
                                  Scheme for Tiled CMPs  . . . . . . . . . 271--295
             Grigori Fursin and   
            Yuriy Kashnikov and   
          Abdul Wahid Memon and   
           Zbigniew Chamski and   
              Olivier Temam and   
                         others   Milepost GCC: Machine Learning Enabled
                                  Self-tuning Compiler . . . . . . . . . . 296--327
             Arnaud Grasset and   
            Philippe Millet and   
            Philippe Bonnot and   
                 Sami Yehia and   
     Wolfram Putzke-Roeming and   
                         others   The MORPHEUS Heterogeneous Dynamically
                                  Reconfigurable Platform  . . . . . . . . 328--356
                 R. Tornero and   
        J. M. Orduña and   
                   A. Mejia and   
                   J. Flich and   
                       J. Duato   A Communication-Driven Routing Technique
                                  for Application-Specific NoCs  . . . . . 357--374
            Enrique Vallejo and   
            Sutirtha Sanyal and   
                 Tim Harris and   
           Fernando Vallejo and   
       Ramón Beivide and   
                         others   Hybrid Transactional Memory with
                                  Pessimistic Concurrency Control  . . . . 375--396
                  Harm Munk and   
      Eduard Ayguadé and   
      Cédric Bastoul and   
             Paul Carpenter and   
           Zbigniew Chamski and   
                         others   ACOTES Project: Advanced Compiler
                                  Technologies for Embedded Streaming  . . 397--450

International Journal of Parallel Programming
Volume 39, Number 4, August, 2011

               Shaoshan Liu and   
                Ligang Wang and   
               Xiao-Feng Li and   
               Jean-Luc Gaudiot   Space-and-Time Efficient Parallel
                                  Garbage Collector for Data-Intensive
                                  Applications . . . . . . . . . . . . . . 451--472
                  Ying Qian and   
                   Ahmad Afsahi   Process Arrival Pattern Aware Alltoall
                                  and Allgather on InfiniBand Clusters . . 473--493
                  L. Benini and   
                R. Grottesi and   
                  S. Morigi and   
                    M. Ruggiero   Parallel Rendering and Animation of
                                  Subdivision Surfaces on the Cell BE
                                  Processor  . . . . . . . . . . . . . . . 494--521
              Kush K. Kella and   
                   Aasia Khanum   APCFS: Autonomous and Parallel
                                  Compressed File System . . . . . . . . . 522--532

International Journal of Parallel Programming
Volume 39, Number 5, October, 2011

               Shaoshan Liu and   
        Christine Eisenbeis and   
               Jean-Luc Gaudiot   Value Prediction and Speculative
                                  Execution on GPU . . . . . . . . . . . . 533--552
              Ralf Hoffmann and   
                  Thomas Rauber   Adaptive Task Pools: Efficiently
                                  Balancing Large Number of Tasks on
                                  Shared-address Spaces  . . . . . . . . . 553--581
                Can Ozturan and   
                   Dan Grigoras   Guest Editorial: Parallel and
                                  Distributed Computing  . . . . . . . . . 582--583
                Anne Benoit and   
       Hinde Lilia Bouziane and   
                    Yves Robert   Optimizing the Reliability of Streaming
                                  Applications Under Throughput
                                  Constraints  . . . . . . . . . . . . . . 584--614
          George C. Caragea and   
         Alexandros Tzannes and   
                Fuat Keceli and   
               Rajeev Barua and   
                    Uzi Vishkin   Resource-Aware Compiler Prefetching for
                                  Fine-Grained Many-Cores  . . . . . . . . 615--638
                  Alper Sen and   
              Baris Aksanli and   
                  Murat Bozkurt   Speeding Up Cycle Based Logic Simulation
                                  Using Graphics Processing Units  . . . . 639--661

International Journal of Parallel Programming
Volume 39, Number 6, December, 2011

                  Yu-Min Lu and   
                Peng-Sheng Chen   Probabilistic Alias Analysis of
                                  Executable Code  . . . . . . . . . . . . 663--693
            Håkan Sundell   Wait-Free Multi-Word Compare-and-Swap
                                  Using Greedy Helping and Grabbing  . . . 694--716
            Masroor Hussain and   
              Muhammad Abid and   
              Mushtaq Ahmad and   
             Ashfaq Khokhar and   
                     Arif Masud   A Parallel Implementation of ALE Moving
                                  Mesh Technique for FSI Problems using
                                  OpenMP . . . . . . . . . . . . . . . . . 717--745
             Kayhan M. Imre and   
             Cesur Baransel and   
                  Harun Artuner   Efficient and Scalable Routing
                                  Algorithms for Collective Communication
                                  Operations on $2$D All-Port Torus
                                  Networks . . . . . . . . . . . . . . . . 746--782
                   Brian Demsky   Using Discrete Event Simulation to
                                  Analyze Contention Managers  . . . . . . 783--808
        Seçkin Sanci and   
                    Veysi Isler   A Parallel Algorithm for UAV Flight
                                  Route Planning on GPU  . . . . . . . . . 809--837

International Journal of Parallel Programming
Volume 40, Number 1, February, 2012

         Valentina Salapura and   
           Michael Gschwind and   
                     Jens Knoop   Guest Editorial: Parallel Systems and
                                  Compilers  . . . . . . . . . . . . . . . 1--3
                 I-Jui Sung and   
             Nasser Anssari and   
           John A. Stratton and   
                 Wen-Mei W. Hwu   Data Layout Transformation Exploiting
                                  Memory-Level Parallelism in Structured
                                  Grid Many-Core Applications  . . . . . . 4--24
           Ferad Zyulkyarov and   
              Srdjan Stipic and   
                 Tim Harris and   
             Osman S. Unsal and   
      Adrián Cristal and   
                Ibrahim Hur and   
                   Mateo Valero   Profiling and Optimizing Transactional
                                  Memory Applications  . . . . . . . . . . 25--56
                 M. Awasthi and   
                 D. Nellans and   
                   K. Sudan and   
         R. Balasubramonian and   
                       A. Davis   Managing Data Placement in Memory
                                  Systems with Multiple Memory Controllers 57--83
               Changhui Lin and   
            Vijay Nagarajan and   
                    Rajiv Gupta   Efficient Sequential Consistency Using
                                  Conditional Fences . . . . . . . . . . . 84--117
                  Yun Zhang and   
                 Jae W. Lee and   
            Nick P. Johnson and   
                David I. August   DAFT: Decoupled Acyclic Fault Tolerance  118--140

International Journal of Parallel Programming
Volume 40, Number 2, April, 2012

                  Yan Huang and   
                   Jie Tang and   
                 Zhi-min Gu and   
                    Min Cai and   
              Jianxun Zhang and   
                  Ninghan Zheng   The Performance Optimization of Threaded
                                  Prefetching for Linked Data Structures   141--163
          Jean-Claude Charr and   
     Raphaël Couturier and   
                 David Laiymani   Adaptation and Evaluation of the
                                  Multisplitting-Newton and Waveform
                                  Relaxation Methods Over Distributed
                                  Volatile Environments  . . . . . . . . . 164--183
              Mwaffaq Otoom and   
                  JoAnn M. Paul   Workload Mode Identification for Chip
                                  Heterogeneous Multiprocessors  . . . . . 184--224
  Mohsen Ebrahimi Moghaddam and   
          Mohammad Reza Bonyadi   An Immune-based Genetic Algorithm with
                                  Reduced Search Space Coding for
                                  Multiprocessor Task Scheduling Problem   225--257

International Journal of Parallel Programming
Volume 40, Number 3, June, 2012

               Wagner Meira and   
              Ricardo Bianchini   Special Issue on Computer Architecture
                                  and High-Performance Computing . . . . . 259--261
            Ricardo Menotti and   
  João M. P. Cardoso and   
        Marcio M. Fernandes and   
                Eduardo Marques   LALP: a Language to Program Custom
                                  FPGA-Based Acceleration Engines  . . . . 262--289
              Jairo Panetta and   
            Thiago Teixeira and   
 Paulo R. P. de Souza Filho and   
   Carlos A. da Cunha Filho and   
               David Sotelo and   
  Fernando M. Roxo da Motta and   
   Silvio Sinedino Pinheiro and   
    Andre L. Romanelli Rosa and   
           Luiz R. Monnerat and   
        Leandro T. Carneiro and   
       Carlos H. B. de Albrecht   Accelerating Time and Depth Seismic
                                  Migration by CPU and GPU Cooperation . . 290--312
                Pedro Leite and   
João Marcelo Teixeira and   
              Thiago Farias and   
              Bernardo Reis and   
         Veronica Teichrieb and   
                  Judith Kelner   Nearest Neighbor Searches on the GPU: a
                                  Massively Parallel Approach for Dynamic
                                  Point Clouds . . . . . . . . . . . . . . 313--330
               Artur Santos and   
João Marcelo Teixeira and   
              Thiago Farias and   
         Veronica Teichrieb and   
                  Judith Kelner   Understanding the Efficiency of kD-tree
                                  Ray-Traversal Techniques over a GPGPU
                                  Architecture . . . . . . . . . . . . . . 331--352
  Girish Venkatasubramanian and   
       Renato J. Figueiredo and   
            Ramesh Illikkal and   
                  Donald Newell   TMT: a TLB Tag Management Framework for
                                  Virtualized Platforms  . . . . . . . . . 353--380

International Journal of Parallel Programming
Volume 40, Number 4, August, 2012

   Ákos Dudás and   
Sándor Juhász and   
    Tamás Schrádi   Software Controlled Adaptive
                                  Pre-Execution for Data Prefetching . . . 381--396
          Giuliano Laccetti and   
              Marco Lapegna and   
               Valeria Mele and   
               Diego Romano and   
                 Almerico Murli   A Double Adaptive Algorithm for
                                  Multidimensional Integration on
                                  Multicore Based HPC Systems  . . . . . . 397--409
                Rohit Jalan and   
                 Arun Kejariwal   Trin--Trin: Who's Calling? A Pin-Based
                                  Dynamic Call Graph Extraction Framework  410--442
          John M. Neuberger and   
       Nándor Sieben and   
                 James W. Swift   An MPI Implementation of a
                                  Self-Submitting Parallel Job Queue . . . 443--464

International Journal of Parallel Programming
Volume 40, Number 5, October, 2012

                  Yan Huang and   
                 Zhi-Min Gu and   
                   Jie Tang and   
                    Min Cai and   
              Jianxun Zhang and   
                         others   Estimating Effective Prefetch Distance
                                  in Threaded Prefetching for Linked Data
                                  Structures . . . . . . . . . . . . . . . 465--487
                Fadi Abboud and   
             Yosi Ben-Asher and   
            Yousef Shajrawi and   
                     Esti Stein   Combining Height Reduction and
                                  Scheduling for VLIW Machines Enhanced
                                  with Three-Argument Arithmetic
                                  Operations . . . . . . . . . . . . . . . 488--513
              Wai-Mee Ching and   
                       Da Zheng   Automatic Parallelization of
                                  Array-oriented Programs for a Multi-core
                                  Machine  . . . . . . . . . . . . . . . . 514--531
                   Joppe W. Bos   Low-Latency Elliptic Curve Scalar
                                  Multiplication . . . . . . . . . . . . . 532--550

International Journal of Parallel Programming
Volume 40, Number 6, December, 2012

            Hubertus Franke and   
           Paul H. J. Kelly and   
                 Pedro Trancoso   Guest Editorial: Computing Frontiers . . 551--552
          Alexander D. Rast and   
           Javier Navaridas and   
                    Xin Jin and   
         Francesco Galluppi and   
              Luis A. Plana and   
                         others   Managing Burstiness and Scalability in
                                  Event-Driven Models on the SpiNNaker
                                  Neuromimetic System  . . . . . . . . . . 553--582
          Stamatis Kavadias and   
          Manolis Katevenis and   
         Michail Zampetakis and   
      Dimitrios S. Nikolopoulos   Cache-Integrated Network Interfaces:
                                  Flexible On-Chip Communication and
                                  Synchronization for Large-Scale CMPs . . 583--604
                   Yong Cao and   
         Debprakash Patnaik and   
                 Sean Ponce and   
           Jeremy Archuleta and   
             Patrick Butler and   
                         others   Parallel Mining of Neuronal Spike
                                  Streams on Graphics Processing Units . . 605--632
            Vinod Tipparaju and   
               Edoardo Apra and   
                 Weikuan Yu and   
                  Xinyu Que and   
              Jeffrey S. Vetter   Runtime Techniques to Enable a
                                  Highly-Scalable Global Address Space
                                  Model for Petascale Computing  . . . . . 633--655

International Journal of Parallel Programming
Volume 41, Number 1, February, 2013

             Mounira Bachir and   
       Sid-Ahmed-Ali Touati and   
            Frederic Brault and   
                David Gregg and   
                   Albert Cohen   Minimal Unroll Factor for Code
                                  Generation of Software Pipelining  . . . 1--58
               Shixun Zhang and   
          Shinichi Yamagiwa and   
           Masahiko Okumura and   
                   Seiji Yunoki   Kernel Polynomial Method on GPU  . . . . 59--88
      Daniel Nicácio and   
        Alexandro Baldassin and   
            Guido Araújo   Transaction Scheduling Using Dynamic
                                  Conflict Avoidance . . . . . . . . . . . 89--110
          Khaled Hamidouche and   
  Fernando Machado Mendonca and   
                Joel Falcou and   
Alba Cristina Magalhaes Alves de Melo and   
                Daniel Etiemble   Parallel Smith--Waterman Comparison on
                                  Multicore and Manycore Computing
                                  Platforms with BSP++ . . . . . . . . . . 111--136
              Junchang Wang and   
                  Kai Zhang and   
                 Xinan Tang and   
                        Bei Hua   B-Queue: Efficient and Practical Queuing
                                  for Fast Core-to-Core Communication  . . 137--159

International Journal of Parallel Programming
Volume 41, Number 2, April, 2013

            John McAllister and   
                Luigi Carro and   
               Skevos Evripidou   Guest Editorial: Special Issue on 2011
                                  International Conference on Embedded
                                  Computer Systems: Architectures,
                                  Modeling and Simulation (SAMOS XI) . . . 161--162
             David A. Penry and   
               Kurtis D. Cahill   ADL-Based Specification of
                                  Implementation Styles for Functional
                                  Simulators . . . . . . . . . . . . . . . 163--211
                Oscar Almer and   
             Igor Böhm and   
      Tobias Edler von Koch and   
          Björn Franke and   
               Stephen Kyle and   
              Volker Seeker and   
       Christopher Thompson and   
                   Nigel Topham   A Parallel Dynamic Binary Translator for
                                  Efficient Multi-Core Simulation  . . . . 212--235
                 Tiago Dias and   
Sebastián López and   
                  Nuno Roma and   
                   Leonel Sousa   Scalable Unified Transform Architecture
                                  for Advanced Video Coding Embedded
                                  Systems  . . . . . . . . . . . . . . . . 236--260
          Kenneth C. Rovers and   
                      Jan Kuper   UniTi: Unified Composition and Time for
                                  Multi-domain Model-based Design  . . . . 261--304
    Karthik T. Sundararajan and   
           Timothy M. Jones and   
                Nigel P. Topham   The Smart Cache: an Energy-Efficient
                                  Cache Architecture Through Dynamic
                                  Adaptation . . . . . . . . . . . . . . . 305--330
          Stefan Langemeyer and   
               Peter Pirsch and   
                   Holger Blume   Using SDRAM Memories for
                                  High-Performance Accesses to
                                  Two-Dimensional Matrices Without
                                  Transpose  . . . . . . . . . . . . . . . 331--354

International Journal of Parallel Programming
Volume 41, Number 3, June, 2013

             Calin Cascaval and   
             Pedro Trancoso and   
                Viktor Prasanna   Guest Editorial: Computing Frontiers . . 355--356
         Alexander Heinecke and   
              Dirk Pflüger   Emerging Architectures Enable to Boost
                                  Massively Parallel Data Mining Using
                                  Adaptive Sparse Grids  . . . . . . . . . 357--399
               Chunyang Gou and   
           Georgi N. Gaydadjiev   Addressing GPU On-Chip Shared Memory
                                  Bank Conflicts Using Elastic Pipeline    400--429
         Gianfranco Bilardi and   
        Kattamuri Ekanadham and   
                Pratap Pattnaik   Efficient Stack Distance Computation for
                                  a Class of Priority Replacement Policies 430--468
                  Nawab Ali and   
      Sriram Krishnamoorthy and   
     Mahantesh Halappanavar and   
                     Jeff Daily   Multi-Fault Tolerance for Cartesian Data
                                  Distributions  . . . . . . . . . . . . . 469--493

International Journal of Parallel Programming
Volume 41, Number 4, August, 2013

             Emanuel Vianna and   
          Giovanni Comarela and   
             Tatiana Pontes and   
            Jussara Almeida and   
    Virgílio Almeida and   
            Kevin Wilkinson and   
                Harumi Kuno and   
                 Umeshwar Dayal   Analytical Performance Models for
                                  MapReduce Workloads  . . . . . . . . . . 495--525
                   Yunho Oh and   
                 Doohwan Oh and   
                      Won W. Ro   GPU-Friendly Parallel Genome Matching
                                  with Tiled Access and Reduced State
                                  Transition Table . . . . . . . . . . . . 526--551
            Claudio Schepke and   
           Nicolas Maillard and   
            Joerg Schneider and   
              Hans-Ulrich Heiss   Online Mesh Refinement for Parallel
                                  Atmospheric Models . . . . . . . . . . . 552--569
    Christopher Oßner and   
              Klemens Böhm   Graphs for Mining-Based Defect
                                  Localization in Multithreaded Programs   570--593

International Journal of Parallel Programming
Volume 41, Number 5, October, 2013

                    Bugra Gedik   Auto-tuning Similarity Search Algorithms
                                  on Multi-core Architectures  . . . . . . 595--620
            Nasser Giacaman and   
                  Oliver Sinnen   Parallel Task for Parallelising
                                  Object-Oriented Desktop Applications . . 621--681
                   Zheng Gu and   
              Matthew Small and   
                   Xin Yuan and   
          Aniruddha Marathe and   
             David K. Lowenthal   Protocol Customization for Improving MPI
                                  Performance on RDMA-Enabled Clusters . . 682--703
               Eunjung Park and   
               John Cavazos and   
    Louis-Noël Pouchet and   
      Cédric Bastoul and   
               Albert Cohen and   
                  P. Sadayappan   Predictive Modeling in a Polyhedral
                                  Optimization Space . . . . . . . . . . . 704--750

International Journal of Parallel Programming
Volume 41, Number 6, December, 2013

             Rudi Eigenmann and   
                    Sam Midkiff   Compiler Infrastructure  . . . . . . . . 751--752
                Hansang Bae and   
              Dheya Mustafa and   
                Jae-Woo Lee and   
                  Aurangzeb and   
                    Hao Lin and   
                Chirag Dave and   
           Rudolf Eigenmann and   
              Samuel P. Midkiff   The Cetus Source-to-Source Compiler
                                  Infrastructure: Overview and Evaluation  753--767
                    Yi Yang and   
                   Huiyang Zhou   The Implementation of a High Performance
                                  GPGPU Compiler . . . . . . . . . . . . . 768--781
   Gabriel Rodríguez and   
María J. Martín and   
   Patricia González and   
        Juan Touriño and   
            Ramón Doallo   Compiler-Assisted Checkpointing of
                                  Parallel Codes: The Cetus and LLVM
                                  Experience . . . . . . . . . . . . . . . 782--805
    Amin Shafiee Sarvestani and   
               Erik Hansson and   
              Christoph Kessler   Extensible Recognition of Algorithmic
                                  Patterns in DSP Programs for Automatic
                                  Parallelization  . . . . . . . . . . . . 806--824
            Barbara Chapman and   
          Deepak Eachempati and   
                Oscar Hernandez   Experiences Developing the OpenUH
                                  Compiler and Runtime Infrastructure  . . 825--854
                Xipeng Shen and   
                  Yixun Liu and   
              Eddy Z. Zhang and   
           Poornima Bhamidipati   An Infrastructure for Tackling
                                  Input-Sensitivity of GPU Program
                                  Optimizations  . . . . . . . . . . . . . 855--869

International Journal of Parallel Programming
Volume 42, Number 1, February, 2014

                  Alba Melo and   
           Jean-Luc Gaudiot and   
                Luiz DeRose and   
             Kunle Olukotun and   
                  Albert Zomaya   Guest Editorial  . . . . . . . . . . . . 1--3
Ana Avilés-González and   
               Juan Piernas and   
Pilar González-Férez   Scalable Metadata Management Through
                                  OSD+ Devices . . . . . . . . . . . . . . 4--29
                Enqiang Sun and   
                    David Kaeli   Aggressive Value Prediction on a GPU . . 30--48
                 Mouad Bahi and   
            Christine Eisenbeis   Impact of Reverse Computing on
                                  Information Locality in Register
                                  Allocation for High Performance
                                  Computing  . . . . . . . . . . . . . . . 49--76
            Joerg Schneider and   
                  Barry Linnert   List-based Data Structures for Efficient
                                  Management of Advance Reservations . . . 77--93
              Claudia Rosas and   
                Anna Sikora and   
                Josep Jorba and   
              Andreu Moreno and   
           Eduardo César   Improving Performance on Data--Intensive
                                  Applications Using a Load Balancing
                                  Methodology Based on Divisible Load
                                  Theory . . . . . . . . . . . . . . . . . 94--118
               Sasa Tomi\'c and   
      Adrián Cristal and   
                Osman Unsal and   
                   Mateo Valero   Using Dynamic Runtime Testing for Rapid
                                  Development of Architectural Simulators  119--139
                Edson Borin and   
               Guido Araujo and   
   Mauricio Breternitz, Jr. and   
                     Youfeng Wu   Microcode Compression Using
                                  Structured--Constrained Clustering . . . 140--164
           Sarala Arunagiri and   
                Yipkei Kwok and   
         Patricia J. Teller and   
        Ricardo A. Portillo and   
           Seetharami R. Seelam   FAIRIO: a Throughput-oriented Algorithm
                                  for Differentiated I/O Performance . . . 165--197
            M. M. Waliullah and   
                  Per Stenstrom   Removal of Conflicts in Hardware
                                  Transactional Memory Systems . . . . . . 198--218
                     Nam Ma and   
               Yinglong Xia and   
             Viktor K. Prasanna   Data Parallel Implementation of Belief
                                  Propagation in Factor Graphs on
                                  Multi-core Platforms . . . . . . . . . . 219--237

International Journal of Parallel Programming
Volume 42, Number 2, April, 2014

           Eduarda Monteiro and   
             Bruno Vizzotto and   
       Cláudio Diniz and   
             Marilena Maule and   
                 Bruno Zatt and   
                   Sergio Bampi   Parallelization of Full Search Motion
                                  Estimation Algorithm for Parallel and
                                  Distributed Platforms  . . . . . . . . . 239--264
           Gabriel P. Silva and   
             Juliana Correa and   
           Cristiana Bentes and   
              Sergio Guedes and   
                Mariela Gabioux   The Experience in Designing and
                                  Evaluating the High Performance Cluster
                                  Netuno . . . . . . . . . . . . . . . . . 265--286
             Mitja Bezensek and   
                    Borut Robic   A Survey of Parallel and Distributed
                                  Algorithms for the Steiner Tree Problem  287--319
        Johann Steinbrecher and   
       Cesar J. Philippidis and   
                   Weijia Shang   A Case Study of Implementing Supernode
                                  Transformations  . . . . . . . . . . . . 320--342
             John K. Holmen and   
                David L. Foster   Accelerating Single Iteration
                                  Performance of CUDA--Based $3$D
                                  Reaction--Diffusion Simulations  . . . . 343--363
             John K. Holmen and   
                David L. Foster   Erratum to: Accelerating Single
                                  Iteration Performance of CUDA--Based
                                  $3$D Reaction--Diffusion Simulations . . 364--364
Luís Fabrício Wanderley Góes and   
   Christiane Pousa Ribeiro and   
       Márcio Castro and   
Jean-François Méhaut and   
                Murray Cole and   
                 Marcelo Cintra   Automatic Skeleton-Driven Memory
                                  Affinity for Transactional Worklist
                                  Applications . . . . . . . . . . . . . . 365--382
                      Anonymous   Editor's Note  . . . . . . . . . . . . . 383--383
               Changmin Lee and   
                 Won Woo Ro and   
               Jean-Luc Gaudiot   Boosting CUDA Applications with CPU--GPU
                                  Hybrid Computing . . . . . . . . . . . . 384--404

International Journal of Parallel Programming
Volume 42, Number 3, June, 2014

            Jesus Carretero and   
               Laurence T. Yang   Parallel and Distributed Processing with
                                  Applications: Preface  . . . . . . . . . 405--407
 Jesús Cámara and   
              Javier Cuenca and   
     Domingo Giménez and   
   Luis Pedro García and   
               Antonio M. Vidal   Empirical Installation of Linear Algebra
                                  Shared-Memory Subroutines for
                                  Auto-Tuning  . . . . . . . . . . . . . . 408--434
              Abdullah Kayi and   
             Olivier Serres and   
               Tarek El-Ghazawi   Bandwidth Adaptive Cache Coherence
                                  Optimizations for Chip Multiprocessors   435--455
                  Yousun Ko and   
              Minyoung Jung and   
                 Yo-Sub Han and   
              Bernd Burgstaller   A Speculative Parallel DFA Membership
                                  Test for Multicore, SIMD and Cloud
                                  Computing Environments . . . . . . . . . 456--489
             Thomas Baumann and   
                  Michael Resch   Parallel Parameter Identification in
                                  Industrial Biotechnology . . . . . . . . 490--504
               Cheng Hua Li and   
           Laurence T. Yang and   
                        Man Lin   Parallel Training of an Improved Neural
                                  Network for Text Categorization  . . . . 505--523

International Journal of Parallel Programming
Volume 42, Number 4, August, 2014

               Gaetan Hains and   
               Youry Khmelevsky   Guest Editorial for High-level Parallel
                                  Programming and Applications . . . . . . 525--528
        Alexandra Jimborean and   
            Philippe Clauss and   
Jean-François Dollinger and   
           Vincent Loechner and   
Juan Manuel Martinez Caamaño   Dynamic and Speculative Polyhedral
                                  Parallelization Using Compiler-Generated
                                  Skeletons  . . . . . . . . . . . . . . . 529--545
                Kento Emoto and   
             Kiminori Matsuzaki   An Automatic Fusion Mechanism for
                                  Variable-Length List Skeletons in SkeTo  546--563
          Christopher Brown and   
            Marco Danelutto and   
              Kevin Hammond and   
           Peter Kilpatrick and   
              Archibald Elliott   Cost-Directed Refactoring for Parallel
                                  Erlang Programs  . . . . . . . . . . . . 564--582
           Mathias Bourgoin and   
         Emmanuel Chailloux and   
               Jean-Luc Lamotte   Efficient Abstractions for GPGPU
                                  Programming  . . . . . . . . . . . . . . 583--600
             Michel Steuwer and   
               Malte Friese and   
           Sebastian Albers and   
                Sergei Gorlatch   Introducing and Implementing the
                                  Allpairs Skeleton for Programming
                                  Multi-GPU Systems  . . . . . . . . . . . 601--618
              A. N. Yzelman and   
            R. H. Bisseling and   
                   D. Roose and   
                  K. Meerbergen   MulticoreBSP for C: a High-Performance
                                  Library for Shared-Memory Parallel
                                  Programming  . . . . . . . . . . . . . . 619--642
                Nuno Gaspar and   
             Ludovic Henrio and   
                 Eric Madelaine   Bringing Coq into the World of GCM
                                  Distributed Applications . . . . . . . . 643--662
             Stefano Chessa and   
          Susanna Pelagatti and   
               Nicoletta Triolo   Engineering Energy Efficient Visual
                                  Sensor Network Applications Using
                                  Skeletons  . . . . . . . . . . . . . . . 663--680

International Journal of Parallel Programming
Volume 42, Number 5, October, 2014

          Pavel Krömer and   
                 Jan Platos and   
    Václav Snásel   Nature-Inspired Meta-Heuristics on
                                  Modern GPUs: State of the Art and Brief
                                  Survey of Selected Algorithms  . . . . . 681--709
              Ciprian Dobre and   
                    Fatos Xhafa   Parallel Programming Paradigms and
                                  Frameworks in Big Data Era . . . . . . . 710--738
           Fahimeh Ramezani and   
                     Jie Lu and   
        Farookh Khadeer Hussain   Task-Based System Load Balancing in
                                  Cloud Computing Using Particle Swarm
                                  Optimization . . . . . . . . . . . . . . 739--754
                  Ugo Fiore and   
         Francesco Palmieri and   
        Aniello Castiglione and   
              Alfredo De Santis   A Cluster-Based Data-Centric Model for
                                  Network-Aware Task Scheduling in
                                  Distributed Systems  . . . . . . . . . . 755--775
              Ibtehal Nafea and   
            Muhammad Younas and   
              Robert Holton and   
                     Irfan Awan   A Priority-Based Admission Control
                                  Scheme for Commercial Web Servers  . . . 776--797
             Tomoya Enokido and   
         Ailixier Aikebaier and   
                Makoto Takizawa   Energy-Efficient Redundant Execution of
                                  Processes in a Fault-Tolerant Cluster of
                                  Servers  . . . . . . . . . . . . . . . . 798--819
              Zia ur Rehman and   
       Omar Khadeer Hussain and   
        Farookh Khadeer Hussain   Parallel Cloud Service Selection and
                                  Ranking Based on QoS History . . . . . . 820--852
                   Fei Song and   
              Daochao Huang and   
               Huachun Zhou and   
               Hongke Zhang and   
                      Ilsun You   An Optimization-Based Scheme for
                                  Efficient Virtual Machine Placement  . . 853--872

International Journal of Parallel Programming
Volume 42, Number 6, December, 2014

                   Alex Nicolau   Acknowledgment to Reviewers  . . . . . . 873--874
              Shin-Kai Chen and   
              Cheng-Yu Hung and   
            Ching-Chih Chen and   
                   Chih-Wei Liu   Parallelizing Complex Streaming
                                  Applications on Distributed Scratchpad
                                  Memory Multicore Architecture  . . . . . 875--899
              Young-Joo Kim and   
                 Sejun Song and   
                   Yong-Kee Jun   VORD: a Versatile On-the-fly Race
                                  Detection Tool in OpenMP Programs  . . . 900--930
              S. Sankaraiah and   
              Lam Hai Shuan and   
                 C. Eswaran and   
               Junaidi Abdullah   Performance Optimization of Video Coding
                                  Process on Multi-Core Platform Using GOP
                                  Level Parallelism  . . . . . . . . . . . 931--947
  Carlos H. González and   
            Basilio B. Fraguela   An Algorithm Template for Domain-Based
                                  Parallel Irregular Algorithms  . . . . . 948--967
           Steffen Ernsting and   
                 Herbert Kuchen   A Scalable Farm Skeleton for Hybrid
                                  Parallel and Distributed Programming . . 968--987
              Bert Gijsbers and   
                 Clemens Grelck   An Efficient Scalable Runtime System for
                                  Macro Data Flow Processing Using S-Net   988--1011
               M. Aldinucci and   
                   S. Campa and   
               M. Danelutto and   
              P. Kilpatrick and   
                    M. Torquati   Design patterns percolating to parallel
                                  programming framework implementation . . 1012--1031
         Michal Czapi\'nski and   
             Chris Thompson and   
                  Stuart Barnes   Reducing Communication Overhead in
                                  Multi-GPU Hybrid Solver for $2$D
                                  Laplace's Equation . . . . . . . . . . . 1032--1047

International Journal of Parallel Programming
Volume 43, Number 1, February, 2015

            John McAllister and   
           David Guevorkian and   
            Hartwig Jeschke and   
                     Mihai Sima   Guest Editorial: Special Issue on
                                  Embedded Computer Systems:
                                  Architectures, Modeling and Simulation   1--2
        Teemu Nyländen and   
            Jani Boutellier and   
              Karri Nikunen and   
            Jari Hannuksela and   
             Olli Silvén   Low-Power Reconfigurable Miniature
                                  Sensor Nodes for Condition Monitoring    3--23
                Amine Anane and   
         El Mostapha Aboulhamid   A Transaction-Based Environment for
                                  System Modeling and Parallel Simulation  24--58
         Georgios Keramidas and   
         Chrysovalantis Datsios   Revisiting Cache Resizing  . . . . . . . 59--85
            Daniel Baudisch and   
                Klaus Schneider   Evaluation of Speculation in
                                  Out-of-Order Execution of Synchronous
                                  Dataflow Networks  . . . . . . . . . . . 86--129
Ricardo A. Velásquez and   
             Pierre Michaud and   
            André Seznec   BADCO: Behavioral Application-Dependent
                                  Superscalar Core Models  . . . . . . . . 130--157

International Journal of Parallel Programming
Volume 43, Number 2, April, 2015

             Markus Metzger and   
                Xinmin Tian and   
               Walfred Tedeschi   User-Guided Dynamic Data Race Detection  159--179
              Jianxun Zhang and   
                  Zhimin Gu and   
                  Yan Huang and   
              Ninghan Zheng and   
                     Xiaohan Hu   Helper Thread Prefetching Control
                                  Framework on Chip Multi-processor  . . . 180--202
               I. Z. Reguly and   
                    M. B. Giles   Finite Element Algorithms and Data
                                  Structures on Graphical Processing Units 203--239
         Matthew Williamson and   
                   K. Subramani   A Parallel Implementation for the
                                  Negative Cost Girth Problem  . . . . . . 240--259
                Zhendong Wu and   
                     Kai Lu and   
              Xiaoping Wang and   
                        Xu Zhou   Collaborative Technique for Concurrency
                                  Bug Detection  . . . . . . . . . . . . . 260--285
              Kshitij Mehta and   
                  Edgar Gabriel   Multi-Threaded Parallel I/O for OpenMP
                                  Applications . . . . . . . . . . . . . . 286--309

International Journal of Parallel Programming
Volume 43, Number 3, June, 2015

            Ching-Hsien Hsu and   
                Xiaoming Li and   
                    Xuanhua Shi   Network and Parallel Computing . . . . . 311--315
                Quanqing Xu and   
                 Liang Zhao and   
             Mingzhong Xiao and   
                   Anna Liu and   
                      Yafei Dai   YuruBackup: a Space-Efficient and Highly
                                  Scalable Incremental Backup System in
                                  the Cloud  . . . . . . . . . . . . . . . 316--338
                  Hui Huang and   
                  Ligang He and   
              Xueguang Chen and   
                 Minghui Yu and   
                     Zhiwu Wang   Automatic Composition of Heterogeneous
                                  Models Based on Semantic Web Services    339--358
               Xiaowen Feng and   
                    Hai Jin and   
                  Ran Zheng and   
                    Lei Zhu and   
                      Weiqi Dai   Accelerating Smith--Waterman Alignment
                                  of Species-Based Protein Sequences on
                                  GPU  . . . . . . . . . . . . . . . . . . 359--380
                  Edwin Sha and   
                    Li Wang and   
             Qingfeng Zhuge and   
                  Jun Zhang and   
                       Jing Liu   Power Efficiency for Hardware/Software
                                  Partitioning with Time and Area
                                  Constraints on MPSoC . . . . . . . . . . 381--402
                    Hai Jin and   
                Hanfeng Qin and   
                    Song Wu and   
                    Xuerong Guo   CCAP: a Cache Contention-Aware Virtual
                                  Machine Placement Approach for HPC Cloud 403--420
             Bernhard Egger and   
            Erik Gustafsson and   
               Changyeon Jo and   
                  Jeongseok Son   Efficiently Restoring Virtual Machines   421--439
                 Feng Liang and   
                Yunzhen Liu and   
                    Hai Liu and   
                 Shilong Ma and   
                 Bettina Schnor   A Parallel Job Execution Time Estimation
                                  Approach Based on User Submission
                                  Patterns within Computational Grids  . . 440--454
             Xianming Zhong and   
           Chengcheng Xiang and   
                    Miao Yu and   
                Zhengwei Qi and   
                   Haibing Guan   A Virtualization Based Monitoring System
                                  for Mini-intrusive Live Forensics  . . . 455--471
                    Zhao Li and   
                   Yao Shen and   
                    Bin Yao and   
                      Minyi Guo   OFScheduler: a Dynamic Network Optimizer
                                  for MapReduce in Heterogeneous Cluster   472--488
               Kenn Slagter and   
            Ching-Hsien Hsu and   
                Yeh-Ching Chung   An Adaptive and Memory Efficient
                                  Sampling Mechanism for Partitioning in
                                  MapReduce  . . . . . . . . . . . . . . . 489--507
                Songbin Liu and   
             Xiaomeng Huang and   
                 Haohuan Fu and   
              Guangwen Yang and   
                    Zhenya Song   Data Reduction Analysis for Climate Data
                                  Sets . . . . . . . . . . . . . . . . . . 508--527
                    Hai Jin and   
              Honglei Jiang and   
              Shadi Ibrahim and   
                   Xiaofei Liao   Inaccuracy in Private BitTorrent
                                  Measurements . . . . . . . . . . . . . . 528--547

International Journal of Parallel Programming
Volume 43, Number 4, August, 2015

              Dheya Mustafa and   
               Rudolf Eigenmann   PETRA: Performance Evaluation Tool for
                                  Modern Parallelizing Compilers . . . . . 549--571
             Steven Feldman and   
             Pierre LaBorde and   
                  Damian Dechev   A Wait-Free Multi-Word Compare-and-Swap
                                  Operation  . . . . . . . . . . . . . . . 572--596
               Tae-Hyuk Ahn and   
               Adrian Sandu and   
            Layne T. Watson and   
        Clifford A. Shaffer and   
                   Yang Cao and   
             William T. Baumann   A Framework to Analyze the Performance
                                  of Load Balancing Schemes for Ensembles
                                  of Stochastic Simulations  . . . . . . . 597--630
             Ryma Mahfoudhi and   
              Zaher Mahjoub and   
                    Wahid Nasri   Parallel Communication-Avoiding
                                  Algorithm for Triangular Matrix
                                  Inversion on Homogeneous and
                                  Heterogeneous Platforms  . . . . . . . . 631--655
                  Ali Jannesari   Detection of High-Level Synchronization
                                  Anomalies in Parallel Programs . . . . . 656--678

International Journal of Parallel Programming
Volume 43, Number 5, October, 2015

               Daniel Langr and   
        Pavel Tvrdík and   
               Ivan Simecek and   
           Tomás Dytrych   Downsampling Algorithms for Large Sparse
                                  Matrices . . . . . . . . . . . . . . . . 679--702
 Alejandro Hidalgo-Paniagua and   
Miguel A. Vega-Rodríguez and   
        Nieves Pavón and   
          Joaquín Ferruz   A Comparative Study of Parallel RANSAC
                                  Implementations in $3$D Space  . . . . . 703--720
                 Deli Zhang and   
              Brendan Lynch and   
                  Damian Dechev   Queue-Based and Adaptive Lock Algorithms
                                  for Scalable Resource Allocation on
                                  Shared--Memory Multiprocessors . . . . . 721--751
Pekka Jääskeläinen and   
Carlos Sánchez de La Lama and   
             Erik Schnetter and   
             Kalle Raiskila and   
               Jarmo Takala and   
                    Heikki Berg   pocl: a Performance-Portable OpenCL
                                  Implementation . . . . . . . . . . . . . 752--785
María Botón-Fernández and   
Manuel Rodríguez-Pascual and   
Miguel A. Vega-Rodríguez and   
 Francisco Prieto-Castrillo and   
      Rafael Mayo-García   A Comparative Analysis of Adaptive
                                  Solutions for Grid Environments  . . . . 786--811
               Jakub Nalepa and   
                Miroslaw Blocho   Co-operation in the Parallel Memetic
                                  Algorithm  . . . . . . . . . . . . . . . 812--839
           Slobodan Jeli\'c and   
            Sören Laue and   
        Domagoj Matijevi\'c and   
               Patrick Wijerama   A Fast Parallel Implementation of a PTAS
                                  for Fractional Packing and Covering
                                  Linear Programs  . . . . . . . . . . . . 840--875
              Jose L. Jodra and   
            Ibai Gurrutxaga and   
                Javier Muguerza   Efficient $3$D Transpositions in
                                  Graphics Processing Units  . . . . . . . 876--891
              Christopher Brown   High-Level Heterogeneous and
                                  Hierarchical Parallel Systems (HLPGPU
                                  2014)  . . . . . . . . . . . . . . . . . 892--893
        Ashkan Tousimojarad and   
             Wim Vanderbauwhede   Steal Locally, Share Globally  . . . . . 894--917
       Hector Ortega-Arranz and   
                Yuri Torres and   
  Arturo Gonzalez-Escribano and   
                Diego R. Llanos   Comprehensive Evaluation of a New
                                  GPU-based Approach to the Shortest Path
                                  Problem  . . . . . . . . . . . . . . . . 918--938
       Hector Ortega-Arranz and   
                Yuri Torres and   
  Arturo Gonzalez-Escribano and   
                Diego R. Llanos   TuCCompi: a Multi-layer Model for
                                  Distributed Heterogeneous Computing with
                                  Tuning Capabilities  . . . . . . . . . . 939--960

International Journal of Parallel Programming
Volume 43, Number 6, December, 2015

               Guido Araujo and   
               Jean-Luc Gaudiot   Guest Editorial: SBAC--PAD 2013  . . . . 961--964
                  Yun R. Qu and   
                Shijie Zhou and   
             Viktor K. Prasanna   A Decomposition-Based Approach for
                                  Scalable Many-Field Packet
                                  Classification on Multi-core Processors  965--987
             Karlo G. Lenzi and   
        Felipe A. P. Figueiredo   Fully Optimized Code Block Segmentation
                                  Algorithm for LTE--Advanced  . . . . . . 988--1003
           Martin Schreiber and   
            Christoph Riesinger   Invasive Compute Balancing for
                                  Applications with Shared and Hybrid
                                  Parallelization  . . . . . . . . . . . . 1004--1027
                  Zifan Liu and   
                 Nahid Emad and   
               Soufian Ben Amor   PageRank Computation Using a Multiple
                                  Implicitly Restarted Arnoldi Method for
                                  Modeling Epidemic Spread . . . . . . . . 1028--1053
                 Guohong Li and   
              Olivier Temam and   
                     Zhenyu Liu   Cluster Cache Monitor: Leveraging the
                                  Proximity Data in CMP  . . . . . . . . . 1054--1077
                J. Lobeiras and   
                    M. Amor and   
                      R. Doallo   BPLG: a Tuned Butterfly Processing
                                  Library for GPU Architectures  . . . . . 1078--1102
         Paul-Antoine Arras and   
                    Didier Fuin   List Scheduling in Embedded Systems
                                  Under Memory Constraints . . . . . . . . 1103--1128
            Bharat Sukhwani and   
            Mathew Thoennes and   
                       Hong Min   A Hardware/Software Approach for
                                  Database Query Acceleration with FPGAs   1129--1159
    Gregorio Bernabé and   
                  Javier Cuenca   An Autotuning Engine for the $3$D Fast
                                  Wavelet Transform on Clusters with
                                  Hybrid CPU + GPU Platforms . . . . . . . 1160--1191
                    Gong Su and   
                 Stephen Heisig   The Scalability of Disjoint Data
                                  Structures on a New Hardware
                                  Transactional Memory System  . . . . . . 1192--1217
    George Michelogiannakis and   
                   Xiaoye S. Li   Extending Summation Precision for
                                  Network Reduction Operations . . . . . . 1218--1243

International Journal of Parallel Programming
Volume 44, Number 1, February, 2016

            Ching-Hsien Hsu and   
             Valentina Salapura   Network and Parallel Computing . . . . . 1--4
            Chengcheng Yang and   
                Peiquan Jin and   
                      Lihua Yue   Efficient Buffer Management for Tree
                                  Indexes on Solid State Drives  . . . . . 5--25
               Ralph Duncan and   
               Peder Jungck and   
                   Kenneth Ross   Using Packet Processing Object Modules
                                  Interchangeably as Stand-Alone Programs
                                  or ``Multi-app'' Components  . . . . . . 26--45
            Mei-Ling Chiang and   
                  Bo-Wen Yu and   
                 Chi-Shian Shia   Operating System Enhancement for
                                  Supporting Massively Multiplayer Online
                                  Games in a Server Cluster  . . . . . . . 46--67
               Xiaofei Liao and   
                Rentong Guo and   
                     Danping Yu   A Phase Behavior Aware Dynamic Cache
                                  Partitioning Scheme for CMPs . . . . . . 68--86
               Byungjoo Kim and   
               Jung Eun Lee and   
                   Young J. Kim   GPU Accelerated Finding of Channels and
                                  Tunnels for a Protein Molecule . . . . . 87--108
                  Yulong Yu and   
                   Xubin He and   
                     He Guo and   
                     Yuxin Wang   A Credit-Based Load-Balance-Aware CTA
                                  Scheduling Optimization Scheme in GPGPU  109--129
                      Xi Li and   
         Anthony Ventresque and   
                    John Murphy   SOC: Satisfaction-Oriented Virtual
                                  Machine Consolidation in Enterprise Data
                                  Centers  . . . . . . . . . . . . . . . . 130--150
                 Yihua Ding and   
              James Z. Wang and   
              Pradip K. Srimani   A Linear Time Self-stabilizing Algorithm
                                  for Minimal Weakly Connected Dominating
                                  Sets . . . . . . . . . . . . . . . . . . 151--162
                   Jian Cao and   
                   Qiang Li and   
                   Yuede Ji and   
                       Yukun He   Detection of Forwarding-Based Malicious
                                  URLs in Online Social Networks . . . . . 163--180
                 Lizhi Peng and   
                    Bo Yang and   
                    Yuehui Chen   Effectiveness of Statistical Features
                                  for Early Stage Internet Traffic
                                  Identification . . . . . . . . . . . . . 181--197
                Zhaoxin Fan and   
              Shuoying Chen and   
                         Li Zha   A Text Clustering Approach of Chinese
                                  News Based on Neural Network Language
                                  Model  . . . . . . . . . . . . . . . . . 198--206

International Journal of Parallel Programming
Volume 44, Number 2, April, 2016

                      Anonymous   Editor's Note: Special Section on
                                  Data-Flow for Multicore  . . . . . . . . 207--207
             Sebastian Weis and   
               Arne Garbade and   
           Bernhard Fechner and   
              Avi Mendelson and   
             Roberto Giorgi and   
                   Theo Ungerer   Architectural Support for Fault
                                  Tolerance in a Teradevice Dataflow
                                  System . . . . . . . . . . . . . . . . . 208--232
          Dragos Sb\^\irlea and   
                Jun Shirako and   
                Ryan Newton and   
                   Vivek Sarkar   SCnC: Efficient Unification of Streaming
                                  with Dynamic Task Parallelism  . . . . . 233--256
          Andreas Diavastos and   
             Pedro Trancoso and   
         Mikel Luján and   
                     Ian Watson   Integrating Transactions into the
                                  Data-Driven Multi-threading Model Using
                                  the TFlux Platform . . . . . . . . . . . 257--277
              Daniel Orozco and   
               Elkin Garcia and   
               Robert Pavel and   
              Jaime Arteaga and   
                      Guang Gao   The Design and Implementation of
                                  TIDeFlow: A Dataflow-Inspired Execution
                                  Model for Parallel Loops and Task
                                  Pipelining . . . . . . . . . . . . . . . 278--307
                      Anonymous   Editor's Note: Special Section on
                                  Concurrent Systems: Status and
                                  Perspectives . . . . . . . . . . . . . . 308--308
               Nakul Jindal and   
             Victor Lotrich and   
               Erik Deumens and   
             Beverly A. Sanders   Exploiting GPUs with the Super
                                  Instruction Architecture . . . . . . . . 309--324
            W. Morven Gentleman   Concurrency Paradigms: Competitive,
                                  Coordinated, and Collaborative: Which
                                  Control Mechanisms are Appropriate?  . . 325--336
   Emre Kültürsay and   
             Kemal Ebcioglu and   
Gürhan Küçük and   
             Mahmut T. Kandemir   Memory Partitioning in the Limit . . . . 337--380

International Journal of Parallel Programming
Volume 44, Number 3, June, 2016

                      Anonymous   Editor's Note: High-Level Parallel
                                  Programming and Applications (HLPP)  . . 381--382
                 Clemens Grelck   Guest Editorial for High-Level Parallel
                                  Programming and Applications . . . . . . 383--385
              Miguel Areias and   
                  Ricardo Rocha   A Lock-Free Hash Trie Design for
                                  Concurrent Tabled Logic Programs . . . . 386--406
           Alvaro Estebanez and   
            Diego R. Llanos and   
      Arturo Gonzalez-Escribano   New Data Structures to Handle
                                  Speculative Parallelization at Runtime   407--426
                    Ye Wang and   
                     Zhiyuan Li   GridFOR: a Domain Specific Language for
                                  Parallel Grid-Based Applications . . . . 427--448
           Antoine Tran Tan and   
                Joel Falcou and   
            Daniel Etiemble and   
                 Hartmut Kaiser   Automatic Task-Based Code Generation for
                                  High Performance Domain Specific
                                  Embedded Language  . . . . . . . . . . . 449--465
         Kiminori Matsuzaki and   
                 Reina Miyazaki   Parallel Tree Accumulations on MapReduce 466--485
              Tarek Menouer and   
             Mohamed Rezgui and   
            Bertrand Le Cun and   
      Jean-Charles Régin   Mixing Static and Dynamic Partitioning
                                  to Parallelize a Constraint Programming
                                  Solver . . . . . . . . . . . . . . . . . 486--505
             Usman Dastgeer and   
              Christoph Kessler   Smart Containers and Skeleton
                                  Programming for GPU-Based Systems  . . . 506--530
            Marco Aldinucci and   
                Sonia Campa and   
            Marco Danelutto and   
           Peter Kilpatrick and   
               Massimo Torquati   Pool Evolution: a Parallel Pattern for
                                  Evolutionary and Symbolic Computing  . . 531--551
       Tristan Aubrey-Jones and   
                  Bernd Fischer   Synthesizing MPI Implementations from
                                  Functional Data-Parallel Programs  . . . 552--573
                Jean Fortin and   
    Frédéric Gava   BSP-Why: a Tool for Deductive
                                  Verification of BSP Algorithms with
                                  Subgroup Synchronisation . . . . . . . . 574--597
                Konrad Siek and   
         Pawel T. Wojciechowski   Atomic RMI: a Distributed Transactional
                                  Memory Framework . . . . . . . . . . . . 598--619
José M. Andión and   
              Manuel Arenaz and   
      François Bodin and   
   Gabriel Rodríguez and   
            Juan Touriño   Locality-Aware Automatic Parallelization
                                  for GPGPU with OpenHMPP Directives . . . 620--643
              Ali Jannesari and   
                     Felix Wolf   Automatic Generation of Unit Tests for
                                  Correlated Variables in Parallel
                                  Programs . . . . . . . . . . . . . . . . 644--662
Carlos Alberto Martínez-Angeles and   
                Haicheng Wu and   
           Inês Dutra and   
  Vítor Santos Costa and   
   Jorge Buenabad-Chávez   Relational Learning with GPUs:
                                  Accelerating Rule Coverage . . . . . . . 663--685
             Shigeyuki Sato and   
             Kiminori Matsuzaki   A Generic Implementation of Tree
                                  Skeletons  . . . . . . . . . . . . . . . 686--707

International Journal of Parallel Programming
Volume 44, Number 4, August, 2016

            Juan Chabkinian and   
        Thomas J. E. Schwarz SJ   Fast LH$*$ . . . . . . . . . . . . . . . 709--734
             Marco Lattuada and   
           Christian Pilato and   
              Fabrizio Ferrandi   Performance Estimation of Task Graphs
                                  Based on Path Profiling  . . . . . . . . 735--771
             Srimanth Gadde and   
             William Acosta and   
          Jordan Ringenberg and   
               Robert Green and   
             Vijay Devabhaktuni   Achieving Optimal Inter-Node
                                  Communication in Graph Partitioning
                                  Using Random Selection and Breadth-First
                                  Search . . . . . . . . . . . . . . . . . 772--800
        Ayaz ul Hassan Khan and   
          Mayez Al-Mouhamed and   
              Allam Fatayer and   
           Nazeeruddin Mohammad   Optimizing the Matrix Multiplication
                                  Using Strassen and Winograd Algorithms
                                  with Limited Recursions on Many-Core . . 801--830
        Ayaz ul Hassan Khan and   
          Mayez Al-Mouhamed and   
              Allam Fatayer and   
           Nazeeruddin Mohammad   Erratum to: Optimizing the Matrix
                                  Multiplication Using Strassen and
                                  Winograd Algorithms with Limited
                                  Recursions on Many--Core . . . . . . . . 831--831
                     Ren Li and   
                   Haibo Hu and   
                    Heng Li and   
                 Yunsong Wu and   
                    Jianxi Yang   MapReduce Parallel Programming Model: A
                                  State-of-the-Art Survey  . . . . . . . . 832--866
                 Etem Deniz and   
                      Alper Sen   Using Machine Learning Techniques to
                                  Detect Parallel Patterns of
                                  Multi-threaded Applications  . . . . . . 867--900
          Giuliano Laccetti and   
              Marco Lapegna and   
                   Valeria Mele   A Loosely Coordinated Model for
                                  Heap-Based Priority Queues in Multicore
                                  Environments . . . . . . . . . . . . . . 901--921

International Journal of Parallel Programming
Volume 44, Number 5, October, 2016

                      Anonymous   Editor's Note: Special Issue on
                                  Computing Frontiers  . . . . . . . . . . 923--923
             Andreea Anghel and   
    Laura Mihaela Vasilescu and   
           Giovanni Mariani and   
              Rik Jongerius and   
                  Gero Dittmann   An Instrumentation Approach for
                                  Hardware-Agnostic Software
                                  Characterization . . . . . . . . . . . . 924--948
              Musfiq Rahman and   
              Bruce R. Childers   Asteroid: Scalable Online Memory
                                  Diagnostics for Multi-core, Multi-socket
                                  Servers  . . . . . . . . . . . . . . . . 949--974
           Giovanni Mariani and   
             Andreea Anghel and   
              Rik Jongerius and   
                  Gero Dittmann   Scaling Properties of Parallel
                                  Applications to Exascale . . . . . . . . 975--1002
             Leandro Fiorin and   
                Erik Vermij and   
           Jan van Lunteren and   
              Rik Jongerius and   
           Christoph Hagleitner   Exploring the Design Space of an
                                  Energy-Efficient Accelerator for the
                                  SKA1-Low Central Signal Processor  . . . 1003--1027
        Archimedes Pavlidis and   
            Dimitris Gizopoulos   Hierarchical Synthesis of Quantum and
                                  Reversible Architectures . . . . . . . . 1028--1053
                    Rui Han and   
              Jianfeng Zhan and   
      Jose Vazquez-Poletti Luis   SARP: Synopsis--Based Approximate
                                  Request Processing for Low Latency and
                                  Small Correctness Loss in Cloud Online
                                  Services . . . . . . . . . . . . . . . . 1054--1077
       Vassilis Vassiliadis and   
        Charalampos Chalios and   
     Konstantinos Parasyris and   
   Christos D. Antonopoulos and   
               Spyros Lalis and   
            Nikolaos Bellas and   
        Hans Vandierendonck and   
      Dimitrios S. Nikolopoulos   Exploiting Significance of Computations
                                  for Energy-Constrained Approximate
                                  Computing  . . . . . . . . . . . . . . . 1078--1098

International Journal of Parallel Programming
Volume 44, Number 6, December, 2016

                  Chao Wang and   
               Nadia Nedjah and   
          Luiza M. Mourelle and   
                      Aili Wang   Preface to the Special Issue on
                                  Sequential Code Parallelization  . . . . 1099--1101
               Nadia Nedjah and   
   Luiza de Macedo Mourelle and   
                      Chao Wang   A Parallel Yet Pipelined Architecture
                                  for Efficient Implementation of the
                                  Advanced Encryption Standard Algorithm
                                  on Reconfigurable Hardware . . . . . . . 1102--1117
                 Huang Wang and   
              Xianglan Chen and   
                   Huaping Chen   A Cross-ISA Kernelized High-Performance
                                  Parallel Emulator  . . . . . . . . . . . 1118--1141
                Ansar Javed and   
               Bibrak Qamar and   
              Mohsan Jameel and   
                Aamir Shafi and   
                Bryan Carpenter   Towards Scalable Java HPC with Hybrid
                                  and Native Communication Devices in MPJ
                                  Express  . . . . . . . . . . . . . . . . 1142--1172
               Nadia Nedjah and   
Rogério de M. Calazan and   
   Luiza de Macedo Mourelle and   
                      Chao Wang   Parallel Implementations of the
                                  Cooperative Particle Swarm Optimization
                                  on Many-core and Multi-core
                                  Architectures  . . . . . . . . . . . . . 1173--1199
      Alessandro Pellegrini and   
          Sebastiano Peluso and   
          Francesco Quaglia and   
                 Roberto Vitali   Transparent Speculative Parallelization
                                  of Discrete Event Simulation
                                  Applications Using Global Variables  . . 1200--1247
             Xiaomeng Huang and   
                  Yufang Ni and   
                 Dexun Chen and   
                Songbin Liu and   
                 Haohuan Fu and   
                  Guangwen Yang   Czip: a Fast Lossless Compression
                                  Algorithm for Climate Data . . . . . . . 1248--1267
               Rachid Habel and   
Frédérique Silber-Chaussumier and   
    François Irigoin and   
           Elisabeth Brunet and   
         François Trahay   Combining Data and Computation
                                  Distribution Directives for Hybrid
                                  Parallel Programming: a Transformation
                                  System . . . . . . . . . . . . . . . . . 1268--1295
               Martin Frieb and   
                  Ralf Jahr and   
              Haluk Ozaktas and   
               Andreas Hugl and   
                Hans Regler and   
                   Theo Ungerer   A Parallelization Approach for Hard
                                  Real-Time Systems and Its Application on
                                  Two Industrial Programs  . . . . . . . . 1296--1336
            Alcides Fonseca and   
               Bruno Cabral and   
         João Rafael and   
                    Ivo Correia   Automatic Parallelization: Executing
                                  Sequential Programs on a Task-Based
                                  Parallel Runtime . . . . . . . . . . . . 1337--1358
          Abubakar Siddique and   
            Mohammad Ansari and   
             Mikel Luján   Purge--Rehab: Eager Software
                                  Transactional Memory with High
                                  Performance Under Contention . . . . . . 1359--1383

International Journal of Parallel Programming
Volume 45, Number 1, February, 2017

   Vijayalakshmi Srinivasan and   
                  Yunquan Zhang   Special Issue on Network and Parallel
                                  Computing  . . . . . . . . . . . . . . . 1--3
               Jinbao Zhang and   
               Xiaofei Liao and   
                    Hai Jin and   
                   Dong Liu and   
                     Li Lin and   
                       Kao Zhao   An Optimal Page-Level Power Management
                                  Strategy in PCM--DRAM Hybrid Memory  . . 4--16
         Vesna Smiljkovi\'c and   
           Osman Ünsal and   
      Adrián Cristal and   
                   Mateo Valero   Determinism at Standard-Library Level in
                                  TM-Based Applications  . . . . . . . . . 17--29
               Chencheng Ye and   
                Jacob Brock and   
                  Chen Ding and   
                        Hai Jin   Rochester Elastic Cache Utility (RECU):
                                  Unequal Cache Sharing is Good Economics  30--44
                    Song Wu and   
               Yongchang Li and   
                Xinhou Wang and   
                    Hai Jin and   
                    Hanhua Chen   Vshadow: Promoting Physical Servers into
                                  Virtualization World . . . . . . . . . . 45--66
                  Yaojie Lu and   
            Sotirios G. Ziavras   Instruction Fusion for Multiscalar and
                                  Many-Core Processors . . . . . . . . . . 67--78
                    Jing Li and   
                    Lei Liu and   
                    Yuan Wu and   
              Xiaobing Feng and   
                   Chengyong Wu   Two-Level Task Scheduling for Irregular
                                  Applications on GPU Platform . . . . . . 79--93
             Preeti Malakar and   
           Venkatram Vishwanath   Hierarchical Read--Write Optimizations
                                  for Scientific Applications with
                                  Multi-variable Structured Datasets . . . 94--108
              Maksudul Alam and   
                     Maleq Khan   Parallel Algorithms for Generating
                                  Random Networks with Given Degree
                                  Sequences  . . . . . . . . . . . . . . . 109--127
                   Yu Zhang and   
                    Huifang Cao   DMR: a Deterministic MapReduce for
                                  Multicore Systems  . . . . . . . . . . . 128--141
                 Sheng Wang and   
             Weizhong Qiang and   
                    Hai Jin and   
                   Jinfeng Yuan   CovertInspector: Identification of
                                  Shared Memory Covert Timing Channel in
                                  Multi-tenanted Cloud . . . . . . . . . . 142--156
              Jiansheng Yao and   
               Chunguang Ma and   
                    Peng Wu and   
                    Gang Du and   
                        Qi Yuan   An Opportunistic Network Coding Routing
                                  for Opportunistic Networks . . . . . . . 157--171
                    Yong Su and   
                  Zhan Wang and   
                 Zhiguo Fan and   
                  Zheng Cao and   
                 Xiaoli Liu and   
                    En Shao and   
                  Xuejun An and   
                    Ninghui Sun   HyperFatTree: a Large-Scale Tree-Based
                                  Network with Low-Radix Switches  . . . . 172--184
                Xingjing Lu and   
                  Long Chen and   
                     Zhiyuan Li   Performance Evaluation and Enhancement
                                  of Process-Based Parallel Loop Execution 185--198

International Journal of Parallel Programming
Volume 45, Number 2, April, 2017

            Marco Danelutto and   
          Susanna Pelagatti and   
               Massimo Torquati   Guest Editorial: High-Level Parallel
                                  Programming and Applications . . . . . . 199--202
                 Mehdi Goli and   
Horacio González-Vélez   Autonomic Coordination of Skeleton-Based
                                  Applications Over CPU/GPU Multi-Core
                                  Architectures  . . . . . . . . . . . . . 203--224
           Alvaro Estebanez and   
            Diego R. Llanos and   
      Arturo Gonzalez-Escribano   Using the Xeon Phi Platform to Run
                                  Speculatively-Parallelized Codes . . . . 225--241
           Mathias Bourgoin and   
         Emmanuel Chailloux and   
               Jean-Luc Lamotte   High Level Data Structures for GPGPU
                                  Programming in a Statically Typed
                                  Language . . . . . . . . . . . . . . . . 242--261
           Rafael Sotomayor and   
        Luis Miguel Sanchez and   
         Javier Garcia Blas and   
           Javier Fernandez and   
               J. Daniel Garcia   Automatic CPU/GPU Generation of
                                  Multi-versioned OpenCL Kernels for C++
                                  Scientific Applications  . . . . . . . . 262--282
           Steffen Ernsting and   
                 Herbert Kuchen   Data Parallel Algorithmic Skeletons with
                                  Accelerator Support  . . . . . . . . . . 283--299
Frédéric Loulergue and   
            Wadoud Bousdira and   
                  Julien Tesson   Calculating Parallel Programs in Coq
                                  Using List Homomorphisms . . . . . . . . 300--319
                Le-Duc Tung and   
                   Zhenjiang Hu   Towards Systematic Parallelization of
                                  Graph Transformations Over Pregel  . . . 320--339
               V. Allombert and   
                    F. Gava and   
                      J. Tesson   Multi-ML: Programming Multi-BSP
                                  Algorithms in ML . . . . . . . . . . . . 340--361
             Kiminori Matsuzaki   Functional Models of Hadoop MapReduce
                                  with Application to Scan . . . . . . . . 362--381
         Tiziano De Matteis and   
              Gabriele Mencagli   Parallel Patterns for Window-Based
                                  Stateful Operators on Data Streams: an
                                  Algorithmic Skeleton Approach  . . . . . 382--401
              J. Darlington and   
                A. J. Field and   
                       L. Hakim   Tackling Complexity in High Performance
                                  Computing Applications . . . . . . . . . 402--420

International Journal of Parallel Programming
Volume 45, Number 3, June, 2017

             Pierre Laborde and   
             Steven Feldman and   
                  Damian Dechev   A Wait-Free Hash Map . . . . . . . . . . 421--448
               Nuno Fachada and   
             Vitor V. Lopes and   
             Rui C. Martins and   
              Agostinho C. Rosa   Parallelization Strategies for Spatial
                                  Agent-Based Models . . . . . . . . . . . 449--481
         Milos Cvetanovi\'c and   
     Zaharije Radivojevi\'c and   
           Veljko Milutinovi\'c   Restart Optimization for Transactional
                                  Memory with Lazy Conflict Detection  . . 482--507
                Jiaquan Gao and   
                   Zejie Li and   
              Ronghua Liang and   
                      Guixia He   Adaptive Optimization $
                                  l_1$-Minimization Solvers on GPU . . . . 508--529
              Victor Garcia and   
             Alejandro Rico and   
          Carlos Villavieja and   
             Paul Carpenter and   
              Nacho Navarro and   
                   Alex Ramirez   Adaptive Runtime-Assisted Block
                                  Prefetching on Chip-Multiprocessors  . . 530--550
               Ayaz H. Khan and   
          Mayez Al-Mouhamed and   
         Muhammed Al-Mulhem and   
                  Adel F. Ahmed   RT-CUDA: a Software Tool for CUDA Code
                                  Restructuring  . . . . . . . . . . . . . 551--594
                 Yiming Han and   
        Anthony T. Chronopoulos   Scalable Loop Self-scheduling Schemes
                                  for Large-Scale Clusters and Cloud
                                  Systems  . . . . . . . . . . . . . . . . 595--611
               Asim YarKhan and   
               Jakub Kurzak and   
             Piotr Luszczek and   
                  Jack Dongarra   Porting the PLASMA Numerical Library to
                                  the OpenMP Standard  . . . . . . . . . . 612--633
          Krupa Sivakumaran and   
                 Arul Siromoney   Priority Based Yield of Shared Cache to
                                  Provide Cache QoS in Multicore Systems   634--656
                  Shuai Che and   
       Bradford M. Beckmann and   
            Steven K. Reinhardt   Programming GPGPU Graph Applications
                                  with Linear Algebra Building Blocks  . . 657--679
             Xiao-qing Wang and   
              Xian-long Jin and   
                 Da-zhi Kou and   
                   Jia-hui Chen   A Parallel Approach for the Generation
                                  of Unstructured Meshes with Billions of
                                  Elements on Distributed-Memory
                                  Supercomputers . . . . . . . . . . . . . 680--710
          Mohammed Sourouri and   
             Scott B. Baden and   
                       Xing Cai   Panda: a Compiler Framework for
                                  Concurrent CPU $+$ GPU Execution of $3$D
                                  Stencil Computations on GPU-accelerated
                                  Supercomputers . . . . . . . . . . . . . 711--729

International Journal of Parallel Programming
Volume 45, Number 4, August, 2017

                 Maozhen Li and   
                      Zhuo Tang   Guest Editorial: The Parallel Storage,
                                  Processing and Analysis for Big Data . . 731--733
                Qicong Wang and   
                Jinhao Zhao and   
                Dingxi Gong and   
                  Yehu Shen and   
                 Maozhen Li and   
                      Yunqi Lei   Parallelizing Convolutional Neural
                                  Networks for Action Event Recognition in
                                  Surveillance Videos  . . . . . . . . . . 734--759
                   Yang Liu and   
                 Lixiong Xu and   
                     Maozhen Li   The Parallelization of Back Propagation
                                  Neural Network in MapReduce and Spark    760--779
            Kien Tuong Phan and   
        Tomas Henrique Maul and   
                  Tuong Thuy Vu   An Empirical Study on Improving the
                                  Speed and Generalization of Neural
                                  Networks Using a Parallel Circuit
                                  Approach . . . . . . . . . . . . . . . . 780--796
            Hsiang-Huang Wu and   
                 Chien-Min Wang   Generalization of Large-Scale Data
                                  Processing in One MapReduce Job for
                                  Coarse-Grained Parallelism . . . . . . . 797--826
                   Yan Wang and   
                   Kenli Li and   
                       Keqin Li   Partition Scheduling on Heterogeneous
                                  Multicore Processors for
                                  Multi-dimensional Loops Applications . . 827--852
                  Zhuoer Gu and   
                  Ligang He and   
                Cheng Chang and   
                Jianhua Sun and   
                   Hao Chen and   
                  Chenlin Huang   Developing an Efficient Pattern
                                  Discovery Method for CPU Utilizations of
                                  Computers  . . . . . . . . . . . . . . . 853--878
                    Wei Liu and   
                    Lu Wang and   
                   Yuyue Du and   
                     Maozhen Li   Deadlock Property Analysis of Concurrent
                                  Programs Based on Petri Net Structure    879--898
               Aijia Ouyang and   
                  Xuyu Peng and   
                   Jing Liu and   
                   Ahmed Sallam   Hardware/Software Partitioning for
                                  Heterogeneous MPSoC Considering
                                  Communication Overhead . . . . . . . . . 899--922
                    Yang Ou and   
                  Nong Xiao and   
                   Fang Liu and   
              Zhiguang Chen and   
                   Wei Chen and   
                      Lizhou Wu   Gemini: a Novel Hardware and Software
                                  Implementation of High-performance PCIe
                                  SSD  . . . . . . . . . . . . . . . . . . 923--945
               Mingzhu Deng and   
                   Wei Chen and   
                  Nong Xiao and   
                Songping Yu and   
                      Yupeng Hu   GLE-Dedup: a Globally-Locally Even
                                  Deduplication by Request-Aware Placement
                                  for Better Read Performance  . . . . . . 946--964
                   Jiayi Du and   
                   Renfa Li and   
                 Zheng Xiao and   
                  Zhao Tong and   
                       Li Zhang   Optimization of Data Allocation on CMP
                                  Embedded System with Data Migration  . . 965--981
                   Yuyue Du and   
                    Lu Wang and   
                         Man Qi   Constructing Service Clusters Based on
                                  Service Space  . . . . . . . . . . . . . 982--1000
                  Yanan Sun and   
                   Yuyue Du and   
                     Maozhen Li   A Repair of Workflow Models Based on
                                  Mirroring Matrices . . . . . . . . . . . 1001--1020

International Journal of Parallel Programming
Volume 45, Number 5, October, 2017

          Giuliano Laccetti and   
                 Ian Foster and   
              Marco Lapegna and   
               Paul Messina and   
          Raffaele Montella and   
                 Almerico Murli   Guest Editorial for Hybrid Parallelism
                                  in New HPC Systems . . . . . . . . . . . 1021--1025
                    Ami Marowka   Energy-Aware Modeling of Scaled
                                  Heterogeneous Systems  . . . . . . . . . 1026--1045
            Moritz Kreutzer and   
                Jonas Thies and   
Melven Röhrig-Zöllner and   
             Andreas Pieper and   
             Faisal Shahzad and   
              Martin Galgon and   
            Achim Basermann and   
              Holger Fehske and   
                Georg Hager and   
                Gerhard Wellein   GHOST: Building Blocks for High
                                  Performance Sparse Linear Algebra on
                                  Heterogeneous Systems  . . . . . . . . . 1046--1072
               Beata Bylina and   
                 Joanna Potiopa   Explicit Fourth-Order Runge--Kutta
                                  Method on Intel Xeon Phi Coprocessor . . 1073--1090
                  Pawel Czarnul   Benchmarking Performance of a Hybrid
                                  Intel Xeon/Xeon Phi System for Parallel
                                  Computation of Similarity Measures
                                  Between Large Vectors  . . . . . . . . . 1091--1107
            Andrzej Glowacz and   
               Marcin Pietro\'n   Implementation of Digital Watermarking
                                  Algorithms in Parallel Hardware
                                  Accelerators . . . . . . . . . . . . . . 1108--1127
                 Jieun Choi and   
             Theodora Adufu and   
                    Yoonhee Kim   Data-Locality Aware Scientific Workflow
                                  Scheduling Methods in HPC Cloud
                                  Environments . . . . . . . . . . . . . . 1128--1141
          Raffaele Montella and   
              Giulio Giunta and   
          Giuliano Laccetti and   
              Marco Lapegna and   
             Carlo Palmieri and   
            Carmine Ferraro and   
        Valentina Pelliccia and   
              Cheol-Ho Hong and   
                Ivor Spence and   
      Dimitrios S. Nikolopoulos   On the Virtualization of CUDA Based GPU
                                  Remoting on ARM and x86 Machines in the
                                  GVirtuS Framework  . . . . . . . . . . . 1142--1163
               G. B. Barone and   
                  V. Boccia and   
               D. Bottalico and   
                R. Campagna and   
            L. Carracciuolo and   
                G. Laccetti and   
                     M. Lapegna   An Approach to Forecast Queue Time in
                                  Adaptive Scheduling: How to Mediate
                                  System Efficiency and Users Satisfaction 1164--1193
                 P. Natesan and   
            R. R. Rajalaxmi and   
                G. Gowrison and   
              P. Balasubramanie   Hadoop Based Parallel Binary Bat
                                  Algorithm for Network Intrusion
                                  Detection  . . . . . . . . . . . . . . . 1194--1213
           Rossella Arcucci and   
              Luisa D'Amore and   
         Luisa Carracciuolo and   
            Giuseppe Scotti and   
              Giuliano Laccetti   A Decomposition of the Tikhonov
                                  Regularization Functional Oriented to
                                  Exploit Hybrid Multilevel Parallelism    1214--1235
          Johannes Langguth and   
                  Qiang Lan and   
                 Namit Gaur and   
                       Xing Cai   Accelerating Detailed Tissue-Scale $3$D
                                  Cardiac Simulations Using Heterogeneous
                                  CPU--Xeon Phi Computing  . . . . . . . . 1236--1258

International Journal of Parallel Programming
Volume 45, Number 6, December, 2017

               Zhiyuan Shao and   
                    Jian He and   
                 Huiming Lv and   
                        Hai Jin   FOG: a Fast Out-of-Core Graph Processing
                                  Framework  . . . . . . . . . . . . . . . 1259--1272
                    Hai Jin and   
        Aaqif Afzaal Abbasi and   
                        Song Wu   Pathfinder: Application-Aware
                                  Distributed Path Computation in Clouds   1273--1284
              Yuanzhen Geng and   
                Xuanhua Shi and   
                  Cheng Pei and   
                    Hai Jin and   
                   Wenbin Jiang   LCS: an Efficient Data Eviction Strategy
                                  for Spark  . . . . . . . . . . . . . . . 1285--1297
              Chonghua Wang and   
                  Zhiyu Hao and   
                    Lei Cui and   
              Xiangyu Zhang and   
                   Xiaochun Yun   Introspection-Based Memory Pruning for
                                  Live VM Migration  . . . . . . . . . . . 1298--1309
               Fengfeng Pan and   
               Yinliang Yue and   
                      Jin Xiong   dCompaction: Delayed Compaction for the
                                  LSM-Tree . . . . . . . . . . . . . . . . 1310--1325
           Sudakshina Dutta and   
            Dipankar Sarkar and   
                   Arvind Rawat   Synchronization Validation for
                                  Cross-Thread Dependences in Parallel
                                  Programs . . . . . . . . . . . . . . . . 1326--1365
                   Xing Fan and   
            Mostafa Mehrabi and   
              Oliver Sinnen and   
                Nasser Giacaman   Supporting Enhanced Exception Handling
                                  with OpenMP in Object--Oriented
                                  Languages  . . . . . . . . . . . . . . . 1366--1389
             Youcef Barigou and   
                  Edgar Gabriel   Maximizing Communication--Computation
                                  Overlap Through Automatic
                                  Parallelization and Run-time Tuning of
                                  Non-blocking Collective Operations . . . 1390--1416
Guillermo Payá-Vayá and   
             Andreas Gerstlauer   Guest Editorial: Special Issue on the
                                  2015 International Conference on
                                  Embedded Computer Systems ---
                                  Architectures, Modeling and Simulation
                                  (SAMOS XV) . . . . . . . . . . . . . . . 1417--1419
                    Pei Liu and   
               Ahmed Hemani and   
                 Kolin Paul and   
             Christian Weis and   
              Matthias Jung and   
                   Norbert Wehn   $3$D-Stacked Many-Core Architecture for
                                  Biological Sequence Analysis Problems    1420--1460
             Yosi Ben Asher and   
                Irina Lipov and   
      Vladislav Tartakovsky and   
                       Dror Tiv   Generating ASIPs with Reduced Number of
                                  Connections to the Register-File . . . . 1461--1487
              Xinnian Zheng and   
               Lizy K. John and   
             Andreas Gerstlauer   LACross: Learning-Based Analytical
                                  Cross-Platform Performance and Power
                                  Prediction . . . . . . . . . . . . . . . 1488--1514
                  Biao Wang and   
          Diego F. de Souza and   
      Mauricio Alvarez-Mesa and   
              Chi Ching Chi and   
               Ben Juurlink and   
            Aleksandar Ilic and   
                  Nuno Roma and   
                   Leonel Sousa   GPU Parallelization of HEVC In-Loop
                                  Filters  . . . . . . . . . . . . . . . . 1515--1535
               Nabil Hallou and   
                Erven Rohou and   
                Philippe Clauss   Runtime Vectorization Transformations of
                                  Binary Code  . . . . . . . . . . . . . . 1536--1565
             Christian Weis and   
               Abdul Mutaal and   
                  Omar Naji and   
              Matthias Jung and   
            Andreas Hansson and   
                   Norbert Wehn   DRAMSpec: a High-Level DRAM Timing,
                                  Power and Area Exploration Tool  . . . . 1566--1591
       Miguel Angel Aguilar and   
        Juan Fernando Eusse and   
                Projjol Ray and   
             Rainer Leupers and   
               Gerd Ascheid and   
               Weihua Sheng and   
                Prashant Sharma   Towards Parallelism Extraction for
                                  Heterogeneous Multicore Android Devices  1592--1624
               Nuno Fachada and   
             Vitor V. Lopes and   
             Rui C. Martins and   
              Agostinho C. Rosa   Erratum to: Parallelization Strategies
                                  for Spatial Agent-Based Models . . . . . 1625--1626

International Journal of Parallel Programming
Volume 46, Number 1, February, 2018

            Sergei Gorlatch and   
                 Herbert Kuchen   Guest Editorial: High-Level Parallel
                                  Programming with Algorithmic Skeletons   1--3
                 Jan Stypka and   
             Wojciech Turek and   
          Aleksander Byrski and   
   Marek Kisiel-Dorohinicki and   
            Adam D. Barwell and   
          Christopher Brown and   
              Kevin Hammond and   
                Vladimir Janjic   The Missing Link! A New Skeleton for
                                  Evolutionary Multi-agent Systems in
                                  Erlang . . . . . . . . . . . . . . . . . 4--22
              Michael Haidl and   
                Sergei Gorlatch   High-Level Programming for Many-Cores
                                  Using C++14 and the STL  . . . . . . . . 23--41
               Fabian Wrede and   
               Steffen Ernsting   Simultaneous CPU--GPU Execution of Data
                                  Parallel Algorithmic Skeletons . . . . . 42--61
           August Ernstsson and   
                      Lu Li and   
              Christoph Kessler   SkePU 2: Flexible and Type-Safe Skeleton
                                  Programming for Heterogeneous Parallel
                                  Systems  . . . . . . . . . . . . . . . . 62--80
              Antonio Brogi and   
            Marco Danelutto and   
           Daniele De Sensi and   
              Ahmad Ibrahim and   
             Jacopo Soldani and   
               Massimo Torquati   Analysing Multiple QoS Attributes in
                                  Parallel Design Patterns-Based
                                  Applications . . . . . . . . . . . . . . 81--100
                  Ari Rasch and   
                Sergei Gorlatch   Multi-dimensional Homomorphisms and
                                  Their Implementation in OpenCL . . . . . 101--119
                 Mehdi Goli and   
Horacio González-Vélez   Formalised Composition and Interaction
                                  for Heterogeneous Structured Parallelism 120--151
           Venkatesh Kannan and   
                 G. W. Hamilton   Functional Program Transformation for
                                  Parallelisation Using Skeletons  . . . . 152--172

International Journal of Parallel Programming
Volume 46, Number 2, April, 2018

               Jixiang Yang and   
                      Qingbi He   Scheduling Parallel Computations by Work
                                  Stealing: a Survey . . . . . . . . . . . 173--197
               Samer Arandi and   
             George Matheou and   
            Costas Kyriacou and   
           Paraskevas Evripidou   Data-Driven Thread Execution on
                                  Heterogeneous Processors . . . . . . . . 198--224
          Saurabh Hukerikar and   
            Keita Teranishi and   
             Pedro C. Diniz and   
                Robert F. Lucas   RedThreads: an Interface for
                                  Application-Level Fault
                                  Detection/Correction Through Adaptive
                                  Redundant Multithreading . . . . . . . . 225--251
                Jorge Silva and   
                 Ana Aguiar and   
                 Fernando Silva   Parallel Asynchronous Strategies for the
                                  Execution of Feature Selection
                                  Algorithms . . . . . . . . . . . . . . . 252--283
            Jawad Haj-Yihia and   
                 Yosi Ben-Asher   Software Static Energy Modeling for
                                  Modern Processors  . . . . . . . . . . . 284--312
          Sai Charan Koduru and   
                 Keval Vora and   
                    Rajiv Gupta   Software Speculation on Caching DSMs . . 313--332
             Antonino Tumeo and   
            Hubertus Franke and   
           Gianluca Palermo and   
                       John Feo   Guest Editorial: Special Issue on
                                  Computing Frontiers  . . . . . . . . . . 333--335
             Naila Farooqui and   
               Indrajit Roy and   
    Yuan Chen Vanish Talwar and   
           Rajkishore Barik and   
                Brian Lewis and   
          Tatiana Shpeisman and   
                 Karsten Schwan   Accelerating Data Analytics on
                                  Integrated GPU Platforms via Runtime
                                  Specialization . . . . . . . . . . . . . 336--375
                    Ke Wang and   
           Elaheh Sadredini and   
                  Kevin Skadron   Hierarchical Pattern Mining with the
                                  Automata Processor . . . . . . . . . . . 376--411
               William Horn and   
                Manoj Kumar and   
                Joefon Jann and   
        José Moreira and   
            Pratap Pattnaik and   
           Mauricio Serrano and   
             Gabriel Tanase and   
                         Hao Yu   Graph Programming Interface (GPI): a
                                  Linear Algebra Programming Model for
                                  Large Scale Graph Computations . . . . . 412--440
               David Jaeger and   
           Hendrik Graupner and   
              Chris Pelchen and   
                 Feng Cheng and   
               Christoph Meinel   Fast Automated Processing and Evaluation
                                  of Identity Leaks  . . . . . . . . . . . 441--470
              Farhana Aleen and   
     Vyacheslav P. Zakharin and   
         Rakesh Krishnaiyer and   
               Garima Gupta and   
             David Kreitzer and   
             Chang-Sun Lin, Jr.   Automated Compiler Optimization of
                                  Multiple Vector Loads/Stores . . . . . . 471--503

International Journal of Parallel Programming
Volume 46, Number 3, June, 2018

            Salvatore Cuomo and   
            Marco Aldinucci and   
               Massimo Torquati   Guest Editorial for Programming Models
                                  and Algorithms for Data Analysis in HPC
                                  Systems  . . . . . . . . . . . . . . . . 505--507
                Awais Ahmad and   
                 Anand Paul and   
Sadia Din M. Mazhar Rathore and   
              Gyu Sang Choi and   
                  Gwanggil Jeon   Multilevel Data Processing Using
                                  Parallel Algorithms for Analyzing Big
                                  Data in High-Performance Computing . . . 508--527
        Pasquale De Michele and   
         Francesco Maiorano and   
           Livia Marcellino and   
            Francesco Piccialli   A GPU Implementation of OLPCA Method in
                                  Hybrid Environment . . . . . . . . . . . 528--542
            Puneet Jai Kaur and   
             Sakshi Kaushal and   
        Arun Kumar Sangaiah and   
            Francesco Piccialli   A Framework for Assessing Reusability
                                  Using Package Cohesion Measure in Aspect
                                  Oriented Systems . . . . . . . . . . . . 543--564
                   Gang Mei and   
            Salvatore Cuomo and   
                  Hong Tian and   
               Nengxiong Xu and   
                    Linjun Peng   MeshCleaner: a Generic and
                                  Straightforward Algorithm for Cleaning
                                  Finite Element Meshes  . . . . . . . . . 565--583
          Bastien Plazolles and   
              Didier El Baz and   
                Martin Spel and   
             Vincent Rivola and   
                  Pascal Gegout   SIMD Monte-Carlo Numerical Simulations
                                  Accelerated on GPU and Xeon Phi  . . . . 584--606
                Emilia Popa and   
               Mauro Iacono and   
                     Florin Pop   Adapting MCP and HLFET Algorithms to
                                  Multiple Simultaneous Scheduling . . . . 607--629
          M. Mazhar Rathore and   
                  Hojae Son and   
                Awais Ahmad and   
                 Anand Paul and   
                  Gwanggil Jeon   Real-Time Big Data Stream Processing
                                  Using GPU with Spark Over Hadoop
                                  Ecosystem  . . . . . . . . . . . . . . . 630--646

International Journal of Parallel Programming
Volume 46, Number 4, August, 2018

                      Anonymous   Editor's Note: Special Issue on Network
                                  and Parallel Computing for New
                                  Architectures and Applications . . . . . 647--647
                  Yuntao Lu and   
                  Chao Wang and   
                   Lei Gong and   
                    Xuehai Zhou   SparseNN: a Performance-Efficient
                                  Accelerator for Large-Scale Sparse
                                  Neural Networks  . . . . . . . . . . . . 648--659
                Sijiang Fan and   
                 Jiawei Fei and   
                        Li Shen   Accelerating Deep Learning with a
                                  Parallel Mechanism Using CPU + MIC . . . 660--673
               Chengfan Jia and   
                 Junnan Liu and   
                     Xu Jin and   
                    Han Lin and   
                    Hong An and   
                Wenting Han and   
                   Zheng Wu and   
                   Mengxian Chi   Improving the Performance of Distributed
                                  TensorFlow with RDMA . . . . . . . . . . 674--685
                 Xiangyu Ju and   
                  Quan Chen and   
              Zhenning Wang and   
                  Minyi Guo and   
                   Guang R. Gao   DCF: a Dataflow-Based Collaborative
                                  Filtering Training Algorithm . . . . . . 686--698
                Zhiwen Chen and   
                     Xin He and   
                Jianhua Sun and   
                       Hao Chen   Have Your Cake and Eat it (Too): a
                                  Concurrent Hash Table with Hardware
                                  Transactions . . . . . . . . . . . . . . 699--709
              Donghyun Gouk and   
                  Jie Zhang and   
                 Myoungsoo Jung   Enabling Realistic Logical Device
                                  Interface and Driver for NVM Express
                                  Enabled Full System Simulations  . . . . 710--721
                 Wenjie Liu and   
                   Sheng Ma and   
                 Libo Huang and   
                   Zhiying Wang   The Design of NoC-Side Memory Access
                                  Scheduling for Energy-Efficient GPGPUs   722--735
                   Yang Shi and   
                 Yanmin Zhu and   
                  Linpeng Huang   Partial-PreSET: Enhancing Lifetime of
                                  PCM-Based Main Memory with Fine-Grained
                                  SET Operations . . . . . . . . . . . . . 736--748
                   Jian Gao and   
                Hongmei Wei and   
                    Kang Yu and   
                      Peng Qing   A Scalable Runtime Fault Localization
                                  Framework for High-Performance Computing
                                  Systems  . . . . . . . . . . . . . . . . 749--761
                    Han Lin and   
                 Zhichao Su and   
              Xiandong Meng and   
                     Xu Jin and   
                 Zhong Wang and   
                Wenting Han and   
                    Hong An and   
               Mengxian Chi and   
                       Zheng Wu   Combining Hadoop with MPI to Solve
                                  Metagenomics Problems that are both
                                  Data- and Compute-intensive  . . . . . . 762--775
                    Fan Sun and   
                  Chao Wang and   
                   Lei Gong and   
                Yiwei Zhang and   
              Chongchong Xu and   
                  Yuntao Lu and   
                      Xi Li and   
                    Xuehai Zhou   UniCNN: a Pipelined Accelerator Towards
                                  Uniformed Computing for CNNs . . . . . . 776--787
                  Weiqi Dai and   
                   Yukun Du and   
                    Hai Jin and   
             Weizhong Qiang and   
                 Deqing Zou and   
                Shouhuai Xu and   
                    Zhongze Liu   RollSec: Automatically Secure Software
                                  States Against General Rollback  . . . . 788--805

International Journal of Parallel Programming
Volume 46, Number 5, October, 2018

        Francesco Piccialli and   
            Salvatore Cuomo and   
                  Gwanggil Jeon   Parallel Approaches for Data Mining in
                                  the Internet of Things Realm . . . . . . 807--811
              Santosh Kumar and   
         Sanjay Kumar Singh and   
             Ali Imam Abidi and   
           Deepanwita Datta and   
            Arun Kumar Sangaiah   Group Sparse Representation Approach for
                                  Recognition of Cattle on Muzzle Point
                                  Images . . . . . . . . . . . . . . . . . 812--837
               Xiaomin Yang and   
                     Wei Wu and   
                  Binyu Yan and   
               Huiqian Wang and   
                   Kai Zhou and   
                        Kai Liu   Infrared Image Super-Resolution with
                                  Parallel Random Forest . . . . . . . . . 838--858
                  Jun-fang Song   Vehicle Detection Using Spatial
                                  Relationship GMM for Complex Urban
                                  Surveillance in Daytime and Nighttime    859--872
              Jun-fang Song and   
              Wei-xing Wang and   
                      Feng Chen   Target Detection Based on $3$D
                                  Multi-Component Model and Inverse
                                  Projection Transformation  . . . . . . . 873--885
            Muhammad Farhan and   
              Sohail Jabbar and   
             Muhammad Aslam and   
                Awais Ahmad and   
      Muhammad Munwar Iqbal and   
                 Murad Khan and   
    Martinez-Enriquez Ana Maria   A Real-Time Data Mining Approach for
                                  Interaction Analytics Assessment: IoT
                                  Based Student Interaction Framework  . . 886--903
           Vanitha Mohanraj and   
               R. Sakthivel and   
                 Anand Paul and   
                   Seungmin Rho   High Performance GCM Architecture for
                                  the Security of High Speed Network . . . 904--922
            Salvatore Cuomo and   
        Pasquale De Michele and   
           Emanuel Di Nardo and   
               Livia Marcellino   Parallel Implementation of a Machine
                                  Learning Algorithm on GPU  . . . . . . . 923--942
                     Wei Lu and   
               Xiaomin Yang and   
                     Xu Gou and   
                 Lihua Jian and   
                     Wei Wu and   
                  Gwanggil Jeon   Parallel Heat Kernel Volume Based Local
                                  Binary Pattern on Multi-Orientation
                                  Planes for Face Representation . . . . . 943--962
                Zengyu Ding and   
                   Gang Mei and   
            Salvatore Cuomo and   
               Nengxiong Xu and   
                      Hong Tian   Performance Evaluation of
                                  GPU-Accelerated Spatial Interpolation
                                  Using Radial Basis Functions for
                                  Building Explicit Surfaces . . . . . . . 963--991
                  Atif Khan and   
               Naomie Salim and   
              Haleem Farman and   
                 Murad Khan and   
                  Bilal Jan and   
                Awais Ahmad and   
                Imran Ahmed and   
                     Anand Paul   Abstractive Text Summarization based on
                                  Improved Semantic Graph Approach . . . . 992--1016

International Journal of Parallel Programming
Volume 46, Number 6, December, 2018

     Dhirendra Pratap Singh and   
                Ishan Joshi and   
            Jaytrilok Choudhary   Survey of GPU Based Sorting Algorithms   1017--1034
             Rafael Palomar and   
     Juan Gómez-Luna and   
           Faouzi A. Cheikh and   
Joaqu\'ìn Olivares-Bueno and   
                    Ole J. Elle   High-Performance Computation of Bézier
                                  Surfaces on Parallel and Heterogeneous
                                  Platforms  . . . . . . . . . . . . . . . 1035--1062
            Marcin Gorawski and   
                   Michal Lorek   Efficient Processing of Large Data
                                  Structures on GPUs: Enumeration Scheme
                                  Based Optimisation . . . . . . . . . . . 1063--1093
          Mina Hosseini Rad and   
             Ahmad Patooghy and   
                   Mahdi Fazeli   An Efficient Programming Skeleton for
                                  Clusters of Multi-Core Processors  . . . 1094--1109
            Lucia G. Menezo and   
            Valentin Puente and   
                 Pablo Abad and   
            Jose-Angel Gregorio   Mosaic: a Scalable Coherence Protocol    1110--1138
                 David Wehr and   
               Rafael Radkowski   Parallel $ k d$-Tree Construction on the
                                  GPU with an Adaptive Split and Sort
                                  Strategy . . . . . . . . . . . . . . . . 1139--1156
                  Mengda He and   
           Viktor Vafeiadis and   
Shengchao Qin João F. Ferreira   GPS$+$: Reasoning About Fences and
                                  Relaxed Atomics  . . . . . . . . . . . . 1157--1183
                      Anonymous   Editor's Note: Special Issue on Embedded
                                  Computer Systems: Architectures,
                                  Modeling and Simulation  . . . . . . . . 1184--1184
     Catalin Bogdan Ciobanu and   
          Georgi Gaydadjiev and   
           Christian Pilato and   
               Donatella Sciuto   The Case for Polymorphic Registers in
                                  Dataflow Computing . . . . . . . . . . . 1185--1219
            Christos Kyrkou and   
    Theocharis Theocharides and   
   Christos-Savvas Bouganis and   
              Marios Polycarpou   Boosting the Hardware-Efficiency of
                                  Cascade Support Vector Machines for
                                  Embedded Classification Applications . . 1220--1246
       Christopher Thompson and   
                Miles Gould and   
                   Nigel Topham   High Speed Cycle-Approximate Simulation
                                  of Embedded Cache-Incoherent and
                                  Coherent Chip-Multiprocessors  . . . . . 1247--1282
              Timo Viitanen and   
              Janne Helkala and   
             Heikki Kultala and   
Pekka Jääskeläinen and   
               Jarmo Takala and   
            Tommi Zetterman and   
                    Heikki Berg   Variable Length Instruction Compression
                                  on Transport Triggered Architectures . . 1283--1303
   Dimitra Papagiannopoulou and   
            Andrea Marongiu and   
              Tali Moreshet and   
                Luca Benini and   
            Maurice Herlihy and   
                  R. Iris Bahar   Hardware Transactional Memory
                                  Exploration in Coherence-Free Many-Core
                                  Architectures  . . . . . . . . . . . . . 1304--1328

International Journal of Parallel Programming
Volume 47, Number 1, February, 2019

              Christopher Brown   Guest Editorial Special Issue:
                                  High-Level Programming for Heterogeneous
                                  Parallel Systems . . . . . . . . . . . . 1--2
              Javier Fresno and   
               Daniel Barba and   
  Arturo Gonzalez-Escribano and   
                Diego R. Llanos   HitFlow: a Dataflow Programming Model
                                  for Hybrid Distributed- and
                                  Shared-Memory Systems  . . . . . . . . . 3--23
      Georgios C. Chasparis and   
               Michael Rossbory   Efficient Dynamic Pinning of
                                  Parallelized Applications by Distributed
                                  Reinforcement Learning . . . . . . . . . 24--38
        Matthew B. Ashcraft and   
            Alexander Lemon and   
             David A. Penry and   
                    Quinn Snell   Compiler Optimization of Accelerator
                                  Data Transfers . . . . . . . . . . . . . 39--58
                Moria Abadi and   
       Sharon Keidar-Barner and   
               Dmitry Pidan and   
                Tatyana Veksler   Verifying Parallel Code After
                                  Refactoring Using Equivalence Checking   59--73
            Marco Danelutto and   
         Tiziano De Matteis and   
           Daniele De Sensi and   
          Gabriele Mencagli and   
           Massimo Torquati and   
            Marco Aldinucci and   
               Peter Kilpatrick   The RePhrase Extended Pattern Set for
                                  Data Intensive Parallel Computing  . . . 74--93
      Ana Moreton-Fernandez and   
  Arturo Gonzalez-Escribano and   
                Diego R. Llanos   Multi-device Controllers: a Library to
                                  Simplify Parallel Heterogeneous
                                  Programming  . . . . . . . . . . . . . . 94--113
         Wim Vanderbauwhede and   
            Syed Waqar Nabi and   
                 Cristian Urlea   Type-Driven Automated Program
                                  Transformations and Cost Modelling for
                                  Optimising Streaming Programs on FPGAs   114--136
              Hamidreza Mohebbi   Parallel SIMD CPU and GPU
                                  Implementations of Berlekamp--Massey
                                  Algorithm and Its Error Correction
                                  Application  . . . . . . . . . . . . . . 137--160

International Journal of Parallel Programming
Volume 47, Number 2, April, 2019

    J. Daniel García and   
      Arturo Gonzalez-Escribano   Guest Editorial: High-Level Parallel
                                  Programming and the Road to High
                                  Performance  . . . . . . . . . . . . . . 161--163
             Clemens Grelck and   
             Heinrich Wiesinger   Persistent Asynchronous Adaptive
                                  Specialization for Generic Array
                                  Programming  . . . . . . . . . . . . . . 164--183
                Arvid Jakobsson   Automatic Cost Analysis for Imperative
                                  BSP Programs . . . . . . . . . . . . . . 184--212
            Angeles Navarro and   
          Francisco Corbera and   
           Andres Rodriguez and   
            Antonio Vilches and   
                  Rafael Asenjo   Heterogeneous parallel\_for Template for
                                  CPU--GPU Chips . . . . . . . . . . . . . 213--233
               Fabian Wrede and   
              Breno Menezes and   
                 Herbert Kuchen   Fish School Search with Algorithmic
                                  Skeletons  . . . . . . . . . . . . . . . 234--252
            Dalvan Griebler and   
         Renato B. Hoffmann and   
            Marco Danelutto and   
              Luiz G. Fernandes   High-Level and Productive Stream
                                  Parallelism for Dedup, Ferret, and Bzip2 253--271
Javier López-Fandiño and   
              Dora B. Heras and   
    Francisco Argüello and   
               Mauro Dalla Mura   GPU Framework for Change Detection in
                                  Multitemporal Hyperspectral Images . . . 272--292
Miguel A. Vega-Rodr\'ìguez and   
  José M. Granado-Criado   Parallel Programming in Bioinformatics:
                                  Some Interesting Approaches  . . . . . . 293--295
                 Enzo Rucci and   
      Carlos Garcia Sanchez and   
     Guillermo Botella Juan and   
          Armando De Giusti and   
             Marcelo Naiouf and   
           Manuel Prieto-Matias   SWIMM 2.0: Enhanced Smith--Waterman on
                                  Intel's Multicore and Manycore
                                  Architectures Based on AVX-512 Vector
                                  Extensions . . . . . . . . . . . . . . . 296--316
              Ferran Badosa and   
           Antonio Espinosa and   
              Cesar Acevedo and   
               Gonzalo Vera and   
                     Ana Ripoll   A History-Based Resource Manager for
                                  Genome Analysis Workflows Applications
                                  on Clusters with Heterogeneous Nodes . . 317--342

International Journal of Parallel Programming
Volume 47, Number 3, June, 2019

                 Feng Zhang and   
                Jidong Zhai and   
                  Marc Snir and   
                    Hai Jin and   
          Hironori Kasahara and   
                   Mateo Valero   Guest Editorial: Special Issue on
                                  Network and Parallel Computing for
                                  Emerging Architectures and Applications  343--344
                   Dong Han and   
             Shengyuan Zhou and   
                   Tian Zhi and   
                  Yibo Wang and   
                     Shaoli Liu   Float-Fix: an Efficient and
                                  Hardware-Friendly Data Type for Deep
                                  Neural Network . . . . . . . . . . . . . 345--359
                    Yong Yu and   
                   Tian Zhi and   
                  Xuda Zhou and   
                 Shaoli Liu and   
                 Yunji Chen and   
                   Shuyao Cheng   BSHIFT: a Low Cost Deep Neural Networks
                                  Accelerator  . . . . . . . . . . . . . . 360--372
                 Lianke Qin and   
                 Yifan Gong and   
                Tianqi Tang and   
                Yutian Wang and   
                  Jiangming Jin   Training Deep Nets with Progressive
                                  Batch Normalization on Multi-GPUs  . . . 373--387
                 Huihui Zou and   
             Shanjiang Tang and   
                      Ce Yu and   
                     Hao Fu and   
                   Yusen Li and   
                    Wenjie Tang   ASW: Accelerating Smith--Waterman
                                  Algorithm on Coupled CPU--GPU
                                  Architecture . . . . . . . . . . . . . . 388--402
                Junhong Liu and   
                     Xin He and   
                Weifeng Liu and   
                  Guangming Tan   Register-Aware Optimizations for
                                  Parallel Sparse Matrix--Matrix
                                  Multiplication . . . . . . . . . . . . . 403--417
               Donglin Chen and   
               Jianbin Fang and   
               Shizhao Chen and   
                 Chuanfu Xu and   
                     Zheng Wang   Optimizing Sparse Matrix--Vector
                                  Multiplications on an ARMv8-based
                                  Many-Core Architecture . . . . . . . . . 418--432
                   Kang Jin and   
                   Cunlu Li and   
                 Dezun Dong and   
                    Binzhang Fu   HARE: History-Aware Adaptive Routing
                                  Algorithm for Endpoint Congestion in
                                  Networks-on-Chip . . . . . . . . . . . . 433--450
                  Cheng Pan and   
                   Lan Zhou and   
                Yingwei Luo and   
               Xiaolin Wang and   
                   Zhenlin Wang   Lightweight and Accurate Memory
                                  Allocation in Key--Value Cache . . . . . 451--466
                 Mingfan Li and   
                     Ke Wen and   
                    Han Lin and   
                     Xu Jin and   
                   Zheng Wu and   
                    Hong An and   
                   Mengxian Chi   Improving the Performance of Distributed
                                  MXNet with RDMA  . . . . . . . . . . . . 467--480
                  Heyang Xu and   
                   Yang Liu and   
                    Wei Wei and   
                       Ying Xue   Migration Cost and Energy-Aware Virtual
                                  Machine Consolidation Under Cloud
                                  Environments Considering Remaining
                                  Runtime  . . . . . . . . . . . . . . . . 481--501
                    Bo Wang and   
                   Jie Tang and   
                  Rui Zhang and   
                   Wei Ding and   
                        Deyu Qi   A Dependency-Aware Storage Schema
                                  Selection Mechanism for In-Memory Big
                                  Data Computing Frameworks  . . . . . . . 502--519
                  Peng Zhao and   
                    Lei Liu and   
                    Wei Cao and   
                  Xiao Dong and   
                Jiansong Li and   
                  Xiaobing Feng   ElasticActor: an Actor System with
                                  Automatic Granularity Adjustment . . . . 520--534

International Journal of Parallel Programming
Volume 47, Number 4, August, 2019

          Nahid Farhady Ghalaty   Editorial: Special Issue on Side-Channel
                                  and Fault Analysis of High-Performance
                                  Computing Platforms  . . . . . . . . . . 535--537
              Ahmad Moghimi and   
             Jan Wichelmann and   
          Thomas Eisenbarth and   
                     Berk Sunar   MemJam: a False Dependency Attack
                                  Against Constant-Time Crypto
                                  Implementations  . . . . . . . . . . . . 538--570
                Hongyu Fang and   
       Sai Santosh Dayapule and   
                    Fan Yao and   
     Milo\vs Doroslova\vcki and   
             Guru Venkataramani   PrODACT: Prefetch-Obfuscator to Defend
                                  Against Cache Timing Channels  . . . . . 571--594
                    Fan Yao and   
     Milo\vs Doroslova\vcki and   
             Guru Venkataramani   Covert Timing Channels Exploiting Cache
                                  Coherence Hardware: Characterization and
                                  Defense  . . . . . . . . . . . . . . . . 595--620
   Alejandro Cabrera Aldaya and   
          Billy Bob Brumley and   
Alejandro J. Cabrera Sarmiento and   
 Santiago Sánchez-Solano   Memory Tampering Attack on Binary GCD
                                  Based Inversion Algorithms . . . . . . . 621--640
            Qiang-Sheng Hua and   
                Xuanhua Shi and   
               Yinglong Xia and   
                    Howie Huang   Guest Editorial: Special Issue on
                                  Algorithms and Systems on Big Graph
                                  Processing . . . . . . . . . . . . . . . 641--643
               Huanzhou Zhu and   
                  Ligang He and   
                Songling Fu and   
                     Rui Li and   
                    Xie Han and   
                Zhangjie Fu and   
                Yongjian Hu and   
                  Chang-Tsun Li   WolfPath: Accelerating Iterative
                                  Traversing-Based Graph Processing
                                  Algorithms on GPU  . . . . . . . . . . . 644--667
               Zhiyuan Shao and   
                Zhenjie Mei and   
              Xiaofeng Ding and   
                        Hai Jin   BlockGraphChi: Enabling Block Update in
                                  Out-of-Core Graph Processing . . . . . . 668--685
                    Deng Li and   
                Zhujun Chen and   
                      Jiaqi Liu   Analysis for Behavioral Economics in
                                  Social Networks: An Altruism-Based
                                  Dynamic Cooperation Model  . . . . . . . 686--708
                    Wei Liu and   
                    Lu Wang and   
                   Xin Feng and   
                     Man Qi and   
                   Chun Yan and   
                     Maozhen Li   Soundness Analytics of Composed Logical
                                  Workflow Nets  . . . . . . . . . . . . . 709--724
              Jianliang Gao and   
               Jianxin Wang and   
                Jianbiao He and   
                    Fengxia Yan   Against Signed Graph Deanonymization
                                  Attacks on Social Networks . . . . . . . 725--739
                Haipeng Yao and   
                  Qiyi Wang and   
                 Luyao Wang and   
              Peiying Zhang and   
                 Maozhen Li and   
                     Yunjie Liu   An Intrusion Detection Framework Based
                                  on Hybrid Multi-Level Data Mining  . . . 740--758
              Xingwang Wang and   
                Xiaohui Wei and   
                  Shang Gao and   
               Yuanyuan Liu and   
                    Zongpeng Li   A Novel Auction-Based Query Pricing
                                  Schema . . . . . . . . . . . . . . . . . 759--780

International Journal of Parallel Programming
Volume 47, Number 5--6, December, 2019

          David Niedzielski and   
              Kleanthis Psarris   An Analytical Evaluation of Data
                                  Dependence Analysis Techniques . . . . . 781--804
                   Misun Yu and   
              Joon-Sang Lee and   
                   Doo-Hwan Bae   AdaptiveLock: Efficient Hybrid Data Race
                                  Detection Based on Real-World Locking
                                  Patterns . . . . . . . . . . . . . . . . 805--837
          Andrea Crivellini and   
             Matteo Franciolini   OpenMP Parallelization Strategies for a
                                  Discontinuous Galerkin Solver  . . . . . 838--873
     Andreas Simbürger and   
                      Sven Apel   PolyJIT: Polyhedral Optimization Just in
                                  Time . . . . . . . . . . . . . . . . . . 874--906
    Mohammad Amin Irandoost and   
            Amir Masoud Rahmani   MapReduce Data Skewness Handling: a
                                  Systematic Literature Review . . . . . . 907--950
       Fabien Reumont-Locke and   
             Naser Ezzati-Jivan   Efficient Methods for Trace Analysis
                                  Parallelization  . . . . . . . . . . . . 951--972
                Pierre Zins and   
                Michel Dagenais   Tracing and Profiling Machine Learning
                                  Dataflow Applications on GPU . . . . . . 973--1013
              Ismail Akturk and   
                   Ozcan Ozturk   Adaptive Thread Scheduling in Chip
                                  Multiprocessors  . . . . . . . . . . . . 1014--1044
                      Anonymous   Editor's Note: Special Issue on
                                  High-Level Languages and Frameworks for
                                  High-Performance Computing . . . . . . . 1045--1045
    Hél\`ene Coullon and   
                   Julien Bigot   Extensibility and Composability of a
                                  Multi-Stencil Domain Specific Framework  1046--1085
              Brad Peterson and   
              Alan Humphrey and   
                 Dan Sunderland   Automatic Halo Management for the Uintah
                                  GPU--Heterogeneous Asynchronous
                                  Many-Task Runtime  . . . . . . . . . . . 1086--1116
José L. Quiroz-Fabián and   
   Graciela Román-Alonso   VPPE: a Novel Visual Parallel
                                  Programming Environment  . . . . . . . . 1117--1151

International Journal of Parallel Programming
Volume 48, Number 1, February, 2020

                Re'em Harel and   
               Idan Mosseri and   
                Harel Levin and   
                Lee-or Alon and   
           Matan Rusanovsky and   
                       Gal Oren   Source-to-Source Parallelization
                                  Compilers for Scientific Shared-Memory
                                  Multi-core and Accelerated
                                  Multiprocessing: Analysis, Pitfalls,
                                  Enhancement and Potential  . . . . . . . 1--31
                    Zhen Yu and   
                     Yu Zuo and   
                      Yong Zhao   Convoider: a Concurrency Bug Avoider
                                  Based on Transparent Software
                                  Transactional Memory . . . . . . . . . . 32--60
                 Wensi Yang and   
               Qingfeng Yao and   
                 Kejiang Ye and   
                 Cheng-Zhong Xu   Empirical Mode Decomposition and
                                  Temporal Convolutional Networks for
                                  Remaining Useful Life Estimation . . . . 61--79
               Donglin Chen and   
               Jianbin Fang and   
                 Chuanfu Xu and   
               Shizhao Chen and   
                     Zheng Wang   Characterizing Scalability of Sparse
                                  Matrix--Vector Multiplications on
                                  Phytium FT-2000+ . . . . . . . . . . . . 80--97
                  Shuo Chen and   
                   Zhan Shi and   
                   Dan Feng and   
                  Shang Liu and   
                  Fang Wang and   
                   Lei Yang and   
                       Ruili Yu   CSMqGraph: Coarse-Grained and
                                  Multi-external-storage Multi-queue I/O
                                  Management for Graph Computing . . . . . 98--118
                Ziyue Jiang and   
                 Yifan Gong and   
                Jidong Zhai and   
               Yu-Ping Wang and   
                    Wei Liu and   
                     Hao Wu and   
                  Jiangming Jin   Message Passing Optimization in Robot
                                  Operating System . . . . . . . . . . . . 119--136
                  Zelin Liu and   
                   Jian Cao and   
                 Yudong Tan and   
                Quanwu Xiao and   
                  Mukesh Prasad   Planning Above the API Clouds Before
                                  Flying Above the Clouds: a Real-Time
                                  Personalized Air Travel Planning
                                  Approach . . . . . . . . . . . . . . . . 137--156

International Journal of Parallel Programming
Volume 48, Number 2, April, 2020

              Gwanggil Jeon and   
                Awais Ahmad and   
            Salvatore Cuomo and   
                 Burak Kantarci   Guest Editorial: Special Issue on
                                  Emerging Technology for Software Define
                                  Network Enabled Internet of Things . . . 157--161
               Farhan Ullah and   
               Junfeng Wang and   
            Muhammad Farhan and   
              Sohail Jabbar and   
     Muhammad Kashif Naseer and   
                  Muhammad Asif   LSA Based Smart Assessment Methodology
                                  for SDN Infrastructure in IoT
                                  Environment  . . . . . . . . . . . . . . 162--177
                 Murad Khan and   
                Javed Iqbal and   
             Muhammad Talha and   
            Muhammad Arshad and   
             Muhammad Diyan and   
                      Kijun Han   Big Data Processing using Internet of
                                  Software Defined Things in Smart Cities  178--191
                  S. Ramesh and   
                 C. Yaashuwanth   QoS and QoE Enhanced Resource Allocation
                                  for Wireless Video Sensor Networks Using
                                  Hybrid Optimization Algorithm  . . . . . 192--212
             Mudassar Ahmad and   
                Usman Ahmad and   
              Md Asri Ngadi and   
        Muhammad Asif Habib and   
             Shehzad Khalid and   
                   Rehan Ashraf   Loss Based Congestion Control Module for
                                  Health Centers Deployed by Using
                                  Advanced IoT Based SDN Communication
                                  Networks . . . . . . . . . . . . . . . . 213--243
           Fakhri Alam Khan and   
                Awais Ahmad and   
                 Muhammad Imran   Energy Optimization of PR--LEACH Routing
                                  Scheme Using Distance Awareness in
                                  Internet of Things Networks  . . . . . . 244--263
                    Tao Han and   
              Miaowang Zeng and   
               Lijuan Zhang and   
            Arun Kumar Sangaiah   A Channel-Aware Duty Cycle Optimization
                                  for Node-to-Node Communications in the
                                  Internet of Medical Things . . . . . . . 264--279
           Salah A. Alabady and   
            Fadi Al-Turjman and   
                      Sadia Din   A Novel Security Model for Cooperative
                                  Virtual Networks in the IoT Era  . . . . 280--295
               E. Anna Devi and   
         J. Martin Leo Manickam   Identifying Partitions in Wireless
                                  Sensor Network . . . . . . . . . . . . . 296--309
            Hsiu-Sen Chiang and   
        Arun Kumar Sangaiah and   
                Mu-Yen Chen and   
                     Jia-Yu Liu   A Novel Artificial Bee Colony
                                  Optimization Algorithm with SVM for
                                  Bio-inspired Software-Defined Networking 310--328
               M. BalaAnand and   
             N. Karthikeyan and   
                     S. Karthik   Designing a Framework for Communal
                                  Software: Based on the Assessment Using
                                  Relation Modelling . . . . . . . . . . . 329--343
               Idrees Ahmed and   
                  Abid Khan and   
                Adeel Anjum and   
              Mansoor Ahmed and   
            Muhammad Asif Habib   A Secure Provenance Scheme for Detecting
                                  Consecutive Colluding Users in
                                  Distributed Networks . . . . . . . . . . 344--366
             Ghulam Shabbir and   
                Adeel Akram and   
      Muhammad Munwar Iqbal and   
              Sohail Jabbar and   
               Mai Alfawair and   
                Junaid Chaudhry   Network Performance Enhancement of
                                  Multi-sink Enabled Low Power Lossy
                                  Networks in SDN Based Internet of Things 367--398

International Journal of Parallel Programming
Volume 48, Number 3, June, 2020

         A. N. Gnana Jeevan and   
            M. A. Maluk Mohamed   DyTO: Dynamic Task Offloading Strategy
                                  for Mobile Cloud Computing Using
                                  Surrogate Object Model . . . . . . . . . 399--415
           A. K. Gnanasekar and   
                   V. Nagarajan   Efficient MAI Cancellation Scheme in
                                  MC-DS-CDMA Using SIC . . . . . . . . . . 416--430
            R. Saravana Ram and   
         A. Gopi Saminathan and   
                S. Arun Prakash   An Area Efficient and Low Power
                                  Consumption of Run Time Digital System
                                  Based on Dynamic Partial Reconfiguration 431--446
   Sathees Lingam Paulswamy and   
              Hariharan Kaluvan   Quadrant Based Neighbor to Sink and
                                  Neighbor to Source Routing Protocol and
                                  Alternate Node Deployment Strategies for
                                  WSN  . . . . . . . . . . . . . . . . . . 447--469
        M. A. Manazir Ahsan and   
                  Ihsan Ali and   
 Mohd Yamani Idna Bin Idris and   
             Muhammad Imran and   
                Muhammad Shoaib   Countering Statistical Attacks in
                                  Cloud-Based Searchable Encryption  . . . 470--495
             E. Laxmi Lydia and   
           P. Krishna Kumar and   
                 K. Shankar and   
       S. K. Lakshmanaprabu and   
          R. M. Vidhyavathi and   
                Andino Maseleno   Charismatic Document Clustering Through
                                  Novel $K$-Means Non-negative Matrix
                                  Factorization (KNMF) Algorithm Using Key
                                  Phrase Extraction  . . . . . . . . . . . 496--514
              R. Ramya Devi and   
       V. Vijaya Chamundeeswari   Triple DES: Privacy Preserving in Big
                                  Data Healthcare  . . . . . . . . . . . . 515--533
                Zengyu Ding and   
                   Gang Mei and   
            Salvatore Cuomo and   
                  Yixuan Li and   
                   Nengxiong Xu   Comparison of Estimating Missing Values
                                  in IoT Time Series Data Using Different
                                  Interpolation Algorithms . . . . . . . . 534--548
               P. Durgadevi and   
                  S. Srinivasan   Resource Allocation in Cloud Computing
                                  Using SFLA and Cuckoo Search
                                  Hybridization  . . . . . . . . . . . . . 549--565
                 Bowei Shan and   
                      Yong Fang   GPU Accelerated Parallel Algorithm of
                                  Sliding-Window Belief Propagation for
                                  LDPC Codes . . . . . . . . . . . . . . . 566--579
        M. A. Manazir Ahsan and   
                  Ihsan Ali and   
 Mohd Yamani Idna Bin Idris and   
             Muhammad Imran and   
                Muhammad Shoaib   Correction to: Countering Statistical
                                  Attacks in Cloud-Based Searchable
                                  Encryption . . . . . . . . . . . . . . . 580--580

International Journal of Parallel Programming
Volume 48, Number 4, August, 2020

              Christoph Kessler   Guest Editor's Note: High-Level Parallel
                                  Programming 2019 . . . . . . . . . . . . 581--582
          Christopher Brown and   
            Vladimir Janjic and   
                      J. McCall   Programming Heterogeneous Parallel
                                  Machines Using Refactoring and
                                  Monte-Carlo Tree Search  . . . . . . . . 583--602
          Christopher Brown and   
            Vladimir Janjic and   
              Kenneth MacKenzie   Refactoring GrPPI: Generic Refactoring
                                  for Generic Parallelism in C++ . . . . . 603--625
                    F. Gava and   
                     Y. Marquer   Axiomatization and Imperative
                                  Characterization of Multi-BSP
                                  Algorithms: A Q&A on a Partial Solution   626--651
             Clemens Grelck and   
             Cédric Blom   Resource-Aware Data Parallel Array
                                  Processing . . . . . . . . . . . . . . . 652--674
             M. Köster and   
              J. Groß and   
                 A. Krüger   Massively Parallel Rule-Based
                                  Interpreter Execution on GPUs Using
                                  Thread Compaction  . . . . . . . . . . . 675--691
               Luca Rinaldi and   
           Massimo Torquati and   
                Marco Danelutto   Improving the Performance of Actors on
                                  Multi-cores with Parallel Patterns . . . 692--712
               Fabian Wrede and   
                 Herbert Kuchen   Towards High-Performance Code Generation
                                  for Multi-GPU Clusters Based on a
                                  Domain-Specific Language for Algorithmic
                                  Skeletons  . . . . . . . . . . . . . . . 713--728
                      Anonymous   Editor's Note  . . . . . . . . . . . . . 729--729
                   Kang Jin and   
                 Dezun Dong and   
                    Binzhang Fu   DancerFly: An Order-Aware
                                  Network-on-Chip Router On-the-Fly
                                  Mitigating Multi-path Packet Reordering  730--749
                Junmin Xiao and   
              Guizhao Zhang and   
                  Guangming Tan   Fast Data-Obtaining Algorithm for Data
                                  Assimilation with Large Data Set . . . . 750--770

International Journal of Parallel Programming
Volume 48, Number 5, October, 2020

     Ayman A. Ataher Mahmud and   
                   Satakshi and   
                    W. Jeberson   Aircraft Landing Scheduling Using
                                  Embedded Flower Pollination Algorithm    771--785
                 P. Gowtham and   
          V. P. Arunachalam and   
                     S. Karthik   An Efficient Monitoring of Real Time
                                  Traffic Clearance for an Emergency
                                  Service Vehicle Using IOT  . . . . . . . 786--812
             S. Chidambaram and   
                     A. Sumathi   Optimal Feature Selection for the
                                  Classification of Hyperspectral Imagery
                                  Using Adaptive Spectral--Spatial
                                  Clustering . . . . . . . . . . . . . . . 813--832
            M. S. Arunkumar and   
                  P. Suresh and   
                   C. Gunavathi   High Utility Infrequent Itemset Mining
                                  Using a Customized Ant Colony Algorithm  833--849
            Puneet Jai Kaur and   
                 Sakshi Kaushal   A Fuzzy Approach for Estimating Quality
                                  of Aspect Oriented Systems . . . . . . . 850--869
             Iftikhar Ahmad and   
            Rafidah Md Noor and   
                Muhammad Shoaib   A Cooperative Heterogeneous Vehicular
                                  Clustering Mechanism for Road Traffic
                                  Management . . . . . . . . . . . . . . . 870--889
                  Han Zhang and   
                Yurong Qian and   
                   Chenwei Tian   A ViBe Based Moving Targets Edge
                                  Detection Algorithm and Its Parallel
                                  Implementation . . . . . . . . . . . . . 890--908
               Seokhoon Ryu and   
              Young-Sup Lee and   
                  Seonghyun Kim   Active Control of Engine Sound Quality
                                  in a Passenger Car Using a Virtual Error
                                  Microphone . . . . . . . . . . . . . . . 909--927
                   Wei Wang and   
             Huansheng Song and   
                        Hua Cui   Landslide Multi-attitude Data
                                  Measurement of Bedding Rock Slope Model  928--939

International Journal of Parallel Programming
Volume 48, Number 6, December, 2020

                    Zeyu He and   
                Qiuli Huang and   
                  Chuliang Weng   Handling Data Skew for Aggregation in
                                  Spark SQL Using Task Stealing  . . . . . 941--956
          Kim Grüttner and   
        Philipp A. Hartmann and   
            Wolfgang Rosenstiel   A Timed-Value Stream Based ESL Timing
                                  and Power Estimation and Simulation
                                  Framework for Heterogeneous MPSoCs . . . 957--1007
                 Yuanzhe Li and   
               Loren Schwiebert   Memory-Optimized Wavefront Parallelism
                                  on GPUs  . . . . . . . . . . . . . . . . 1008--1031
                Jihyun Park and   
              Byoungju Choi and   
                 Seungyeun Jang   Dynamic Analysis Method for Concurrency
                                  Bugs in Multi-process/Multi-thread
                                  Environments . . . . . . . . . . . . . . 1032--1060

International Journal of Parallel Programming
Volume 49, Number 1, February, 2021

         Tim Süß and   
                 Lars Nagel and   
               Thomas Soddemann   Pure Functions in C: A Small Keyword for
                                  Automatic Parallelization  . . . . . . . 1--24
                    Bo Wang and   
                   Jie Tang and   
                        Deyu Qi   A Task-Aware Fine-Grained Storage
                                  Selection Mechanism for In-Memory Big
                                  Data Computing Frameworks  . . . . . . . 25--50
               Evan Coleman and   
             Erik J. Jensen and   
                Masha Sosonkina   Fault Recovery Methods for Asynchronous
                                  Linear Solvers . . . . . . . . . . . . . 51--80
         Jean-Charles Papin and   
         Christophe Denoual and   
                 Raymond Namyst   SPAWN: An Iterative, Potentials-Based,
                                  Dynamic Scheduling and Partitioning Tool 81--103
           Raphael Beamonte and   
         Naser Ezzati-Jivan and   
             Michel R. Dagenais   Automated Generation of Model-Based
                                  Constraints for Common Multi-core and
                                  Real-Time Applications Using Execution
                                  Tracing  . . . . . . . . . . . . . . . . 104--134

International Journal of Parallel Programming
Volume 49, Number 2, April, 2021

                      Anonymous   Editor's Note: Special Issue on
                                  High-level Programming for Heterogeneous
                                  Parallel Systems (2019)  . . . . . . . . 135--135
               Adam Seewald and   
         Ulrik Pagh Schultz and   
            Henrik Skov Midtiby   Coarse-Grained Computation-Oriented
                                  Energy Modeling for Heterogeneous
                                  Parallel Embedded Systems  . . . . . . . 136--157
                  V. Pothos and   
                E. Vassalos and   
                   N. Fragoulis   Deep Learning Inference with Dynamic
                                  Graphs on Heterogeneous Platforms  . . . 158--176
            Marco Danelutto and   
          Gabriele Mencagli and   
               Peter Kilpatrick   Algorithmic Skeletons and Parallel
                                  Design Patterns in Mainstream Parallel
                                  Programming  . . . . . . . . . . . . . . 177--198
                      Anonymous   Editor's Note: Special Issue on
                                  International Embedded Systems Symposium
                                  (2019) . . . . . . . . . . . . . . . . . 199--199
              Zhongqi Cheng and   
                Tim Schmidt and   
              Rainer Dömer   Scaled Static Analysis and IP Reuse for
                                  Out-of-Order Parallel SystemC Simulation 200--215
             Tomoaki Kawada and   
               Shinya Honda and   
                 Hiroaki Takada   TZmCFI: RTOS-Aware Control-Flow
                                  Integrity Using TrustZone for Armv8-M    216--236
            Paulo C. Santos and   
  João P. C. de Lima and   
                    Luigi Carro   Enabling Near-Data Accelerators Adoption
                                  by Through Investigation of Datapath
                                  Solutions  . . . . . . . . . . . . . . . 237--252
 Menbere Kina Tekleyohannes and   
          Vladimir Rybalkin and   
                 Andreas Dengel   $i$DocChip: A Configurable Hardware
                                  Architecture for Historical Document
                                  Image Processing . . . . . . . . . . . . 253--284

International Journal of Parallel Programming
Volume 49, Number 3, June, 2021

          Amartya Mukherjee and   
         Prateeti Mukherjee and   
                   Nilanjan Dey   iGridEdgeDrone: Hybrid Mobility Aware
                                  Intelligent Load Forecasting by Edge
                                  Enabled Internet of Drone Things for
                                  Smart Grid Networks  . . . . . . . . . . 285--325
            Furat Al-Obaidy and   
              Arghavan Asad and   
             Farah A. Mohammadi   A Power-Aware Hybrid Cache for
                                  Chip-Multi Processors Based on Neural
                                  Network Prediction Technique . . . . . . 326--346
                Maria Fazio and   
             Alina Buzachis and   
                Massimo Villari   A Map-Reduce Approach for the Dijkstra
                                  Algorithm in SDN Over Osmotic Computing
                                  Systems  . . . . . . . . . . . . . . . . 347--375
            Guillaume Iooss and   
           Christophe Alias and   
              Sanjay Rajopadhye   Monoparametric Tiling of Polyhedral
                                  Programs . . . . . . . . . . . . . . . . 376--409
               Isil Öz and   
                   Sanem Arslan   Predicting the Soft Error Vulnerability
                                  of Parallel Applications Using Machine
                                  Learning . . . . . . . . . . . . . . . . 410--439
       Iraklis M. Spiliotis and   
      Charalampos Sitaridis and   
             Michael P. Bekakos   Parallel Computation of Discrete
                                  Orthogonal Moment on Block Represented
                                  Images Using OpenMP  . . . . . . . . . . 440--462
                  Biao Xing and   
                DanDan Wang and   
                      Cuihua He   Accelerating DES and AES Algorithms for
                                  a Heterogeneous Many-core Processor  . . 463--486

International Journal of Parallel Programming
Volume 49, Number 4, August, 2021

           Jörg Mische and   
               Martin Frieb and   
                   Theo Ungerer   PIMP My Many-Core: Pipeline-Integrated
                                  Message Passing  . . . . . . . . . . . . 487--505
               Sven Rheindt and   
            Sebastian Maier and   
            Andreas Herkersdorf   \pkgDySHARQ: Dynamic Software-Defined
                                  Hardware-Managed Queues for Tile-Based
                                  Architectures  . . . . . . . . . . . . . 506--540
                Sven Gesper and   
     Moritz Weißbrich and   
Guillermo Payá-Vayá   Evaluation of Different Processor
                                  Architecture Organizations for On-Site
                                  Electronics in Harsh Environments  . . . 541--569
            Akshay Srivatsa and   
            Mostafa Mansour and   
            Andreas Herkersdorf   \pkgDynaCo: Dynamic Coherence Management
                                  for Tiled Manycore Architectures . . . . 570--599
               Rafael Stahl and   
          Alexander Hoffman and   
               Ulf Schlichtmann   \pkgDeeperThings: Fully Distributed CNN
                                  Inference on Resource-Constrained Edge
                                  Devices  . . . . . . . . . . . . . . . . 600--624

International Journal of Parallel Programming
Volume 49, Number 5, October, 2021

              Guangming Tan and   
                   Guang R. Gao   Guest Editorial: Special issue on
                                  Network and Parallel Computing for
                                  Emerging Architectures and Applications  625--627
                Jiansong Li and   
                    Wei Cao and   
                  Xiaobing Feng   Compiler-assisted Operator Template
                                  Library for DNN Accelerators . . . . . . 628--645
                Tianba Chen and   
                     Wei Li and   
                     Yunchun Li   o\pkgM-DRL: Deep Reinforcement Learning
                                  Based Coflow Traffic Scheduler with MLFQ
                                  Threshold Adaption . . . . . . . . . . . 646--657
                Zhanyuan Di and   
                    En Shao and   
                  Guangming Tan   High-performance Migration Tool for Live
                                  Container in a Workflow  . . . . . . . . 658--670
                 Ziyu Zhang and   
                  Zitan Liu and   
                        Hong An   RDMA-Based Apache Storm for
                                  High-Performance Stream Data Processing  671--684
                   Yang Bai and   
               Dinghuang Hu and   
                   Xiangke Liao   CCRP: Converging Credit-Based and
                                  Reactive Protocols in Datacenters  . . . 685--699
                   Hui Dong and   
                 Jianxi Fan and   
                    Jingya Zhou   Fault-Tolerant and Unicast Performances
                                  of the Data Center Network HSDC  . . . . 700--714
                Mengshan Yu and   
               Guisheng Fan and   
                     Liang Chen   Location-based and Time-aware Service
                                  Recommendation in Mobile Edge Computing  715--731
                  Haonan Ji and   
                   Shibo Lu and   
                   Brian Vinter   Segmented Merge: A New Primitive for
                                  Parallel Sparse Matrix Computations  . . 732--744
                    Xiao Hu and   
                    Zhonghai Lu   A Configurable Hardware Architecture for
                                  Runtime Application of Network Calculus  745--760

International Journal of Parallel Programming
Volume 49, Number 6, December, 2021

               Troels Henriksen   Bounds Checking on GPU . . . . . . . . . 761--775
   Breno A. de Melo Menezes and   
              Nina Herrmann and   
  Fernando Buarque de Lima Neto   High-Level Parallel Ant Colony
                                  Optimization with Algorithmic Skeletons  776--801
Frédéric Dabrowski   On Single-Valuedness in Textually
                                  Aligned SPMD Programs  . . . . . . . . . 802--819
Millán A. Martínez and   
        Basilio B. Fraguela and   
       José C. Cabaleiro   A Parallel Skeleton for
                                  Divide-and-conquer Unbalanced and Deep
                                  Problems . . . . . . . . . . . . . . . . 820--845
           August Ernstsson and   
             Johan Ahlqvist and   
              Christoph Kessler   \pkgSkePU 3: Portable High-Level
                                  Programming of Heterogeneous Systems and
                                  HPC Clusters . . . . . . . . . . . . . . 846--866
            Pascal Jungblut and   
            Karl Fürlinger   Portable Node-Level Parallelism for the
                                  PGAS Model . . . . . . . . . . . . . . . 867--885
            Vladimir Janjic and   
          Christopher Brown and   
                Adam D. Barwell   Restoration of Legacy Parallelism:
                                  Transforming Pthreads into Farm and
                                  Pipeline Patterns  . . . . . . . . . . . 886--910
             Anshu S. Anand and   
             Karthik Sayani and   
             R. K. Shyamasundar   Fortress Abstractions in X10 Framework   911--933

International Journal of Parallel Programming
Volume 50, Number 1, February, 2022

               Neeraj Gupta and   
             Mahdi Khosravy and   
Rubén González Crespo   Lightweight Artificial Intelligence
                                  Technology for Health Diagnosis of
                                  Agriculture Vehicles: Parallel Evolving
                                  Artificial Neural Networks by Genetic
                                  Algorithm  . . . . . . . . . . . . . . . 1--26
                    Fei Yin and   
                       Feng Shi   A Comparative Survey of Big Data
                                  Computing and HPC: From a Parallel
                                  Programming Model to a Cluster
                                  Architecture . . . . . . . . . . . . . . 27--64
                  Jichi Guo and   
                    Qing Yi and   
              Kleanthis Psarris   Enhancing the Effectiveness of Inlining
                                  in Automatic Parallelization . . . . . . 65--88
               Talha Naqash and   
        Sajjad Hussain Shah and   
        Muhammad Najam Ul Islam   Statistical Analysis Based Intrusion
                                  Detection System for Ultra-High-Speed
                                  Software Defined Network . . . . . . . . 89--114
             Tongsheng Geng and   
              Marcos Amaris and   
               Jean-Luc Gaudiot   A Profile-Based AI-Assisted Dynamic
                                  Scheduling Approach for Heterogeneous
                                  Architectures  . . . . . . . . . . . . . 115--151
   Rajesh Pandian Muniasamy and   
               Rupesh Nasre and   
            N. S. Narayanaswamy   Accelerating Computation of Steiner
                                  Trees on GPUs  . . . . . . . . . . . . . 152--185

International Journal of Parallel Programming
Volume 50, Number 2, April, 2022

           Marc Reichenbach and   
              Matthias Jung and   
                 Alex Orailoglu   Guest Editorial: Special Issue on 2020
                                  IEEE International Conference on
                                  Embedded Computer Systems:
                                  Architectures, Modeling and Simulation
                                  (SAMOS 2020) . . . . . . . . . . . . . . 187--188
                  Sohan Lal and   
Bogaraju Sharatchandra Varma and   
                   Ben Juurlink   A Quantitative Study of Locality in GPU
                                  Caches for Memory-Divergent Workloads    189--216
              Lukas Steiner and   
              Matthias Jung and   
                   Norbert Wehn   DRAMSys4.0: an Open-Source Simulation
                                  Framework for In-depth DRAM Analyses . . 217--242
                  Mark Sagi and   
         Nguyen Anh Vu Doan and   
            Andreas Herkersdorf   Fine-Grained Power Modeling of Multicore
                                  Processors Using FFNNs . . . . . . . . . 243--266
                  Minyu Cui and   
        Angeliki Kritikakou and   
               Emmanuel Casseau   Energy-Efficient Partial-Duplication
                                  Task Mapping Under Multiple DVFS Schemes 267--294
      Niko Zurstraßen and   
          Lukas Jünger and   
                 Rainer Leupers   AMAIX In-Depth: a Generic Analytical
                                  Model for Deep Learning Accelerators . . 295--318

International Journal of Parallel Programming
Volume 50, Number 3--4, August, 2022

           August Ernstsson and   
       Nicolas Vandenbergen and   
              Christoph Kessler   A Deterministic Portable Parallel
                                  Pseudo-Random Number Generator for
                                  Pattern-Based Programming of
                                  Heterogeneous Parallel Systems . . . . . 319--340
               Peter Thoman and   
           Florian Tischler and   
               Thomas Fahringer   The Celerity High-level API: C++20 for
                                  Accelerator Clusters . . . . . . . . . . 341--359
   Sébastien Rivault and   
              Mostafa Bamha and   
                  Sophie Robert   A Scalable Similarity Join Algorithm
                                  Based on MapReduce and LSH . . . . . . . 360--380
             Hemalatha Eedi and   
               Sahith Karra and   
                   Rahul Utkoor   An Improved/Optimized Practical
                                  Non-Blocking PageRank Algorithm for
                                  Massive Graphs*  . . . . . . . . . . . . 381--404
        Vasilios Kelefouras and   
              Karim Djemame and   
                 Nikolaos Voros   A Methodology for Efficient Tile Size
                                  Selection for Affine Loop Kernels  . . . 405--432

International Journal of Parallel Programming
Volume 50, Number 5--6, December, 2022

              Nina Herrmann and   
   Breno A. de Melo Menezes and   
                 Herbert Kuchen   Stencil Calculations with Algorithmic
                                  Skeletons for Heterogeneous Computing
                                  Environments . . . . . . . . . . . . . . 433--453
    Júnior Löff and   
         Renato B. Hoffmann and   
             Ricardo Pieper and   
            Dalvan Griebler and   
              Luiz G. Fernandes   DSParLib: a C++ Template Library for
                                  Distributed Stream Parallelism . . . . . 454--485
Breno Augusto de Melo Menezes and   
             Herbert Kuchen and   
  Fernando Buarque de Lima Neto   Parallelization of Swarm Intelligence
                                  Algorithms: Literature Review  . . . . . 486--514
                Jash Khatri and   
              Arihant Samar and   
              Bikash Behera and   
                   Rupesh Nasre   Scaling the Maximum Flow Computation on
                                  GPUs . . . . . . . . . . . . . . . . . . 515--561
                  S. Ramesh and   
                 C. Yaashuwanth   Retraction Note: QoS and QoE Enhanced
                                  Resource Allocation for Wireless Video
                                  Sensor Networks Using Hybrid
                                  Optimization Algorithm . . . . . . . . . 562--562

International Journal of Parallel Programming
Volume 51, Number 1, February, 2023

             Nicol\`o Tonci and   
           Massimo Torquati and   
          Gabriele Mencagli and   
                Marco Danelutto   Distributed-Memory \pkgFastFlow Building
                                  Blocks . . . . . . . . . . . . . . . . . 1--21
               Rui S. Silva and   
          João L. Sobral   Efficient High-Level Programming in
                                  Plain Java . . . . . . . . . . . . . . . 22--42
           Stephen Timcheck and   
                  Jeremy Buhler   Interruptible Nodes: Reducing Queueing
                                  Costs in Irregular Streaming Dataflow
                                  Applications on Wide-SIMD Architectures  43--60
           August Ernstsson and   
            Dalvan Griebler and   
              Christoph Kessler   Assessing Application Efficiency and
                                  Performance Portability in Single-Source
                                  Programming for Heterogeneous Parallel
                                  Systems  . . . . . . . . . . . . . . . . 61--82
         Ruairidh MacGregor and   
            Blair Archibald and   
                   Phil Trinder   Generic Exact Combinatorial Search at
                                  HPC Scale  . . . . . . . . . . . . . . . 83--106
               M. BalaAnand and   
             N. Karthikeyan and   
                     S. Karthik   Retraction Note: Designing a Framework
                                  for Communal Software: Based on the
                                  Assessment Using Relation Modelling  . . 107--107

International Journal of Parallel Programming
Volume 51, Number 2--3, June, 2023

                Haoran Wang and   
             Thibaut Tachon and   
                   Chong Li and   
              Sophie Robert and   
         Sébastien Limet   SMSG: Profiling-Free Parallelism
                                  Modeling for Distributed Training of DNN 109--127
             Grace Nansamba and   
           Amani Altarawneh and   
               Anthony Skjellum   A Fault-Model-Relevant Classification of
                                  Consensus Mechanisms for MPI and HPC . . 128--149
               Fabian Knorr and   
               Peter Thoman and   
               Thomas Fahringer   Declarative Data Flow in a Graph-Based
                                  Distributed Memory Runtime System  . . . 150--171
              Nina Herrmann and   
                 Herbert Kuchen   Distributed Calculations with
                                  Algorithmic Skeletons for Heterogeneous
                                  Computing Environments . . . . . . . . . 172--185
          Lo\"\ic Sylvestre and   
         Emmanuel Chailloux and   
           Jocelyn Sérot   Accelerating OCaml Programs on FPGA  . . 186--207

International Journal of Parallel Programming
Volume 51, Number 4--5, October, 2023

             Matthew Norman and   
              Isaac Lyngaas and   
         Abhishek Bagusetty and   
                   Mark Berrill   Portable C++ Code that can Look and Feel
                                  Like Fortran Code with Yet Another
                                  Kernel Launcher (YAKL) . . . . . . . . . 209--230
             Daniel Presser and   
                 Frank Siqueira   Partitioning-Aware Performance Modeling
                                  of Distributed Graph Processing Tasks    231--255
             Vsevolod Bohaienko   Calculation of Distributed-Order
                                  Fractional Derivative on Tensor
                                  Cores-Enabled GPU  . . . . . . . . . . . 256--270
         Virginia Niculescu and   
Frédéric Loulergue   Guest Editor's Note: High--Level
                                  Parallel Programming 2021  . . . . . . . 271--273

International Journal of Parallel Programming
Volume 51, Number 6, December, 2023

      Polychronis Velentzas and   
    Michael Vassilakopoulos and   
             Antonio Corral and   
          Christos Antonopoulos   GPU-Based Algorithms for Processing the
                                  $k$ Nearest--Neighbor Query on Spatial
                                  Data Using Partitioning and Concurrent
                                  Kernel Execution . . . . . . . . . . . . 275--308
              Yacine Hakimi and   
            Riyadh Baghdadi and   
                 Yacine Challal   A Hybrid Machine Learning Model for Code
                                  Optimization . . . . . . . . . . . . . . 309--331

International Journal of Parallel Programming
Volume 52, Number 1--2, April, 2024

             Alex Orailoglu and   
           Marc Reichenbach and   
                  Matthias Jung   Special Issue on SAMOS 2022  . . . . . . 1--2
             Viktor Razilov and   
              Robert Wittig and   
        Emil Matú\vs and   
               Gerhard Fettweis   Access Interval Prediction by Partial
                                  Matching for Tightly Coupled Memory
                                  Systems  . . . . . . . . . . . . . . . . 3--19
           Milad Kokhazadeh and   
         Georgios Keramidas and   
        Vasilios Kelefouras and   
              Iakovos Stamoulis   A Practical Approach for Employing
                                  Tensor Train Decomposition in Edge
                                  Devices  . . . . . . . . . . . . . . . . 20--39
          Christian Heidorn and   
             Muhammad Sabih and   
    Nicolai Meyerhöfer and   
       Christian Schinabeck and   
          Jürgen Teich and   
                   Frank Hannig   Hardware-Aware Evolutionary Explainable
                                  Filter Pruning for Convolutional Neural
                                  Networks . . . . . . . . . . . . . . . . 40--58
          Luise Müller and   
              Philipp Wanko and   
          Christian Haubelt and   
                 Torsten Schaub   Investigating Methods for ASPmT-Based
                                  Design Space Exploration in Evolutionary
                                  Product Design . . . . . . . . . . . . . 59--92
       Alessandro Ottaviano and   
               Robert Balas and   
           Giovanni Bambini and   
        Antonio Del Vecchio and   
               Maicol Ciani and   
               Davide Rossi and   
                Luca Benini and   
               Andrea Bartolini   ControlPULP: a RISC-V On-Chip Parallel
                                  Power Controller for Many-Core HPC
                                  Processors with FPGA-Based
                                  Hardware-In-The-Loop Power and Thermal
                                  Emulation  . . . . . . . . . . . . . . . 93--123

International Journal of Parallel Programming
Volume 52, Number 3, June, 2024

               Yingpeng Wen and   
                 Zhilin Qiu and   
               Dongyu Zhang and   
                  Dan Huang and   
                  Nong Xiao and   
                      Liang Lin   Accelerating Massively Distributed Deep
                                  Learning Through Efficient
                                  Pseudo-Synchronous Update Method . . . . 125--146
                 Alif Ahmed and   
     Farzana Ahmed Siddique and   
                  Kevin Skadron   GraphTango: a Hybrid Representation
                                  Format for Efficient Streaming Graph
                                  Updates and Analysis . . . . . . . . . . 147--170
               Fabian Knorr and   
            Philip Salzmann and   
               Peter Thoman and   
               Thomas Fahringer   Automatic Discovery of Collective
                                  Communication Patterns in Parallelized
                                  Task Graphs  . . . . . . . . . . . . . . 171--186
               Pedro Moreno and   
              Miguel Areias and   
              Ricardo Rocha and   
      Vítor Santos Costa   Yet Another Lock-Free Atom Table Design
                                  for Scalable Symbol Management in Prolog 187--206
             Nicol\`o Tonci and   
   Sébastien Rivault and   
              Mostafa Bamha and   
              Sophie Robert and   
     Sébastien Limet and   
               Massimo Torquati   LSH SimilarityJoin Pattern in
                                  \pkgFastFlow . . . . . . . . . . . . . . 207--230

International Journal of Parallel Programming
Volume 52, Number 4, August, 2024

                   Bing Wei and   
                Qiang Huang and   
                   Hui Chen and   
              Chenhao Zhang and   
                     Limin Xiao   Erasure-Coded Hybrid Writes Based on
                                  Data Delta . . . . . . . . . . . . . . . 231--252
          Björn Birath and   
           August Ernstsson and   
            John Tinnerholm and   
              Christoph Kessler   High-Level Programming of
                                  FPGA-Accelerated Systems with Parallel
                                  Patterns . . . . . . . . . . . . . . . . 253--273
              Nina Herrmann and   
           Justus Dieckmann and   
                 Herbert Kuchen   Optimizing Three-Dimensional
                                  Stencil-Operations on Heterogeneous
                                  Computing Environments . . . . . . . . . 274--297
    Achilleas Tzenetopoulos and   
       Dimosthenis Masouros and   
             Sotirios Xydis and   
              Dimitrios Soudris   Orchestration Extensions for
                                  Interference- and Heterogeneity-Aware
                                  Placement for Data-Analytics . . . . . . 298--323

International Journal of Parallel Programming
Volume 52, Number 5--6, December, 2024

              Bhanu Dwivedi and   
    Bachu Dushmanta Kumar Patro   RMOWOA: a Revamped Multi-Objective Whale
                                  Optimization Algorithm for Maximizing
                                  the Lifetime of a Network in Wireless
                                  Sensor Networks  . . . . . . . . . . . . 325--366
                  Mustafa Sanli   Design and Performance Evaluation of a
                                  Novel High-Speed Hardware Architecture
                                  for Keccak Crypto Coprocessor  . . . . . 367--379
                Songwen Pei and   
                    Wei Qin and   
                  Jianan Li and   
                 Junhao Tan and   
                   Jie Tang and   
               Jean-Luc Gaudiot   Intelligent Page Migration on
                                  Heterogeneous Memory by Using
                                  Transformer  . . . . . . . . . . . . . . 380--399
       Kevin Jude Concessao and   
Unnikrishnan Cheramangalath and   
                  Ricky Dev and   
                   Rupesh Nasre   Meerkat: a Framework for Dynamic Graph
                                  Algorithms on GPUs . . . . . . . . . . . 400--453

International Journal of Parallel Programming
Volume 53, Number 1, February, 2025

              Assia Brighen and   
               Asma Chouikh and   
              Hamida Ikhlef and   
             Hachem Slimani and   
        Abdelmounaam Rezgui and   
            Hamamache Kheddouci   Giraph-Based Distributed Algorithms for
                                  Coloring Large-Scale Graphs  . . . . . . ??
                Re'em Harel and   
                 Tal Kadosh and   
          Niranjan Hasabnis and   
            Timothy Mattson and   
               Yuval Pinter and   
                       Gal Oren   PragFormer: Data-Driven Parallel Source
                                  Code Classification with Transformers    ??
                Jianwu Long and   
                     Luping Liu   K*-Means: an Efficient Clustering
                                  Algorithm with Adaptive Decision
                                  Boundaries . . . . . . . . . . . . . . . ??
          Naw Safrin Sattar and   
          Khaled Z. Ibrahim and   
                Aydin Buluc and   
             Shaikh Arifuzzaman   DyG-DPCD: a Distributed Parallel
                                  Community Detection Algorithm for
                                  Large-Scale Dynamic Graphs . . . . . . . ??
         Stefan Brankovi\'c and   
         Lazar Smiljkovi\'c and   
        Predrag Obradovi\'c and   
        Milo\vs Radonjii\'c and   
                Marko Mi\vsi\'c   Fast Parallel CPU--GPU Approximate
                                  Spectral Clustering for Transcriptomics
                                  Data . . . . . . . . . . . . . . . . . . ??

International Journal of Parallel Programming
Volume 53, Number 2, April, 2025

         M. Mohamed Asan Basiri   High Throughput Instruction-Data Level
                                  Parallelism Based Arithmetic Hardware
                                  Accelerator  . . . . . . . . . . . . . . ??
          Valentin Beauvais and   
             Nicol\`o Tonci and   
              Sophie Robert and   
         Sébastien Limet   Parallelizing RNA-Seq Analysis with
                                  \pkgBioSkel: a \pkgFastFlow Based
                                  Prototype  . . . . . . . . . . . . . . . ??
               Yaseen Zaidi and   
                  Simon Winberg   Automatic Heterogeneous Runtime Using
                                  Signal Processing Domain-Specific and
                                  Parallel Patterns  . . . . . . . . . . . ??
         Parinaz Barakhshan and   
               Rudolf Eigenmann   Advancing Interactive Parallelization:
                                  \pkgiCetus . . . . . . . . . . . . . . . ??
   Marco Edoardo Santimaria and   
Alberto Riccardo Martinelli and   
          Iacopo Colonnelli and   
          Barbara Cantalupo and   
           Massimo Torquati and   
                Marco Aldinucci   CAPIO-CL: The CAPIO Coordination
                                  Language . . . . . . . . . . . . . . . . ??
          Christopher Brown and   
                Adam D. Barwell   \pkgpi-par: a Dependently-Typed Parallel
                                  Language with Algorithmic Skeletons  . . ??
         Simone Frassinelli and   
              Gabriele Mencagli   Larger-Than-Memory Stateful Stream
                                  Processing with WindFlow . . . . . . . . ??
            Paolo Palazzari and   
             Marco Faltelli and   
              Francesco Iannone   FIPLib: an Image Processing Library for
                                  FPGAs Using High-Level Synthesis . . . . ??
         Ricardo Leonarczyk and   
          Gabriele Mencagli and   
                Dalvan Griebler   Self-Adaptive Micro-Batching for
                                  Low-Latency GPU-Accelerated Stream
                                  Processing . . . . . . . . . . . . . . . ??
         Michail Boulasikis and   
             Flavius Gruian and   
Robert-Zoltán Szász   Using Machine Learning Hardware to Solve
                                  Linear Partial Differential Equations
                                  with Finite Difference Methods . . . . . ??
               William Ruys and   
                 Hochan Lee and   
                  Bozhi You and   
              Shreya Talati and   
              Jaeyoung Park and   
         James Almgren-Bell and   
                 Yineng Yan and   
           Milinda Fernando and   
                Mattan Erez and   
             Milos Gligoric and   
           Martin Burtscher and   
    Christopher J. Rossbach and   
             Keshav Pingali and   
                   George Biros   Performance Characterization of Python
                                  Runtimes for Multi-device Task Parallel
                                  Programming  . . . . . . . . . . . . . . ??