#### Filter Results:

- Full text PDF available (13)

#### Publication Year

2011

2017

- This year (4)
- Last 5 years (14)
- Last 10 years (15)

#### Publication Type

#### Co-author

#### Journals and Conferences

#### Key Phrases

Learn More

- Fabio Luporini, Ana Lucia Varbanescu, +4 authors Paul H. J. Kelly
- TACO
- 2014

We study and systematically evaluate a class of composable code transformations that improve arithmetic intensity in local assembly operations, which represent a significant fraction of the execution time in finite element methods. Their performance optimization is indeed a challenging issue. Even though affine loop nests are generally present, the short… (More)

- Florian Rathgeber, David A. Ham, +6 authors Paul H. J. Kelly
- ACM Trans. Math. Softw.
- 2016

Firedrake is a new tool for automating the numerical solution of partial differential equations. Firedrake adopts the domain-specific language for the finite element method of the FEniCS project, but with a pure Python runtime-only implementation centered on the composition of several existing and new abstractions for particular aspects of scientific… (More)

- Alberto Bandettini, Fabio Luporini, Giovanni Viglietta
- ArXiv
- 2011

Gathering mobile robots is a widely studied problem in robotic research. This survey first introduces the related work, summarizing models and results. Then, the focus shifts on the open problem of gathering fat robots. In this context, “fat” means that the robot is not represented by a point in a bidimensional space, but it has an extent. Moreover, it can… (More)

- Fabio Luporini, David A. Ham, Paul H. J. Kelly
- ACM Trans. Math. Softw.
- 2017

We present an algorithm for the optimization of a class of finite-element integration loop nests. This algorithm, which exploits fundamental mathematical properties of finite-element operators, is proven to achieve a locally optimal operation count. In specified circumstances the optimum achieved is global. Extensive numerical experiments demonstrate… (More)

The advent of multi-/many-core architectures demands efficient run-time supports to sustain parallel applications scalability. Synchronization mechanisms should be optimized in order to account for different scenarios, such as the interaction between threads executed on different cores as well as intra-core synchronization, i.e. involving threads executed… (More)

- Michelle Mills Strout, Fabio Luporini, +5 authors Paul H. J. Kelly
- 2014 IEEE 28th International Parallel and…
- 2014

Many scientific applications are organized in a data parallel way: as sequences of parallel and/or reduction loops. This exposes parallelism well, but does not convert data reuse between loops into data locality. This paper focuses on this issue in parallel loops whose loop-to-loop dependence structure is data-dependent due to indirect references such as… (More)

- Michael Lange, Navjot Kukreja, +6 authors Gerard Gorman
- 2016 6th Workshop on Python for High-Performance…
- 2016

Domain specific languages (DSL) have been used in a variety of fields to express complex scientific problems in a concise manner and provide automated performance optimization for a range of computational architectures. As such DSLs provide a powerful mechanism to speed up scientific Python computation that goes beyond traditional vectorization and… (More)

- Gheorghe-Teodor Bercea, Andrew T. T. McRae, +5 authors Paul H. J. Kelly
- ArXiv
- 2016

- Navjot Kukreja, Mathias Louboutin, Felippe Vieira, Fabio Luporini, Michael Lange, Gerard Gorman
- 2016 Sixth International Workshop on Domain…
- 2016

Domain specific languages have successfully been used in a variety of fields to cleanly express scientific problems as well as to simplify implementation and performance optimization on different computer architectures. Although a large number of stencil languages are available, finite difference domain specific languages have proved challenging to design… (More)

- Gheorghe-Teodor Bercea, Andrew T. T. McRae, +5 authors Paul H. J. Kelly
- 2016

We present a generic algorithm for numbering and then efficiently iterating over the data values attached to an extruded mesh. An extruded mesh is formed by replicating an existing mesh, assumed to be unstructured, to form layers of prismatic cells. Applications of extruded meshes include, but are not limited to, the representation of 3D high aspect ratio… (More)