EVEREST: A design environment for extreme-scale big data analytics on heterogeneous platforms

  title={EVEREST: A design environment for extreme-scale big data analytics on heterogeneous platforms},
  author={C. Michael Pilato and Stanislav Bohm and Fabien Brocheton and Jer{\'o}nimo Castrill{\'o}n and Riccardo Cevasco and Vojtech Cima and Radim Cmar and Dionysios Diamantopoulos and Fabrizio Ferrandi and Jan Martinovic and Gianluca Palermo and Michele Paolino and Antonio Parodi and Lorenzo Pittaluga and Daniel Raho and Francesco Regazzoni and Katerina Slaninov{\'a} and Christoph Hagleitner},
  journal={2021 Design, Automation \& Test in Europe Conference \& Exhibition (DATE)},
  • C. Pilato, S. Bohm, C. Hagleitner
  • Published 1 February 2021
  • Computer Science
  • 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE)
High-Performance Big Data Analytics (HPDA) applications are characterized by huge volumes of distributed and heterogeneous data that require efficient computation for knowledge extraction and decision making. Designers are moving towards a tight integration of computing systems combining HPC, Cloud, and IoT solutions with artificial intelligence (AI). Matching the application and data requirements with the characteristics of the underlying hardware is a key element to improve the predictions… 

Figures from this paper

DAPHNE: An Open and Extensible System Infrastructure for Integrated Data Analysis Pipelines

The overall DAPHNE system architecture, its key components, and the design of a vectorized execution engine for computational storage, HW accelerators, as well as local and distributed operations are described.

A Survey on Domain-Specific Memory Architectures

The major components, the common challenges, and the state-of-the-art design methodologies for building domain-specific memory architectures are described, providing a classification based on their main topics.

The Future of FPGA Acceleration in Datacenters and the Cloud

Current architectures and discusses scalability and abstractions supported by operating systems, middleware, and virtualization are explored and the viability of these architectures for popular applications is reviewed, with a particular focus on deep learning and scientific computing.

Dynamically-Tunable Dataflow Architectures Based on Markov Queuing Models

This microarchitecture features online prediction based on queuing models to estimate the response time of the system and select the proper variant to meet the target throughput, enabling the creation of dynamically-tunable systems.

From Domain-Specific Languages to Memory-Optimized Accelerators for Fluid Dynamics

This paper proposes an automated tool flow from a domain-specific language (DSL) to generate accelerators for computational fluid dynamics on FPGA that simplifies the exploration of parameters and constraints such as on-chip memory usage and a decoupled optimization of memory and logic resources.

Anomaly detection to improve security of big data analytics

Hierarchical Temporal Memory (HTM) is as an anomaly detection technique sufficiently generic to achieve satisfactory performance on a wide range of applications, thus suitable to ease the burden of selecting the anomaly detection method.

High-Level Synthesis of Security Properties via Software-Level Abstractions

This work uses the case of dynamic information flow tracking, showing how classic software-level abstractions can be efficiently used to hide implementation details to the designers in high-level synthesis.

Compiler Infrastructure for Specializing Domain-Specific Memory Templates

This work proposes a multi-level compilation flow that specializes a domain-specific memory template to match data, application, and technology requirements to simplify the design of specialized hardware accelerators.

Automatic Creation of High-Bandwidth Memory Architectures from Domain-Specific Languages: The Case of Computational Fluid Dynamics

An automated tool flow from a domain-specific language (DSL) for tensor expressions to generate massively-parallel accelerators on HBM-equipped FPGAs and combines an MLIR-based compiler with an in-house hardware generation flow to generate systems with parallel accelerators and a specialized memory architecture that moves data efficiently.

Bulletin of Electrical Engineering and Informatics

A CO VID-19 detection method has been presented in this paper for the initial identification of COVID-19 hazard factors and it is shown that it is possible to identify FAST traits efficiently.



Trends in big data analytics

High level synthesis of RDF queries for graph analytics

A novel accelerator design that employs an adaptive and Distributed Controller architecture, and a Memory Interface Controller that supports concurrent and atomic memory operations on a multi-ported/multi-banked shared memory is proposed.

Energy-efficient Runtime Resource Management for Adaptable Multi-application Mapping

A runtime manager for firm real-time applications that generates such mapping segments based on partial solutions and aims at minimizing the overall energy consumption without deadline violations is presented.

HelmGemm: Managing GPUs and FPGAs for Transprecision GEMM Workloads in Containerized Environments

  • D. DiamantopoulosC. Hagleitner
  • Computer Science
    2019 IEEE 30th International Conference on Application-specific Systems, Architectures and Processors (ASAP)
  • 2019
HelmGemm, a system-level component to support energy-efficient computing on CPU-GPU-FPGA heterogeneous architectures for container services, is proposed and succeeded in improving the average energy efficiency by up to 2.3× in inter-scale containerized configurations across three representative GEMM-based cloud applications in the field of machine learning.

Efficient synthesis of graph methods: A dynamically scheduled architecture

This paper presents a novel architecture to improve the synthesis of graph methods with a Dynamic Task Scheduler (DTS), which reduces load unbalance and maximize resource utilization, and a Hierarchical Memory Interface controller (HMI), which provides support for concurrent memory operations on multi-ported/multi-banked shared memories.

MLIR: A Compiler Infrastructure for the End of Moore's Law

Evaluation of MLIR as a generalized infrastructure that reduces the cost of building compilers-describing diverse use-cases to show research and educational opportunities for future programming languages, compilers, execution environments, and computer architecture.

An FPGA Platform for Hyperscalers

An infrastructure which integrates 64 FPGAs (Kintex* UltraScale* XCKU060) from Xilinx* in a 19" × 2U chassis, and provides a bi-sectional bandwidth of 640 Gb/s is described, which turns the FPGA into a disaggregated standalone computing resource that can be deployed at large scale into emerging hyperscale data centers.

A runtime adaptive controller for supporting hardware components with variable latency

This paper presents an innovative lightweight controller architecture able to automatically adjust its behavior at run-time and examines the capabilities of the proposed architectural model to adapt its behavior during the execution, compared to classical ones, such as the finite state machine.

TVM: An Automated End-to-End Optimizing Compiler for Deep Learning

TVM is a compiler that exposes graph-level and operator-level optimizations to provide performance portability to deep learning workloads across diverse hardware back-ends and automates optimization of low-level programs to hardware characteristics by employing a novel, learning-based cost modeling method for rapid exploration of code optimizations.

Agile SoC development with open ESP

Conceived as a heterogeneous integration platform and tested through years of teaching at Columbia University, ESP supports the open-source hardware community by providing a flexible platform for agile SoC development.