It took some time, a lot of research and development to make this very important step!
Trying to employ latest technologies, we have implemented, compiled and successfully run the accelerated Conjugate Gradient solver on Alveo U280 (shell 2019.2) using Vitis, both officially released just couple days ago.
In this design we are evaluating some of techniques like:
Integrated HBM memory
Fully streamlined kernel
We have managed to fit 3 instances of the kernel into the device, consuming about 70% of available resources.
Each kernel instance works with Iteration Interval of 2 clock cycles at 300 MHz, that gives almost 600 GFLOPs for the entire solution!
We are pleased to announce that our paper “Investigating the Dirac operator evaluation on FPGAs”, where we describe our research on running accelerated computations on hardware has been published in Supercomputing Frontiers and Innovations vol. 6 no. 2 2019.
Feel free to check it out any time under this [link].
You can find there a description of our kernel performance, evaluated on Xilinx Alveo U250 platforms and developed with SDAccel software package.
Soon we will present a detailed study of various algorithm architectures in order to achieve highest performance and profit from embedded HBM in Alveo U280.
February was a month of very intensive work to prepare our straw detector and Data Acquisition System for tests with proton beam from COSY accelerator at Juelich Forschungszentrum in Germany.
Together with 5 other groups we had granted one week of beamtime to evaluate the detectors, electronics and software.
It was the first time we evaluated operation of the entire, small scale detector system for PANDA experiment. Three detector subsystems: Forward Tracker, Electromagnetic Calorimeter and Time-of-Flight, each with their own readout system, were synchronized with SODANet system and generated data was processed by a set of 3 Compute Node modules for burst building and preliminary preprocessing.
It was also the possibility to test the data preprocessing system based on Xilinx ZCU102 platform. The board receives data streams from the digitizing boards and recovers track candidates, rejecting empty events.
You are welcome to join the seminar by dr. Piotr Korcyl on Tuesday 15th January 2019 at 12:15 in room D-2-02 at the Faculty of Physics, Astronomy and Applied Computer Science of Jagiellonian University.
The talk will cover:
implementation of Conjugate Gradient algorithm on Xilinx Zynq MPSoC
design methodologies to accelerate computations on FPGA platforms using High Level Synthesis
memory management and data transport infrastructure
prospects to design an FPGA High Performance Computing platform.
During the International Conference on Lattice Field Theory in East Lansing, MI, USA we presented a poster describing our hardware based accelerator for the Dirac matrix inverter. For the first time FPGA devices were shown to be useful in the HPC context discussed at this conference. Several groups expressed their interest in collaboration including the groups from: Michigan State University, Massachusetts Institute of Technology, Brookhaven National Laboratory and China Normal University.
Presented results together with an overview of further development will soon be published. In the meantime you can check out the poster in the results section.