We have just received two Xilinx Alveo U280 as a donation from the Xilinx University Program.
Many thanks to Xilinx for this generous donation!
These two cards, together with the Alveo U50 we bought earlier will be the foundation for the FPGA accelerated computing infrastructure we are designing within the Hardware Acceleration Lab.
Two most-advanced accelerator cards will allow to experiment with our Lattice QCD kernels in a multi-node environment, where the networking and efficient kernel-to-kernel communication over PCIe bus and QSFP+ will be key elements.
It took some time, a lot of research and development to make this very important step!
Trying to employ latest technologies, we have implemented, compiled and successfully run the accelerated Conjugate Gradient solver on Alveo U280 (shell 2019.2) using Vitis, both officially released just couple days ago.
In this design we are evaluating some of techniques like:
Integrated HBM memory
Fully streamlined kernel
We have managed to fit 3 instances of the kernel into the device, consuming about 70% of available resources.
Each kernel instance works with Iteration Interval of 2 clock cycles at 300 MHz, that gives almost 600 GFLOPs for the entire solution!
We are pleased to announce that our paper “Investigating the Dirac operator evaluation on FPGAs”, where we describe our research on running accelerated computations on hardware has been published in Supercomputing Frontiers and Innovations vol. 6 no. 2 2019.
Feel free to check it out any time under this [link].
You can find there a description of our kernel performance, evaluated on Xilinx Alveo U250 platforms and developed with SDAccel software package.
Soon we will present a detailed study of various algorithm architectures in order to achieve highest performance and profit from embedded HBM in Alveo U280.