It took some time, a lot of research and development to make this very important step!
Trying to employ latest technologies, we have implemented, compiled and successfully run the accelerated Conjugate Gradient solver on Alveo U280 (shell 2019.2) using Vitis, both officially released just couple days ago.
In this design we are evaluating some of techniques like:
- Integrated HBM memory
- Fully streamlined kernel
We have managed to fit 3 instances of the kernel into the device, consuming about 70% of available resources.
Each kernel instance works with Iteration Interval of 2 clock cycles at 300 MHz, that gives almost 600 GFLOPs for the entire solution!