We are happy to announce that our research on hardware acceleration of Conjugate Gradient algorithm has been written down in a form of a publication.
Feel free to check it out any time under this [link].
It has been also submitted to a major computer science journal. We are looking forward to receive reviewers opinions in the near future.
During the International Conference on Lattice Field Theory in East Lansing, MI, USA we presented a poster describing our hardware based accelerator for the Dirac matrix inverter. For the first time FPGA devices were shown to be useful in the HPC context discussed at this conference. Several groups expressed their interest in collaboration including the groups from: Michigan State University, Massachusetts Institute of Technology, Brookhaven National Laboratory and China Normal University.
Presented results together with an overview of further development will soon be published. In the meantime you can check out the poster in the results section.
Our innovative solution for accelerating the Conjugate Gradient algorithm in Lattice Quantum Chromodynamics has been accepted for a poster presentation during the 36th Annual International Symposium on Lattice Field Theory in East Lansing in USA.
We have developed an accelerator capable of performing double precision computations with peak performance at the level of 750 GFLOPS, entirely implemented in Programmable Logic. It is a unique project of this type and sets an entry point for the development of a distributed and scalable High-Performance Computing platform.
Next Tuesday (22 May, 16:00, B-2-50):
- Karol Farbaniec will report on Digilent Design Contest
- Maciej Bendec will report on his research on Neural Networks implementation on FPGAs
- We’ll discuss creation of IEEE Student Branch FPGA and Networking
- We’ll discuss schedule and details of the III Symposium
We have managed to evaluate first implementations of the Conjugate Gradient algorithm – an iterative solver for sets of linear equations used in Monte Carlo simulations on the Zynq MPSoC device.
Monte Carlo based algorithms are commonly used in theoretical physics. Such simulations are run on supercomputers, members of our team are using those located in Juelich Forschungszentrum Germany and ICM in Poland, both ranked in top 500 list (positions 22 and 223 respectively).
Those computing facilities are mainly empolying Intel Xeon CPUs, how about accelerating those computations with FPGAs? Or even replace with new platforms featuring FPGAs and MPSoCs? Enourmous amount of resources in currently available series, support from integrated ARM processors and soon available 3D ICs with integrated High Bandwidth Memory are even more convincing.
Our first results are promissing and show great potential of this technology. First shot? About 150x acceleration factor over single ARM core.
Soon we will present more detailed study.
Today we have run our first full implementation of LSTM neural network on Xilinx Zynq MPSoC ZCU102 platform!
There are 8 hardware accelerators to help the ARM performing successive stages of the network evaluation.
We have achieved over 20x acceleration comparing to pure-software implementation.
The project has been entirely developed in SDSoC environment.