Deep Learning | Hande Alemdar

Scalable High-Performance Architecture for Convolutional Ternary Neural Networks on FPGA

To demonstrate the efficiency of our proposal, we implement high-complexity convolutional neural networks on the Xilinx Virtex-7 VC709 FPGA board. While reaching a better accuracy than comparable designs, we can target either high throughput or low power. We measure a throughput up to 27000fps at ≈7W or up to 8.36 TMAC/s at ≈13W.

Ternary Neural Networks for Resource-Efficient AI Applications

We evaluate TNNs on several benchmark datasets and demonstrate up to 3.1× better energy efficiency with respect to the state of the art while also improving accuracy.