Angie Wang (Berkeley)

Mar 23, 2-3PM, 531 Cory.

Title and Abstract

An Agile Approach to FFTs and Hardware DSP Generation
FFTs with a broad range of performance requirements are needed for many modern-day applications, ranging from medical imaging and machine learning to communication and radio astronomy. DSP algorithms such as the FFT are translated into application-specific hardware via primitives composed in various ways for different architectural realizations. Despite sharing underlying algorithms and hardware constructs, designs are often difficult to reuse, leading to redeveloping and reverifying conceptually similar instances. Hardware generators are attractive solutions for effectively balancing fine-grained control of implementation details with simple, retargetable hardware descriptions.

A memory-based, runtime-reconfigurable 2^n 3^m 5^k 7^l FFT generator has been created using the Chisel hardware construction language for software-defined radio research. The generator uses a conflict-free, in-place, multi-bank SRAM design, and exploits the duality of decimation-in-frequency and decimation-in-time FFTs to support continuous data flow with  2N memory. A 0.37-mm^2 LTE/Wi-Fi compatible FFT instance with performance and area comparable to state-of-the-art has been integrated as an accelerator within a complete RISC-V processing system. The measured 16-nm FinFET chip, designed and taped out within 1 month of PDK delivery, runs up to 940 MHz and consumes 0.46 to 22.6 mW of power when running benchmarks with Wi-Fi and LTE symbol lengths and data rates.

To support a broader range of spectral analysis applications, a low-power signal acquisition front end capable of sensing frequency-sparse signals with a resolution of 185 kHz in real time has been prototyped. Nyquist-rate wideband spectrum sensing is typically power hungry, requiring a fast ADC with a high-bandwidth digital interface to subsequent post-processing/inference blocks that must handle significant amounts of raw, real-time data. For frequency-sparse signals, the analog input can instead be subsampled, and frequency-domain aliasing can be leveraged to digitally reconstruct the frequency data via the Fast Fourier Aliasing-based Sparse Transform (FFAST). The output, containing only non-zero frequency content, is compressed, reducing IO bandwidth requirements and power. The spectral analysis SoC, taped out in a 16-nm FinFET process, integrates a front-end with three sets of 25x, 27x, and 32x subsampling SAR ADCs generated using the Berkeley Analog Generator (BAG) framework. The ADC samples, compressed by 35, are fed into generated 864-, 800-, and 675-point FFTs (resulting in an effective 21,600-point FFT), and a digital reconstruction back end, consisting of a signal location estimator and a peeling decoder, recovers spectra with sparsities up to ~3 and input SNRs down to  10 dB. The spectra, compressed to  5%, are recovered in  0.02 ms. A single-issue, in-order RISC-V Rocket processor interacts with the spectrum analyzer for post-processing and calibration. The ADC consumes 53 mW with a 4-GHz input clock. At 400 MHz and 0.8-V supply, the Rocket core and FFAST DSP consume 184 mW.


Angie Wang received the B.S. degree in electrical engineering from the California Institute of Technology in 2012. She is currently a Ph.D. candidate at the University of California, Berkeley. She is advised by Professor Borivoje Nikolic and is affiliated with the Berkeley Wireless Research Center (BWRC) and ADEPT labs. She previously interned at NASA's Jet Propulsion Laboratory, NTT Communication Science Laboratories in Japan, Apple, Nokia, and SiFive. Her current research focus is on the design of ASIC and FPGA hardware generators and design automation tools to ease the implementation of VLSI signal processing systems, with applications in sensor interfaces, software-defined radio, and beyond. She was previously supported by the National Science Foundation's Graduate Research Fellowship