Prof. Philip Brisk (University of California, Riverside)
The last decade has seen significant advances in creating bedside monitoring algorithms for a host of medical conditions;however surprisingly few of these algorithms have seen deployment in wearable devices. The obvious difficulty is theavailability of computational resources on a device that is small enough to be convenient and unobtrusive. The computationalresource gap between conventional systems and wearable devices can be partly bridged by optimizing the algorithms(admissible pruning, early abandoning, indexing, etc.), but increasingly sophisticated monitoring algorithms have producedan arms race that is outpacing the performance and energy capabilities of the hardware community. Within this context,application- and domain-specialization are ultimately necessary in order to achieve the highest possible efficiency forwearable computing platforms.
Medical monitoring is a specialized form of time series data mining. Most time series data mining algorithms require similaritycomparisons as a subroutine, and there is increasing evidence that the Dynamic Time Warping (DTW) measure outperformsthe competition in most domains, including medical monitoring. In addition to medical monitoring, DTW has been used indiverse domains such as robotics, medicine, biometrics, music/speech processing, climatology, aviation, gesture recognition,user interfaces, industrial processing, cryptanalysis, mining of historical manuscripts, geology, astronomy, space exploration,wildlife monitoring, and many others. Despite its ubiquity, DTW remains too computationally intensive for use in real-timeapplications because its core is a dynamic programming algorithm that has a quadratic time complexity; however, recentalgorithmic optimizations have enabled DTW to achieve near-constant amortized time when processing time series databasescontaining trillions of elements.
As further software optimization appears unlikely to yield any further improvements, attention must be turned to hardwarespecialization. This talk will present the design, implementation, and evaluation of an application-specific processor whoseinstruction set has been customized to accelerate a software-optimized implementation of DTW. Compared to a 32-bitembedded processor, our design yields a 4.87x improvement in performance and a 78% reduction in energy consumptionwhen prototyped on a Xilinx EK-V6-ML605-G Virtex 6 FPGA.