Computer Science Faculty Research

Survey on Deep Neural Networks in Speech and Vision Systems

M. Alam, Old Dominion University
Manar D. Samad, Tennessee State UniversityFollow
Lasitha Vidyaratne, Old Dominion University
‪Alexander Glandon, Old Dominion University
Khan M. Iftekharuddin, Old Dominion University

Document Type

Article

Publication Date

12-5-2020

Abstract

This survey presents a review of state-of-the-art deep neural network architectures, algorithms, and systems in speech and vision applications. Recent advances in deep artificial neural network algorithms and architectures have spurred rapid innovation and development of intelligent speech and vision systems. With availability of vast amounts of sensor data and cloud computing for processing and training of deep neural networks, and with increased sophistication in mobile and embedded technology, the next-generation intelligent systems are poised to revolutionize personal and commercial computing. This survey begins by providing background and evolution of some of the most successful deep learning models for intelligent speech and vision systems to date. An overview of large-scale industrial research and development efforts is provided to emphasize future trends and prospects of intelligent speech and vision systems. Robust and efficient intelligent systems demand low-latency and high fidelity in resource-constrained hardware platforms such as mobile devices, robots, and automobiles. Therefore, this survey also provides a summary of key challenges and recent successes in running deep neural networks on hardware-restricted platforms, i.e. within limited memory, battery life, and processing capabilities. Finally, emerging applications of speech and vision across disciplines such as affective computing, intelligent transportation, and precision medicine are discussed. To our knowledge, this paper provides one of the most comprehensive surveys on the latest developments in intelligent speech and vision applications from the perspectives of both software and hardware systems. Many of these emerging technologies using deep neural networks show tremendous promise to revolutionize research and development for future speech and vision systems.

Recommended Citation

M. Alam, M.D. Samad, L. Vidyaratne, A. Glandon, K.M. Iftekharuddin, "Survey on Deep Neural Networks in Speech and Vision Systems", Neurocomputing, Volume 417, 2020, Pages 302-321, ISSN 0925-2312, https://doi.org/10.1016/j.neucom.2020.07.053.

Download

Included in

Artificial Intelligence and Robotics Commons, Other Computer Engineering Commons

COinS

Digital Scholarship @ Tennessee State University

TSU Library

Computer Science Faculty Research

Survey on Deep Neural Networks in Speech and Vision Systems

Document Type

Publication Date

Abstract

Recommended Citation

Included in

Search

Links

Browse

Author Corner

Digital Scholarship @ Tennessee State University

TSU Library

Computer Science Faculty Research

Survey on Deep Neural Networks in Speech and Vision Systems

Authors

Document Type

Publication Date

Abstract

Recommended Citation

Included in

Share

Search

Links

Browse

Author Corner