1
|
Roberts EJ, Chavez T, Hexemer A, Zwart PH. DLSIA: Deep Learning for Scientific Image Analysis. J Appl Crystallogr 2024; 57:392-402. [PMID: 38596727 PMCID: PMC11001410 DOI: 10.1107/s1600576724001390] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Accepted: 02/12/2024] [Indexed: 04/11/2024] Open
Abstract
DLSIA (Deep Learning for Scientific Image Analysis) is a Python-based machine learning library that empowers scientists and researchers across diverse scientific domains with a range of customizable convolutional neural network (CNN) architectures for a wide variety of tasks in image analysis to be used in downstream data processing. DLSIA features easy-to-use architectures, such as autoencoders, tunable U-Nets and parameter-lean mixed-scale dense networks (MSDNets). Additionally, this article introduces sparse mixed-scale networks (SMSNets), generated using random graphs, sparse connections and dilated convolutions connecting different length scales. For verification, several DLSIA-instantiated networks and training scripts are employed in multiple applications, including inpainting for X-ray scattering data using U-Nets and MSDNets, segmenting 3D fibers in X-ray tomographic reconstructions of concrete using an ensemble of SMSNets, and leveraging autoencoder latent spaces for data compression and clustering. As experimental data continue to grow in scale and complexity, DLSIA provides accessible CNN construction and abstracts CNN complexities, allowing scientists to tailor their machine learning approaches, accelerate discoveries, foster interdisciplinary collaboration and advance research in scientific image analysis.
Collapse
Affiliation(s)
- Eric J. Roberts
- Center for Advanced Mathematics for Energy Research Applications, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Tanny Chavez
- Advanced Light Source, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Alexander Hexemer
- Center for Advanced Mathematics for Energy Research Applications, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
- Advanced Light Source, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Petrus H. Zwart
- Center for Advanced Mathematics for Energy Research Applications, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
- Berkeley Synchrotron Infrared Structural Biology Program, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| |
Collapse
|
2
|
Kaiser MAA, Datta G, Wang Z, Jacob AP, Beerel PA, Jaiswal AR. Neuromorphic-P 2M: processing-in-pixel-in-memory paradigm for neuromorphic image sensors. Front Neuroinform 2023; 17:1144301. [PMID: 37214316 PMCID: PMC10192623 DOI: 10.3389/fninf.2023.1144301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2023] [Accepted: 04/13/2023] [Indexed: 05/24/2023] Open
Abstract
Edge devices equipped with computer vision must deal with vast amounts of sensory data with limited computing resources. Hence, researchers have been exploring different energy-efficient solutions such as near-sensor, in-sensor, and in-pixel processing, bringing the computation closer to the sensor. In particular, in-pixel processing embeds the computation capabilities inside the pixel array and achieves high energy efficiency by generating low-level features instead of the raw data stream from CMOS image sensors. Many different in-pixel processing techniques and approaches have been demonstrated on conventional frame-based CMOS imagers; however, the processing-in-pixel approach for neuromorphic vision sensors has not been explored so far. In this work, for the first time, we propose an asynchronous non-von-Neumann analog processing-in-pixel paradigm to perform convolution operations by integrating in-situ multi-bit multi-channel convolution inside the pixel array performing analog multiply and accumulate (MAC) operations that consume significantly less energy than their digital MAC alternative. To make this approach viable, we incorporate the circuit's non-ideality, leakage, and process variations into a novel hardware-algorithm co-design framework that leverages extensive HSpice simulations of our proposed circuit using the GF22nm FD-SOI technology node. We verified our framework on state-of-the-art neuromorphic vision sensor datasets and show that our solution consumes ~2× lower backend-processor energy while maintaining almost similar front-end (sensor) energy on the IBM DVS128-Gesture dataset than the state-of-the-art while maintaining a high test accuracy of 88.36%.
Collapse
Affiliation(s)
- Md Abdullah-Al Kaiser
- Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA, United States
- Information Sciences Institute, University of Southern California, Los Angeles, CA, United States
| | - Gourav Datta
- Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA, United States
| | - Zixu Wang
- Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA, United States
| | - Ajey P. Jacob
- Information Sciences Institute, University of Southern California, Los Angeles, CA, United States
| | - Peter A. Beerel
- Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA, United States
- Information Sciences Institute, University of Southern California, Los Angeles, CA, United States
| | - Akhilesh R. Jaiswal
- Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA, United States
- Information Sciences Institute, University of Southern California, Los Angeles, CA, United States
| |
Collapse
|
3
|
Shi W, Huang Z, Huang H, Hu C, Chen M, Yang S, Chen H. LOEN: Lensless opto-electronic neural network empowered machine vision. LIGHT, SCIENCE & APPLICATIONS 2022; 11:121. [PMID: 35508469 PMCID: PMC9068799 DOI: 10.1038/s41377-022-00809-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/30/2021] [Revised: 04/15/2022] [Accepted: 04/20/2022] [Indexed: 06/14/2023]
Abstract
Machine vision faces bottlenecks in computing power consumption and large amounts of data. Although opto-electronic hybrid neural networks can provide assistance, they usually have complex structures and are highly dependent on a coherent light source; therefore, they are not suitable for natural lighting environment applications. In this paper, we propose a novel lensless opto-electronic neural network architecture for machine vision applications. The architecture optimizes a passive optical mask by means of a task-oriented neural network design, performs the optical convolution calculation operation using the lensless architecture, and reduces the device size and amount of calculation required. We demonstrate the performance of handwritten digit classification tasks with a multiple-kernel mask in which accuracies of as much as 97.21% were achieved. Furthermore, we optimize a large-kernel mask to perform optical encryption for privacy-protecting face recognition, thereby obtaining the same recognition accuracy performance as no-encryption methods. Compared with the random MLS pattern, the recognition accuracy is improved by more than 6%.
Collapse
Affiliation(s)
- Wanxin Shi
- Beijing National Research Center for Information Science and Technology (BNRist), Department of Electronic Engineering, Tsinghua University, Beijing, 100084, China
| | - Zheng Huang
- Beijing National Research Center for Information Science and Technology (BNRist), Department of Electronic Engineering, Tsinghua University, Beijing, 100084, China
| | - Honghao Huang
- Beijing National Research Center for Information Science and Technology (BNRist), Department of Electronic Engineering, Tsinghua University, Beijing, 100084, China
| | - Chengyang Hu
- Beijing National Research Center for Information Science and Technology (BNRist), Department of Electronic Engineering, Tsinghua University, Beijing, 100084, China
| | - Minghua Chen
- Beijing National Research Center for Information Science and Technology (BNRist), Department of Electronic Engineering, Tsinghua University, Beijing, 100084, China
| | - Sigang Yang
- Beijing National Research Center for Information Science and Technology (BNRist), Department of Electronic Engineering, Tsinghua University, Beijing, 100084, China
| | - Hongwei Chen
- Beijing National Research Center for Information Science and Technology (BNRist), Department of Electronic Engineering, Tsinghua University, Beijing, 100084, China.
| |
Collapse
|
4
|
Li Y, Chen R, Sensale-Rodriguez B, Gao W, Yu C. Real-time multi-task diffractive deep neural networks via hardware-software co-design. Sci Rep 2021; 11:11013. [PMID: 34040045 PMCID: PMC8155121 DOI: 10.1038/s41598-021-90221-7] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Accepted: 05/04/2021] [Indexed: 11/09/2022] Open
Abstract
Deep neural networks (DNNs) have substantial computational requirements, which greatly limit their performance in resource-constrained environments. Recently, there are increasing efforts on optical neural networks and optical computing based DNNs hardware, which bring significant advantages for deep learning systems in terms of their power efficiency, parallelism and computational speed. Among them, free-space diffractive deep neural networks (D2NNs) based on the light diffraction, feature millions of neurons in each layer interconnected with neurons in neighboring layers. However, due to the challenge of implementing reconfigurability, deploying different DNNs algorithms requires re-building and duplicating the physical diffractive systems, which significantly degrades the hardware efficiency in practical application scenarios. Thus, this work proposes a novel hardware-software co-design method that enables first-of-its-like real-time multi-task learning in D22NNs that automatically recognizes which task is being deployed in real-time. Our experimental results demonstrate significant improvements in versatility, hardware efficiency, and also demonstrate and quantify the robustness of proposed multi-task D2NN architecture under wide noise ranges of all system components. In addition, we propose a domain-specific regularization algorithm for training the proposed multi-task architecture, which can be used to flexibly adjust the desired performance for each task.
Collapse
Affiliation(s)
- Yingjie Li
- Electrical and Computer Engineering Department, University of Utah, 50 S Central Campus Road, Salt Lake City, UT, 84112, USA
| | - Ruiyang Chen
- Electrical and Computer Engineering Department, University of Utah, 50 S Central Campus Road, Salt Lake City, UT, 84112, USA
| | - Berardi Sensale-Rodriguez
- Electrical and Computer Engineering Department, University of Utah, 50 S Central Campus Road, Salt Lake City, UT, 84112, USA
| | - Weilu Gao
- Electrical and Computer Engineering Department, University of Utah, 50 S Central Campus Road, Salt Lake City, UT, 84112, USA.
| | - Cunxi Yu
- Electrical and Computer Engineering Department, University of Utah, 50 S Central Campus Road, Salt Lake City, UT, 84112, USA.
| |
Collapse
|
5
|
Dynamic Temperature Management of Near-Sensor Processing for Energy-Efficient High-Fidelity Imaging. SENSORS 2021; 21:s21030926. [PMID: 33573185 PMCID: PMC7866500 DOI: 10.3390/s21030926] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/07/2020] [Revised: 01/09/2021] [Accepted: 01/27/2021] [Indexed: 11/23/2022]
Abstract
Vision processing on traditional architectures is inefficient due to energy-expensive off-chip data movement. Many researchers advocate pushing processing close to the sensor to substantially reduce data movement. However, continuous near-sensor processing raises sensor temperature, impairing imaging/vision fidelity. We characterize the thermal implications of using 3D stacked image sensors with near-sensor vision processing units. Our characterization reveals that near-sensor processing reduces system power but degrades image quality. For reasonable image fidelity, the sensor temperature needs to stay below a threshold, situationally determined by application needs. Fortunately, our characterization also identifies opportunities—unique to the needs of near-sensor processing—to regulate temperature based on dynamic visual task requirements and rapidly increase capture quality on demand. Based on our characterization, we propose and investigate two thermal management strategies—stop-capture-go and seasonal migration—for imaging-aware thermal management. For our evaluated tasks, our policies save up to 53% of system power with negligible performance impact and sustained image fidelity.
Collapse
|
6
|
Hadidi R, Cao J, Woodward M, Ryoo MS, Kim H. Distributed Perception by Collaborative Robots. IEEE Robot Autom Lett 2018. [DOI: 10.1109/lra.2018.2856261] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|