Gui D, Chen Y, Kuang W, Shang M, Zhang Y, Huang ZL. PCIe-based FPGA-GPU heterogeneous computation for real-time multi-emitter fitting in super-resolution localization microscopy.
BIOMEDICAL OPTICS EXPRESS 2022;
13:3401-3415. [PMID:
35781968 PMCID:
PMC9208611 DOI:
10.1364/boe.459198]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Revised: 05/07/2022] [Accepted: 05/09/2022] [Indexed: 06/15/2023]
Abstract
Real-time multi-emitter fitting is a key technology for advancing super-resolution localization microscopy (SRLM), especially when it is necessary to achieve dynamic imaging quality control and/or optimization of experimental conditions. However, with the increase of activation densities, the requirements in the computing resources would increase rapidly due to the complexity of the fitting algorithms, making it difficult to realize real-time multi-emitter fitting for emitter density more than 0.6 mol/µm2 in large field of view (FOV), even after acceleration with the popular Graphics Processing Unit (GPU) computation. Here we adopt the task parallelism strategy in computer science to construct a Peripheral Component Interconnect Express (PCIe) based all-in-one heterogeneous computing platform (AIO-HCP), where the data between two major parallel computing hardware, Field Programmable Gate Array (FPGA) and GPU, are interacted directly and executed simultaneously. Using simulated and experimental data, we verify that AIO-HCP could achieve a data throughput of up to ∼ 1.561 GB/s between FPGA and GPU. With this new platform, we develop a multi-emitter fitting method, called AIO-STORM, under big data stream parallel scheduling. We show that AIO-STORM is capable of providing real-time image processing on raw images with 100 µm × 100 µm FOV, 10 ms exposure time and 5.5 mol/µm2 structure density, without scarifying image quality. This study overcomes the data throughput limitation of heterogeneous devices, demonstrates the power of the PCIe-based heterogeneous computation platform, and offers opportunities for multi-scale stitching of super-resolution images.
Collapse