26
|
Rajasekaran S, Finnoff JT. An innovative technique for recording picture-in-picture ultrasound videos. JOURNAL OF ULTRASOUND IN MEDICINE : OFFICIAL JOURNAL OF THE AMERICAN INSTITUTE OF ULTRASOUND IN MEDICINE 2013; 32:1493-1497. [PMID: 23887962 DOI: 10.7863/ultra.32.8.1493] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Many ultrasound educational products and ultrasound researchers present diagnostic and interventional ultrasound information using picture-in-picture videos, which simultaneously show the ultrasound image and transducer and patient positions. Traditional techniques for creating picture-in-picture videos are expensive, nonportable, or time-consuming. This article describes an inexpensive, simple, and portable way of creating picture-in-picture ultrasound videos. This technique uses a laptop computer with a video capture device to acquire the ultrasound feed. Simultaneously, a webcam captures a live video feed of the transducer and patient position and live audio. Both sources are streamed onto the computer screen and recorded by screen capture software. This technique makes the process of recording picture-in-picture ultrasound videos more accessible for ultrasound educators and researchers for use in their presentations or publications.
Collapse
|
27
|
Yuan J, Xu G, Yu Y, Zhou Y, Carson PL, Wang X, Liu X. Real-time photoacoustic and ultrasound dual-modality imaging system facilitated with graphics processing unit and code parallel optimization. JOURNAL OF BIOMEDICAL OPTICS 2013; 18:86001. [PMID: 23907277 PMCID: PMC3733419 DOI: 10.1117/1.jbo.18.8.086001] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/07/2013] [Revised: 05/31/2013] [Accepted: 06/21/2013] [Indexed: 05/18/2023]
Abstract
Photoacoustic tomography (PAT) offers structural and functional imaging of living biological tissue with highly sensitive optical absorption contrast and excellent spatial resolution comparable to medical ultrasound (US) imaging. We report the development of a fully integrated PAT and US dual-modality imaging system, which performs signal scanning, image reconstruction, and display for both photoacoustic (PA) and US imaging all in a truly real-time manner. The back-projection (BP) algorithm for PA image reconstruction is optimized to reduce the computational cost and facilitate parallel computation on a state of the art graphics processing unit (GPU) card. For the first time, PAT and US imaging of the same object can be conducted simultaneously and continuously, at a real-time frame rate, presently limited by the laser repetition rate of 10 Hz. Noninvasive PAT and US imaging of human peripheral joints in vivo were achieved, demonstrating the satisfactory image quality realized with this system. Another experiment, simultaneous PAT and US imaging of contrast agent flowing through an artificial vessel, was conducted to verify the performance of this system for imaging fast biological events. The GPU-based image reconstruction software code for this dual-modality system is open source and available for download from http://sourceforge.net/projects/patrealtime.
Collapse
|
28
|
Qi M, Cao TT, Tan TS. Computing 2D constrained delaunay triangulation using the GPU. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2013; 19:736-748. [PMID: 23492377 DOI: 10.1109/tvcg.2012.307] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
We propose the first graphics processing unit (GPU) solution to compute the 2D constrained Delaunay triangulation (CDT) of a planar straight line graph (PSLG) consisting of points and edges. There are many existing CPU algorithms to solve the CDT problem in computational geometry, yet there has been no prior approach to solve this problem efficiently using the parallel computing power of the GPU. For the special case of the CDT problem where the PSLG consists of just points, which is simply the normal Delaunay triangulation (DT) problem, a hybrid approach using the GPU together with the CPU to partially speed up the computation has already been presented in the literature. Our work, on the other hand, accelerates the entire computation on the GPU. Our implementation using the CUDA programming model on NVIDIA GPUs is numerically robust, and runs up to an order of magnitude faster than the best sequential implementations on the CPU. This result is reflected in our experiment with both randomly generated PSLGs and real-world GIS data having millions of points and edges.
Collapse
|
29
|
Matsukura H, Yoneda T, Ishida H. Smelling screen: development and evaluation of an olfactory display system for presenting a virtual odor source. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2013; 19:606-615. [PMID: 23428445 DOI: 10.1109/tvcg.2013.40] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
We propose a new olfactory display system that can generate an odor distribution on a two-dimensional display screen. The proposed system has four fans on the four corners of the screen. The airflows that are generated by these fans collide multiple times to create an airflow that is directed towards the user from a certain position on the screen. By introducing odor vapor into the airflows, the odor distribution is as if an odor source had been placed onto the screen. The generated odor distribution leads the user to perceive the odor as emanating from a specific region of the screen. The position of this virtual odor source can be shifted to an arbitrary position on the screen by adjusting the balance of the airflows from the four fans. Most users do not immediately notice the odor presentation mechanism of the proposed olfactory display system because the airflow and perceived odor come from the display screen rather than the fans. The airflow velocity can even be set below the threshold for airflow sensation, such that the odor alone is perceived by the user. We present experimental results that show the airflow field and odor distribution that are generated by the proposed system. We also report sensory test results to show how the generated odor distribution is perceived by the user and the issues that must be considered in odor presentation.
Collapse
|
30
|
Laha B, Bowman DA, Schiffbauer JD. Validation of the MR simulation approach for evaluating the effects of immersion on visual analysis of volume data. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2013; 19:529-538. [PMID: 23428436 DOI: 10.1109/tvcg.2013.43] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
In our research agenda to study the effects of immersion (level of fidelity) on various tasks in virtual reality (VR) systems, we have found that the most generalizable findings come not from direct comparisons of different technologies, but from controlled simulations of those technologies. We call this the mixed reality (MR) simulation approach. However, the validity of MR simulation, especially when different simulator platforms are used, can be questioned. In this paper, we report the results of an experiment examining the effects of field of regard (FOR) and head tracking on the analysis of volume visualized micro-CT datasets, and compare them with those from a previous study. The original study used a CAVE-like display as the MR simulator platform, while the present study used a high-end head-mounted display (HMD). Out of the 24 combinations of system characteristics and tasks tested on the two platforms, we found that the results produced by the two different MR simulators were similar in 20 cases. However, only one of the significant effects found in the original experiment for quantitative tasks was reproduced in the present study. Our observations provide evidence both for and against the validity of MR simulation, and give insight into the differences caused by different MR simulator platforms. The present experiment also examined new conditions not present in the original study, and produced new significant results, which confirm and extend previous existing knowledge on the effects of FOR and head tracking. We provide design guidelines for choosing display systems that can improve the effectiveness of volume visualization applications.
Collapse
|
31
|
Berkelman P, Miyasaka M, Bozlee S. Co-located haptic and 3D graphic interface for medical simulations. Stud Health Technol Inform 2013; 184:48-50. [PMID: 23400128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
We describe a system which provides high-fidelity haptic feedback in the same physical location as a 3D graphical display, in order to enable realistic physical interaction with virtual anatomical tissue during modelled procedures such as needle driving, palpation, and other interventions performed using handheld instruments. The haptic feedback is produced by the interaction between an array of coils located behind a thin flat LCD screen, and permanent magnets embedded in the instrument held by the user. The coil and magnet configuration permits arbitrary forces and torques to be generated on the instrument in real time according to the dynamics of the simulated tissue by activating the coils in combination. A rigid-body motion tracker provides position and orientation feedback of the handheld instrument to the computer simulation, and the 3D display is produced using LCD shutter glasses and a head-tracking system for the user.
Collapse
|
32
|
SALUD LH, KWAN C, PUGH CM. Simplifying touch data from tri-axial sensors using a new data visualization tool. Stud Health Technol Inform 2013; 184:370-376. [PMID: 23400186 PMCID: PMC3693446] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Quantification and evaluation of palpation is a growing field of research in medicine and engineering. A newly developed tri-axial touch sensor has been designed to capture a multi-dimensional profile of touch-loaded forces. We have developed a data visualization tool as a first step in simplifying interpretation of touch for assessing hands-on clinical performance.
Collapse
|
33
|
Wang D, Qiao H, Song X, Fan Y, Li D. Fluorescence molecular tomography using a two-step three-dimensional shape-based reconstruction with graphics processing unit acceleration. APPLIED OPTICS 2012; 51:8731-8744. [PMID: 23262613 DOI: 10.1364/ao.51.008731] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/27/2012] [Accepted: 11/26/2012] [Indexed: 06/01/2023]
Abstract
In fluorescence molecular tomography, the accurate and stable reconstruction of fluorescence-labeled targets remains a challenge for wide application of this imaging modality. Here we propose a two-step three-dimensional shape-based reconstruction method using graphics processing unit (GPU) acceleration. In this method, the fluorophore distribution is assumed as the sum of ellipsoids with piecewise-constant fluorescence intensities. The inverse problem is formulated as a constrained nonlinear least-squares problem with respect to shape parameters, leading to much less ill-posedness as the number of unknowns is greatly reduced. Considering that various shape parameters contribute differently to the boundary measurements, we use a two-step optimization algorithm to handle them in a distinctive way and also stabilize the reconstruction. Additionally, the GPU acceleration is employed for finite-element-method-based calculation of the objective function value and the Jacobian matrix, which reduces the total optimization time from around 10 min to less than 1 min. The numerical simulations show that our method can accurately reconstruct multiple targets of various shapes while the conventional voxel-based reconstruction cannot separate the nearby targets. Moreover, the two-step optimization can tolerate different initial values in the existence of noises, even when the number of targets is not known a priori. A physical phantom experiment further demonstrates the method's potential in practical applications.
Collapse
|
34
|
Nishitsuji T, Shimobaba T, Kakue T, Masuda N, Ito T. Fast calculation of computer-generated hologram using the circular symmetry of zone plates. OPTICS EXPRESS 2012; 20:27496-502. [PMID: 23262699 DOI: 10.1364/oe.20.027496] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
Computer-Generated Holograms (CGHs) can be generated from three-dimensional objects composed of point light sources by overlapping zone plates. A zone plate is a grating that can focus an incident wave and it has circular symmetry shape. In this study, we propose a fast CGH generating algorithm using the circular symmetry of zone plates and computer graphics techniques. We evaluated the proposed method by numerical simulation.
Collapse
|
35
|
Dinkelbach HÜ, Vitay J, Beuth F, Hamker FH. Comparison of GPU- and CPU-implementations of mean-firing rate neural networks on parallel hardware. NETWORK (BRISTOL, ENGLAND) 2012; 23:212-236. [PMID: 23140422 DOI: 10.3109/0954898x.2012.739292] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Modern parallel hardware such as multi-core processors (CPUs) and graphics processing units (GPUs) have a high computational power which can be greatly beneficial to the simulation of large-scale neural networks. Over the past years, a number of efforts have focused on developing parallel algorithms and simulators best suited for the simulation of spiking neural models. In this article, we aim at investigating the advantages and drawbacks of the CPU and GPU parallelization of mean-firing rate neurons, widely used in systems-level computational neuroscience. By comparing OpenMP, CUDA and OpenCL implementations towards a serial CPU implementation, we show that GPUs are better suited than CPUs for the simulation of very large networks, but that smaller networks would benefit more from an OpenMP implementation. As this performance strongly depends on data organization, we analyze the impact of various factors such as data structure, memory alignment and floating precision. We then discuss the suitability of the different hardware depending on the networks' size and connectivity, as random or sparse connectivities in mean-firing rate networks tend to break parallel performance on GPUs due to the violation of coalescence.
Collapse
|
36
|
Chessa M, Bianchi V, Zampetti M, Sabatini SP, Solari F. Real-time simulation of large-scale neural architectures for visual features computation based on GPU. NETWORK (BRISTOL, ENGLAND) 2012; 23:272-291. [PMID: 23116085 DOI: 10.3109/0954898x.2012.737500] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
The intrinsic parallelism of visual neural architectures based on distributed hierarchical layers is well suited to be implemented on the multi-core architectures of modern graphics cards. The design strategies that allow us to optimally take advantage of such parallelism, in order to efficiently map on GPU the hierarchy of layers and the canonical neural computations, are proposed. Specifically, the advantages of a cortical map-like representation of the data are exploited. Moreover, a GPU implementation of a novel neural architecture for the computation of binocular disparity from stereo image pairs, based on populations of binocular energy neurons, is presented. The implemented neural model achieves good performances in terms of reliability of the disparity estimates and a near real-time execution speed, thus demonstrating the effectiveness of the devised design strategies. The proposed approach is valid in general, since the neural building blocks we implemented are a common basis for the modeling of visual neural functionalities.
Collapse
|
37
|
Slażyński L, Bohte S. Streaming parallel GPU acceleration of large-scale filter-based spiking neural networks. NETWORK (BRISTOL, ENGLAND) 2012; 23:183-211. [PMID: 23098420 DOI: 10.3109/0954898x.2012.733842] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
The arrival of graphics processing (GPU) cards suitable for massively parallel computing promises affordable large-scale neural network simulation previously only available at supercomputing facilities. While the raw numbers suggest that GPUs may outperform CPUs by at least an order of magnitude, the challenge is to develop fine-grained parallel algorithms to fully exploit the particulars of GPUs. Computation in a neural network is inherently parallel and thus a natural match for GPU architectures: given inputs, the internal state for each neuron can be updated in parallel. We show that for filter-based spiking neurons, like the Spike Response Model, the additive nature of membrane potential dynamics enables additional update parallelism. This also reduces the accumulation of numerical errors when using single precision computation, the native precision of GPUs. We further show that optimizing simulation algorithms and data structures to the GPU's architecture has a large pay-off: for example, matching iterative neural updating to the memory architecture of the GPU speeds up this simulation step by a factor of three to five. With such optimizations, we can simulate in better-than-realtime plausible spiking neural networks of up to 50 000 neurons, processing over 35 million spiking events per second.
Collapse
|
38
|
Abstract
Modern graphics cards contain hundreds of cores that can be programmed for intensive calculations. They are beginning to be used for spiking neural network simulations. The goal is to make parallel simulation of spiking neural networks available to a large audience, without the requirements of a cluster. We review the ongoing efforts towards this goal, and we outline the main difficulties.
Collapse
|
39
|
Loane J, O'Mullane B, Bortz B, Knapp RB. Looking for similarities in movement between and within homes using cluster analysis. Health Informatics J 2012; 18:202-11. [PMID: 23011815 DOI: 10.1177/1460458212445501] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
In this article we examine data from eight purpose-built aware homes over a six-month period, looking at presence in rooms to try to determine patterns among the older residents. We look for homes that have similar movement patterns using cluster analysis. We also examine how movement over days clusters within individual homes. Our analysis shows that different homes have distinct movement patterns but within individual homes residents have strong movement routines.
Collapse
|
40
|
Wang L, Hofer B, Guggenheim JA, Povazay B. Graphics processing unit-based dispersion encoded full-range frequency-domain optical coherence tomography. JOURNAL OF BIOMEDICAL OPTICS 2012; 17:077007. [PMID: 22894520 DOI: 10.1117/1.jbo.17.7.077007] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Dispersion encoded full-range (DEFR) frequency-domain optical coherence tomography (FD-OCT) and its enhanced version, fast DEFR, utilize dispersion mismatch between sample and reference arm to eliminate the ambiguity in OCT signals caused by non-complex valued spectral measurement, thereby numerically doubling the usable information content. By iteratively suppressing asymmetrically dispersed complex conjugate artifacts of OCT-signal pulses the complex valued signal can be recovered without additional measurements, thus doubling the spatial signal range to cover the full positive and negative sampling range. Previously the computational complexity and low processing speed limited application of DEFR to smaller amounts of data and did not allow for interactive operation at high resolution. We report a graphics processing unit (GPU)-based implementation of fast DEFR, which significantly improves reconstruction speed by a factor of more than 90 in respect to CPU-based processing and thereby overcomes these limitations. Implemented on a commercial low-cost GPU, a display line rate of ∼21,000 depth scans/s for 2048 samples/depth scan using 10 iterations of the fast DEFR algorithm has been achieved, sufficient for real-time visualization in situ.
Collapse
|
41
|
Nagaoka T, Watanabe S. Multi-GPU accelerated three-dimensional FDTD method for electromagnetic simulation. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2012; 2011:401-4. [PMID: 22254333 DOI: 10.1109/iembs.2011.6090128] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Numerical simulation with a numerical human model using the finite-difference time domain (FDTD) method has recently been performed in a number of fields in biomedical engineering. To improve the method's calculation speed and realize large-scale computing with the numerical human model, we adapt three-dimensional FDTD code to a multi-GPU environment using Compute Unified Device Architecture (CUDA). In this study, we used NVIDIA Tesla C2070 as GPGPU boards. The performance of multi-GPU is evaluated in comparison with that of a single GPU and vector supercomputer. The calculation speed with four GPUs was approximately 3.5 times faster than with a single GPU, and was slightly (approx. 1.3 times) slower than with the supercomputer. Calculation speed of the three-dimensional FDTD method using GPUs can significantly improve with an expanding number of GPUs.
Collapse
|
42
|
Chen W, Ward K, Li Q, Kecman V, Najarian K, Menke N. Agent based modeling of blood coagulation system: implementation using a GPU based high speed framework. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2012; 2011:145-8. [PMID: 22254271 DOI: 10.1109/iembs.2011.6089915] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
The coagulation and fibrinolytic systems are complex, inter-connected biological systems with major physiological roles. The complex, nonlinear multi-point relationships between the molecular and cellular constituents of two systems render a comprehensive and simultaneous study of the system at the microscopic and macroscopic level a significant challenge. We have created an Agent Based Modeling and Simulation (ABMS) approach for simulating these complex interactions. As the scale of agents increase, the time complexity and cost of the resulting simulations presents a significant challenge. As such, in this paper, we also present a high-speed framework for the coagulation simulation utilizing the computing power of graphics processing units (GPU). For comparison, we also implemented the simulations in NetLogo, Repast, and a direct C version. As our experiments demonstrate, the computational speed of the GPU implementation of the million-level scale of agents is over 10 times faster versus the C version, over 100 times faster versus the Repast version and over 300 times faster versus the NetLogo simulation.
Collapse
|
43
|
Wang W, Huang HH, Kay M, Cavazos J. GPGPU accelerated cardiac arrhythmia simulations. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2012; 2011:724-7. [PMID: 22254412 DOI: 10.1109/iembs.2011.6090164] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Computational modeling of cardiac electrophysiology is a powerful tool for studying arrhythmia mechanisms. In particular, cardiac models are useful for gaining insights into experimental studies, and in the foreseeable future they will be used by clinicians to improve therapy for the patients suffering from complex arrhythmias. Such models are highly intricate, both in their geometric structure and in the equations that represent myocyte electrophysiology. For these models to be useful in a clinical setting, cost-effective solutions for solving the models in real time must be developed. In this work, we hypothesized that low-cost GPGPU-based hardware systems can be used to accelerate arrhythmia simulations. We ported a two dimensional monodomain cardiac model and executed it on various GPGPU platforms. Electrical activity was simulated during point stimulation and rotor activity. Our GPGPU implementations provided significant speedups over the CPU implementation: 18X for point stimulation and 12X for rotor activity. We found that the number of threads that could be launched concurrently was a critical factor in optimizing the GPGPU implementations.
Collapse
|
44
|
Broxvall M, Emilsson K, Thunberg P. Fast GPU based adaptive filtering of 4D echocardiography. IEEE TRANSACTIONS ON MEDICAL IMAGING 2012; 31:1165-1172. [PMID: 22167599 DOI: 10.1109/tmi.2011.2179308] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Time resolved three-dimensional (3D) echocardiography generates four-dimensional (3D+time) data sets that bring new possibilities in clinical practice. Image quality of four-dimensional (4D) echocardiography is however regarded as poorer compared to conventional echocardiography where time-resolved 2D imaging is used. Advanced image processing filtering methods can be used to achieve image improvements but to the cost of heavy data processing. The recent development of graphics processing unit (GPUs) enables highly parallel general purpose computations, that considerably reduces the computational time of advanced image filtering methods. In this study multidimensional adaptive filtering of 4D echocardiography was performed using GPUs. Filtering was done using multiple kernels implemented in OpenCL (open computing language) working on multiple subsets of the data. Our results show a substantial speed increase of up to 74 times, resulting in a total filtering time less than 30 s on a common desktop. This implies that advanced adaptive image processing can be accomplished in conjunction with a clinical examination. Since the presented GPU processor method scales linearly with the number of processing elements, we expect it to continue scaling with the expected future increases in number of processing elements. This should be contrasted with the increases in data set sizes in the near future following the further improvements in ultrasound probes and measuring devices. It is concluded that GPUs facilitate the use of demanding adaptive image filtering techniques that in turn enhance 4D echocardiographic data sets. The presented general methodology of implementing parallelism using GPUs is also applicable for other medical modalities that generate multidimensional data.
Collapse
|
45
|
Watanabe Y. Real time processing of Fourier domain optical coherence tomography with fixed-pattern noise removal by partial median subtraction using a graphics processing unit. JOURNAL OF BIOMEDICAL OPTICS 2012; 17:050503. [PMID: 22612118 DOI: 10.1117/1.jbo.17.5.050503] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
The author presents a graphics processing unit (GPU) programming for real-time Fourier domain optical coherence tomography (FD-OCT) with fixed-pattern noise removal by subtracting mean and median. In general, the fixed-pattern noise can be removed by the averaged spectrum from the many spectra of an actual measurement. However, a mean-spectrum results in artifacts as residual lateral lines caused by a small number of high-reflective points on a sample surface. These artifacts can be eliminated from OCT images by using medians instead of means. However, median calculations that are based on a sorting algorithm can generate a large amount of computation time. With the developed GPU programming, highly reflective surface regions were obtained by calculating the standard deviation of the Fourier transformed data in the lateral direction. The medians and means were then subtracted at the observed regions and other regions, such as backgrounds. When the median calculation was less than 256 positions out of a total 512 depths in an OCT image with 1024 A-lines, the GPU processing rate was faster than that of the line scan camera (46.9 kHz). Therefore, processed OCT images can be displayed in real-time using partial medians.
Collapse
|
46
|
Daga M, Feng WC. Multi-dimensional characterization of electrostatic surface potential computation on graphics processors. BMC Bioinformatics 2012; 13 Suppl 5:S4. [PMID: 22537008 PMCID: PMC3358664 DOI: 10.1186/1471-2105-13-s5-s4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Calculating the electrostatic surface potential (ESP) of a biomolecule is critical towards understanding biomolecular function. Because of its quadratic computational complexity (as a function of the number of atoms in a molecule), there have been continual efforts to reduce its complexity either by improving the algorithm or the underlying hardware on which the calculations are performed. RESULTS We present the combined effect of (i) a multi-scale approximation algorithm, known as hierarchical charge partitioning (HCP), when applied to the calculation of ESP and (ii) its mapping onto a graphics processing unit (GPU). To date, most molecular modeling algorithms perform an artificial partitioning of biomolecules into a grid/lattice on the GPU. In contrast, HCP takes advantage of the natural partitioning in biomolecules, which in turn, better facilitates its mapping onto the GPU. Specifically, we characterize the effect of known GPU optimization techniques like use of shared memory. In addition, we demonstrate how the cost of divergent branching on a GPU can be amortized across algorithms like HCP in order to deliver a massive performance boon. CONCLUSIONS We accelerated the calculation of ESP by 25-fold solely by parallelization on the GPU. Combining GPU and HCP, resulted in a speedup of at most 1,860-fold for our largest molecular structure. The baseline for these speedups is an implementation that has been hand-tuned SSE-optimized and parallelized across 16 cores on the CPU. The use of GPU does not deteriorate the accuracy of our results.
Collapse
|
47
|
Nakai H. [Three-dimensional computer graphics. 1. Hardware topics]. Nihon Hoshasen Gijutsu Gakkai Zasshi 2012; 68:1414-1418. [PMID: 23089846 DOI: 10.6009/jjrt.2012_jsrt_68.10.1414] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
|
48
|
Seibert A, Denisov S, Ponomarev AV, Hänggi P. Mapping the Arnold web with a graphic processing unit. CHAOS (WOODBURY, N.Y.) 2011; 21:043123. [PMID: 22225360 DOI: 10.1063/1.3658622] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
The Arnold diffusion constitutes a dynamical phenomenon which may occur in the phase space of a non-integrable Hamiltonian system whenever the number of the system degrees of freedom is M ≥ 3. The diffusion is mediated by a web-like structure of resonance channels, which penetrates the phase space and allows the system to explore the whole energy shell. The Arnold diffusion is a slow process; consequently, the mapping of the web presents a very time-consuming task. We demonstrate that the exploration of the Arnold web by use of a graphic processing unit-supercomputer can result in distinct speedups of two orders of magnitude as compared with standard CPU-based simulations.
Collapse
|
49
|
Yang X, Deka S, Righetti R. A hybrid CPU-GPGPU approach for real-time elastography. IEEE TRANSACTIONS ON ULTRASONICS, FERROELECTRICS, AND FREQUENCY CONTROL 2011; 58:2631-2645. [PMID: 23443699 DOI: 10.1109/tuffc.2011.2126] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Ultrasound elastography is becoming a widely available clinical imaging tool. In recent years, several real- time elastography algorithms have been proposed; however, most of these algorithms achieve real-time frame rates through compromises in elastographic image quality. Cross-correlation- based elastographic techniques are known to provide high- quality elastographic estimates, but they are computationally intense and usually not suitable for real-time clinical applications. Recently, the use of massively parallel general purpose graphics processing units (GPGPUs) for accelerating computationally intense operations in biomedical applications has received great interest. In this study, we investigate the use of the GPGPU to speed up generation of cross-correlation-based elastograms and achieve real-time frame rates while preserving elastographic image quality. We propose and statistically analyze performance of a new hybrid model of computation suitable for elastography applications in which sequential code is executed on the CPU and parallel code is executed on the GPGPU. Our results indicate that the proposed hybrid approach yields optimal results and adequately addresses the trade-off between speed and quality.
Collapse
|
50
|
Waudby CA, Christodoulou J. GPU accelerated Monte Carlo simulation of pulsed-field gradient NMR experiments. JOURNAL OF MAGNETIC RESONANCE (SAN DIEGO, CALIF. : 1997) 2011; 211:67-73. [PMID: 21570329 DOI: 10.1016/j.jmr.2011.04.004] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/08/2011] [Revised: 04/07/2011] [Accepted: 04/11/2011] [Indexed: 05/30/2023]
Abstract
The simulation of diffusion by Monte Carlo methods is often essential to describing NMR measurements of diffusion in porous media. However, simulation timescales must often span hundreds of milliseconds, with large numbers of trajectories required to ensure statistical convergence. Here we demonstrate that by parallelising code to run on graphics processing units (GPUs), these calculations may be accelerated by over three orders of magnitude, opening new frontiers in experimental design and analysis. As such cards are commonly installed on most desktop computers, we expect that this will prove useful in many cases where simple analytical descriptions are not available or appropriate, e.g. in complex geometries or where short gradient pulse approximations do not hold, or for the analysis of diffusion-weighted MRI in complex tissues such as the lungs and brain.
Collapse
|