1
|
Ren X, Li J, Hua Z, Jiang X. Consistent image processing based on co‐saliency. CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY 2021. [DOI: 10.1049/cit2.12020] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Affiliation(s)
- Xiangnan Ren
- School of Computer Science and Technology Shandong Technology and Business University Yantai China
- Co‐innovation Center of Shandong Colleges and Universities Future Intelligent Computing, Shandong Technology and Business University Yantai China
| | - Jinjiang Li
- School of Computer Science and Technology Shandong Technology and Business University Yantai China
- Co‐innovation Center of Shandong Colleges and Universities Future Intelligent Computing, Shandong Technology and Business University Yantai China
| | - Zhen Hua
- Co‐innovation Center of Shandong Colleges and Universities Future Intelligent Computing, Shandong Technology and Business University Yantai China
- School of Information and Electronic Engineering Shandong Technology and Business University Yantai China
| | - Xinbo Jiang
- School of Computer Science and Technology Shandong Technology and Business University Yantai China
- Shandong Provincial Key Laboratory of Software Engineering Shandong University Jinan China
| |
Collapse
|
2
|
Wang M, Lang C, Liang L, Feng S, Wang T, Gao Y. End-to-End Text-to-Image Synthesis with Spatial Constrains. ACM T INTEL SYST TEC 2020. [DOI: 10.1145/3391709] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
Although the performance of automatically generating high-resolution realistic images from text descriptions has been significantly boosted, many challenging issues in image synthesis have not been fully investigated, due to shapes variations, viewpoint changes, pose changes, and the relations of multiple objects. In this article, we propose a novel end-to-end approach for text-to-image synthesis with spatial constraints by mining object spatial location and shape information. Instead of learning a hierarchical mapping from text to image, our algorithm directly generates multi-object fine-grained images through the guidance of the generated semantic layouts. By fusing text semantic and spatial information into a synthesis module and jointly fine-tuning them with multi-scale semantic layouts generated, the proposed networks show impressive performance in text-to-image synthesis for complex scenes. We evaluate our method both on single-object CUB dataset and multi-object MS-COCO dataset. Comprehensive experimental results demonstrate that our method significantly outperforms the state-of-the-art approaches consistently across different evaluation metrics.
Collapse
Affiliation(s)
- Min Wang
- Beijing Jiaotong University, Beijing, China
| | | | | | | | - Tao Wang
- Beijing Jiaotong University, Beijing, China
| | - Yutong Gao
- Beijing Jiaotong University, Beijing, China
| |
Collapse
|
3
|
Liu H, Wang H, Wu Y, Xing L. Superpixel Region Merging Based on Deep Network for Medical Image Segmentation. ACM T INTEL SYST TEC 2020. [DOI: 10.1145/3386090] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
Automatic and accurate semantic segmentation of pathological structures in medical images is challenging because of noisy disturbance, deformable shapes of pathology, and low contrast between soft tissues. Classical superpixel-based classification algorithms suffer from edge leakage due to complexity and heterogeneity inherent in medical images. Therefore, we propose a deep U-Net with superpixel region merging processing incorporated for edge enhancement to facilitate and optimize segmentation. Our approach combines three innovations: (1) different from deep learning--based image segmentation, the segmentation evolved from superpixel region merging via U-Net training getting rich semantic information, in addition to gray similarity; (2) a bilateral filtering module was adopted at the beginning of the network to eliminate external noise and enhance soft tissue contrast at edges of pathogy; and (3) a normalization layer was inserted after the convolutional layer at each feature scale, to prevent overfitting and increase the sensitivity to model parameters. This model was validated on lung CT, brain MR, and coronary CT datasets, respectively. Different superpixel methods and cross validation show the effectiveness of this architecture. The hyperparameter settings were empirically explored to achieve a good trade-off between the performance and efficiency, where a four-layer network achieves the best result in precision, recall, F-measure, and running speed. It was demonstrated that our method outperformed state-of-the-art networks, including FCN-16s, SegNet, PSPNet, DeepLabv3, and traditional U-Net, both quantitatively and qualitatively. Source code for the complete method is available at https://github.com/Leahnawho/Superpixel-network.
Collapse
Affiliation(s)
- Hui Liu
- Shandong University of Finance and Economics and Stanford University, Jinan, Shandong Province, China
| | - Haiou Wang
- Shandong University of Finance and Economics, Jinan, Shandong Province, China
| | - Yan Wu
- Stanford University, CA, USA
| | | |
Collapse
|