1
|
Ju Y, Lam KM, Xie W, Zhou H, Dong J, Shi B. Deep Learning Methods for Calibrated Photometric Stereo and Beyond. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024; 46:7154-7172. [PMID: 38607717 DOI: 10.1109/tpami.2024.3388150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/14/2024]
Abstract
Photometric stereo recovers the surface normals of an object from multiple images with varying shading cues, i.e., modeling the relationship between surface orientation and intensity at each pixel. Photometric stereo prevails in superior per-pixel resolution and fine reconstruction details. However, it is a complicated problem because of the non-linear relationship caused by non-Lambertian surface reflectance. Recently, various deep learning methods have shown a powerful ability in the context of photometric stereo against non-Lambertian surfaces. This paper provides a comprehensive review of existing deep learning-based calibrated photometric stereo methods utilizing orthographic cameras and directional light sources. We first analyze these methods from different perspectives, including input processing, supervision, and network architecture. We summarize the performance of deep learning photometric stereo models on the most widely-used benchmark data set. This demonstrates the advanced performance of deep learning-based photometric stereo methods. Finally, we give suggestions and propose future research trends based on the limitations of existing models.
Collapse
|
2
|
Schmitt C, Antic B, Neculai A, Lee JH, Geiger A. Towards Scalable Multi-View Reconstruction of Geometry and Materials. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:15850-15869. [PMID: 37708017 DOI: 10.1109/tpami.2023.3314348] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/16/2023]
Abstract
In this paper, we propose a novel method for joint recovery of camera pose, object geometry and spatially-varying Bidirectional Reflectance Distribution Function (svBRDF) of 3D scenes that exceed object-scale and hence cannot be captured with stationary light stages. The input are high-resolution RGB-D images captured by a mobile, hand-held capture system with point lights for active illumination. Compared to previous works that jointly estimate geometry and materials from a hand-held scanner, we formulate this problem using a single objective function that can be minimized using off-the-shelf gradient-based solvers. To facilitate scalability to large numbers of observation views and optimization variables, we introduce a distributed optimization algorithm that reconstructs 2.5D keyframe-based representations of the scene. A novel multi-view consistency regularizer effectively synchronizes neighboring keyframes such that the local optimization results allow for seamless integration into a globally consistent 3D model. We provide a study on the importance of each component in our formulation and show that our method compares favorably to baselines. We further demonstrate that our method accurately reconstructs various objects and materials and allows for expansion to spatially larger scenes. We believe that this work represents a significant step towards making geometry and material estimation from hand-held scanners scalable.
Collapse
|
3
|
Enomoto K, Waechter M, Okura F, Kutulakos KN, Matsushita Y. Discrete Search Photometric Stereo for Fast and Accurate Shape Estimation. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:4355-4367. [PMID: 35976840 DOI: 10.1109/tpami.2022.3198729] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
We consider the problem of estimating surface normals of a scene with spatially varying, general bidirectional reflectance distribution functions (BRDFs) observed by a static camera under varying distant illuminations. Unlike previous approaches that rely on continuous optimization of surface normals, we cast the problem as a discrete search problem over a set of finely discretized surface normals. In this setting, we show that the expensive processes can be precomputed in a scene-independent manner, resulting in accelerated inference. We discuss two variants of our discrete search photometric stereo (DSPS), one working with continuous linear combinations of BRDF bases and the other working with discrete BRDFs sampled from a BRDF space. Experiments show that DSPS has comparable accuracy to state-of-the-art exemplar-based photometric stereo methods while achieving 10-100x acceleration.
Collapse
|
4
|
A CNN Based Approach for the Point-Light Photometric Stereo Problem. Int J Comput Vis 2022. [DOI: 10.1007/s11263-022-01689-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
5
|
Ju Y, Shi B, Jian M, Qi L, Dong J, Lam KM. NormAttention-PSN: A High-frequency Region Enhanced Photometric Stereo Network with Normalized Attention. Int J Comput Vis 2022. [DOI: 10.1007/s11263-022-01684-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
6
|
Example-Based Multispectral Photometric Stereo for Multi-Colored Surfaces. J Imaging 2022; 8:jimaging8040107. [PMID: 35448234 PMCID: PMC9024654 DOI: 10.3390/jimaging8040107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Revised: 04/05/2022] [Accepted: 04/06/2022] [Indexed: 02/04/2023] Open
Abstract
A photometric stereo needs three images taken under three different light directions lit one by one, while a color photometric stereo needs only one image taken under three different lights lit at the same time with different light directions and different colors. As a result, a color photometric stereo can obtain the surface normal of a dynamically moving object from a single image. However, the conventional color photometric stereo cannot estimate a multicolored object due to the colored illumination. This paper uses an example-based photometric stereo to solve the problem of the color photometric stereo. The example-based photometric stereo searches the surface normal from the database of the images of known shapes. Color photometric stereos suffer from mathematical difficulty, and they add many assumptions and constraints; however, the example-based photometric stereo is free from such mathematical problems. The process of our method is pixelwise; thus, the estimated surface normal is not oversmoothed, unlike existing methods that use smoothness constraints. To demonstrate the effectiveness of this study, a measurement device that can realize the multispectral photometric stereo method with sixteen colors is employed instead of the classic color photometric stereo method with three colors.
Collapse
|
7
|
Santo H, Samejima M, Sugano Y, Shi B, Matsushita Y. Deep Photometric Stereo Networks for Determining Surface Normal and Reflectances. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:114-128. [PMID: 32750795 DOI: 10.1109/tpami.2020.3005219] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
This article presents a photometric stereo method based on deep learning. One of the major difficulties in photometric stereo is designing an appropriate reflectance model that is both capable of representing real-world reflectances and computationally tractable for deriving surface normal. Unlike previous photometric stereo methods that rely on a simplified parametric image formation model, such as the Lambert's model, the proposed method aims at establishing a flexible mapping between complex reflectance observations and surface normal using a deep neural network. In addition, the proposed method predicts the reflectance, which allows us to understand surface materials and to render the scene under arbitrary lighting conditions. As a result, we propose a deep photometric stereo network (DPSN) that takes reflectance observations under varying light directions and infers the surface normal and reflectance in a per-pixel manner. To make the DPSN applicable to real-world scenes, a dataset of measured BRDFs (MERL BRDF dataset) has been used for training the network. Evaluation using simulation and real-world scenes shows the effectiveness of the proposed approach in estimating both surface normal and reflectances.
Collapse
|
8
|
Chen G, Han K, Shi B, Matsushita Y, Wong KYK. Deep Photometric Stereo for Non-Lambertian Surfaces. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:129-142. [PMID: 32750798 DOI: 10.1109/tpami.2020.3005397] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
This paper addresses the problem of photometric stereo, in both calibrated and uncalibrated scenarios, for non-Lambertian surfaces based on deep learning. We first introduce a fully convolutional deep network for calibrated photometric stereo, which we call PS-FCN. Unlike traditional approaches that adopt simplified reflectance models to make the problem tractable, our method directly learns the mapping from reflectance observations to surface normal, and is able to handle surfaces with general and unknown isotropic reflectance. At test time, PS-FCN takes an arbitrary number of images and their associated light directions as input and predicts a surface normal map of the scene in a fast feed-forward pass. To deal with the uncalibrated scenario where light directions are unknown, we introduce a new convolutional network, named LCNet, to estimate light directions from input images. The estimated light directions and the input images are then fed to PS-FCN to determine the surface normals. Our method does not require a pre-defined set of light directions and can handle multiple images in an order-agnostic manner. Thorough evaluation of our approach on both synthetic and real datasets shows that it outperforms state-of-the-art methods in both calibrated and uncalibrated scenarios.
Collapse
|
9
|
Chen L, Zheng Y, Shi B, Subpa-Asa A, Sato I. A Microfacet-Based Model for Photometric Stereo with General Isotropic Reflectance. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2021; 43:48-61. [PMID: 31295106 DOI: 10.1109/tpami.2019.2927909] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
This paper presents a precise, stable, and invertible reflectance model for photometric stereo. This microfacet-based model is applicable to all types of isotropic surface reflectance, covering cases from diffusion to specular reflections. We introduce a single variable to physically quantify the surface smoothness, and by monotonically sliding this variable between 0 and 1, our model enables a versatile representation that can smoothly transform between an ellipsoid of revolution and the equation for Lambertian reflectance. In the inverse domain, this model offers a compact and physically interpretable formulation, for which we introduce a fast and lightweight solver that allows accurate estimations for both surface smoothness and surface shape. Finally, extensive experiments on the appearances of synthesized and real objects evidence that this model is state-of-the-art in our off-the-shelf solution.
Collapse
|
10
|
Yu C, Lee SW. Deep Photometric Stereo Network with Multi-Scale Feature Aggregation. SENSORS 2020; 20:s20216261. [PMID: 33153006 PMCID: PMC7675179 DOI: 10.3390/s20216261] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/08/2020] [Revised: 10/30/2020] [Accepted: 11/02/2020] [Indexed: 11/21/2022]
Abstract
We present photometric stereo algorithms robust to non-Lambertian reflection, which are based on a convolutional neural network in which surface normals of objects with complex geometry and surface reflectance are estimated from a given set of an arbitrary number of images. These images are taken from the same viewpoint under different directional illumination conditions. The proposed method focuses on surface normal estimation, where multi-scale feature aggregation is proposed to obtain a more accurate surface normal, and max pooling is adopted to obtain an intermediate order-agnostic representation in the photometric stereo scenario. The proposed multi-scale feature aggregation scheme using feature concatenation is easily incorporated into existing photometric stereo network architectures. Our experiments were performed with a DiLiGent photometric stereo benchmark dataset consisting of ten real objects, and they demonstrated that the accuracies of our calibrated and uncalibrated photometric stereo approaches were improved over those of baseline methods. In particular, our experiments also demonstrated that our uncalibrated photometric stereo outperformed the state-of-the-art method. Our work is the first to consider the multi-scale feature aggregation in photometric stereo, and we showed that our proposed multi-scale fusion scheme estimated the surface normal accurately and was beneficial to improving performance.
Collapse
Affiliation(s)
- Chanki Yu
- Department of Media Technology, Graduate School of Media, Sogang University, Seoul 04107, Korea;
| | - Sang Wook Lee
- Department of Media Technology, Graduate School of Media, Sogang University, Seoul 04107, Korea;
- Department of Art & Technology, School of Media, Arts and Science, Sogang University, Seoul 04107, Korea
- Correspondence: ; Tel.: +82-2-705-8902
| |
Collapse
|
11
|
Wang X, Jian Z, Ren M. Non-Lambertian Photometric Stereo Network based on Inverse Reflectance Model with Collocated Light. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; 29:6032-6042. [PMID: 32310771 DOI: 10.1109/tip.2020.2987176] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Current non-Lambertian photometric stereo methods generally require a large number of images to ensure accurate surface normal estimation. To achieve accurate surface normal recovery under a sparse set of lights, this paper proposes a non-Lambertian photometric stereo network based on a derived inverse reflectance model with collocated light. The model is deduced using monotonicity of isotropic reflectance and the univariate property of collocated light to decouple the surface normal from the reflectance function. Thus, the surface normal can be estimated by three steps, i.e., model fitting, shadow rejection, and normal estimation. We leverage a supervised deep learning technique to enhance the shadow rejection ability and the flexibility of the inverse reflectance model. Shadows are handled through max-pooling. Information from a neighborhood image patch is utilized to improve the flexibility to various reflectances. Experiments using both synthetic and real images demonstrate that the proposed method achieves state-of-the-art accuracy in surface normal estimation.
Collapse
|
12
|
Wendt G, Faul F. Factors Influencing the Detection of Spatially-Varying Surface Gloss. Iperception 2019; 10:2041669519866843. [PMID: 31523415 PMCID: PMC6732868 DOI: 10.1177/2041669519866843] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2019] [Accepted: 07/10/2019] [Indexed: 11/15/2022] Open
Abstract
In this study, we investigate the ability of human observers to detect spatial inhomogeneities in the glossiness of a surface and how the performance in this task depends on several context factors. We used computer-generated stimuli showing a single object in three-dimensional space whose surface was split into two spatial areas with different microscale smoothness. The context factors were the kind of illumination, the object's shape, the availability of motion information, the degree of edge blurring, the spatial proportions between the two areas of different smoothness, and the general smoothness level. Detection thresholds were determined using a two-alternative forced choice (2AFC) task implemented in a double random staircase procedure, where the subjects had to indicate for each stimulus whether or not the surface appears to have a spatially uniform material. We found evidence that two different cues are used for this task: luminance differences and differences in highlight properties between areas of different microscale smoothness. While the visual system seems to be highly sensitive in detecting gloss differences based on luminance contrast information, detection thresholds were considerably higher when the judgment was mainly based on differences in highlight features, such as their size, intensity, and sharpness.
Collapse
Affiliation(s)
- Gunnar Wendt
- Christian-Albrechts-Universität zu Kiel, Institut
für Psychologie, Kiel, Germany
| | - Franz Faul
- Christian-Albrechts-Universität zu Kiel, Institut
für Psychologie, Kiel, Germany
| |
Collapse
|
13
|
Zhang C, Cheng J, Tian Q. Multi-View Image Classification With Visual, Semantic And View Consistency. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 29:617-627. [PMID: 31425078 DOI: 10.1109/tip.2019.2934576] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Multi-view visual classification methods have been widely applied to use discriminative information of different views. This strategy has been proven very effective by many researchers. On the one hand, images are often treated independently without fully considering their visual and semantic correlations. On the other hand, view consistency is often ignored. To solve these problems, in this paper, we propose a novel multi-view image classification method with visual, semantic and view consistency (VSVC). For each image, we linearly combine multi-view information for image classification. The combination parameters are determined by considering both the classification loss and the visual, semantic and view consistency. Visual consistency is imposed by ensuring that visually similar images of the same view are predicted to have similar values. For semantic consistency, we impose the locality constraint that nearby images should be predicted to have the same class by multiview combination. View consistency is also used to ensure that similar images have consistent multi-view combination parameters. An alternative optimization strategy is used to learn the combination parameters. To evaluate the effectiveness of VSVC, we perform image classification experiments on several public datasets. The experimental results on these datasets show the effectiveness of the proposed VSVC method.
Collapse
|
14
|
|