This section shows experimental results using both synthetic and real-world images for the various settings for the unit norm constraint. The performance of the ORIGINAL problem and INSIDE, BOX, and OPEN relaxations are examined in terms of their accuracy and computation times. We also evaluate the effectiveness of the piecewise solution method described in Section 3.5. In addition, we compare these strategies with the original numerical shape-from-shading algorithm proposed by Ikeuchi and Horn [13] (labeled “ITERATIVE” hereafter), a polynomial shape-from-shading method proposed by Ecker and Jepson [7] (labeled “P-SFS”), and local shape prediction method proposed by Xiong et al. [43] (labeled “XIONG”).
For the ITERATIVE method [13], following the original method’s procedure, we repeat the Newton step for the following problem for a few times (set to 5 in this evaluation based on our empirical test) starting from the initial guess n=(0,0,1)⊤:
$$ {\begin{aligned} & \underset{{\mathbf{N}}}{\text{minimize}} & & \frac{1}{2} ||\mathbf{N} \mathbf{D}||^{2}_{F} \,+\, \lambda_{1} ||\mathbf{l}^{\top} \mathbf{N} \,-\, \mathbf{m}^{\top}||^{2}_{2} + \lambda_{2} ||\mathbf{N} \mathbf{F} - \mathbf{G}||^{2}_{F} \\ & \text{subject to} & & 0 \leq n_{iz}~, \end{aligned}} $$
(10)
and normalize the current estimate of the surface normal to ∥n
i
∥2=1. As such, it iteratively optimizes without the unit norm constraint, and during the iterations, it enforces the surface normal to have the unit norm by normalization. In the work of the P-SFS method [7], they propose an iterative procedure with exact line search, which is inherently non-convex, and its convex SDP relaxation. Since their method does not require boundary conditions, we align the setting to their setting and compare the performance with their SDP relaxation method. We use Gurobi OptimizerFootnote 2 as a solver for SDP problems. XIONG method [43] assumes the quadratic representation of local shape and infers the local shape for each small image patches separately. We use their implementation that is publicly availableFootnote 3 and their default parameters for our experiment. In our all experiments, the feasibility tolerance for constraints of LCQP, QCQP, and SDP is set to 1×10−6 and the tolerance for the stopping criteria is set to 1×10−6.
4.1 Synthetic scenes
In this section, we show some experiments on synthetic dataset [15]. The dataset consists of ten objects, which have a smooth shape, and the dataset contains ideally complete 3D shape data. There is another dataset for evaluating the shape-from-shading or photometric stereo method ([9, 33]), but [15] is designed for synthetic evaluation and suits our evaluation. We show the results of five objects among them, labeled “blob01” to “blob05,” rendered under a directional light source l=(0,0,1)⊤. Figure 2 summarizes the results of various settings: (a) ORIGINAL setting with the Lagrangian relaxation, (b) INSIDE relaxation, (c) BOX relaxation, (d) OPEN relaxation, (e) PIECEWISE solution method described in Section 3.5, and (f) ITERATIVE method of [10]. For each scene, top row shows the estimated surface normal, and the bottom row depicts the angular error map and corresponding mean angular error (MAE). For the ORIGINAL method with Lagrangian relaxation, we carefully picked the weight parameters (λ1,λ2,λ3)=(512,2048,32) with numerical simulation based on ground truth. It shows that aside from the ORIGINAL method, the INSIDE relaxation tends to yield favorable result compared to BOX and OPEN relaxations. The trend is inherited in the PIECEWISE method that uses the INSIDE relaxation in a sequential manner. The ITERATIVE method also shows higher accuracy compared to BOX and OPEN relaxations. The ORIGINAL setting shows the highest accuracy in two scenes, but the weight (hyper) parameters of the Lagrangian relaxation have been carefully chosen for producing the results.
Discussion on Lagrangian relaxation for ORIGINAL. The Lagrangian relaxation of the ORIGINAL setting has two obvious issues. One is the non-convexity of the problem, which implies that the solution may depend on the initial guess. The other is that the hyper parameters λ1, λ2, and λ3 of [8] need to be properly chosen for expecting accurate estimates; however, unfortunately, the optimal hyper parameters are generally unknown and scene-dependent.
Figure 3 shows the plot of MAEs that are obtained by changing the initial guess of the surface normal for the blob01 and blob02 scenes using the Lagrangian relaxation of the ORIGINAL setting. In the figures, x- and y-axes correspond to the azimuth θ and polar ϕ angles of the initial guess of the surface normal. The MAE drastically varies with the small variations of initial guess for the surface normal, and the variation has dependency on the scene.
To see the effect of the choice of hyper parameters, we altered the hyper parameters λ1, λ2, and λ3 of [8] and observed the resulting MAEs. One of the results using the blob03 scene is shown in Fig. 4, in which the hyper parameters are set to λ1=λ2=λ3∈{1,10,100,1000,10000}. The MAE varies significantly depending on the choice of the parameters, and it illustrates the difficulty of applying the Lagrangian relaxation of the ORIGINAL problem.
Comparison to existing methods. We compared our method with P-SFS and XIONG. While our method requires the boundary conditions to work properly, in order to compare with the P-SFS and XIONG methods that do not require them, we eliminate the boundary condition from the INSIDE relaxation. As a result, there remains a rotation ambiguity in the solution. Therefore, we applied rotation alignment of the estimated normal map for the purpose of comparison. We determine the rotation matrix \(\mathbf {R}\ {\in }\ \mathbb {R}^{3 \times 3}\) by solving the following problem:
$$ \begin{aligned} & \underset{{\mathbf{R}}}{\text{minimize}} & & ||\mathbf{N}^{*} - \mathbf{R} \hat{\mathbf{N}}||^{2}_{F} \\ & \text{subject to} & & \mathbf{R} \mathbf{R}^{\top} = \mathbf{I}, \end{aligned} $$
(11)
where N∗ and \(\hat {\mathbf {N}}\) are the ground truth and estimated normal maps, respectively. This problem is known as the orthogonal Procrustes problem [12], and the solution method is proposed in [32]. XIONG directly estimates the depth rather than surface normal; therefore, to compare with other methods in the space of surface normal, we compute the normal map from the estimated depth map. Figure 5 shows one of the representative results. From left to right, it shows the ground truth normal map, (a) result of the INSIDE relaxation with boundary conditions, (b) INSIDE relaxation without boundary conditions, (c) P-SFS method, and (d) XIONG method. “(b) - aligned” and “(c) - aligned” are the rotation aligned results of (b) and (c). While the result of (a) is convincing, (b) and (c) are rather far from the ground truth due to that the surface normals are not anchored by boundary conditions, containing the rotation ambiguity. Also, compared with (d), (a) achieves the better estimation.
Speed and accuracy. Figure6 summarizes the computation times and accuracies of various methods applied to blob01–blob05 datasets. The x- and y-axes represent the log-scale processing time and MAE respectively. The mean scores of MAEs are plotted by circle, and their minimum and maximum time/accuracy are indicated by the associated bars. It can be seen that the PIECEWISE method significantly reduces the computation time compared to the INSIDE relaxation with retaining the accuracy. OPEN and BOX relaxations are faster; however, they suffer from inaccuracy due to the loose relaxation. The ORIGINAL method with Lagrangian relaxation shows a good trade-off as we have carefully selected a good set of hyper parameters. The MAE may significantly vary depending on the selection of hyper parameters as discussed earlier. The ITERATIVE method is the most efficient one among them, while MAEs were consistently larger than PIECEWISE, INSIDE, and ORIGINAL methods.
4.2 Real-world data
Real-world data contains observations that deviate from the assumed image formation model. Namely, there are two major factors: non-uniform diffuse albedos and non-Lambertian surface reflectances. Due to these unmodelled errors, the brightness [2] and boundary [4] constraints can conflict, resulting in no feasible solutions. For the real-world data experiment, we therefore relax these hard constraints as soft ones as:
INSIDE relaxation:
$$ {\begin{aligned} & \underset{{\mathbf{N}}}{\text{minimize}} & & \frac{1}{2} ||\mathbf{N} \mathbf{D}||^{2}_{F} \,+\,\lambda_{1} ||\mathbf{N}\mathbf{F}\,-\,\mathbf{G}||^{2}_{F} + \lambda_{2} ||\mathbf{l}^{\top} \mathbf{N} - \mathbf{m}^{\top}||^{2}_{2} \\ & \text{subject to} & & ||\mathbf{n}_{i}||^{2}_{2} \leq 1, \quad 0 \leq n_{iz},\quad \forall i \in \{1 \ldots p\}. \end{aligned}} $$
BOX relaxation:
$$ {\begin{aligned} & \underset{{\mathbf{N}}}{\text{minimize}} & & \frac{1}{2} ||\mathbf{N} \mathbf{D}||^{2}_{F} \,+\,\lambda_{1} ||\mathbf{N}\mathbf{F}\,-\,\mathbf{G}||^{2}_{F} + \lambda_{2} ||\mathbf{l}^{\top} \mathbf{N} - \mathbf{m}^{\top}||^{2}_{2} \\ & \text{subject to} & & -\!1 \!\leq\! n_{ix}, \!\!\!\!\quad n_{iy} \!\leq\! 1, \!\!\!\!\quad 0 \!\leq\! n_{iz} \!\leq\! 1, \!\!\!\!\quad \forall i \in \{1, \ldots, p\}. \end{aligned}} $$
The results are summarized in Fig. 7. In the figure, “cat” data is from DiLiGenT [33] dataset, in which the ground truth is taken by the laser sensor. We picked up “cat” in DiLiGenT because it is the most Lambertian-like object. For other data, we have obtained the ground truth by a conventional least-squares photometric stereo [39] using 16 light sources. We selected these four objects: “wall-paper,” “coin,” and “logo,” which have diffuse surfaces. From left to right, it shows the estimated surface normal and angular error maps of (a) ORIGINAL with Lagrangian relaxation, (b) INSIDE, (c) BOX, (d) OPEN, (e) PIECEWISE, and (f) ITERATIVE methods. Although the surface details are smoothed out due to the smoothness constraint, overall structures can be better observed by properly accounting for the unit norm constraint with a tight relaxation by (b) compared to the result of (a) and (f). The PIECEWISE method in (e) also yields lower accuracy as well in this case but still producing results closer to the ground truth compared to (a) and (f).
Discussions on Lagrangian relaxation for the real-world data We examine Lagrangian relaxations of INSIDE, BOX, and OPEN methods using the real-world data for assessing their capabilities of handling unmodelled errors. The formulations are all convex problems; therefore, the solution does not depend on the initial guess. Here, we discuss the effect of the choice of hyper parameters λ1 and λ2.
We alter the hyper parameters λ1 and λ2 and observe the resulting mean angular errors (MAEs). The results using the “cat,” “wall-paper,” “coin,” and “logo” scenes are summarized in Fig. 8, in which the hyper parameters are set to λ1=λ2=λ∈{1,100,10000}.
While this result shows that the choice of hyper parameters has little effect on overall MAEs, it still locally affects surface normal estimates. For example, errors near ear and forefoot of “cat” are decreased with large hyper parameters in Fig. 9. Because the areas of ear and forefoot are not smooth, surface normal can be correctly estimated by emphasizing on the brightness and occluding boundary constraints rather than the smoothness constraint.