PaperHub
5.7
/10
Poster3 位审稿人
最低4最高7标准差1.2
7
4
6
4.0
置信度
正确性3.0
贡献度2.3
表达2.7
NeurIPS 2024

SplitNeRF: Split Sum Approximation Neural Field for Joint Geometry, Illumination, and Material Estimation

OpenReviewPDF
提交: 2024-04-24更新: 2024-11-06

摘要

We present a novel approach for digitizing real-world objects by estimating their geometry, material properties, and environmental lighting from a set of posed images with fixed lighting. Our method incorporates into Neural Radiance Field (NeRF) pipelines the split sum approximation used with image-based lighting for real-time physically based rendering. We propose modeling the scene's lighting with a single scene-specific MLP representing pre-integrated image-based lighting at arbitrary resolutions. We accurately model pre-integrated lighting by exploiting a novel regularizer based on efficient Monte Carlo sampling. Additionally, we propose a new method of supervising self-occlusion predictions by exploiting a similar regularizer based on Monte Carlo sampling. Experimental results demonstrate the efficiency and effectiveness of our approach in estimating scene geometry, material properties, and lighting. Our method attains state-of-the-art relighting quality after only ${\sim}1$ hour of training in a single NVIDIA A100 GPU.
关键词
inverse renderingneural renderingphysically-based

评审与讨论

审稿意见
7

This paper tackles the problem of inverse rendering, which reconstructs geometry, material, and environmental lighting from a set of posed images with fixed lighting. It proposes two contributions: 1) representing pre-integrated illumination as a single MLP, 2) approximating self-occlusion on pre-integrated lighting and use it to supervise an occlusion MLP to disentangle shadows and materials.

The proposed method achieves state-of-the-art results while being very fast to train (less than one hour).

优点

  1. The idea of representing pre-integrated illumination as MLP is interesting. The regularization is particularly novel and clever. I really like the way how pure specular MLP (roughness=0) is used to regularize the training of the illumination MLP (Eq. 5)

  2. The proposed occlusion factor estimation method is also very interesting and novel. Instead of ambient occlusion, the paper uses simple trick to factor out occlusion as an independent scalar for light integral. This scalar can be computed with MC integration and is distilled into a neural network for efficient inference.

  3. The quantitative result shows that the improvement over SOTA is significant, especially on relatively diffuse scene (NeRFactor dataset).

缺点

  1. More datasets could be tested, e.g. the TensorIR dataset.

问题

  1. The paper mentions that the relighting results were obtained via Blender's PBR shader. Does the shader use global illumination? If so it would be unfair for some of the baselines (TensorIR, NMF) as they can only render direct illumination. A more fair way would be run another experiment where all methods do not use global illumination.

局限性

The limitations and potential negative societal impact are discussed properly.

作者回复

The shader uses global (indirect) illumination. We decided to evaluate using blender's PBR shader, as done in previous works, since our aim is to generate relightable meshes for use in existing rendering pipelines. Evaluating the rendering quality from one such pipeline is the most direct way of benchmarking that goal. This is especially important in the context of radiance fields since the volumetric formulation supervised mostly through rendering allows for artifacts in 3D to be hidden in renders. For example, material properties could be distributed along rays near the surface of objects, leading to good quality renders but bad quality 3D representations. Both TensoIR and NMF take into account indirect illumination, although they restrict its calculation to two ray bounces.

审稿意见
4

This paper introduces a method for digitizing real-world objects by estimating their geometry, material properties, and environmental lighting from posed images under fixed lighting. Integrating image-based lighting techniques into Neural Radiance Field (NeRF) pipelines, the method uses a scene-specific MLP for pre-integrated lighting at arbitrary resolutions. A Monte Carlo sampling-based regularizer ensures accurate lighting representation and self-occlusion predictions.

优点

The method proposes a method to approximate the effect of self-occlusion to improve the material estimation

缺点

  • The proposed method, particularly in terms of material and lighting representation, is trivial.
  • The correctness of the decomposed material is flawed in the qualitative results. The albedo clearly has specular and shadow bake-in, and the metalness and roughness do not match the material in the ground truth.
  • The occlusion introduced in the paper to improve material estimation is not robust enough to validate its contribution.

问题

In the discussion of occlusion loss, why is only the albedo prediction addressed? The paper claims that the occlusion is designed to improve the overall material estimation.

局限性

The proposed method decomposes lighting and material using surface reflection. However, real-world objects often have multiple material layers, such as the car showcased in the qualitative results. The clear coat reflects stronger specular light, while the lower paint layer includes both specular and diffuse reflections. The proposed model does not account for these complexities.

作者回复

We only adress the albedo prediction since albedo is the only material property that is shared across most commonly used BRDF models. The synthetic datasets we rely on come from 3D models which were hand-designed using a variety of complex BRDF models with properties which can't be directly translated into out model's "metalness" and "roughness". Due to this, there doesn't exist a ground truth "metalness" or "roughness" to evaluate against. Qualitatively, it is easier to visualize the effects of the occlusion loss on albedo due to its colored nature, which makes it more intuitive to evaluate.

审稿意见
6

The paper introduces SplitNeRF, a method that integrates the split sum approximation into Neural Radiance Field (NeRF) pipelines. This approach optimizes object geometry, material properties, and environmental lighting. The method employs a Multi-Layer Perceptron (MLP) to model scene-specific pre-integrated image-based lighting at arbitrary resolutions. The model is regularized using efficient Monte Carlo sampling to accurately capture pre-integrated lighting and self-occlusion effects. Experimental results indicate that SplitNeRF achieves state-of-the-art relighting quality with only about an hour of training on a single NVIDIA A100 GPU.

优点

  1. The method is efficient and achieves high-quality relighting results after only about an hour of training on a single NVIDIA A100 GPU, which is highly efficient compared to existing methods.
  2. By using Monte Carlo sampling for regularization, the method accurately models pre-integrated lighting and self-occlusion, leading to high-fidelity reconstruction of scene geometry and material properties. The integration of split sum approximation into NeRF pipelines is novel and effective, allowing the method to disentangle environmental lighting from material properties.
  3. The method demonstrates competitive performance on both synthetic and real datasets, showing its state-of-the-art performance in material decomposition and relighting.

缺点

I do not have a major concern about this paper- the technical claims are sound and the paper is overall well presented. Several minor points:

  1. Adding material editing visualizations in the paper could further strengthen the results and inspire downstream applications.
  2. Normal could contain a significant error in e.g. lego and coffee scenes. Can you provide more insight and analysis into this?

问题

See weaknesses 1, 2.

局限性

Limitations discussed in the conclusions section.

作者回复

Estimating geometry together with materials and illumination is a very complex and unconstrained problem. Because of this, the optimization can sometimes get stuck in local minima. We believe that is happening for those scenes. For example, it is possible to model reflections via small variations in geometry rather than higher frequency changes in the illumination. We hypothesize that is happening in the 'coffee' scene.

评论

Thanks for the response. However I do not think the authors addressed my questions in the initial review, and agree with reviewer h9Hn that there could be flaws in occlusion handling. I'm leaning towards acceptance while acknowledging the paper could use some further improvements.

最终决定

This paper received three mixed reviews -- one 4 (borderline reject), one 6 (weak accept), and one 7 (accept).

While there was general appreciation for the proposed idea of representing scene's illumination map using an MLP (and the associated regularization and optimization techniques used for reconstruction), and the quality of the results, including improvements over the current state-of-the-art methods.

While there were some concerns raised about the effectiveness of the proposed method in separating the albedo components correctly, which the authors attempted to address in the rebuttal.

The opinion remained split post-rebuttal, but given that the paper tackles an important problem of reconstructing the scene's geometry, material properties and environmental lighting and demonstrates reasonable results with novel insights and mathematical techniques, on balance, an accept decision was reached.