6.8

/10

Poster4 位审稿人

最低4最高5标准差0.4

4.0

置信度

创新性2.8

质量3.0

清晰度2.8

重要性2.8

NeurIPS 2025

DGH: Dynamic Gaussian Hair

Junying Wang,Yuanlu Xu,Edith Tretschk,Ziyan Wang,Anastasia Ianina,Aljaz Bozic,Ulrich Neumann,Tony Tung

OpenReview PDF

提交: 2025-05-11更新: 2025-10-29

摘要

关键词

Dynamic hairHair deformation3D Gaussian Splatting

评审与讨论

审稿意见

评分: 4置信度: 42025-07-02

This paper proposes a framework to jointly simulate and render 3D hair models. For the simulation part, this work proposes a coarse-to-fine strategy to separately learn the primary and secondary hair dynamics in a data-driven manner, and for the rendering part, this work extends strand-based 3DGS by introducing additional networks to model time-varying appearance under the dynamic scenarios. Experiments show promising results for both the hair simulation and rendering compared to the baselines.

优缺点分析

Strengths

[Improvements over the previous version]

I reviewed an earlier version of this work submitted to a different venue. Compared to the previous version, the current manuscript incorporated several modifications I’m glad to see and want to highlight here:

The demonstration of some complex hairstyles such as the curly ones and ponytails, which helps to show the generalizability of the work to those hairstyles that are more close to our real life.
A more detailed analysis and discussion of the run-time performance. Though the reported performance is still nonoptimal (which is understandable), I’m glad to see the authors tone down their previous claim on building a “fast and lightweight framework” but focus on the joint learning of hair dynamics and appearance.
More technical details included that help address many of my previous concerns. Although the code of this work seems not to be released, I think it contains enough information to reproduce it.

Weaknesses

See my questions listed below.

问题

[Dynamic hair model vs. hair simulator]

I’m glad to see the authors have included a discussion of prior work on hair simulation. However, I still have some concerns which I will list below:

Regardless of performance, one claim the authors made on those simulation methods is the need for iterative tuning of material parameters to achieve stable and realistic hair simulation. I agree with it, but I don’t think the proposed dynamic hair model can handle hairstyles with different material properties (e.g., stiffness) either, especially given the absence of any explicit conditioning on those material properties. From this perspective, I would argue that existing hair simulation models are still more generalizable given their incorporation of physics laws.
Second, the paper critiques prior real-time simulation approaches for being limited to "simple or quasi-static deformations." However, I question whether the proposed framework is capable of capturing those realistic and energetic deformations. According to Figure 2 in supplemental, I thought the included head motions may come from some mocap data, but it looks like they are only some hand-crafted 3D rotations. While the supplemental video includes some “increased velocity” examples, the motions still appear quite basic and not representative of realistic, high-energy movements. As such, the current evaluation does not convincingly demonstrate the robustness of the proposed dynamic hair model under realistic motion conditions. Including experiments driven by some mocap data would provide a much stronger test of the model’s capabilities.
Finally, I would argue that existing hair simulators (both physics-based and neural-based) should be capable of handling simulations under these relatively subtle head movements, therefore still questioning the necessity of the proposed dynamic hair model.

While I still hold these concerns regarding the dynamic hair model, considering the improvements made during this round of submission, I would lean toward acceptance of this work but still encourage the authors to consider and address these physics-related concerns listed above.

局限性

yes

最终评判理由

This paper proposes a neural simulation and rendering framework for hair dynamics on the synthetic data. Though there are still room left for improvement, considering the contribution and progress made since the last round I reviewed this paper, I acknowledge the authors' efforts and lean toward acceptance for this submission.

格式问题

n/a

作者回复

2025-07-30

Thank you for the valuable feedback! We also appreciate your positive view of the improvements in our learning-based framework. We will do our best to address your concerns as detailed below.

Material property sensitivity:

We agree that traditional physics-based simulation models benefit from explicitly defined physical parameters (e.g., stiffness), which allow direct control across material types. In contrast, our approach formulates dynamic hair modeling as a data-driven task within the 3DGS representation. For example, here we would like to explain with some simple formulations:

Physics-based formulation:

m \frac{d^2 \mathbf{x}}{dt^2} = -k (\mathbf{x} - \mathbf{x}_{0})- c \frac{d\mathbf{x}}{dt} + \mathbf{F}

This is a simplified spring-damper system formulation, left side indicates “how much the hair wants to move,” and the right side states “what is making it move.”, here $m$ is the mass of the hair segment, $\mathbf{x}$ its current position, $\mathbf{x}_{0}$ the rest position, $k$ the stiffness, $c$ the damping coefficient, and $\mathbf{F}$ represents external forces such as gravity. And the movement of each hair segment is controlled by these parameters.

Data-driven formulation:

Here we show a simplified high-level formulation of data-driven model. We define a neural network $f_\theta$ that models hair dynamics, where $\theta$ denotes the learnable parameters. Given a hair current position $\mathbf{x}^i$ , motion flow vector/velocity $\mathbf{v}^i$ , and a style feature $\mathbf{s}$ (e.g., hair volume latent code), the network predicts a motion update:

\Delta \mathbf{x}^{i+1} = f_\theta(\mathbf{x}^i, \mathbf{v}^i, \mathbf{s})

The next position is obtained via residual addition: $\mathbf{x}^{i+1} = \mathbf{x}^i + \Delta \mathbf{x}^{i+1}$ .

Therefore, rather than manually tuning physical parameters, our model learns motion priors from a large, diverse dataset (see Supple. A). While material properties (e.g., stiffness, damping) are not explicitly modeled, style features such as the hair latent code $\mathbf{s}$ serve as implicit priors that guide the learned dynamics in our data-driven framework. This enables robust generalization across unseen motion patterns without hair parameters tuning.

Realistic motion coverage and evaluation:

We agree that realistic and high-energy motion is essential for evaluating dynamic hair behavior. While the video highlights select cases (e.g., increased velocity), we are adding more diverse high-energy examples.

Hence, to further evaluate our model, we did evaluations on the public motion capture dataset Mixamo. We selected the “Nervously Look Around” sequence, which contains relatively high-energy head motions. For testing, we extracted the head rotation and upper body mesh, and used them as inputs.

We applied this motion to 3 hairstyles: Blowout, Ponytail, and Curly, to assess generalization to unseen, high-energy inputs.

Visually, our model generates smooth, and coherent dynamics by sharing volumetric priors across neighboring strands, resulting in unified, directional hair motion even under energetic inputs. Due to NeurIPS policy, we cannot include visual results via link at this stage. Instead, we report the L2 error metrics below, computed against physics-based XPBD ground truth over a 100-frame subsequence, and will include qualitative comparisons in the final revision.

Hairstyle	Ours (L2 Error ↓)	Rigid (L2 Error ↓)
Blowout	0.1307	0.2976
Ponytail	0.0965	0.2547
Curly	0.1158	0.2011

The necessity of the proposed dynamic hair model:

The motivation for designing a learning-based dynamic hair model (Stage I) is the integration with the 3DGS representation, to enable dynamic Gaussian hair as a separate layer that integrates seamlessly with multi-layer Gaussian head avatars (Main L16–18). Unlike prior works [1,2] that focus on static hair or lack realistic dynamics [3], our volumetric formulation allows plausible hair deformation prediction from a dense point cloud or pre-trained 3DGS, without requiring explicit body meshes.

DGH pipeline (Ours):

Pre-trained head GS avatar
→ Re-animation
→ DGH predicts dynamic Gaussian hair deformation (from half-body GS / point cloud)
→ Merge with head GS
→ Final avatar with dynamic hair

Physics-cased pipeline:

Pre-trained head GS avatar
→ Re-animation
→ Convert to explicit mesh
→ Run physics-based hair simulation
→ Render with 3D engine and shaders to obtain hair appearance
→ Convert rendered hair to 3DGS representation
→ Merge
→ Final avatar

Unlike physics-based simulators, which rely on explicit meshes and require additional rendering and conversion steps, our DGH framework only needs head rotation and a dense point cloud/pre-trained GS avatar. By converting static hair and upper-body into volume, we enable mesh-free deformation prediction, making our model more compatible with Gaussian-based avatars and easier to integrate into learning-based re-animation pipelines without additional rigging or simulation overhead.

To date, neural-based simulators are mostly quasi-static [4] and do not model dynamics (e.g., damping effect, acceleration) and upper-body (shoulder) intersections with realism [5]. We also view our learning-based model as a complementary alternative to traditional simulators. Rather than relying on explicit material parameters, it learns hair motion priors from data. We believe with more hairstyle data, this framework holds potential for a more generalizable and comprehensive solution to dynamic hair modeling. We also want to highlight that DGH also enables photorealistic rendering of dynamic hair (using 3DGS) at lower latency than traditional rendering with 3D engines running sophisticated hair shaders with physics-based simulation needs rendering on top. We have new results showing our dynamic hair integrated with real human Gaussian avatars, which we will include in the revision.

And we thank you again for the thoughtful suggestions, especially Q1, which provided valuable insights and inspired us to explore future extensions. As suggested, we are interested in exploring lightweight physics-inspired constraints to further enhance realism in future work.

[1] Luo, Haimin, et al. "Gaussianhair: Hair modeling and rendering with light-aware gaussians." arXiv preprint arXiv:2402.10483 (2024).

[2] Zakharov, Egor, et al. "Human hair reconstruction with strand-aligned 3d gaussians." ECCV 2024.

[3] Qian, Shenhan, et al. "Gaussianavatars: Photorealistic head avatars with rigged 3d gaussians." CVPR 2024.

[4] Stuyck, Tuur, et al. “Quaffure: Real-Time Quasi-Static Neural Hair Simulation”, CVPR 2025

[5] Wang, Ziyan, et al. “NeuWigs: ANeural Dynamic Model for Volumetric Hair Capture and Animation”, CVPR 2023

2025-08-03

Thanks for your rebuttal!

Q1: I appreciate the authors' response and clarification. However, to elaborate on my initial point: physics-based models offer explicit control over material properties, enabling the same model to simulate a wide range of hairstyles by adjusting parameters like stiffness or damping. In contrast, the proposed method trains separate hair deformation models for each hairstyle (L24-25 in supplemental), which suggests that generalization is limited to different poses rather than different hairstyles. Moreover, I remain unconvinced that material properties can be implicitly encoded in the proposed feature grids, given that they are only learned from geometric SDF volumes. Hair physical properties are largely orthogonal to its geometry, like hairstyles with similar shapes can be either soft or stiff. While this is a relatively minor concern, I encourage the authors to provide more discussion and clarification in the revised version.

Q3: I like the motivation and given examples. I do think they should be incorporated in the introduction, rather than some short sentences in the abstract. Including examples with real human Gaussian avatars would also strengthen the practical value of this work.

2025-08-05

Thank you for getting back and for the thoughtful feedback and suggestions!

For Q1: We appreciate the concern regarding explicit material control in physics-based models and will include further discussion and ablation results (adding with and without explicit material conditioning) in the revision.

As noted in our Limitations (L158–160), our current approach focuses on learning hairstyle-specific dynamics and generalizing to arbitrary motions. This choice is driven by the limited diversity of available hairstyle data (Supple. Fig. 1). In hairstyle-specific training, the model implicitly learns material properties through data-driven patterns, for example, curly hair often exhibits spring-like recoil and dense oscillations, which the model learns to associate with the underlying geometry. Since data-driven learning captures statistical regularities, by seeing enough training motion data per hairstyle, it allows the model to capture both shape and the effective material behavior.

That said, we agree that explicit material conditioning could enhance generalization across hairstyles (especially if we can access a more diverse 3D hairstyle dataset), integrating such conditioning is a promising direction. At this stage, we aim to demonstrate the effectiveness of our dynamic 3DGS representation in modeling realistic hair dynamics for head avatar applications. We will clarify this point and outline it as future work in the revision.

For Q3: Thank you for your positive feedback on our motivation. As suggested, we will incorporate it with more detail in the introduction and update example real head GS avatar figures.

2025-08-07

Thanks for the response! I have no further questions.

审稿意见

评分: 5置信度: 42025-07-02

This paper presents a method, DGH, for simulation-free hair reconstruction, rendering, and animation. Stretched 3D Gaussian splats are concatenated to approximate hair strands to model the geometry and appearance of the hair. To capture the deformation and dynamics of the hair, a 3D CNN is trained on a synthetic dataset with different grooms to predict coarse deformation of splats conditioned on the head pose and body mesh, followed by an MLP to refine the dynamics based on previous states. For better appearance fidelity, another MLP is learned on multi-view data to predict the appearance properties of 3D Gaussians. The authors conduct thorough experiments in terms of deformation reconstruction and appearance modeling, showing clear superiority of DGH.

优缺点分析

Strengths

Making 3DGS-based hair representation animatable is a challenging and meaningful task. DGH is a concise and effective method that fulfills this goal.
It is a reasonable experiment design to evaluate deformation and appearance separately. The authors animate 3DGS and GaussianHairCut with the same learned deformation model, showing the effectiveness of the appearance modeling components.
Figure 5 is a great plot to show the quality of learned hair dynamics.
The process of data generation and SDF computation is reported in detail in the supplementary document, making this method more reproducible.
The comparison of FPS and visual quality with different numbers of strands is very informative, showing the value of this method in real-world applications.

Weaknesses

The naming of the "coarse stage", "fine stage", and "the second stage" (line 70 and 141) is a bit confusing. A reader may confuse "the fine stage" with "the second stage", not realizing that they are different things.
Curvature-based Gaussian blending does not make much sense to me. In reality, the visibility of a segment does not change by its local curvature. Also, the visual difference isn't obvious in Figure 7 and the video, although PSNR is improved with it.

问题

How robust is this method to novel head poses and movements, such as a sudden downward movement of the entire head and body?

局限性

yes

最终评判理由

My major confusion on naming and the curvature-aware blending components is clarified by the authors. Therefore, I am glad to still vote for acceptance.

格式问题

N/A

作者回复

2025-07-30

Thank you for the thoughtful feedback! We address your concerns in detail below and will make our best effort to incorporate the suggestions:

Stage naming ambiguity:

Thank you for pointing out the naming ambiguity. This will be clarified. As shown in Fig. 2, our framework consists of two main stages:

Stage I: Coarse-to-fine dynamic hair modeling
Stage II: Appearance optimization

Within Stage I, we adopt a coarse-to-fine design:

The coarse stage learns hair deformation from static hair.
The fine stage refines secondary dynamics (e.g., inertia, damping).

Therefore, the terms "fine stage" and "second stage" refer to different modules.

To improve clarity, we provide the following table:

Module	Function
Stage I (coarse stage)	Learns hair deformation given head motions
Stage I (fine stage)	Adds secondary motion refinement
Stage II	Optimizes dynamic Gaussian hair appearance

As this caused confusion, we will rename Stage I and II as Hair Dynamics Model and Appearance Model, in the revision.

Curvature-based Gaussian blending:

Thank you for the insightful question. As stated in our hypothesis (L218–220), each hair strand is composed of discrete segments, which can lead to appearance discontinuities, particularly in high-curvature regions where the tangent direction changes abruptly.

While curvature does not directly impact visibility (which is related to camera viewpoint and depth ordering), it significantly affects the continuity of shading and appearance across segments.

Unlike continuous surfaces, Gaussian hair treats each segment as an independent shading unit, each with its own tangent vector $\mathbf{t_i}$ . In high-curvature regions, if the angle between $\mathbf{t_i}$ and $\mathbf{t_{i+1}}$ becomes large, it will lead to discontinuities in shading intensity. Ideally, this can be mitigated by increasing segment density, but usually for hair tracking purpose, we use a fixed number of segments per hair strand, so an alternative approach is to apply curvature-aware blending to smooth visual transitions.

Here we give a simple hair diffuse shading formulation: $I_i \propto \max(0, \mathbf{t_i} \cdot \mathbf{l})$ where $I_i$ is the shading intensity of the $i$ -th hair segment, $\mathbf{t_i}$ is its tangent direction, and $\mathbf{l}$ is the light direction. And the shading discontinuity between adjacent Gaussians is: $|I_i - I_{i+1}| \propto \left| \max(0, \mathbf{t_i} \cdot \mathbf{l}) - \max(0, \mathbf{t_{i+1}} \cdot \mathbf{l}) \right|.$ And this value increases with curvature. Our curvature-based blending design reduces this effect by weighting contributions from adjacent segments according to their angular deviation, ensuring smoother appearance without increasing segment density. Without blending, high-curvature areas may exhibit visual artifacts such as shading discontinuities, as shown in Fig. 7 (with zoom-in) and the video demo (02:57–03:09; quality may be reduced due to video compression).

We will include additional visual comparisons in the revision to further illustrate this contribution.

Robustness to unseen motion patterns (e.g., sudden downward movements):

Thanks to our coarse-to-fine dynamic hair modeling framework (Fig. 2 Framework Overview) and training on large-scale motion sequences (Supple. A), our model is robust to unseen head motions, including sudden downward movements. As shown in the demo video (02:34–02:48), under a sudden head drop and turn, the coarse stage captures the primary deformation, while the fine stage models secondary motion (e.g., damping effects) and preserves realistic hair dynamics. Due to NeurIPS policy, we cannot include additional PDFs for visual results at this stage. As suggested, we will add more qualitative results, including re-animation results with dynamic hair merged into real human avatars, in the final version to further illustrate this.

2025-08-08

I appreciate the responses from the authors regarding naming confusion, curvature-based blending, and robustness to unseen motion. I am looking forward to the mentioned additional visual results in the next version.

审稿意见

评分: 4置信度: 42025-07-03

The paper introduces a novel data-driven framework for photorealistic dynamic hair modeling and rendering, rather than physics-based simulation. The author creates a synthetic dynamic hair dataset and utilizes it for training. To model dynamic hair, a coarse-to-fine strategy is first proposed. The coarse stage yields (head) pose-driven deformation rather than rigid transformation based on point cloud and volume deformation, and the fine stage utilizes cross-attention on temporal feature sequence to enable temporally consistent hair animation. Then, the author leverages 3DGS for photorealistic modeling and rendering, together with a splats adjustment network to align the hair appearance with dynamic motion.

优缺点分析

Strengths

The proposed pipeline is a good contribution to dynamic hair modeling, offering a scalable, data-driven alternative to physics-based methods.
The overall idea of hair dynamic modeling and photorealistic appearance optimization is straightforward and technically sound.
The framework is claimed to be able to integrate with 3D gaussian avatar, which increases its application potential.

Weaknesses

Some presentations and descriptions are not clear and poorly satisfactory:

(a) From line 161 to line 163, and in equation (1), no clear formula for $L_{point}$ and $L_{SDF}$

(b) Line 205, what is the definition of tangent vector $t$ ? Is it in equation (7)? If so, the order of explanation is a bit confusing.

What is the training strategy? Is the model trained end-to-end? I cannot find the detailed elaboration.
Simulated training data may not fully capture real-world hair complexity (e.g., lighting variations, fine-scale interactions). Results are validated on synthetic sequences; real-world performance is untested.

问题

See Weaknesses

局限性

Yes.

最终评判理由

I'd like to maintain my initial rating for this paper. Most of my concerns have been adequately addressed and the presented experimental results are convincing. Both the dataset and the proposed method can serve as solid platforms for followers to improve upon.

格式问题

None.

作者回复

2025-07-30

Thank you for the thoughtful feedback! We address your concerns in below and will do our best to incorporate the suggestions (equation clarity, section orders, etc ):

Equation clarity:

(a) From L161-163, and in Eq. (1), no clear formula for $\mathcal{L_\text{point}}$ and $\mathcal{L_\text{SDF}}$ .

Thank you for noting the missing explicit definitions of $\mathcal{L_\text{point}}$ and $\mathcal{L_\text{SDF}}$ in Eq. (1). While briefly described in L162–163, we agree that providing formal formulas will improve clarity.

Point MSE loss

\mathcal{L_\text{point}} = \frac{1}{N} \sum_{i=1}^{N} \left\| \hat{\mathbf{p}}_i - \mathbf{p}_i^{\mathrm{GT}} \right\|_2^2

where $\hat{\mathbf{p}}_i$ is the predicted 3D hair point and $\mathbf{p}_i^{\text{GT}}$ is the corresponding ground-truth hair point.

SDF penalty loss

\mathcal{L_\text{SDF}} = \frac{1}{N} \sum_{i=1}^{N} \max\left( 0,\ -\mathrm{SDF}(\hat{\mathbf{p}}_i) \right)

where $\mathrm{SDF}(\hat{p}_i)$ is the signed distance to the body surface, and we penalize points inside the mesh by applying ReLU.

We will include these definitions in the final version.

(2) Tangent vector definition.

The hair tangent vector represents the local hair direction and the visual definition is in Fig. 3. And Yes, the formal definition is given in Eq (7). Thank you for the suggestion, we will clarify the explanation order in the revision.

(3) Section structure.

Thank you for the suggestion. We will reorder the sections as suggested to improve the logical flow.

Training strategy:

Our learning-based framework follows a two-stage training strategy, as described in Introduction (L57–75). The model is not trained end-to-end (to be explored in future work). We first train a hair deformation model that includes a secondary dynamics module, and then train a separate appearance model. Training times for each model are provided in the Supple. (L101–103). During inference, we first predict hair dynamics, then generate appearance from the deformed hair.

Real-world generalization and hair complexity:

We understand the concerns regarding real data testing. Our model is robust to real-world hair complexity, supported by a well-prepared synthetic dataset and a learning-based architecture. As the first data-driven framework for dynamic hair modeling with 3DGS representations, our method also offers a novel solution to the rigid hair problem in Gaussian head avatar re-animation.

Due to NeurIPS policy, we cannot include PDFs for visual results this year. However, we have conducted new tests combining additional motion with pre-trained real human Gaussian avatars, demonstrating that our model serves as a novel and effective supplement for multi-layer Gaussian avatar re-animation.

To capture lighting effects, our appearance model estimates spherical harmonics (SH) coefficients as a compact color representation. The training dataset includes artist-designed hair shaders under natural lighting conditions to simulate real-world scenarios.

While our current setup is based on synthetic data, we acknowledge its limitations and consider future directions such as dataset expansion and hair relighting to further improve realism and generalization.

As noted in the Introduction (L51–52) and Limitations (Supple. L155–157), capturing strand-level dynamic hair in-the-wild is challenging due to hardware limitations and the lack of robust hair tracking. To address this, we created a large-scale, lab-controlled synthetic dataset driven by various head movements (Main L243–249, Supple. A). This dataset includes diverse head poses and motions, and photorealistic rendering, which we believe generalize well to real scenarios, as demonstrated in video (03:11).

As mentioned in Main L87, we also plan to release our synthetic dataset to support and accelerate data-driven dynamic hair modeling.

2025-08-07

I'd like to thank the authors for their elaborate response. My questions and concerns have been addressed accordingly. As stated in both the review and the authors' response, while this work demonstrates its potential as a viable representation for high-fidelity dynamic hair, there is still room to improve and refine it to make it practical. And thus I will retain my original rating.

审稿意见

评分: 4置信度: 42025-07-07

This paper presents Dynamic Gaussian Hair(DGH), a two-stage learning-based framework for dynamic hair modeling and rendering. The first stage introduces a coarse-to-fine hair deformation network that predicts temporally coherent hair motion driven by head poses. It starts from a canonical hair model and uses SDF-based collision loss to avoid hair-body interpenetration in the coarse stage, followed by a flow-based fine stage that captures high-frequency temporal dynamics using cross-frame attention.

The second stage focuses on dynamic appearance modeling. Each hair strand is represented by cylindrical Gaussian primitives, and a lightweight MLP predicts per-frame appearance parameters(color, scale, opacity) conditioned on geometric cues such as hair tangents and view direction. A curvature-aware blending scheme is introduced to improve appearance continuity in highly curved regions.

Compared to simulation-based methods like XPBD, the proposed deformation model significantly improves efficiency. In terms of appearance, the method extends prior static approaches to dynamic scenes, achieving higher rendering quality under motion and occlusion.

优缺点分析

Strengths: -Both the deformation and appearance modules are tailored to the dynamic setting: the deformation model incorporates temporal refinement via cross-frame flow, while the appearance model dynamically adjusts Gaussian parameters based on hair tangents. -Extensive experiments support the paper's claims, including ablation studies on each component(e.g., SDF loss, attention, tangent features), as well as comparisons with baselines like 3DGS and Gaussian Haircut. -The system shows clear advantages over simulation-based methods(e.g., XPBD) in terms of runtime efficiency and over prior static appearance models in terms of visual fidelity under motion.

Weaknesses: -Despite using SDF constraints in the coarse stage to avoid hair-body interpenetration, high-speed motion in the demo still causes noticeable penetration artifacts. It is unclear why similar constraints were not applied to the fine stage, where temporal dynamics are modeled. -The temporal refinement module only uses two previous frames(t-1 and t-2), which limits the model to learning first-order motion(velocity). Higher-order dynamics(e.g., acceleration, oscillation) may not be fully captured, and the paper does not discuss the impact of temporal window size.

问题

Please see the weakness above.

局限性

A key limitation of the proposed method is its reliance on ground-truth dynamic hair point clouds for supervised training. Such annotations are only feasible to obtain in synthetic settings using simulation, making the method difficult to apply directly to real captured data. Additionally, while the coarse stage uses SDF-based constraints to reduce hair-body interpenetration, the fine stage lacks similar regularization. As a result, high-speed head movements can still cause noticeable penetration artifacts in dynamic sequences. Finally, the temporal refinement module only leverages two previous frames, which restricts the model to learning first-order dynamics. This may limit its ability to capture higher-order temporal effects.

最终评判理由

This paper presents Dynamic Gaussian Hair(DGH), a two-stage learning-based framework for dynamic hair modeling and rendering. Though I still have concerns about the difficulty of applying the memthod to real captured data, considering the clear advantages of the method over simulation-based methods in terms of runtime efficiency and over prior static appearance models in terms of visual fidelity under motion, I lean toward acceptance for this submission and keep the initial rating.

格式问题

N/A

作者回复

2025-07-30

Thank you for your valuable feedback! We carefully considered your comments and provide detailed responses below:

Generalization to real data:

Our method relies on synthetic data for ground-truth (dynamic hair) and supervised training, as discussed in the Introduction (L51–52) and Limitations (Supple. L155–157), as capturing strand-level dynamic hair in the wild remains challenging due to hardware limitations and the absence of robust hair tracking systems. Hence, we opted to create photorealistic dynamic hair data at scale, using state-of-the-art offline toolkits from the graphics community (i.e., physics-based simulation, head motion, and 3D engine with hair shaders for rendering), as detailed in the paper (Main L243–249, Supple. A).

This dataset includes a wide range of head poses and dynamics, enabling our model to generalize to realistic motions, as demonstrated in the video (03:11). While we initially explored unsupervised training using in-the-wild videos and coarse hair meshes, the results lacked strand-level precision and visual sharpness required for dynamic Gaussian rendering.

Beyond hair prediction, our other goal is to align with the 3DGS representation. Our learning-based Stage I model supports dynamic Gaussian hair as a separate layer, designed to integrate seamlessly with multi-layer Gaussian head avatars (Main L16–18). This (point-based) mesh-free design enables re-animation of arbitrary hairstyles (long hair, curls, ponytails) with plausible dynamics while avoiding the complexities of physics-based simulation pipelines.

Our DGH framework is compatible with 3DGS avatars and can be extended to improve head avatar realism. The pipeline is as follows:

Real multi-view images → Strand-based static hair reconstruction [1] → DGH → Merge with animation-ready 3DGS avatar

Due to NeurIPS policy, we cannot include visual results at this stage, but we will present additional results with dynamic hair merged into human avatars in the final version. As mentioned in Main L87, we also plan to release our synthetic dataset to support and accelerate data-driven dynamic hair modeling.

[1] Sklyarova, Vanessa, et al. "Neural haircut: Prior-guided strand-based hair reconstruction." ICCV 2023.

SDF regularization in fine stage:

Thank you for the thoughtful observation. While the coarse stage includes SDF constraints to discourage hair-body interpenetration, we do not apply the same constraints in the fine stage. This is because the fine stage operates in a residual manner, predicting subtle flow-based refinements on top of the coarse deformation. These updates are typically small and designed to model secondary dynamics (e.g., acceleration, damping), rather than introducing large displacements that could lead to new penetrations.

Based on our observations, we omitted SDF supervision in the fine stage, and also especially since real-time SDF evaluations are computationally expensive in training. That said, we agree that incorporating lightweight geometric constraints in the fine stage could further improve robustness under fast motion, and we are planning to explore this direction in future work.

The impact of temporal window size:

Let us clarify what we refer to as "temporal modeling". As described in L65–69 and Fig. 2 (Framework Overview), our Stage I consists of two modules:

Coarse stage: Uses only the current frame’s pose to predict the hair deformation relative to the rigid hair. Since no temporal history is involved, this stage models first-order, pose-conditioned deformation.
Fine stage: Refines the deformation by predicting per-frame flow vectors $\Delta x_{t}$ from previously deformed hair. By leveraging information from $t{-}2$ and $t{-}1$ , this stage captures second-order dynamics (acceleration, damping), while also ensuring temporal coherence. (L166-173)

Specifically, the fine (temporal) stage learns the flow vector from the deformed hair at $t{-}1$ to $t$ , conditioned on deformation and flow from earlier steps ( $t{-}2$ to $t{-}1$ ). This enables the model to implicitly capture second-order dynamics, such as acceleration, by observing how motion evolves over time.

Acceleration is computed via finite differences of velocity, requiring 3 positions:

a_t = \frac{(x_{t} - x_{t-1}) - (x_{t-1} - x_{t-2})}{\Delta t^2}

Our fine model predicts $x_t$ based on:

$x_{t-2}$ and $x_{t-1}$ : previous deformed hair
$\Delta x_{t-1} = x_{t-1} - x_{t-2}$ : the previous flow vector

The current model already observes the change in flow via $x_{t-2} \rightarrow x_{t-1}$ and it implicitly captures second-order effects (acceleration) using two positions and one velocity vector. With enough training data, it can learn acceleration-like patterns from local motion trends. We agree that expanding the temporal window ( $t{-}3$ , $t{-}4$ ) could help capture longer-term trends, we find the two-frame design effective for modeling higher-order dynamics.

This capability is demonstrated in our video (02:35–02:48), where we can see the damping effects. We will clarify this temporal modeling strategy and include the window size discussion in the revised version.

最终决定Accept (poster)

2025-09-17

How to generate photorealistic dynamic hairs is a challenging and on-going problem in digital human synthesis. This paper proposes an effective framework to learn hair dynamics and appearance. Extensive experiments have been conducted to verify the effectiveness of each component and demonstrate the superiority of the proposed method to other SOTA approaches. There are totally four reviewers handling this paper. Among them, three reviewers voted “Borderline accept” and the other one suggested “Accept”. Furthermore, most of the reviewers stated clearly in the Final Justification that they leaned towards accepting this paper after discussing with the authors. The AC agrees with the reviewers that this paper deserves to be published at NeurIPS due to the above-mentioned merits. The authors are recommended to carefully revise their paper according to the reviewers’ suggestions.