PaperHub
8.7
/10
Spotlight4 位审稿人
最低4最高6标准差0.8
4
6
6
5
3.8
置信度
创新性2.8
质量3.5
清晰度3.3
重要性3.0
NeurIPS 2025

Taccel: Scaling Up Vision-based Tactile Robotics via High-performance GPU Simulation

OpenReviewPDF
提交: 2025-04-19更新: 2025-10-29
TL;DR

We propose Taccel, a high-performance GPU-based physical simulator for robots with vision-based tactile sensor.

摘要

关键词
tactile sensingsimulationrobotics

评审与讨论

审稿意见
4

This paper presents a simulation framework for vision-based tactile sensors based on Incremental Potential Contact (IPC) and Affine Body Dynamics (ABD) in which IPC guarantees inversion- and intersection-free contact solutions and ABD is for efficient and precise simulation of stiff materials. This implementation is based on NVIDIA Warp and can achieve high FPS through parallelization. It supports a wide range of robotic tasks and environment setups and its performance is validated by many robotic tasks using tactile signals.

优缺点分析

Strengths: This work presented in this paper is very complete, including very detailed technical and implication details, sufficient experiment across many scenarios, and benchmarks compared with prior works. The proposed tactile simulation framework is more stable and can handle more complex physical interactions with smaller sim2real gap. It also achieves higher computational efficiency enabling more simulation environments running in parallel. Weaknesses: This work is valuable but more like an incremental project. The combination of IPC and ABD has been implemented in other simulators for the same purpose and the tactile RGB image generation has also been explored previously. It is clear that this work outperforms the previous works but the technical novelty is marginal.

问题

The presentation is clear and I don't have any questions.

局限性

yes

格式问题

No

作者回复

We thank the reviewers and the chairs for their time and effort in providing highly valuable feedback. We appreciate the reviewers for recognizing our formulation sound (ePNg, TkE2), experiments thorough (ePNg, VcE1, TkE2, rfK1), performance significantly improved (ePNg, VcE1, rfK1), and the paper well-written (VcE1, TkE2). Here we address your concerns and answer your questions below.

Technical Novelty and Contributions

Thank you for your feedback. While we build upon foundational techniques like IPC and ABD, Taccel's novelty and primary contribution lie in the synergistic integration and engineering advancements that enables improved scalability for massive parallelization and the flexibility to integrate various robots and sensors. We also introduce a few new features to better support robotic manipulation:

  • A solver pipeline redesigned for multi‑environment execution, enabling numerous environments to run efficiently in parallel and making large‑scale robot learning feasible
  • Complete functions for VBTS simulation, including tools for tactile sensor fabrication, tactile signal generation in various formats, etc.
  • Improved infrastructure, which is also compatible with NVIDIA warp.sim, which allows Taccel to leverage additional functions from warp, including the OpenGL-based and ray-tracing-based renderer, disentangled simulation and rendering.

In essence, while the components may be known, Taccel represents a significant engineering achievement that pushes the boundary from few-instance, slow simulation to large-scale, high-performance experimentation.

审稿意见
6

This paper introduces Taccel, a high-performance simulation platform tailored for vision-based tactile sensors (VBTSs) in robotics. VBTSs offer rich tactile feedback through gel pad deformations, but simulating their behavior remains challenging due to the complex mechanics of soft gel pads and the high computational cost. The proposed method addresses these challenges by integrating Incremental Potential Contact (IPC) and Affine Body Dynamics (ABD), achieving high-fidelity, collision-free simulations at >18 wall-clock speed.

Powered by efficient implementation with NVIDIA Warp, Taccel supports large-scale parallelization. Its modular design also enables flexible sensor-robot configurations via user-friendly Python APIs, and it generates high-resolution RGB and depth-based tactile signals. The paper tests Taccel on diverse tasks, including object classification, grasping, and articulated manipulation, to demonstrate its strong capability for sim-to-real transfer, which outperforms prior simulators like TacMan in both precision and scalability.

优缺点分析

Strengths

  • The paper proposes an IPC-ABD combined simulation to simulate vision-based tactile sensors (VBTS), which is novel, original, and significant. It improves the simulation accuracy compared to rigid body simulations like Tacto and includes the robot in the loop compared to FEM based simulator like DiffTactile. Authors have provided detailed mathematical derivations and justifications for when to use which simulation method.
  • The paper provides a thorough description of the simulation pipeline, covering all important aspects of VBTS, i.e. RGB images, normal maps, marker flow, and 3D point cloud information.
  • The experiment section provides comprehensive examples to demonstrate Taccel's capability of sim-to-real transfer and applications in robotic manipulation tasks through three main tasks: training object classification models, generating synthetic data with scalable parallelization, and manipulating articulated objects. They demonstrate the significance of VBTS and efficient learning with simulations like Taccel.
  • The paper and supplementary material are overall well-written and easy to follow. The video provides a good dissection and demonstrates the simulation scenes effectively.

Weakness

  • The paper focuses on sensor and robot modeling, but doesn't elaborate on the object modeling. It mentions that Taccel can handle both rigid and soft objects, but could it handle object of different materials? would the simulation speed be affected? More testing examples and analysis would be helpful.
  • Clarification question: in "object classification" experiment, since the tactile data only provide local information, how does the model learn to differentiate different objects VS different parts of the same model? (Fig. 5 (a) shows nine mechanical parts, but the caption says ten. Is any of them considered as two objects?) Also, those parts seem to fabricated using the same material thus only differ in geometry. How would Taccel handle objects of different materials, e.g. wooden plate VS metal plate?
  • The paper mentions "the simulation time scales linearly with the number of mesh vertices". Does it mean simulating sensor contact with objects with more complicated geometries would take a longer time? How does it generalize if we want to simulate the gel pad with higher resolution, emulating a higher-resolution embedded camera? How does it generalize to daily objects? i.e. Taccel achieves 18x faster than wall clock time with "bolt and nut" task; how does it change with more complex but still common object manipulation tasks?

问题

  • As discussed in the "Strengths and Weaknesses" section, can Taccel handle different object materials or objects with heterogeneous materials? If so, what is an efficient way to do that? If not, what are the challenges?
  • Part of the main sim-to-real challenges for VBTS come from (1) sensor degradation over time and (2) variance among different sensor configurations and among individual sensors of the same configuration. Would that be mitigated by using Taccel simulation? If so, how would Taccel efficiently simulate different sensor configurations and/or sensor degradation over time to reduce the sim-to-real gap?
  • Open-ended question for discussion: one weakness of VBTS is force estimation, as real sensors only use embedded cameras to capture the gel deformation. However, now given the simulation environment and the known full state of the sensor and objects, is it possible to derive force/torque measurement from the simulation, which then helps to train a NN to learn the mapping of tactile images to force estimation? (Context: one of current methods for force estimation using VBTS is to collect a bunch of data pressing objects on the sensor with different forces, then train a NN. With Taccel, would it be easier to gather ground truth data with more variation, thus making the task easier?)

局限性

yes

最终评判理由

Thanks for the detailed rebuttal and thoughtful insights provided in the open discussion. The rebuttal addresses most of my concerns and I would like to raise my score to "Strong Accept".

格式问题

N/A

作者回复

We thank the reviewers and the chairs for their time and effort in providing highly valuable feedback. We appreciate the reviewers for recognizing our formulation sound (ePNg, TkE2), experiments thorough (ePNg, VcE1, TkE2, rfK1), performance significantly improved (ePNg, VcE1, rfK1), and the paper well-written (VcE1, TkE2). Here we address your concerns and answer your questions below.

Object Modeling, Material Compatibility, and Their Overhead (W1, Q1)

Taccel supports a range of object materials by implementing widely used elastic constitutive models like Neo-Hookean and StVK. It also accommodates heterogeneous materials by allowing different parameters (e.g., Young's modulus) to be assigned to each tetrahedral element of an object's mesh. Simulation speed theoretically remains unaffected, since the computation follows the same procedure with no extra computational overhead introduced. In practice, different material properties can lead to different physical outcomes (eg., how an object deforms when grasped), which may cause minor variations in the total number of solver iterations needed for the simulation to converge.

Simulation Speed (W3)

Thank you for raising this vital point. FEM-based simulators like Taccel requires more time to simulate tactile sensor meshes or object meshes with more vertices, as validated by the Sec. 5.4.

Our high-resolution benchmarks were intended to stress-test Taccel's performance and stability. For many practical applications, a more efficient approach is to simulate only low-resolution marker flows [1,2], which requires far fewer vertices. When high-resolution tactile images are needed in the loop, relying solely on dense physical simulation is not an efficient approach. A promising solution is to directly fuse simulated deformation with object surface [1], which allows for high-resolution tactile image rendering with low-resolution physics simulation. We plan to further explore this in future work.

Classification Experiment (W2)

In our object classification experiment, the objects are in similar shape (around 50-100mm in length) and mainly differs on surface geometries (diameter, pitch, thread depth). The sensor's contact area (40×40mm) is large enough to capture these fine geometric features that distinguish the 10 objects. We further randomize the object's pose during data collection, ensuring the model is exposed to different parts of each object (eg., a bolt's head vs. its threads) so it can identify the same object from its different regions.

We use 10 individual objects for this experiment. We apologize for the missing object in Figure 5(a) and will correct it in the revision.

Classifying Objects of Different Materials (W2)

This is an excellent point. Theoretically, Taccel can simulate the contact with different materials with various surface textures (eg., wood grain vs. smooth metal), as long as fine-grained meshes are available.

While beyond our main focus in this paper, to practically incorporate these materials inside VBTS simulator could be expensive as it requires fine-grained meshes for some materials (eg., meshes that includes the cracks of the wooden plate). Intuitive solution may include adjusting the tactile signal with the texture normal maps, which are easily accessible online. We believe this is a promising direction for future study.

Sim-to-Real with Sensor Degradation and Variances (Q2)

Sensor configuration variances (production variance, elastomer elasticity, assembly error, etc) could be remedied in Taccel by adding perturbations to the sensor model as an augmentation during simulation.

Handling sensor degradation (eg., dimming LEDs) is more challenging and is a common issue for all VBTS simulators. Currently, this should be addressed via recalibration the simulation model with real world samples.

Open Discussion on Learning Force Estimation (Q3)

Thank you for this thought-provoking question. VBTS simulators like Taccel do serve as a practical tool for learning estimation of forces. We provides the APIs to obtain the forces within each FEM element, enabling the automatic generation of large, diverse datasets that pair simulated gel deformations with exact force/torque labels. Data collection protocol and applications have been explored in simpler settings [3-5]. We believe several aspects are important for learning an estimation model:

  1. Systematic Data Generation: The training data must cover diverse object geometries, contact types (indentation, shear), material properties, and dynamic trajectories.
  2. Physics-Inductive Models: Using architectures like Graph Neural Networks (GNNs) over the FEM mesh or Physics-Informed Neural Networks (PINNs) can improve model accuracy and interpretability [6, 7].

Currently a few challenges remains:

  1. Speed and accuracy trade-offs: FEM-based simulators like Taccel still have a large space for improvement in speed. Faster speed indicates better scalability and more agile iteration of synthetic data.
  2. Real-world fine-tuning: Precise calibration or fine-tuning with real-world data (eg., [3]) is still necessary for better performance. Despite the accuracy of the existing simulators, the sim-to-real gaps inevitably exists. To this end, noises and sensor degradation, as mentioned in Q2, should be systematically modeled and remedied. Such studies were rich in vision (eg., [8]) but still lacking for VBTS simulators.

[1] Chen, Weihang, et al. "General-purpose sim2real protocol for learning contact-rich manipulation with marker-based visuotactile sensors." IEEE Transactions on Robotics 40 (2024): 1509-1526.

[2] Si, Zilin, et al. "Difftactile: A physics-based differentiable tactile simulator for contact-rich robotic manipulation." arXiv preprint arXiv:2403.08716 (2024).

[3] Helmut, Erik, et al. "Learning force distribution estimation for the gelsight mini optical tactile sensor based on finite element analysis." arXiv preprint arXiv:2411.03315 (2024).

[4] Higuera, Carolina, et al. "Sparsh: Self-supervised touch representations for vision-based tactile sensing." Conference on Robot Learning. PMLR, 2025.

[5] Yuan, Wenzhen. Tactile measurement with a gelsight sensor. Diss. Massachusetts Institute of Technology, 2014.

[6] Shi, Haochen, et al. "Robocraft: Learning to see, simulate, and shape elasto-plastic objects in 3d with graph networks." The International Journal of Robotics Research 43.4 (2024): 533-549.

[7] Karniadakis, George Em, et al. "Physics-informed machine learning." Nature Reviews Physics 3.6 (2021): 422-440.

[8] Zhang, Xiaoshuai, et al. "Close the optical sensing domain gap by physics-grounded active stereo sensor simulation." IEEE transactions on robotics 39.3 (2023): 2429-2447.

评论

I would like to thank the authors for their detailed rebuttal and response, including the thoughtful insights in the open discussion. It addresses most of my concerns and I would like to raise my score to "Strong Accept".

评论

We would like to express our sincere appreciation for your positive and encouraging feedback.

审稿意见
6

This paper introduces Taccel, a parallel simulation platform to simulate vision-based tactile sensors (VBTS) for robotics. Taccel combines Incremental Potential Contact (IPC) as a contact model, Affine Body Dynamics (ABD) for stiff materials, and Finite Element Method (FEM) for soft-body simulation of hyper-elastic soft gels used in VBTSs. Taccel enables accurate and high-speed simulation. The experiments demonstrate that Taccel can be used for applications such as generating synthetic data to train a model for object classification, simulating robotic grasping using multi-fingered dexterous hands with integrated VBTSs, and articulated object manipulation using tactile perception feedback.

优缺点分析

[S1] The paper is generally well-written, and the proposed Taccel simulator and accompanying experiments are comprehensive. The authors motivate the use of IPC+ABD based on the limitations of prior work simulating VBTSs alternative simulation techniques. The performance gains of parallel simulation enabled by Taccel over single-environment simulators are impressive (16x).

[W1] To the best of my knowledge, while the Taccel simulator is available on Github, most of the actual (IPC) simulation code is closed source (only compiled binaries are provided). Given that the authors appear committed to developing Taccel with the community (L187-189), I would encourage the authors to consider fully open sourcing Taccel. My main concern here is that IPC is notoriously slow, and understanding the optimizations made for IPC that achieves the performance reported in Taccel, would greatly benefit the community. Furthermore, it seems that Taccel uses SAPIEN-IPC? (an IPC implementation based on NVIDIA Warp that is fully open sourced.)

[W2] It is difficult to directly compare the quality of Taccel’s simulation of VBTSs to existing VBTS simulators such as SAPIEN-IPC, Taxim, or DiffTactile, given the experimental results in Section 5 or 6. I think an experiment that compares directly pixel-wise tactile marker MSE between VBTS simulators would be helpful. While there is a quantitative result reporting SSIM between the simulated and real-world RGB tactile image, this robustness may be due to the depth -> RGB neural network, rather than quality improvements in tactile sensor simulation.


I am willing to raise my score if [W1] and [W2] are resolved.

问题

L61-63: I believe these numbers are for total effective FPS, instead of per-environment FPS? If this is the case, then I don’t think it’s accurate to describe 18x “real-time” performance in the abstract L10), because the single-environment FPS is still sub 1 FPS (Figure 4).

L309: What are execution-recovery switch counts? This metric is not explained in the subsection.

Figure 2(a): It looks like only two rigid planes are interacting with the bolt (while the Franka gripper is only used for visualization). Does the task fail if the finger collision shapes are smaller (similar to what is used in PyBullet and Sapien)? What collision simplifications are made by Taccel, such as for the articulated rigid body tasks in Figure 7?

SAPIEN-IPC is also implemented using NVIDIA Warp. Is Taccel based on SAPIEN-IPC (which appears fully open sourced)? What are the major differences between Taccel’s implementation of IPC and SAPIEN-IPC that enable faster simulation (Figure 2 (d))? L251 mentions that SAPIEN-IPC’s use of FP32 leads to convergence issues, while Taccel uses FP64 (Figure 4). I would expect use of FP64 to lead to slower simulation (and larger memory requirements). Does FP64 enable fewer solver iterations or larger timesteps?


[minor]

  • References should not in the abstract?
  • L37-38: “contactsd” spelling?
  • L60: “warp” is lowercase?

局限性

Yes.

最终评判理由

The authors prepared an extensive rebuttal that resolved most of my questions and concerns. While [W1] is not fully addressed (full code release is still in preparation), I will still increase my score (-> 6) in a show of good faith. Taccel is an impressive work in tactile parallel simulation, advancing the state-of-the-art on IPC simulation, with comprehensive evaluations in simulation and validation on real-world tactile sensors.

格式问题

n/a

作者回复

We thank the reviewers and the chairs for their time and effort in providing highly valuable feedback. We appreciate the reviewers for recognizing our formulation sound (ePNg, TkE2), experiments thorough (ePNg, VcE1, TkE2, rfK1), performance significantly improved (ePNg, VcE1, rfK1), and the paper well-written (VcE1, TkE2). Here we address your concerns and answer your questions below.

Code Release of Taccel (W1)

We agree that fully open-sourcing the code is beneficial for the community, and do plan to open-source Taccel in the near future. Before that, we are still working on optimizing and testing the core implementations to improve their robustness and accuracy. We are also working on adapting the infrastructure to warp.sim and NVIDIA Newton framework for broader compatibility and a richer ecosystem.

Comparisons on Tactile Markers (W2)

Experiment: We conduct sim-real twisting experiments to test elastomer deformation, and compare with TacFlex[3], a recently published (T-RO) FEM-based simulator. In the real world, a GelSight Mini in controlled by a robot arm, pressed perpendicularly onto a fixed sphere (radius rr = 10mm) to achieve a contact depth dd \in [6,12]mm, then twisted about the z-axis for an angle θ\theta. Marker flows are collected as θ\theta reaches 2.5, 5, 7.5, and 10 degrees. The scene is simulated in Taccel and the public codebase of TacFlex, both using a 2k-node elastomer.

Metric: However, while pixel-wise comparison (eg., SSIM, MSE) is suitable for comparing tactile image quality (Fig. 3(a)), they are less suitable for evaluating tactile images with markers, as the they are overly sensitive to minor positional shifts of these high-contrast marker pixels. A small, physically insignificant displacement can lead to a disproportionately large error, masking the true geometric accuracy of the simulation. Therefore, we adopt a more direct and conventional evaluation by comparing the marker flow, which directly measures the simulated deformation accuracy. This methodology is the established standard in the vast majority of prior works [1-3].

Results: We report the pair-wise error of marker flow magnitudes (length difference) and directions (angular difference) below. Taccel demonstrates lower errors in marker flow simulation, which we attribute to its superior accuracy of the IPC solver in handling friction and elastomer deformation.

SimulatorFlow Magnitude Error / mm ↓Flow Direction Error / deg ↓
Taccel0.0540 (±0.0019)0.2877 (±0.0349)
TacFlex0.0968 (±0.0198)0.3981(±0.1565)

Simulation FPS Computation (Q1)

You are correct. The reported FPS represents the total effective FPS across all parallel environments (ie., total simulation steps per second). We will revise the abstract and main text to state this with greater clarity. Thank you for pointing this out.

The Execution-Recovery Switch Count (Q2)

We apologize for omitting the explanation from the main text; it was in the appendix (Sec. C.5) and will be moved to the main text to make it self-contained.

The "switch count" is a key performance metric for the Tac-Man algorithm [4], which manipulates articulated objects using tactile feedback. The algorithm frequently switches between an execution stage (moving the part following a guessed direction) and a recovery stage (readjusting the grasp based on tactile signals to ensure stable contact) based on tactile deformation thresholds. The "switch" between the two stages occurs when deformation exceeds a threshold (execution → recovery) or the large deformation is compensated (recovery → execution). The "switch counts" is the number of such switches during the manipulation.

The simulation of Tac-Man requires precisely solving the friction and gel pad deformation during a long sequence. The close match in the "switch counts" between our simulation and the real world indicates that Taccel accurately replicates the complex, long-horizon contact dynamics of the task.

Collision Settings (Q3)

In Figure 2(a), the gray “rigid planes” are actually the soft gel pads of the two VBTS sensors attached on the fingers (the sensor shells are not visualized for clarity). The gripper grasps the bolt with the pads and screws it into the nut, during which all the elements, including the robot links, gel pads, and objects, are involved in the physical simulation (collision detection, contact solving, time stepping, etc).

Based on ABD, Taccel supports efficient simulation using either the collision meshes (simplified shapes, convex decomposition, or convex hull) or the high-resolution, non-convex visual mesh on GPU. While we directly use these detailed meshes in our experiments (as rendered), we recommend using simplified collision meshes for greater efficiency in practice.

Relation with SAPIEN-IPC (W1, Q4)

Taccel is an independent development and is not based on SAPIEN-IPC [1]. Although both use IPC and NVIDIA Warp, we attribute Taccel's faster speed to two key factors:

Implementation Optimizations: We developed a more compact data management and optimization solving scheme, which allows more parallel environments on a single GPU and thus the faster speeds. Environment isolation in collision detection and step size filtering are introduced for each environment to evolve independently with its own optimization step size. Environments meeting their individual stopping criteria (eg., Newton relative error) are frozen to free resources. A failure detection stage further isolates unstable environments, enhancing overall robustness.

FP64 Precision: You are correct that FP64 arithmetic is slower per operation than FP32 (as used in SAPIEN-IPC). However, IPC benefits from FP64’s higher precision, enabling more accurate search directions and line search filtering, which significantly accelerates convergence by shortening Newton iterations. This reduction in iterations more than compensates for the higher cost per iteration, resulting in a faster and more stable simulation overall. See the next question for a quantitative inspection.

Computing Data Type of FP64 (Q4)

FP64 does cause slower simulation speed, but the higher precision enables much fewer solver iterations for larger timesteps. To demonstrate this, we run the peg-insertion benchmark (Sec. 5.4) in Taccel and SAPIEN-IPC with 1 / 16 / 64 environments, with the newton iteration limit set to 50 and optimization residue tolerance set to 0.01m/s. During each iteration, the IPC solver performs conjugate gradient descent until (i) the 50-step limit is exceeded, or (ii) the optimization residue is below 0.01m/s. The table below reports the maximum Newton iteration required for each step, the optimization residue, total time and total FPS. Although each iteration takes longer for FP64, the convergence is much faster and thus the better simulation speed.

Simulator# Envs# Newton Iterations (max 50) ↓Optim. Residue / (m/s) ↓Time / s ↓Total FPS ↑
Taccel (FP64)11.585 (±1.128)7.643e-5 (±6.550e-5)69.782.866
SAPIEN-IPC (FP32)124.085 (±10.354)0.001 (±0.006)82.4112.427
Taccel (FP64)162.83 (± 2.565)1.866e-4 (±2.089e-4)141.91922.548
SAPIEN-IPC (FP32)1650.0 (±0.0)0.113 (±0.440)291.46210.979
Taccel (FP64)643.445 (±3.261)2.043e-4 (±2.200e-4)196.71265.069
SAPIEN-IPC (FP32)6450.0 (±0.0)0.196 (±0.442)668.95019.134

Note that we have increased the Newton iteration limit from 20 (used in Sec. 5.4) to 50 to inspect the convergence behavior under FP32. The results are not comparable.

Editorial Issues (Q5)

Thanks for pointing them out! We will carefully revise the typos and editorial issues.

[1] Chen, Weihang, et al. "General-purpose sim2real protocol for learning contact-rich manipulation with marker-based visuotactile sensors." IEEE Transactions on Robotics 40 (2024): 1509-1526.

[2] Zhang, Chaofan, et al. "TacFlex: Multi-Mode Tactile Imprints Simulation for Visuotactile Sensors with Coating Patterns." IEEE Transactions on Robotics (2025).

[3] Du, Wenxin, et al. "Tacipc: Intersection-and inversion-free FEM-based elastomer simulation for optical tactile sensors." IEEE Robotics and Automation Letters 9.3 (2024): 2559-2566.

[4] Zhao, Zihang, et al. "Tac-man: Tactile-informed prior-free manipulation of articulated objects." IEEE Transactions on Robotics (2024).

评论

I appreciate the authors' efforts in preparing an extensive rebuttal. Taccel is an impressive work in tactile parallel simulation, advancing the state-of-the-art on IPC simulation, with comprehensive evaluations in simulation and validation on real-world tactile sensors. I look forward to the full open source release of the simulator for camera ready. My questions and concerns have been mostly answered. While [W1] is not fully addressed (full code release is still in preparation), I will still increase my score (-> 6) in a show of good faith.

评论

We extend our sincere gratitude to the reviewer for their positive and encouraging feedback on our work. We reaffirm our effort to the full code release.

审稿意见
5

This paper presents Taccel, a high-performance simulation platform for vision-based tactile sensors (VBTSs) integrated with robotic systems. The key technical contribution is the combination of Incremental Potential Contact (IPC) and Affine Body Dynamics (ABD) to achieve both physical accuracy and computational efficiency. The platform achieves an 18-fold speedup over real-time simulation across thousands of parallel environments on a single GPU. The authors validate their system through three applications: object classification with sim-to-real transfer, large-scale robotic grasping dataset generation, and articulated object manipulation. The paper demonstrates that Taccel can generate realistic tactile signals while maintaining stable, penetration-free physics simulation at scale.

优缺点分析

Strengths:

  • The unified IPC-ABD formulation seems mathematically sound and addresses key limitations of existing simulators. The use of IPC guarantees intersection-free trajectories while ABD efficiently handles rigid components, avoiding the computational overhead of treating everything as soft bodies.
  • The ability to simulate 4096+ parallel environments with tactile sensors is a significant achievement. The performance benchmarks showing 915 FPS for low-resolution tactile simulation represent a major improvement over existing solutions like SAPIEN-IPC.
  • The paper provides thorough validation across multiple dimensions - physics accuracy (bolt-nut assembly, soft block pressing), tactile signal fidelity (SSIM of 0.93), and real-world task performance (Tac-Man manipulation with 1.1% error compared to real execution).
  • The sim-to-real transfer results (70.94% accuracy on real mechanical parts after training only on synthetic data) demonstrate the practical value of the simulator for developing tactile perception systems.
  • The provision of Python APIs and support for standard formats (URDF) makes the tool accessible to the robotics community.

Weaknesses:

  • Limited tactile sensor models: While the paper focuses on GelSight-type sensors, it's unclear how easily the framework extends to other VBTS designs (e.g., different gel materials, alternative optical configurations). The RGB signal generation relies on a DNN trained on only 200 real images, which may limit generalization.
  • Despite the impressive speedup, the system still requires high-end GPUs (H100) for optimal performance. The scalability analysis doesn't adequately address performance on more accessible hardware or the memory-computation tradeoffs. Incomplete comparisons: The paper lacks detailed comparisons with some recent simulators. Most notably, there's no comparison with TacEx (Nguyen et al., 2024, TacEx: GelSight Tactile Simulation in Isaac Sim -- Combining Soft-Body and Visuotactile Simulators), which also combines soft-body and visuotactile simulation in Isaac Sim. This is a significant omission given the similar goals. -The paper doesn't thoroughly discuss when and why the simulation might fail to match real-world behavior. The sim-to-real gap in object classification (86.5% -> 70.94%) suggests room for improvement.
  • The validation focuses primarily on static/quasi-static scenarios. Dynamic manipulation tasks with rapid contact changes or multi-contact scenarios are underexplored.

问题

  • How does Taccel compare to the recently presented TacEx system (Nguyen et al., CoRL 2024 Workshop), which also combines soft-body simulation with visuotactile rendering in Isaac Sim? Could you provide a detailed comparison in terms of accuracy, performance, and features?
  • How difficult would it be to extend Taccel to support other VBTS designs beyond GelSight-type sensors? What modifications would be required for sensors with different optical properties or gel geometries?
  • The current validation focuses on relatively slow, controlled movements. How does the simulation accuracy degrade under dynamic conditions with rapid contact formation/breaking or high-velocity impacts?
  • What is the relationship between memory usage and the number of parallel environments? Is there a practical upper limit beyond GPU memory constraints?
  • How sensitive are the results to the choice of material parameters (Young's modulus, Poisson's ratio)? Do you provide guidance for parameter identification from real sensor data?
  • Are there plans to support multi-GPU execution for even larger-scale simulations? What would be the main challenges in implementing this?
  • For online robot learning applications, what is the minimum hardware configuration needed to achieve real-time performance with a reasonable number of environments (e.g., 100-1000)?

局限性

The authors acknowledge computational demands as a limitation, identifying PCG-based linear system solving as a bottleneck. However, several other limitations deserve mention:

  • The fidelity of tactile signal generation depends on the quality of the trained DNN model, which may not generalize to all surface textures or contact conditions

  • The Neo-Hookean material model may not accurately capture all aspects of real gel pad behavior, particularly under large deformations

  • The current implementation appears limited to single-GPU execution, which may constrain very large-scale simulations

最终评判理由

I appreciate the authors' efforts in preparing an extensive rebuttal. Taccel is an impressive work in tactile parallel simulation, advancing the state-of-the-art on IPC simulation, with comprehensive evaluations in simulation and validation on real-world tactile sensors. The rebuttal largely confirms my assessment and I will up my score.

格式问题

I have no formatting concerns.

作者回复

We thank the reviewers and the chairs for their time and effort in providing highly valuable feedback. We appreciate the reviewers for recognizing our formulation sound (ePNg, TkE2), experiments thorough (ePNg, VcE1, TkE2, rfK1), performance significantly improved (ePNg, VcE1, rfK1), and the paper well-written (VcE1, TkE2). Here we address your concerns and answer your questions below.

Sensor Compatibility (W1, Q2)

Taccel is highly extensible and supports a wide range of VBTS with various:

  • gel materials with adjustable simulation parameters like Young’s Modulus
  • gel geometry, including cuboid, curve-shaped (DIGIT), dome-shaped (DIGIT360, PPTac)
  • mechanism and optical setup: Our data-driven tactile image generation handles different mechanisms (eg., 9DTact) and optical setups via the learnable tactile image model

We have validated the integration and physical simulation of four sensors in Taccel: our custom GelSight and 9DTact sensors, commercial sensors like GelSight Mini, DIGIT, and DIGIT360.

We further validate the data-driven tactile image generation for a 9DTact sensor (different mechanism compared to GelSights). The calibration and validation protocols follows Sec. 5.2, and the simulated tactile images achieves SSIM = 0.97, confirming the high fidelity.

Generalization of the DNN (W1, L1)

This is a very interesting point. The DNN model learns a local, pixel-to-pixel (rather than patch-based) mapping from the pixel coordinate and surface normal to the color change (Sec. 4.2). This is a generalizable formulation across object geometries (Fig. 3a), analogous to the widely-used GelSight depth model [1]. Based on this, the actual data for training includes n_img x h x w data points (32M, for our 400 x 400 images), practically sufficient for a small MLP (<100K parameters).

We agree its limitation of generalizing to all textures or extreme contacts (eg., not producing shadows casted by the gel when largely deformed) and will clarify this in revision.

Comparisons with other Simulators (W2, Q1)

We agree a comparison to recent simulators like TacEx is needed.

  • Physics Accuracy: TacEx simulates soft bodies with GIPC, asynchronously sending their nodal positions back to PhysX for rigid body simulation. This lacks a global energy minimization guarantee, which may introduce potential coupling errors and simulation instabilities at soft-rigid contact. Taccel integrates IPC and ABD in a unified augmented Lagrangian formulation, with soft and rigid bodies strongly coupled by contact forces, providing theoretically better accuracy in contact.
  • Sensor Signal Accuracy: TacEx uses Taxim that converts object geometry to tactile signals via look-up tables. Taccel uses the deformed elastomer to compute marker flow (via camera projection) and contact depth map (via ray-tracing), generating tactile images with a learned model. This allows Taccel to capture finer and richer elastomer deformation modes, particularly shear deformations, yielding more accurate tactile signals.
  • Efficiency: GIPC uses a quasi-Newton method to accelerate Newton iterations, which is a potential improvement for Taccel. Currently, Taccel focuses on optimizing overhead for massive parallelization environments on GPU.
  • Features: TacEx is a useful plug-in for Isaac Lab, while Taccel is a comprehensive simulation platform for robots, objects, and sensors. It also offers additional tools and APIs for sensor integration and simulation.

The code of TacEx is not yet released upon access, making a more detailed comparison impossible. We opt to a quantitative comparison with TacFlex[3], another recently published (T-RO) FEM-based simulator.

Experiment: We conduct sim-real twisting experiments to test elastomer deformation. In the real world, a GelSight Mini in controlled by a robot arm, pressed perpendicularly onto a fixed sphere (radius rr = 10mm) to achieve a contact depth dd \in [6,12]mm, then twisted about the z-axis for an angle θ\theta. Marker flows are collected as θ\theta reaches 2.5, 5, 7.5, and 10 degrees. The scene is simulated in Taccel and TacFlex, both using a 2k-node elastomer. Despite the multi-engine support, only the Flex solver was available in their codebase currently, so we used it for our comparison.

Results: We report the pair-wise error of marker flow magnitudes (length difference) and directions (angular difference). Taccel demonstrates lower errors in marker flow simulation, which we attribute to its superior accuracy of the IPC solver in handling friction and elastomer deformation.

SimulatorFlow Magnitude Error / mm ↓Flow Direction Error / deg ↓
Taccel0.0540 (±0.0019)0.2877 (±0.0349)
TacFlex0.0968 (±0.0198)0.3981 (±0.1565)

Hardware Requirement and Usage (W2, Q4, Q7)

We additionaly benchmark the peg-insertion task on various GPUs (timestep Δt\Delta t = 1/50s). Taccel leverages FP64 for improved precision and faster Newton convergence (see reponse to reviewer VcE1's Q4), making HPC GPUs (FP32:FP64 FLOPS ratio = 1:2) ideal. More accessible cards like the 3090, despite their lower FP32:FP64 ratio (1:64), also deliver scalable performance, achieving real-time performance (50FPS) with around 100 environments.

GPU (FP64:FP32)H100 (1:2)RTX 4090 (1:64)RTX 3090 (1:64)RTX 3090 (1:64)RTX 3090 (1:64)
# Envs25625625612864
Total FPS185.52129.12121.3874.8143.40

The GPU memory (VRAM) scales linearly with the number of environments, enabling efficient scaling until memory saturation. For the peg insertion task on an H100 GPU:

# envs11625610244096
VRAM / GiB4.34.37.216.754.0

Currently the bottleneck for scalability is still on the GPU memory, which can extend further via Multi‑GPU support (Q6). We aim to further enhance Taccel's scalability in the future.

Accuracy Validations on More Dynamics Tasks (W3, Q3)

Taccel's foundation in IPC and ABD allows it to robustly handle highly dynamic events involving high-speed contact. We have validated it through qualitative tasks like throw-and-catch, where the results are visually realistic at a cost of more Newton iterations. However, quantitative sim-real comparisons is much harder due to the stochasticity of dynamic events. For a qualitative reference, the original IPC paper's demos [2] illustrate its capability to simulate rapid, high-impact contacts with high physical fidelity.

Potential Failure in Sim-Real Mismatch (W2)

We do observe sim-real mismatch that requires further improvement, despite the moderate sim-real gap revealed by our experiments. Primary sources include:

  • Errors in camera calibration (distortions, noises, refraction through the gel [3]), which can be observed in Fig. 5(b)
  • LEDs and the sensor degradation through usage (light dimming, elastomer wear, and environmental light leakage)
  • Use of imprecise physical parameters (eg., Young’s Modulus) and suboptimal solver parameters (eg., optimization tolerance) in simulation, despite its robustness
  • Slight sim-real geometrical mismatch of the sensor and the objects due to discretization error

We will add a discussion on this for completeness.

Affects and Calibration of Materials Parameters (Q5)

Practically, the sensor elastomer's Poisson’s ratio ν\nu is typically in [0.3, 0.45], within which the response variation is subtle. The Young’s modulus EE typically falls in [0.01, 10] MPa, and the response remains almost constant within the same order of magnitude. The simulation is not overly sensitive to these parameters.

Standard techniques exist for precise calibration, such as sliding experiments for friction coefficients [4] or tensile tests for elastic moduli [5,6,7]. For simulations without calibrations, our practice is to start within the typical range and carefully adjust them by inspecting the simulation scenes. We will provide guidance and references for these methods in revision.

Multi-GPU Support (Q6, L3)

We plan to support multi-GPU execution in future work. Our parallel-environment design allows for isolated environments to be distributedly solved. Currently we foresee no major technical barriers to this.

Additional Limitations (L2, L3)

We acknowledge the limitations and will incorporate them in the revision.

[1] Wang, Shaoxiong, et al. "Gelsight wedge: Measuring high-resolution 3d contact geometry with a compact robot finger." 2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2021.

[2] Li, Minchen, et al. "Incremental potential contact: intersection-and inversion-free, large-deformation dynamics." ACM Trans. Graph. 39.4 (2020): 49.

[3] Zhang, Chaofan, et al. "TacFlex: Multi-Mode Tactile Imprints Simulation for Visuotactile Sensors with Coating Patterns." IEEE Transactions on Robotics (2025).

[4] Yuan, Wenzhen. Tactile measurement with a gelsight sensor. Diss. Massachusetts Institute of Technology, 2014.

[5] Gould, Phillip L., and Yuan Feng. Introduction to linear elasticity. Vol. 2. New York: Springer, 1994.

[6] Javanmardi, Yousef, et al. "Quantifying cell-generated forces: Poisson’s ratio matters." Communications physics 4.1 (2021): 237.

[7] Li, Mingxuan, et al. "EasyCalib: Simple and low-cost in-situ calibration for force reconstruction with vision-based tactile sensors." IEEE Robotics and Automation Letters (2024).

评论

I appreciate the authors' efforts in preparing an extensive rebuttal. Taccel is an impressive work in tactile parallel simulation, advancing the state-of-the-art on IPC simulation, with comprehensive evaluations in simulation and validation on real-world tactile sensors. The rebuttal largely confirms my assessment and I will up my score.

评论

We are very grateful for the your positive and encouraging feedback.

最终决定

The paper presents Taccel, a high-performance GPU-based simulation platform for vision-based tactile sensors. By integrating IPC and ABD, the system achieves stable, scalable simulation of contact-rich interactions, supporting thousands of environments in parallel. The authors validate the framework across multiple tasks, including object classification, grasping, and articulated object manipulation, with promising sim-to-real results.

The reviews are overall positive (two Strong Accepts, one Accept, one Borderline Accept). The rebuttal addresses key concerns, including sensor generality, simulation under dynamic contacts, and comparisons with recent work such as TacEx. While some components build on existing techniques, the engineering effort and system-level contribution are solid. One reviewer raises concerns about novelty, but others emphasize the practical value and thorough validation.

On balance, this is a well-executed systems paper that will be of interest to the tactile robotics and simulation communities. The AC recommends acceptance.