7.6

/10

Poster3 位审稿人

最低4最高5标准差0.5

4.0

置信度

创新性2.7

质量3.0

清晰度3.0

重要性3.0

NeurIPS 2025

Bézier Splatting for Fast and Differentiable Vector Graphics Rendering

Xi Liu,Chaoyi Zhou,Nanxuan Zhao,Siyu Huang

OpenReview PDF

提交: 2025-05-11更新: 2025-10-29

TL;DR

We propose Bézier Splatting, a new differentiable vector graphics representation that achieves an order-of-magnitude computational speedup compared to the state-of-the-art methods.

摘要

关键词

differentiable vector graphicsGaussian splatting

评审与讨论

审稿意见

评分: 5置信度: 42025-07-02

This work presents Bezier Splatting, a new differentiable Vector Graph representation that optimizes Bezier curves through Gaussian splatting-based rasterization. This method provide direct positional gradients at object boundaries so that over 150x faster than DiffVG. Overall, given a raster image, the author efficiently vectorize the input into a VG representation that closely resembles the input while preserving the details.

优缺点分析

Strength:

Vector graph differentiable rendering that make it available for machine learning.
Using 2D Gaussian representation, it can provide direct position gradients and it is 150x faster than former State of Art result (DiffVG).
Better vectorization results. In both qualitative and quantitative.
Vectorization results support flexible editing operations.

Weakness:

Lacking ablation study in Section 4.
Lacking experiments on binary images, and comparison with traditional vectorization algorithms.

问题

I wonder if number of curve and other parameters are changeable:

In Line 150, is 10 control points for each curve is fixed or flexible?
In table 1, is it necessary to use so many curves? How to determine the best number of curve?
Is it possible to make the result more compact? Using fewer curves?

局限性

Yes.

最终评判理由

The author's answer in weakness 2 seems solid. I decide to keep my score.

格式问题

None

作者回复

2025-07-30

We sincerely thank the reviewer qWw9's valuable comments and suggestions, which have helped us improve the clarity and quality of the paper. We also thank the reviewers for pointing out that our work can benefit downstream machine learning applications, which is indeed one of the key motivations behind this paper. Our responses to the weaknesses and questions are as follows.

Weakness 1: Lacking ablation study in Section 4.

Thank you for pointing this out. Due to the page limitation, we provided the ablation studies in the supplemental material. Specifically, they include:

Tab. 3 studies the effectiveness of adaptive pruning and densification, demonstrating how they improve reconstruction quality and avoid local minima.
Figs. 8-16 show the comparison across different numbers of Bézier curves, visually illustrating how increasing the number of curves progressively recovers more image details.
Tab. 4 conducts a systematic analysis of method efficiency, including the impact of varying the number of Gaussian samples per curve in closed regions.
Tab. 5 shows an ablation study of the multi-opacity scheme for open curves, comparing the use of a single opacity vs. three opacities for each Bézier segment of an open curve.
Fig. 7 visualizes the results of our layer-wise vectorization strategy, demonstrating its adaptability with improvement techniques developed for DiffVG [1].

Weakness 2: Lacking experiments on binary images, and comparison with traditional vectorization algorithms.

Differentiable VG rendering methods starting from DiffVG generally struggle with complex binary images. We conjecture this is because the pixel gradients in the binary space are inherently sparse. Differentiable methods often rely on anti-aliasing (DiffVG-based) or Gaussian splatting (ours) to compute stable positional gradients in the color image domain, but these approaches do not well suit with the extreme case of binary images. In addition, the assumptions about opacity and overlaps between curves made by differentiable methods are not applicable to binary images. Most recent works do not evaluate on binary images and focus primarily on performance in the color image domain, we also did not include binary image experiments in our comparisons. To better handle binary image representations, we could leverage clustering-based method to generate guidances to help determine the positions and shapes of curves. In addition, we have verified that our method can be combined with LIVSS [2], which uses SegmentAnything [3] masks to initialize Bézier curves and then optimizes their shapes using the masks as supervision signals. This further demonstrates that our method can operate on binary images. Our method combined with LIVSS [2] will also be released together with our current codebase.

Regarding the comparison with traditional vectorization algorithms, we conduct a comparison using the Image Trace from Adobe Illustrator [4] on several images from the DIV2K dataset. Both methods were evaluated using the same number of parameters (~20K). As shown in the table below, our method achieves higher fidelity, with an average PSNR of 25.734 compared to 24.904 of Image Trace.

Image ID	0004		0008		0012		0016		0020
Method	Adobe	Ours	Adobe	Ours	Adobe	Ours	Adobe	Ours	Adobe	Ours
SSIM↑	0.835	0.849	0.645	0.637	0.658	0.658	0.627	0.628	0.814	0.822
PSNR↑	26.43	28.62	23.26	24.33	22.29	22.87	23.87	24.54	26.67	28.31
LPIPS↓	0.380	0.387	0.489	0.489	0.523	0.518	0.534	0.543	0.505	0.503

Question 1: In Line 150, is 10 control points for each curve is fixed or flexible?

The design is flexible and supports arbitrary Bézier degrees and any number of connected segments. The number of control points is determined by the Bézier curve degree and the number of connected segments. In the implementation of this paper, each open curve is composed of three connected cubic Bézier segments, where each segment is defined by four control points. Due to continuity between segments, adjacent curves share one control point, resulting in a total of 10 control points per open curve (i.e., 3×3+1). For closed curves, our framework only requires the number of segments to be even, which facilitates efficient and consistent interior sampling during rasterization.

Question 2: In table 1, is it necessary to use so many curves? How to determine the best number of curve?

Thank you for the question. In general, high-resolution images (e.g., DIV2K_HR at 2K resolution) require a relatively large number of vector primitives (typically 1K–2K Bézier curves) to preserve fine-grained visual details. The best number of curves depends on the user's goal: fewer curves yield more abstract or stylistic representations, while more curves are needed for faithful reconstruction of complex textures. Moreover, the optimal number of curves can vary significantly depending on the image content. For example, highly textured images naturally require more curves than flat or homogeneous ones to achieve similar PSNR or perceptual quality. In practice, determining the best number of curves could also benefit from external guidance mechanisms. For instance, LIVSS [2] uses SAM [3] to provide semantically meaningful curve initialization with fewer but more informative paths. Our method is also compatible with this mechanism.

Question 3: Is it possible to make the result more compact? Using fewer curves?

We have shown more compact results (32 to 512 curves) in Figure 7 of the supplement. To achieve more compact results, we can combine our method with the layerwise training strategy, as shown in Section 4.3 and supplement Table 3. To further ensure better semantic consistency when using fewer curves, we could combine our method with LVISS [2], which uses SAM [3] to provide semantically meaningful curve initialization, with a higher computational efficiency compared to the original LVISS+DiffVG.

References

[1] Tzu-Mao Li, Michal Lukáč, Gharbi Michaël, and Jonathan Ragan-Kelley. Differentiable vector graphics rasterization for editing and learning. ACM Trans. Graph. (Proc. SIGGRAPH Asia), 39(6):193:1–193:15, 2020.

[2] Wang, Z., Huang, J., Sun, Z., Gong, Y., Cohen-Or, D., & Lu, M. (2025). Layered image vectorization via semantic simplification. In Proceedings of the Computer Vision and Pattern Recognition Conference (pp. 7728-7738).

[3] Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., ... & Girshick, R. (2023). Segment anything. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 4015-4026).

[4] Adobe Illustrator.

2025-08-06

I thank the authors for the rebuttal. I believe this is good work with adequate experiments.

2025-08-08

We sincerely thank the reviewer for the positive feedback and for recognizing the value of our work and experiments. We appreciate your time and effort in reviewing our paper.

审稿意见

评分: 4置信度: 42025-07-03

This paper proposes a curve splatting method with 2D Gaussian rendering. Both open and closed curve representations are converted to a set of Gaussians, and those Gaussians are rendered via differentiable rasterization, yielding back-propagation from the image loss to the Bezier curve parameters. Three pruning conditions discard redundant or inefficient curves and replace them with the regions that need more curves.

优缺点分析

The paper’s strengths are as follows;

-The curve-to-Gaussians conversion achieves fast rendering speed compared to the state-of-the-art method.

The proposed splatting method achieves the state-of-the-art results for open curves.

There are the following weak points:

-It would be great to add some figures about open and closed curves with notations.

-A 2D Gaussian splat has a round shape, which poses a problem in representing rectangular shapes. To this end, Gaussian splat representation requires dense sampling as it is a discrete representation.

-The sampling resolution for Gaussians has not been mentioned in this paper. This is crucial as the proposed method aims to reconstruct vector graphics at any image scale. This paper should discuss and examine how dense Gaussian sampling should be with respect to the image resolution.

-Lines 170 and 175 seem duplicated.

-Which probability density function has been integrated into the cumulative distribution function mentioned at Line 170?

-2D grid sampling for the closed curve suffers from non-uniform sampling near the starting and ending points. This increases the rendering time of Gaussians without quality gain.

-A proposed curve pruning method should be explained in more detail. The current explanation looks vague; the overlap threshold at Line 266 and AABB are not mentioned in the “pruning and densification” paragraph.

-Although the rendering time becomes faster, there is no significant improvement in quality.

问题

I have the following questions and suggestions:

Does 2D grid Gaussian sampling for closed curves suffer from over-sampling near the starting and ending points? How does it affect the rendering performance, and how could we mitigate this problem?
Can the current method handle images of various scales without changing sampling parameters? If not, the relation between sampling resolution and image size should be discussed.
What is the reconstruction policy for curve splatting about depth?
Please clarify the pruning and densification process.
For the offline method, the gain in quality is more important than the speed gain. Is there any quality improvement against the state-of-the-art method?

局限性

Yes.

最终评判理由

After the rebuttal, the authors resolved my concerns with solid answers. I will change my stance to accept this paper.

格式问题

No.

作者回复

2025-07-30

We sincerely thank reviewer ZXL5 for the detailed comments and suggestions.

Weakness 1: It would be great to add some figures about open and closed curves with notations.

Figures 1, 3, and 4 of the paper include both open and closed curves with corresponding open/closed notations. We will add additional notations to Figures 5 and 6 in the revised version for clarity. Furthermore, Figures 7–16 in the supplementary material provide more examples with notations.

Weakness 2: A 2D Gaussian splat is round in shape and requires dense sampling.

We agree that Bézier splatting, similar to 2D Gaussian splatting, requires dense sampling to accurately capture complex geometry. In our implementation, each individual Bézier curve is sampled with 64 points. However, despite this relatively dense sampling, our method achieves highly efficient rasterization as shown in Table 1 with 4.5 ms for open curves and 14.1 ms for closed curves, significantly faster than DiffVG (141.0 ms and 85.4 ms, respectively). Furthermore, our approach yields much faster backpropagation times (4.7 ms and 24.6 ms) compared to DiffVG (701.3 ms and 225.8 ms). These results demonstrate that our method strikes an effective balance between geometric expressiveness and computational efficiency.

Weakness 3 and Question 2: The paper should discuss Gaussian sampling density vs. image resolution. Can the method works across any scales?

The density of Gaussian sampling is not directly related to image resolution. It depends on the geometric complexity of the current curves and the size of the region they represent. In general, more complex and larger curves require denser sampling. To ensure that our method can adapt to the vast majority of datasets, we adopted relatively dense sampling parameters (64). Varying the per-segment sampling resolution (32, 64, 128) improves PSNR by less than 0.1. Regarding image resolution, the number of curves has a greater impact on reconstruction quality, as images with more complex textures generally require more Bézier curves to represent.

Our method can handle images of arbitrary resolution without any modification. One of the key advantages of vector graphics is their resolution-free representation, meaning they can be rendered at any target resolution without quality loss. In addition, because our method is fully differentiable, we render the vector graphics at the same resolution as the target image to compute the loss and backpropagate accurate gradients, making our method inherently support arbitrary resolutions. We conducted experiments on datasets covering a variety of resolutions, including high-resolution photographs (e.g., DIV2K_HR, KIDAK) and low-resolution animation images (e.g., Danbooregions), using the same sampling number (64).

Weakness 4: Lines 170 and 175 seem duplicated.

Thank you for pointing this out. We will correct it in the camera-ready version and carefully review the entire paper to revise any remaining typos.

Weakness 5: Which probability density function for CDF?

We use a normal distribution with mean=0 and standard deviation=0.85 for the CDF-based strategy to sample interpolated Bézier curve positions. This results in smaller Gaussian scales near the boundaries, which helps prevent unwanted influence beyond the true boundaries.

Weakness 6: Non-uniform sampling near the starting and ending points.

Although 2D grid sampling may introduce some non-uniformity near the start and end of closed curves, we adopt this strategy to ensure that all curves have the same number of sampling points and a consistent structure. This design enables efficient matrix operations for computing Gaussian scales, rotations, and region-based sampling, all of which benefit greatly from GPU parallelism. In contrast, adaptive sampling strategies, while potentially reducing redundancy near curve endpoints, lead to varying numbers of samples per curve and make it harder to utilize GPU acceleration effectively.

Weakness 7 and Question 4: The pruning and densification process should be explained in more detail.

Thank you for the helpful suggestion. We have outlined the pruning and densification process in three steps in the paper. We will add an algorithm diagram (as shown below) in the supplement. The relevant hyperparameters (e.g., overlap threshold and AABB pruning) are currently listed in the “Implementation Details” section, as we follow the standard practice in deep learning papers to place all hyperparameters in this section. We will update the “Pruning and Densification” paragraph to explicitly mention these hyperparameters for better clarity.

Algorithm of Pruning and Densification for Closed Curves

Definitions:

opacity( $b$ ): Opacity of curve $b$ .
area( $b$ ): Area of closed curve $b$ .
colordiff( $b_i, b_j$ ): Color difference between curves $b_i$ and $b_j$ .
IoU( $b_i$ ): Intersection-over-union between $b_i$ 's bounding box and all other curves.
ConnectedComponents( $E$ ): Extracts connected error regions by quantizing the error map.

Input:

Closed Bézier curve set $\mathcal{B} = \{b_1, b_2, \dots, b_N\}$
Error map $E$ , optimization iteration $t$

Output:

Updated curve set $\mathcal{B}'$

Initialize $\mathcal{B}' \leftarrow \mathcal{B}$

// Pruning
for each curve $b_i$ , $i = 1$ to $N$ do
if $\text{opacity}(b_i) < \tau_{\text{opacity}}(t)$ then
remove $b_i$ from $\mathcal{B}'$ // low-opacity pruning

if $\text{area}(b_i) < \tau_{\text{area}}$ then
remove $b_i$ from $\mathcal{B}'$ // small-area pruning

if $\text{IoU}(b_i, \{b_j \in \mathcal{B} \mid \text{colordiff}(b_i, b_j) < 0.03\}) > 0.9$ then
remove $b_i$ from $\mathcal{B}'$ // redundant overlap pruning
end for

// Densification
$\mathcal{R} \leftarrow \text{ConnectedComponents}(E)$
$\mathcal{R}_{\text{sorted}} \leftarrow \text{SortByArea}(\mathcal{R})$

for each region $r_j \in \mathcal{R}_{\text{sorted}}$ do
if curve budget allows then
insert a new closed Bézier curve initialized by $r_j$ into $\mathcal{B}'$ end for

Return: $\mathcal{B}'$

For open curves, we add a splitting rule: if the middle opacity is over 0.5 lower than both ends, the curve is split to better fit the region.

Weakness 8 and Question 5: Although rendering is faster, quality improvement is not significant.

For differentiable rendering methods, efficiency is critically important, as it directly affects the scalability of deep learning–based applications such as feedforward image-to-SVG and text-to-SVG generation. These neural networks typically require fast forward and backward passes, but existing differentiable VG rasterizers (DiffVG) are very slow, often taking over 700 ms per iteration (compared to network inference time <100ms per iteration), which makes training on large datasets impractical. For optimization-based methods like LIVSS that initialize compact vectors from SegmentAnything, replacing their rasterization backend (DiffVG) with ours reduces the time to reach a semantically meaningful result from ~45s to 2–3s, which greatly improves the user experience.

A similar case is 3D Gaussian Splatting, which is proposed for offline tasks like novel view synthesis. However, the differentiable rendering of 3DGS has enabled a wide range of downstream applications, including feedforward 3D reconstruction (e.g., LSM, AnySplat), text-to-3D generation (e.g., DreamFusion), and the transfer of 3D knowledge into 2D foundation models (e.g., Improving 2D Feature Representations by 3D-Aware Fine-Tuning). Likewise, we hope our fast and differentiable vector graphics representation can open up similar opportunities in the vector graphics domain.

Rather than only focusing on higher PSNR, we want readers to appreciate the high efficiency of our approach. There are several ways to further improve the PSNR of our method. For example, increasing the number of interpolated lines within closed curves can improve PSNR (see the table below), and using the layer-wise strategy can also further improve PSNR (Table 3). Using 80 lines improves PSNR to 23.93, nearly 1 point higher than DiffVG (22.95), at the cost of doubling the rendering time. However, for the best trade-off between efficiency and quality, we recommend using 40 lines as the default.

Inter.	SSIM↑	PSNR↑	LPIPS↓	FPS↑
20	0.636	23.15	0.504	103.45
40	0.639	23.45	0.507	68.30
80	0.654	23.93	0.495	37.90

Question 1: Does 2D grid Gaussian sampling for closed curves suffer from over-sampling near the starting and ending points?

No, the 2D grid sampling strategy does not negatively affect rendering performance. Although the sampling points are denser near the starting and ending positions of closed curves, this does not lead to over-sampling artifacts. This is because the scale of each 2D Gaussian is computed based on the distance to its neighboring samples, which naturally balances the density during rendering. As a result, denser sampling in certain areas does not increase visual artifacts. Extensive qualitative results are provided in the supplementary material, with no noticeable artifacts near the start or end points of closed curves.

Furthermore, as shown in Section 4.4, our representation can be fully converted into standard XML-based SVG files and rendered using traditional vector graphics engines, which also confirms the absence of sampling-related visual issues.

Question 3: What is the reconstruction policy for curve splatting about depth?

As mentioned in Line 156, the depth is determined by the spatial area of the curve. Larger area indicates larger depth. For open curves, it is computed as length $\times$ width; for closed curves, we use the area of the rotated AABB.

评论- Additional comments

2025-08-05

Thank the authors for the rebuttal. I add some comments on it.

Weakness 1: It would be great to add some figures about open and closed curves with notations.

Comments: I mean the definition of both curves. I thought an open curve is a curved line and a closed curve is a warped shape filled by color.

Weakness 2: A 2D Gaussian splat is round in shape and requires dense sampling.

Comments: Thank you for the answer. It would be great to introduce the pros and cons of sampling resolution.

Weakness 3 and Question 2: The paper should discuss Gaussian sampling density vs. image resolution. Can the method works across any scales?

Comments: Though the vector graphics is resolution-free, the proposed method needs discretization of curves and constant initialization for curve length and the number of curves. It means if the image resolution increases, the discretization error will emerge, and it makes the optimization fall to a different result. Is there no artifact when we zoom in the curves 10 times?

Weakness 6: Non-uniform sampling near the starting and ending points.

Comments: It means some pixels are occupied by many Gaussians. It raises the race conditions for overlapping regions. How could it be addressed while keeping parallelism?

Weakness 7 and Question 4: The pruning and densification process should be explained in more detail.

Comments: Thank you.

Question 3: What is the reconstruction policy for curve splatting about depth?

Comments: This area-wise depth ordering causes the discontinuity when there are similar areas of curves, making the optimization process unstable. If the proposed method doesn't count the difference when the depth order is swapped, the optimization doesn't converge to the global minimum.

2025-08-07

Thank you for the insightful comments. We hope our explanation addresses your concerns.

Weakness1, Comments: The definition of both curves.

Thanks for your suggestion, we will add a figure to illustrate the concept of open and closed curves in the final version.

Weakness 2, Comments: It would be great to introduce the pros and cons of sampling resolution.

We will add our discussion about sampling resolution to the final version to better demonstrate that our method can strike an effective balance between geometric expressiveness and computational efficiency.

Weakness 3 and Question 2, Comments: if the image resolution increases, the discretization error will emerge. Is there no artifact when we zoom in the curves 10 times?

For optimization: Our optimization relies on discrete image representations and Gaussian sampling for loss computation, while it maintains accurate gradient estimation by leveraging continuous Gaussian functions. This continuous nature is similar to DiffVG, which computes gradients from discrete pixels near object boundaries using continuous kernel function. Thanks to the continuous formulation, curve optimization remains robust across image resolutions. We applied our method to datasets of varying resolutions and observed no noticeable artifacts.
For rendering: Our method naturally supports an adaptive sampling strategy, as all Gaussian parameters are derived from the underlying Bézier curve parameters. Assume the curves are optimized at 2K resolution, we can directly render 4K, 8K, or even higher-resolution images by simply increasing the sampling density—doubling the resolution doubles the sampling rate—to preserve visual quality and avoid artifacts. To evaluate this, we rendered images at 2× and 4× higher resolutions. No noticeable artifacts are observed. We computed PSNR against the corresponding upscaled ground truth images. The table below uses 512 curves and reports results on the first four images from the DIV2K evaluation split. A sampling rate of 64 is sufficient for 4K rendering, and our method can adaptively increase the sampling density for resolutions beyond 8K to prevent artifacts.

Per-Image PSNR

Image ID	2K (original)	4K (2× sampling)	8K (4× sampling)	4K (original sampling)	8K (original sampling)
0004	26.8976	27.2761	26.6594	27.1553	25.9800
0008	21.2063	21.4340	21.1214	21.0614	19.8318
0012	19.8816	20.0900	19.7760	19.9079	18.9389
0016	22.3456	22.5408	22.3140	22.4327	21.7088

In addition, as shown in Section 4.4, our representation can be converted into standard XML-based SVG files and rendered by existing vector graphics engines to render images at any resolution.

Weakness 6, Comments: Raises race conditions for overlapping regions, how to address them while keeping parallelism?

In practice, race conditions in overlapping regions are not an issue in Gaussian splatting-based methods (e.g. 3D Gaussian Splatting) due to the carefully designed rendering pipeline. While it is true that some pixels may be influenced by many Gaussians, the rendering process handles this through two structured and parallelizable stages:

Tile-based Gaussian projection: For each pixel, all potentially contributing Gaussians are identified and sorted by depth. This step is fully parallelized across image tiles and involves only read operations, thereby avoiding any race conditions.
Alpha Blending: For each pixel, the sorted Gaussians are blended in a strict front-to-back order. Although multiple Gaussians may contribute to the same pixel, the blending is performed sequentially within each pixel to preserve correctness, while parallelism is applied across different pixels.

Question 3, Comment: Area-wise depth ordering causes discontinuities between similar curves, making optimization unstable.

In image vectorization, users often expect layer-wise rendering, and our area-based depth strategy naturally aligns with this expectation. Moreover, we find that prioritizing smaller objects in front of larger ones also improves final performance.

In our method, depth values can be easily fixed at any stage of the optimization. We have tried to fix them during the final 1,000 steps and found that it does not improve performance—in fact, it slightly reduces PSNR. Moreover, due to the small learning rate in the late optimization stages, depth order swapping rarely occurs. Therefore, although our method supports fixing depth values to prevent swapping, we did not include this strategy in our pipeline, as it offers no practical benefit.

审稿意见

评分: 5置信度: 42025-07-05

This work has presented Bézier Splatting, a novel differentiable vector graphics (VGs) representation that leverages Gaussian splatting for efficient Bézier curve optimization. The proposed method achieves faster forward computation and faster backward computation in rasterization compared to the baselines, while also delivering high rendering fidelity. Overall, the idea of this paper is interesting.

优缺点分析

Strengths

The authors propose a novel differentiable vector graphic representation, Bézier Splatting, which achieves 30× faster forward and 150× faster backward computation while producing high-quality rendering results.
The authors introduce an adaptive pruning and densification strategy to improve the optimization process of Bézier curves by escaping the local minima of the spatial distributions of curves.
Extensive experimental results demonstrate the effectiveness of the proposed method.

Weaknesses

The reviewer wants more explanation about the statement "It helps the optimization process escape the local minima of current spatial distributions of curves". Is there any theoretical analysis that can support this argument?
In the experiments, how to determine the hyperparameters \lambda_1 and \lambda_2 in (13)? Is it necessary to change the hyperparameters to fit different datasets?

问题

Please see the weaknesses.

局限性

Yes

最终评判理由

I appreciate the authors’ rebuttal, which addresses most of my concerns. I have raised the score from 4 to 5.

格式问题

作者回复

2025-07-30

We thank the reviewer BJtc‘s valuable comments and suggestions, which have helped us improve the clarity and quality of the paper. We have carefully addressed the identified weaknesses, and our responses are summarized as follows:

Weakness 1: The reviewer wants more explanation about the statement "It helps the optimization process escape the local minima of current spatial distributions of curves". Is there any theoretical analysis that can support this argument?

In this work, we provide an intuitive analysis of the optimization landscape: both 3D Gaussian Splatting (3DGS) and Bézier‑splatting suffer from local minima because suboptimal primitive (Gaussian/curve) initialization leaves some regions insufficiently covered. In 3DGS [1], this issue is alleviated by pruning large‑volume and low‑opacity Gaussians and splitting those with high gradients to densify underrepresented regions.

In image vectorization, smooth or homogeneous regions only need a few curves to be well reconstructed, whereas complex textures typically require many more. However, random initialization can again leave some regions underrepresented and cause the optimization to fall into local minima. Inspired by 3DGS [1], we propose a pruning and densification method for image vectorization that uses the error map as guidance to reallocate curves: removing redundant or low‑impact ones and relocating them to regions where additional curves are needed.

Since our method is based on Gaussian-based alpha blending, the contribution of each Gaussian to the final color is defined as Equation (9):

C_n = \sum_{i \in M} c_i \alpha_i \prod_{j=1}^{i-1}(1 - \alpha_j)

where $c_i$ and $\alpha_i$ represent the color and opacity of the $i$ -th Gaussian, respectively. $M$ is the set of Gaussians along the compositing order. This formulation directly implies that Gaussians with very low opacity, or those heavily occluded by earlier Gaussians, have almost no impact on the final rendering.

Guided by this observation, during optimization we prune curves with extremely low opacity, very small area, or those that largely overlap with nearby curves of similar color, as they are redundant and contribute little to the result. At the same time, we introduce new curves into regions with large reconstruction errors. This reallocation keeps the total number of curves fixed, improves overall reconstruction quality, and helps the optimization escape local minima caused by imperfect initial positions. Table 3 in supplemental material demonstrates the effectiveness of our pruning and densification method, which improves the PSNR from 21.10 to 22.11.

Weakness 2: In the experiments, how to determine the hyperparameters \lambda_1 and \lambda_2 in (13)? Is it necessary to change the hyperparameters to fit different datasets?

The choice of hyperparameters $\lambda_1$ and $\lambda_2$ in Equation (13) depends primarily on user preference and the specific characteristics of the target vector graphics. If users expect little or no self-intersection, or prefer smoother vector graphics, increasing the weight of the crossing loss $\lambda_2$ can be beneficial. However, this often comes at the cost of a slight reduction in PSNR. Setting the weights too high may lead to optimization failure, which aligns with observations reported in LIVE [2].

In our experiments, we consistently adopt the same set of hyperparameters ( $\lambda_1 = 1.0$ , $\lambda_2 = 0.01$ ). Empirically, this set performs well across different datasets, so it is generally not necessary to adjust $\lambda_1$ and $\lambda_2$ for different datasets.

References

[1] Kerbl, B., Kopanas, G., Leimkühler, T., & Drettakis, G. (2023). 3D Gaussian splatting for real-time radiance field rendering. ACM Trans. Graph., 42(4), 139-1.

[2]Ma, X., Zhou, Y., Xu, X., Sun, B., Filev, V., Orlov, N., ... & Shi, H. (2022). Towards layer-wise image vectorization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 16314-16323).

最终决定Accept (poster)

2025-09-17

The work presents a method for representing a bitmap image with a set of Bezier 2d surfaces. It is significantly faster, and results in better quality than the state of the art. Overall, all 3 reviewers were happy with the submission.

(3.1 Overall -> Overview)