PaperHub
6.3
/10
Poster3 位审稿人
最低3最高4标准差0.5
3
4
3
ICML 2025

AtlasD: Automatic Local Symmetry Discovery

OpenReviewPDF
提交: 2025-01-24更新: 2025-07-24
TL;DR

A symmetry discovery method capable of learning local transformations.

摘要

关键词
Local symmetry discoverysymmetry discoveryequivariancegauge equivariant neural networkLie theory

评审与讨论

审稿意见
3

The paper proposes a novel pipeline to discovery symmetries in a dataset and, then, employ them to enforce the correct inductive bias in a machine learning model. In particular, the proposed method can discover not only global symmetries but also local symmetries by restricting the attention to local patches of the input data via the charts in a user-defined atlas.

给作者的问题

  • if a neural network is trained on a real dataset, it is typically never numerically equivariant unless the data contains sufficient data augmentation. At that point, however, it is assumed the user already knows the underlying symmetry. I feel this aspect could get some more discussion in the main paper.

  • Sec. 4.2.1 sampling ηN(0,I)\eta \sim N(0, I) will only enforce stability to (relatively) small transformatoins close to the identity, isn't it? Does't it make more sense to sample uniformly from the group?

  • Sec 4.2.2: if I understand correctly, the method uses a simple gradient descent strategy in a rather low-dimensional space (the parameter space of the ClC_l matrices) to find all minima of a probably very non-convex function. Why do you expect this solution to work well to find the discrete symmetries?

  • Sec 4.2.2 even if CiCj1C_i C_j^{-1} is not in the identity component, they might still be redundant, e.g. if CiC_i is a power of CjC_j. Also, how is the minimization perfomed? Is the objective convex such that a simple gradient descent can be expected to converge? I am not sure it is possible to claim that the filtration process produces a list of unique representatives of the cosets. Note also that this is probably not a problem in these experiments since only order-2 discrete groups have been considered (such as the flip group and the parity group). You should experiment with other discrete symmetries, e.g. C_4 or D_4 rotations.

  • 5.2: what happens if you seed with more than 1 Lie Algebra generator? Do you still find a single relevant generator?

  • 5.3: assigning a chart to the region of each digit seems a bit like cheating, since the users are implicitly pointing to the regions (and restricting the attention to only those regions) where they know there are exact local symmetries. Did you consider other choices of charts?

论据与证据

See comments below

方法与评估标准

See Experimental Designs or Analysis.

理论论述

I didn't check the theoretical proofs in detail.

实验设计与分析

The evaluation criteria seems suitable for evaluating the proposed idea, although I think there is space of improvment in some experiemnts.

  • Sec. 5.2: what happens if you seed with more than 1 Lie Algebra generator? Do you still find a single relevant generator?

  • Sec 5.4, isn't the choice of isotropic filters making the model equivariant to GL(2)? If so, this seems to imply you ignored the discovered GL+(2) symmetry and, instead, implemented a model equivariant to all possible transformations considered in the initial search space. Then, I am not sure this experiment supports the benefit of the symmetry discovery method. That being said, I agree on the difficulty of implementing a GL+(2) equivariant model, but I think this suggests this is not the best task to evaluate the proposed method.

  • Why not consider the benchmark of fluid simulation from (Wang et al., 2022), which already compares many learnable equivariance methods in the literature? This could provide a simple and effective way to compare with most of the previous literature. This dataset seems to also feature local rotational symmetries (similar to the PDE dataset in this manuscript) and broken global symmetries due to boundary conditions.

  • All experiments consider at most order-2 discrete groups (such as the flip group and the parity group). I suspect this significantly limits the complexity of discovering discrete symmetries (see Questions for the Authors below); it would be interesting to experiemnt with tasks featuring bigger discrete symmetries e.g. C_4 or D_4 rotations.

Wang et al., 2022, Approximately equivariant networks for imperfectly symmetric dynamics

补充材料

I quickly reviewed all the supplementary materials but I might have missed some details.

与现有文献的关系

I think this is a relevant novel work.

The paper claims previous works on symmetry discovery mostly focused on global symmetries. However, there's a few previous works which considered learning local symmetries by leveraging the idea of symmetry breaking at different scales in CNNs with layer-wise learnable equivariance. See for example:

Romero & Lohit, 2022, Learning partial equivariances from data Veefkind & Cesa, 2024, A Probabilistic Approach to Learning the Degree of Equivariance in Steerable CNNs

遗漏的重要参考文献

I think the manuscript is missing a few relevant citations from the literature on learnable equivariance and approximate equivariance. Below are a few examples:

Finzi, M., Benton, G., and Wilson, A. G. Residual pathway priors for soft equivariance constraints.

Wang, R., Walters, R., and Yu, R. Approximately equivariant networks for imperfectly symmetric dynamics

Wang, D., Zhu, X., Park, J. Y., Platt, R., and Walters, R. A general theory of correct, incorrect, and extrinsic equivariance

van der Ouderaa, T., Romero, D. W., and van der Wilk, M. Relaxing equivariance constraints with non-stationary continuous filters.

van der Ouderaa, T. F., Immer, A., and van der Wilk, M. Learning layer-wise equivariances automatically using gradient

Petrache, M. and Trivedi, S. Approximation-generalization trade-offs under (approximate) group equivariance

Veefkind, L., and Cesa, G. A Probabilistic Approach to Learning the Degree of Equivariance in Steerable CNNs

van der Ouderaa et al., Noether's Razor: Learning Conserved Quantities

其他优缺点

Strengths.

The atlas approach to local symmetries is novel and the idea proposed is interesting. The experimental analysis shows the potential benefits of the proposed method in different settings.

Weaknesses.

The paper misses the comparison with some previous works and the experimental analysis has a few issues. I am happy to raise my score if the authors address these concerns.

其他意见或建议

  • in Sec. 4.2.1, it is not very clear to the reader what kind of functions the \Phi_c should be at this point of the manuscript. Might be worth adding some concrete examples earlier in the manuscript.

  • Eq. 3 why not just training the basis while enforcing its elements to be orthogonal to each other? One would only need to enforce the matrix B (containing the vecorized B_i in its columns) to be an orthogonal matrix, which you can easily do via SVD, no?

  • Sec. 4.2.2: page 5, second column, "This implies we only need to consider transformations whose determinant has absolute value 1." Why is that the case?

  • can you make Fig 4 clearer? the colorbars are hardly readable. Also, the numbering of the generators in the caption are ambiguous. Maybe add the numbers in the figure too. Also, what is the pink and green heatmap exaclty? It seems that Fig 4 includes different types of matrices (continuous generators, some metric and the discrete generators) but puts them all together with no explanation in the image.

  • what are the Ψc\Psi_c functions used for the symmetry discovery algorithm in sec. 5.1? Do you just train a normal neural network first, discover the symmetry and, then, train a new network equivariant to the new symmetries? I think more details about this could be included in the main paper rather than just the appendix.

  • 5.4 why so few charts? Also, the width of the charts seem a very important parameter here to manage the scale of the local symmetries. How is it chosen?

作者回复

The manuscript is missing a few relevant citations

We appreciate the suggestion for additional citations. The mentioned works on approximate equivariance modify equivariant architectures to handle situations where no perfect global symmetry is present. These works are indeed relevant and we will be sure to include them in our revised manuscript.

The fluid simulation benchmark also appears relevant. We have started work towards an experiment, but are currently addressing challenges related to the chaotic and non-local nature of the dataset.

Eq. 3 Why not train the basis using SVD?

SVD achieves orthogonality but is not suitable for encouraging sparsity. Our standard basis regularization explicitly promotes sparsity by penalizing shared non-zero terms among the generators.

“This implies we only need to consider transformations whose determinant has absolute value 1.” Why?

When the component group of a matrix Lie group GG is finite, we may find a finite subgroup HH that contains at least one element from each connected component [1]. As HH is a finite subgroup of GL(n)GL(n), for each hinHh\\in H, hH=Inh^{|H|}=I_n so (deth)H=1(\\det h)^{|H|}=1 and deth=1|\\det h|=1. Hence, each coset has at least one representative whose determinant has absolute value 11.

Can you make Fig 4 clearer?

Currently, the subfigures are in row-major order. The pink and green heatmap is the computed invariant metric tensor, using the methodology by [2]. We will make the numbering clearer and split this figure into multiple subfigures.

What are the Φc\Phi_c functions in sec. 4.2.1/5.1?

The description you provided is correct. We will be sure to include more details and examples in the main paper.

5.4 Why so few charts?

The chart size was primarily constrained by the fact that when small charts were used, the atmospheric river class would take up the entirety of the chart. So we instead chose to use a smaller number of larger charts.

If a network is trained on a real dataset, it is typically not equivariant

This is a problem with global symmetry discovery when datasets are canonicalized (e.g. rotation of images removed). It is uncommon to perform such canonicalization at the local scale. Thus, our predictor networks will not face this issue and remain equivariant.

4.2.1 Sampling ηN(0,I)\eta \sim N(0,I) will only enforce stability to small transformations

Although there are many distributions to try, it’s difficult to sample uniformly from the group as the hypothesis space GL(n)\mathrm{GL}(n) is not compact. To sample more extreme transformations, one can increase the standard deviation. However, due to the unboundedness of the search space, the distribution must necessarily be biased towards the origin.

Why does the discrete discovery algorithm work?

The search space is relatively low-dimensional and is further reduced in dimensionality by our assumption that the component group is finite. Moreover, by setting KK to be significantly larger than the expected number of cosets, at least one representative likely converges toward each ground truth coset. This is confirmed experimentally in 5.1 and 5.2.

To verify if our discrete discovery pipeline is able to discover complex groups, we ask it to find the global symmetries of the function f(x,y)=x+yf(x,y)=|x|+|y|. The ground truth symmetry group is D4D_4 and our algorithm can recover all 88 elements. https://i.ibb.co/CKy1wvLc/output.png

4.2.2 Even if CiCj1C_i C_j^{-1} is not in the identity component, they might be redundant

Note that we claim to discover all the cosets, not just the generators of the component group. In practice, once one has all the elements of the component group, it’s easy for a human to identify the generators.

4.2.2 How is the minimization performed? Is the function convex?

The function is not globally convex (especially since in certain directions exp\\exp is periodic). Nevertheless, we have found gradient descent to be successful empirically and are yet to run into any issues. In case gradient descent fails, one can turn to higher-order methods like L-BFGS.

5.2 What happens if you seed with more than 1 generator?

When we use multiple generators, the algorithm produces one rotational generator and one that corresponds to a weak scale. The rotational generator is still recognizable, though admittedly less accurate than when the algorithm is seeded with 1 generator.

5.3 Assigning a chart to the region of each digit seems unfair

We acknowledge the MNIST experiment has an idealized setting. Its purpose is to demonstrate the viability of the full pipeline and highlight the difference between local and global transformations. In more realistic scenarios (like our PDE or climate examples), one does not know a priori which charts exhibit local symmetries. Despite having less knowledge, our method still succeeds.

References

[1] A note on free subgroups in linear groups, Wang 1981.

[2] Generative Adversarial Symmetry Discovery, Yang et al. 2023.

审稿人评论

Thanks for the detailed answer. I still have a main concern:

This is a problem with global symmetry discovery when datasets are canonicalized (e.g. rotation of images removed). It is uncommon to perform such canonicalization at the local scale. Thus, our predictor networks will not face this issue and remain equivariant.

I disagree with this statement, since this problem is relevant even in symmetric datasets without any canonicalization. For example, it is very common to use rotation augmentation on typically rotation symmetric datasets (e.g. histopathological images): even if the underlaying data distribution is symmetric, the finite and small dataset rarely present all rotated versions of each pattern; using explicit data augmentation is often fundamental for non-equivariant models to properly generalize (it is even used for SE(2)-equivariant models to mitigate the discretization artifacts of the pixel grid). With this in mind, I find the proposed argument about the model learning the symmetry by just training it on the dataset (of local or global patches is irrelevant for this argument) a bit weak: if that was true, there would be no need for both data augmentation and equivariance, which contradicts most of the previous literature. Note I am not claiming it is never possible to get insights about the symmetries by inspecting a model trained on a sufficiently large dataset; instead, I am highlighting a reasonable limitation of this approach which seems to be mostly ignored in the current manuscript.

Also, regarding Question 5.3 about charts for the MNIST dataset, I would at least mention this point explicitly in the manuscript.

作者评论

Yes, you are correct that we assume the symmetry is fully present in the dataset for each chart. We will be sure to explicitly mention this assumption in our manuscript. Note that this is a limitation of symmetry discovery works in general, as LieGG similarly relies on a pretrained predictor and LieGAN requires the dataset distribution to be not fully canonicalized across any orbit (see their Assumption 3 in Appendix A). Fundamentally, if symmetry is not present in a dataset, either due to canonicalization or insufficient data samples, it does make symmetry discovery more challenging. We consider it a direction for future work to investigate how data distribution (along group action orbits) can affect the performance of symmetry discovery.

We will mention the idealistic setup of Section 5.3 in the manuscript as well.

审稿意见
4

This paper introduces AtlasD, a framework for discovering local symmetries, specifically, atlas equivariance, within datasets. Atlas equivariance is a kind of gauge equivariance, where a global symmetry group GG acts differently in each local coordinate system. To identify such a GG, AtlasD assumes a predefined atlas, trains local predictor networks (which learn GG-equivariance implicitly from data) for each coordinate chart, and extracts the Lie algebra of GG that is commonly present across the local predictors. The proposed method also extends to discrete symmetries by identifying cosets of the identity group action. Furthermore, the authors establish a theoretical connection between atlas equivariance and gauge symmetry, ultimately constructing a pipeline to extract local symmetries from data and integrate them into gauge-equivariant CNNs. The approach is validated across four different datasets.

给作者的问题

Please see the Method and Evaluation Criteria Section for major questions and concerns. Some additional questions are:

  • The proposed method identifies the global symmetry group by aggregating multiple local predictor models. I am curious whether this approach enhances the robustness of symmetry discovery or makes it more susceptible to noise. From an ensemble perspective, it seems to improve robustness of discovered symmetries; however, if the local predictors are not well optimized, the discovered symmetry group might be unreliable. I would like to hear the authors' thoughts on this aspect.

  • The authors mention that standard basis regularization provides more interpretable results, albeit at the cost of a higher rate of duplicate generators, possibly in comparison to cosine similarity regularization. What does "interpretable results" mean in this context? Does it simply imply that the discovered symmetries are more consistent across different runs? Additionally, could you conduct an ablation study to compare the effectiveness of standard basis regularization and cosine regularization?

  • Algorithm 1, though provides a compact summary of the proposed framework, is too vague. Providing more details on practical computational methods would enhance its comprehensibility.

论据与证据

The proposed method is based on the strong assumption that the presence of known atlas. Building on this assumption, the authors’ claims are empirically supported as follows:

  • AtlasD detects local symmetries, defined as atlas equivariance (Definition 4.2), where global methods fail. This is empirically supported by the PDE experiment (Figure 15) and the MNIST-on-sphere experiment (Figure 8).

  • AtlasD identifies discrete symmetries (more specifically, disconnected Lie group), including both positive and negative determinant cases. This is demonstrated in the top quark tagging experiment (Section 5.1), where AtlasD finds Section 5.1 (O+(1,3)O^+(1, 3) from AtlasD vs. SO+(1,3)SO^+(1,3) from LieGAN).

  • The extracted local symmetry provides a useful inductive bias that compatible with gauge-equivariant neural networks: this claim is supported by several downstream tasks.

方法与评估标准

  • A major concern with this method is the assumption that a suitable atlas for the dataset is known. It is unclear whether this assumption is reasonable, and the authors should provide more empirical evidence on how different atlas choices impact the discovered symmetry. While this is partially addressed in the PDE example (19 charts vs. 3 charts), additional ablation studies would strengthen the validation, including:
  1. The effects of overly sparse or excessively dense (overlapping) charts.

  2. Performance under missing or incorrect (noisy) charts.

  3. Cases where local coordinates are embedded in higher-dimensional Euclidean spaces.

  4. Atlases constructed using data-driven approaches, such as chart auto-encoders [1,2].

[1] Schonsheck, S., Chen, J., & Lai, R. (2019). Chart auto-encoders for manifold structured data. arXiv preprint arXiv:1912.10094.

[2] Floryan, D., & Graham, M. D. (2022). Data-driven discovery of intrinsic dynamics. Nature Machine Intelligence, 4(12), 1113-1120.

  • Another concern is the computational complexity of the proposed method. Several crucial hyperparameters, such as the number of basis functions k and the number of cosets for disconnected symmetries K, are determined heuristically by progressively reducing their values from high to low. This approach appears inefficient and may benefit from a more systematic or scalable selection strategy.

理论论述

This paper primarily explores applications of symmetry discovery but also presents an interesting theoretical connection between atlas equivariance and gauge-equivariant CNNs. I briefly reviewed the theorem and its proof and did not find any significant flaws.

实验设计与分析

As mentioned in the Methods and Evaluation Criteria section, I believe the authors should provide a more rigorous ablation study regarding the selection of atlases. Please refer to this section for details. Apart from that, I am satisfied with the experimental procedure and benchmarks.

补充材料

The supplementary material includes proofs, implementation details (such as regularizations used), experimental details, and a time complexity analysis of the proposed algorithm. I briefly reviewed the proofs and experimental details, which seem to adequately support the main manuscript.

与现有文献的关系

This paper is relevant to gauge theory in theoretical and high-energy physics. While the authors establish a connection between atlas equivariance and gauge-equivariant CNNs and present experimental results using a quark dataset, its focus remains within the realm of machine learning. If the authors illustrate the practical applications of gauge symmetry and local symmetry through examples from physics or other broader scientific fields in the introduction, the practical usefulness of this paper will be further emphasized.

遗漏的重要参考文献

This paper provides a structured overview of relevant works on equivariant neural networks and automatic symmetry discovery. However, I believe the authors would benefit from discussing representation learning literatures, for example, [1,2], which can be seen as the automatic construction of charts and embeddings for manifold data.

[1] Schonsheck, S., Chen, J., & Lai, R. (2019). Chart auto-encoders for manifold structured data. arXiv preprint arXiv:1912.10094.

[2] Floryan, D., & Graham, M. D. (2022). Data-driven discovery of intrinsic dynamics. Nature Machine Intelligence, 4(12), 1113-1120.

其他优缺点

Overall, the motivation is clear, the proposed method is well-founded, and the evaluation is reasonable. However, to make the work publication-worthy, the authors should provide additional theoretical or empirical validation to better clarify its sensitivity to the predefined local charts.

其他意见或建议

To highlight the importance of local symmetry detection, it would be beneficial to include Figure 15 from the Supplementary Material in the main manuscript.

作者回复

A major concern with this method is the assumption that a suitable atlas for the dataset is known.

While this assumption may initially appear overly ideal, in practice it is achievable. The primary requirement for an atlas is that the charts are large enough for the function to truly be atlas local, but small enough that each region is approximately flat. In all of our experiments, we only needed minor additional tuning once this relatively weak condition was met.

We showed in section 5.2 that AtlasD is successful under two diverse atlases, implying one has large freedom in the exact atlas they choose. To provide more evidence, we have created an additional atlas for the PDE experiment with heavy overlap, missing regions, and a sheared/noisy chart. https://i.ibb.co/xtgDr7ZF/Figure-1.png

We discover a single generator \\begin{vmatrix} -0.368 & -1.035 \\\\ 1.101 & 0.386 \\end{vmatrix} and both cosets. The noisy chart does worsen the discovered generator, but it remains recognizable as a rotation.

Atlases constructed using data-driven approaches

We believe the referenced works slightly differ from our setup. In particular, these works model an unknown data manifold embedded in higher-dimensional Euclidean spaces. On the other hand, we deal with feature fields on a simple, explicitly known manifold, such as a sphere or 2D region.

Computational complexity w.r.t hyperparameters kk and KK.

kk is simple to tune in practice since we are dealing with low-dimensional spaces and only need several reruns during training. Moreover, only kk is chosen in the high-to-low process. KK, the number of cosets, is fixed once.

It would be beneficial to include Figure 15 from the Supplementary Material in the main manuscript.

We agree that Appendix D.4 is important and will be sure to include it and Figure 15 in the main manuscript.

The proposed method identifies the global symmetry group by aggregating multiple local predictor models. Does this approach enhance the robustness of symmetry discovery against noise?

To be specific, we identify the local symmetry group of Φ\Phi by finding the common global symmetries of all Φc\Phi_c. We argue that performing symmetry discovery across multiple predictors effectively multiplies the amount of data elements we have available, making our method resilient to noise.

If the local predictors are not well optimized, the discovered symmetry group might be unreliable.

Yes, this is true. However, the predictors generally have easy tasks in our setup, since they only need to predict locally. This gives us an advantage over global predictors as it is inherently more difficult to predict on a global scale. Also, compared to existing works that use a GAN discriminator, our predictor-based approach does not suffer from the training instability issue of GAN.

The authors mention that standard basis regularization provides more interpretable results. What does "interpretable results" mean in this context? Additionally, could you conduct an ablation study to compare the effectiveness of standard basis regularization and cosine regularization?

We provide an ablation in Appendix D.2 to compare standard basis regularization against cosine similarity.

When cosine similarity is used, we are still able to find an orthogonal basis, but each generator has many non-zero elements (Figure 12). This makes it difficult to understand what physical action each generator corresponds to. In contrast, in the basis discovered using standard basis regularization (Figure 4), each generator has only a few non-zero elements which allows it to easily be classified as a boost or rotation. Hence, “interpretability” in this context means sparsity of the generators.

Algorithm 1, though providing a compact summary, is too vague.

We will be sure to include subroutines for each step in Algorithm 1. If there are any other specific points the reviewer feels should be included in the overview itself, we would be happy to add them.

审稿人评论

I appreciate the authors' detailed response. I am now convinced that the requirement for prior knowledge of atlases on 2D/3D manifolds is not overly expensive. I also appreciate the additional experiment with sheared/noisy charts. I think this paper makes a solid contribution to the field of symmetry discovery, and I would like to raise my score accordingly.

审稿意见
3

The paper introduces the concept of atlas equivariance, which formalizes the notion of local symmetry in contrast to traditional global symmetry approaches. The proposed method discovers local symmetries by learning a Lie group basis for each chart (a local region of the input manifold). This is achieved by training local predictor networks and optimizing the equivariance loss with respect to learnable group generators. These generators are defined in the Lie algebra of the target symmetry group, which acts on each local chart.

给作者的问题

None

论据与证据

The reasoning for the necessity of local symmetry in general is not convincing. The paper states that “local symmetries are more generalized,” but it does not strongly justify why local symmetry is essential for either the machine learning or natural science communities. Although it also mentions that “it generalizes symmetry discovery to arbitrary manifolds and allows for downstream use in gauge equivariant networks,” this is also feasible with prior methods when data is defined as a manifold. Moreover, the performance gains reported in the experiments from utilizing discovered local symmetry are only marginal.

方法与评估标准

The suggested approach partially makes sense. However, I do not understand why parameterizing only the Lie algebra and minimizing the equivariance loss with a pretrained predictor is sufficient for learning the symmetry group. This approach may lead to unstable symmetry discovery depending on the initialization of the Lie algebra and the accuracy of the pretrained predictor, especially for complex data like PDE solutions. This is why prior methods choose to provide some group generators and train their coefficients [1,2] or use well-designed loss functions, such as Jacobian-based losses [3] or cosine similarity of output features [4]. The paper should discuss these potential issues or provide an empirical study to address them.

[1] Learning Invariances in Neural Networks, Benton et al. 2020.

[2] Generative Adversarial Symmetry Discovery, Yang et al. 2023.

[3] LieGG: Studying Learned Lie Group Generators, Moskalev et al. 2023.

[4] Learning Infinitesimal Generators of Continuous Symmetries from Data, Ko et al. 2024.

理论论述

I did not check the proof of Theorem 4.3. but statement at least makes sense.

实验设计与分析

  1. The experiment excluding a certain region in the PDE setting is sound and well-designed.
  2. However, the experiments focus only on locally varying symmetry. It is also important to demonstrate that the method consistently discovers the same symmetry for every chart when only global symmetry is present, which prior methods can easily achieve.
  3. I also wonder about the memory and time complexity required for learning the group generators. I guess the learning time is proportional to the number of charts available
  4. Additionally, the paper should provide guidelines on how many neighborhoods are needed to form a chart to obtain reasonable local symmetry for different types of data. The choice of the number of neighborhoods may significantly impact the discovered symmetry.

补充材料

None

与现有文献的关系

The main concern is the contribution of the paper. Local symmetry can also be found using baselines like LieGG and LieGAN by defining charts and treating the data as a manifold. The main novelty appears to be the new loss function for discovering discrete symmetry, which was limited in prior works. However, this contribution is independent of finding local symmetry. I do not see how the proposed loss functions or parameterization are specifically designed for discovering local symmetry.

遗漏的重要参考文献

None

其他优缺点

None

其他意见或建议

None

作者回复

The reasoning for the necessity of local symmetry is not convincing.

The motivation for local symmetry is that arbitrary manifolds (such as a Möbius strip) do not have global symmetries. In such cases, there is nothing for global discovery methods to learn. However, all manifolds do have local symmetries, making them broadly applicable. Such local symmetries are interesting because they can be used as an inductive bias in gauge equivariant neural networks to improve performance in computer vision, climate segmentation, and other real world tasks [1]. On the other hand, the global symmetries that prior works discover are incompatible with gauge equivariant networks.

The experiments focus only on locally varying symmetry. It is also important to demonstrate that the method consistently discovers the same symmetry for every chart when only global symmetry is present, which prior methods can easily achieve.

We clarify that the atlas equivariance group describes the global symmetries of the local predictors, where each local predictor is the task function restricted to a particular chart. Crucially, it is the common symmetry group for these predictors, rather than varying with each chart. This means that in our experiments, we have in fact been discovering the symmetry group that is the same for all charts.

I also wonder about the memory and time complexity required for learning the group generators.

We include an analysis of space and runtime in Appendix E. In summary, the time and space complexity scale linearly with the number of charts.

Additionally, the paper should provide guidelines on how many neighborhoods are needed

The primary requirement for an atlas is that the charts are large enough for the function to truly be atlas local, but small enough that each region is approximately flat. The exact number of charts is a hyperparameter that may depend on domain-specific factors (e.g. geometry, boundary, data coverage). However, Sec 5.2 demonstrates that our algorithm works under a diverse set of atlases, indicating its robustness to the chart size and count in practice.

I do not see how the proposed loss functions or parameterization are specifically designed for discovering local symmetry.

Compared to those in LieGG or LieGAN, the loss function and parameterization used by AtlasD are specifically tailored to local symmetry discovery. LieGG requires one row in the polarization matrix for every single output pixel in the dataset (Appendix D.4). However, local symmetry datasets are detailed feature fields, making this method impractical due to the memory required. On the other hand, an approach based on LieGAN requires adversarial training, which is often unstable. Moreover, a discriminator-based setup cannot be used in the discrete discovery algorithm because losses for different cosets are no longer comparable. Hence, the predictor loss method used by AtlasD is most applicable to discovering the full local symmetry group.

The reviewer is correct that the discrete discovery is somewhat independent of local symmetry, but we argue that this only boosts our contribution as our method can then be applied to broader domains.

References

[1] Gauge equivariant convolutional networks and the icosahedral cnn, Taco Cohen et al. 2019

审稿人评论

Thanks for the detailed response. Some points are addressed, but the questions raised in Methods and Evaluation Criteria remain unresolved.

作者评论

Thank you for replying and pointing out the remaining concerns. We have explained the design rationale for the proposed loss function and parameterization of the symmetry in the last section of our rebuttal. We will provide additional clarification to your question and comparison with the mentioned related works as follows.

AtlasD depends on the accuracy of the predictors

Yes, this is true. However, we note that this is a common setup in symmetry discovery, e.g. LieGG, [4], [5] also depend on the accuracy of pretrained predictors. In addition, in the context of local symmetry discovery, the predictors generally have simple tasks, since they only need to predict locally. This gives us an advantage over global predictors as it is inherently more difficult to predict on a global scale. Finally, compared to existing works that use a GAN discriminator, our predictor-based approach does not suffer from the training instability issue of GAN.

Comparison to prior works

[1,2]: When fixing the generators and training the coefficients, these methods learn which subset of an input group the system is equivariant to. This differs from our setup, where we seek to learn the maximal symmetry group from a much broader hypothesis space. Only our method is applicable when we do not know the symmetry group beforehand.

[3]: Generalizing LieGG to discover equivariances on maps between feature fields requires a significant amount of memory, which makes it infeasible in practice. This is discussed briefly in Appendix D.4 of our paper.

[4]: Recall that we define our equivariance loss as L(Φc(gx),gΦc(x))\mathcal{L}(\Phi_c(g \cdot x), g \cdot \Phi_c(x)), where L\mathcal{L} is a error function appropriate to the context. If we take L\mathcal{L} to be cosine similarity, then the only major difference between our method and [4] is that we apply the loss to the output of the predictors but in [4], it is applied to the output of the feature extractor. Since training the feature extractor is of comparable difficulty to training the predictors, we argue we should see similar stability in discovery.

References:

[1] Learning Invariances in Neural Networks, Benton et al. 2020.

[2] Generative Adversarial Symmetry Discovery, Yang et al. 2023.

[3] LieGG: Studying Learned Lie Group Generators, Moskalev et al. 2023.

[4] Learning Infinitesimal Generators of Continuous Symmetries from Data, Ko et al. 2024.

[5] Deep learning symmetries and their Lie groups, algebras, and subalgebras from first principles, Forestano et al. 2023

最终决定

The paper presents AtlasD, a framework for discovering local symmetries by formalizing the natural concept of atlas equivariance. Unlike existing methods that focus on global symmetries, AtlasD trains local predictor networks on coordinate charts and extracts a Lie group that captures equivariance across these charts. The method accommodates both continuous and discrete symmetries and demonstrates effectiveness across a range of relevant tasks.

Prior to the rebuttal, the paper was assessed as borderline. Although reviewers recognized the conceptual novelty of introducing atlas equivariance for local symmetry discovery, they also identified several significant concerns—chief among them the flexibility in atlas selection, the limited scope of empirical evaluation, and the unclear comparative advantages over existing methods. The discussion phase was constructive, with the authors presenting additional experiments demonstrating robustness to atlas choice, clarifying aspects of the methodological framework, and providing more meaningful comparisons to prior work. Reviewers YBVF and 9veb subsequently revised their scores upward, with the latter indicating that major concerns had been addressed to a satisfactory degree. Nevertheless, some reservations remained, particularly regarding the stability of the approach and the need for further comparison to existing methods. Overall, however, the consensus shifted in a more favorable direction. In light of the clarified contributions, the paper makes a reasonable case for acceptance, primarily on the strength of its conceptual novelty, provided there is room.

The authors are encouraged to incorporate the valuable feedback provided by the knowledgeable reviewers.