PaperHub
6.0
/10
Rejected4 位审稿人
最低6最高6标准差0.0
6
6
6
6
2.8
置信度
ICLR 2024

SMPE: A Framework for Multi-Dimensional Permutation Equivariance

OpenReviewPDF
提交: 2023-09-22更新: 2024-02-11
TL;DR

A novel framework that enables cross-dimensional interactions among objects while maintaining multi-dimensional permutation equivariance.

摘要

关键词
Permutation equivariancemulti-dimensional equivariant networkfeature reusecross-dimensional information.

评审与讨论

审稿意见
6

This paper proposes a multi-dimensional permutation equivariance framework named SMPE to pave the way for the design of multi-dimensional permutation equivariant networks.

优点

  • This work provides the first exact algebra-based definition of multi-dimensional permutation equivariance.

  • The experiments are extensive, including contextual auction design, pseudo inverse computation, and typical wireless communication tasks.

  • The research can benefit many real applications including point cloud analysis, graph analysis, pseudo inverse computation, typical wireless communication, etc.

  • The proposed technique seems sound, and its effectiveness is verified by experiments.

缺点

  • The running time of all the methods could be compared to show the efficiency of the proposed method.

  • The hyperparameter settings of the model should be listed in the text.

  • Two related works [1,2] about graph permutation invariance are missing in the discussion.

  • The source codes and the datasets could be provided to facilitate the reproducibility of this work.

Refs:

[1] [arXiv 2019] PiNet: A Permutation Invariant Graph Neural Network for Graph Classification

[2] [TKDE 2022] Graph Substructure Assembling Network with Soft Sequence and Context Attention

问题

Please respond to the weaknesses listed in the previous text box.

伦理问题详情

I have not identified any ethical concerns about this work.

评论

In 11 , a one-dimensional invariant network was constructed based on the proposed variable node attention pooling mechanism, which effectively addressed graph classification problems. Due to its permutation invariance, it is applicable to graphs of different sizes.

The work in 22 introduces the Substructure Assembling Network (SAN), a graph neural network model that effectively learns interpretable and discriminative graph representations for classification. It overcomes challenges by hierarchically assembling substructures using the Substructure Assembling Unit (SAU) and enhancing performance with Soft Sequence with Context Attention (SSCA).

The goal of these two works is to model one-dimensional invariant mappings, which is relevant to our work but not the primary focus of our investigation. We have incorporated these two works into the 'Related Work' section and provided additional discussions.

11 Meltzer P, Mallea M D G, Bentley P J. Pinet: A permutation invariant graph neural network for graph classification JJ . arXiv preprint arXiv:1905.03046, 2019.

22 Yang Y, Guan Z, Zhao W, et al. Graph substructure assembling network with soft sequence and context attention JJ . IEEE Transactions on Knowledge and Data Engineering, 2022, 35(5): 4894-4907.

评论

The hyperparameters used in our experiments are listed in Appendix E.4 of the revised paper. For example

  • Pooling functions: Without loss of generality, all the pooling operations are set to be the mean function.

  • Training strategy: We perform batch iterations for 8×1048\times 10^{4} times with a batch size of 512, and reduce the learning rate by half after 4×1044\times 10^{4} iterations. This strategy for three tasks is the same. For the design task of MIMO and cell-free MIMO system transmission, we randomly assign SNR values to each training and testing sample.

  • Activation & normalization: To accelerate convergence and enhance the stability of the training process, we applied the ReLU activation and LN to ETN2D\mathtt{ETN2D} and ETN3D\mathtt{ETN3D}.

  • Monte-Carlo simulation: Each experimental result is the average of three valid experiments.

评论

Many thanks for your careful reviews and constructive comments.

The running times of all the methods have been compared in Appendix F.2. Specifically, the following table provides a comparison of the runtime for each network across different tasks. The runtime refers to the average time (ms) taken for a batch during GPU inference, with a batch size of 1000. In the first row, '1', '2' and '3' denote experiments in Sections 4.1, 4.2, and 4.3, respectively. A{\mathcal{A}}, B{\mathcal{B}}, and C{\mathcal{C}} represent the scenario configurations for each task during inference. It can be observed that the running time of SMPENDS\mathtt{SMPEN-DS} is comparable to that of ETN\mathtt{ETN}, and the running time of SMPEN2DTF\mathtt{SMPEN2D-TF} is similar to that of CITransNet2D\mathtt{CITransNet2D}.

1mathcalA\\mathcal{A}1mathcalB\\mathcal{B}1mathcalC\\mathcal{C}2mathcalA\\mathcal{A}2mathcalB\\mathcal{B}2mathcalC\\mathcal{C}3mathcalA\\mathcal{A}3mathcalB\\mathcal{B}3mathcalC\\mathcal{C}
mathttSetTrans1D\\mathtt{SetTrans1D}61.767.572.116.418.624.3
mathttCITransNet2D\\mathtt{CITransNet2D}87.9106.0119.148.253.864.434.834.963.2
mathttETN2D\\mathtt{ETN2D}61.969.275.818.721.928.413.813.827.1
mathttSMPEN2D!!DS\\mathtt{SMPEN2D\\!-\\!DS}65.275.680.421.324.631.412.812.924.7
mathttSMPEN2D!!TF\\mathtt{SMPEN2D\\!-\\!TF}87.8109.1129.051.259.969.135.035.163.4
mathttETN3D\\mathtt{ETN3D}25.625.746.9
mathttSMPEN3D!!DS\\mathtt{SMPEN3D\\!-\\!DS}23.123.242.5
mathttSMPEN3D!!TF\\mathtt{SMPEN3D\\!-\\!TF}55.455.599.3
评论

The source code and associated datasets will be provided soon, and we are currently making efforts to organize the code and add comments to improve reproducibility.

审稿意见
6

The authors propose a method for obtaining neural networks with higher-order permutation equivariances. This concerns tensors where multiple dimensions are equivariant. The main approach uses a composition of first order permutation equivariant functions. Features are pooled from all subsets of dimensions to facilitate exchange of information between the different dimensions.

优点

  • Straightforward and easy to understand method, well-written
  • I believe the method to be novel
  • While there is existing work handling this subject, I believe that this is a useful practical contribution to the literature.
  • Interesting applications used in experiments, solid results

缺点

  • I believe that this topic has been studied before as part of more general treatments of permutation equivariance, such as in [1] under the name of higher-order permutations. As such, the techniques used themselves are fairly standard and novelty therefore limited, even if their exact combination is new.
  • The experiments could be made stronger by comparing on existing settings, so that it's possible to compare against existing numbers in papers. Some of the performance benefits of the proposed method could be explained simply by insufficient tuning of the baseline models.
  • It would be good to see further ablation to test some of the claims made in the paper. For example, it would be good to check the performance impact of the parallel computation vs sequential computation.

[1] https://arxiv.org/pdf/2004.03990.pdf

问题

It seems like using all possible subsets can be limiting in terms of scalability. Can you think of more efficient alternatives that don't sacrifice too much in model capacity?

评论

First, we would like to illustrate the primary sources of complexity in the framework. The computational complexities and parameter quantities are shown as follows

Layer TypeComplexityParameters
SMPELDS\mathtt{SMPEL-DS}O((Πn=1NKn)(2N+N+1)D2)\mathcal{O}\left((\Pi^{N}_{n=1}K_n)\cdot(2^N+N+1)\cdot D^2\right)O((2N1+N)D2)\mathcal{O}\left((2^N-1+N)D^2\right)
SMPELTF\mathtt{SMPEL-TF}O((ΠN_n=1Kn)[(2N+N1)D+n=1NKn]D)\mathcal{O}\left((\Pi^{N}\_{n=1}K_n)\cdot[(2^N+N-1)D+\sum_{n=1}^{N}K_n]\cdot D\right)O((2N1+N)D2)\mathcal{O}\left((2^N-1+N)D^2\right)
ETL\mathtt{ETL}O((Πn=1NKn)2ND2)\mathcal{O}\left((\Pi^{N}_{n=1}K_n)\cdot2^N\cdot D^2\right)O(2ND2)\mathcal{O}(2^ND^2)

The total number of input features is Π_n=1NK_n\Pi\_{n=1}^NK\_n. This implies that under the condition of a fixed input dimensions NN, the computational complexity orders of SMPELDS\mathtt{SMPEL-DS} and SMPELTF\mathtt{SMPEL-TF} are lower than the second order of Π_n=1NK_n\Pi\_{n=1}^NK\_n, and the parameter quantity is also independent of Π_n=1NK_n\Pi\_{n=1}^NK\_n. Therefore, the main factor that limits the scalability of the framework is the exponential expansion of network size with the growth of dimension NN. The number 2N2^N in this table arises from the algorithm's requirement to obtain pooled information for every subset in the power set of N\{}{\mathbb N}\backslash\{\emptyset\} and incorporate it into the computation.

The exploration of efficient alternatives that do not significantly compromise model capacity is an interesting topic. We are currently conducting further research about this problem. From a practical implementation perspective, the following method can achieve a trade-off between performance and complexity: Train the network for multiple times, with each training randomly selecting HH subsets from the 2N12^N-1 subsets to include in the computation. Then, the best performing networks are retained. This approach allows for the identification of the most important HH combinations, effectively reducing the scale from 2N12^N-1 to HH.

评论

The ablation experiments have been included in Appendix F.1 of the revised paper. Specifically, to validate the superiority of g_serialg\_{\rm serial} (represented by ParaN2DTF\mathtt{ParaN2D-TF}) over g_parag\_{\rm para} (represented by SerialN2DTF\mathtt{SerialN2D-TF}), as well as the superiority of g_SMPEg\_{\rm SMPE} over g_serialg\_{\rm serial}, we compared the performance of these three methods on the task in Section 4.2. In most scenarios, as observed from the table below, g_SMPEg\_{\rm SMPE} outperforms g_serialg\_{\rm serial}, which, in turn, outperforms g_parag\_{\rm para}.

Test/Train SettingmathcalA\\mathcal{A}/boldsymbolmathcalA\\boldsymbol{\\mathcal{A}}mathcalA\\mathcal{A}/mathcalB\\mathcal{B}mathcalA\\mathcal{A}/mathcalC\\mathcal{C}mathcalB\\mathcal{B}/mathcalA\\mathcal{A}mathcalB\\mathcal{B}/boldsymbolmathcalB\\boldsymbol{\\mathcal{B}}mathcalB\\mathcal{B}/mathcalC\\mathcal{C}mathcalC\\mathcal{C}/mathcalA\\mathcal{A}mathcalC\\mathcal{C}/mathcalB\\mathcal{B}mathcalC\\mathcal{C}/boldsymbolmathcalC\\boldsymbol{\\mathcal{C}}
mathttParaN2D!!TF\\mathtt{ParaN2D\\!-\\!TF}20.4920.3920.3321.7221.8721.6221.9921.6121.97
mathttSerialN2D!!TF\\mathtt{SerialN2D\\!-\\!TF}20.7521.4819.0821.9122.9420.3119.2620.1824.33
mathttSMPEN2D!!TF\\mathtt{SMPEN2D\\!-\\!TF}22.2122.7020.6923.7824.7122.3021.8023.1724.44
评论

The hyperparameters used in our experiments are listed in Appendix E.4 of the revised paper. These hyperparameters of the baseline models have been fine-tuned, and we reported the best results for comparison. For example:

  • Pooling functions: Without loss of generality, all the pooling operations are set to be the mean function.

  • Activation & normalization: To accelerate convergence and enhance the stability of the training process, we applied the ReLU activation and LN to ETN2D\mathtt{ETN2D} and ETN3D\mathtt{ETN3D}.

  • Parameter initialization: With the exception of networks SMPEN2DTF\mathtt{SMPEN2D-TF} and SMPEN3DTF\mathtt{SMPEN3D-TF} in the task of Section 4.3, which were initialized using the default method, we employed Kaiming initialization for the weight parameters of the linear layers in networks used for other tasks.

  • Monte-Carlo simulation: Each experimental result is the average of three valid experiments.

评论

In this work, we propose a plug-and-play paradigm for the construction of arbitrary high-dimensional equivariant networks with well-designed one-dimensional equivariant networks. Specifically, we first provide a definition of multi-dimensional equivariance based on algebraic definitions. Based on this, we further propose two methods to combine one-dimensional equivariant functions, which have been mathematically proven to satisfy the equivariant property in multiple dimensions. Moreover, we employed feature reuse and introduced global information at each layer to enhance performance while maintaining the multi-dimensional equivariance. Additionally, it has been proven that the constructed equivariant layer can degenerate into the multi-dimensional equivariant linear layer characterized by a complex mathematical expression. The substitutability of one-dimensional equivariant functions in the proposed framework implies its enhanced generality, effectively transforming the design problem of multi-dimensional equivariant networks into the selection of one-dimensional equivariant networks.

评论

Thank you for your thorough reviews and valuable feedback.

The high-order permutation equivariance considers the high-order features of a single type of object. Assuming there are KK objects, their NNth-order features among each other are denoted as XRK×K××K×D_X{{{\bf{\mathsf{X}}}}}\in{\mathbb R}^{K\times K\times \cdots\times K\times D\_X}, where KK is repeated NN times in the dimension, so it can also be written as XRKN×D_X{\bf{\mathsf{X}}}\in{\mathbb R}^{K^N\times D\_X} 11 . The function ff exhibits NNth-order permutation equivariance if it satisfies f(σX)=σf(X), (σX)_[k_1,...,k_N,:]=X_[σ1k_1,...,σ1k_N,:], σS_K.\begin{aligned} f(\sigma\circ{\bf{\mathsf{X}}})=\sigma\circ f({\bf{\mathsf{X}}}),\ (\sigma\circ{\bf{\mathsf{X}}})\_{[k\_1,...,k\_N,:]} = {\bf{\mathsf{X}}}\_{[\sigma^{-1}\circ k\_1,...,\sigma^{-1}\circ k\_N,:]},\ \sigma\in {\mathbb S}\_K. \end{aligned} It can be observed that the same permutation σ\sigma is employed across all dimensions. High-order permutation equivariance is an extension of one-dimensional permutation equivariance 11

22 . The relevant literature on high-order equivariance also includes references 33 , 44 , 55 , and so on. The multi-dimensional equivariance considers features among multiple types of objects, i.e., XRK_1×K_2××K_N×D_X{{{\bf{\mathsf{X}}}}}\in{\mathbb R}^{K\_1\times K\_2\times \cdots\times K\_N\times D\_X}. A mapping ff is said to be NN-dimensional equivariant if

f\left({\pi}^{\mathbb N}\circ{{{\bf{\mathsf{X}}}}}\right) = {\pi}^{\mathbb N}\circ f({{{\bf{\mathsf{X}}}}}),\ ({\pi}^{\mathbb N}\circ {\bf{\mathsf{X}}})\_{[k\_1,k\_2,...,k\_N,:]} = {{\bf{\mathsf{X}}}}\_{[\pi^{-1}\_{K\_1}(k\_1),\pi^{-1}\_{K\_2}(k\_2),...,\pi^{-1}\_{K\_N}(k\_N),:]},\ \forall {\pi}^{\mathbb N}\in {\mathbb S}^{\mathbb N}. \end{aligned}$$ **multi-dimensional** permutation equivariance considers different permutations $\pi\_{K\_1},...,\pi\_{K\_N}$ across multiple dimensions, which is clearly distinct from high-order permutation equivariance. $ 1 $ Maron H, Ben-Hamu H, Shamir N, et al. Invariant and equivariant graph networks $ J $ . arXiv preprint arXiv:1812.09902, 2018. $ 2 $ Thiede E H, Hy T S, Kondor R. The general theory of permutation equivarant neural networks and higher order graph variational encoders $ J $ . arXiv preprint arXiv:2004.03990, 2020. $ 3 $ Keriven N, Peyré G. Universal invariant and equivariant graph neural networks $ J $ . Advances in Neural Information Processing Systems, 2019, 32. $ 4 $ Pan H, Kondor R. Permutation equivariant layers for higher order interactions $ C $ . International Conference on Artificial Intelligence and Statistics. PMLR, 2022: 5987-6001. $ 5 $ Kim J, Oh S, Hong S. Transformers generalize deepsets and can be extended to graphs & hypergraphs $ J $ . Advances in Neural Information Processing Systems, 2021, 34: 28016-28028.
审稿意见
6

This paper introduces SMPE, a multi-dimensional permutation equivariant framework. The framework combines 1D equivariant functions applied to individual dimensions to preserve higher-dimensional equivariance. By incorporating feature reuse and cross-dimensional global information, SMPE facilitates interactions among objects in all dimensions, enhancing expressivity. Additionally, the framework can be extended to achieve multi-dimensional invariance using pooling operations. Experimental results show that networks constructed using the SMPE framework outperform other approaches and can operate on sets of varying sizes.

优点

  • It is valuable to preserve higher-dimensional equivariance.
  • The theoretical analyses are provided.
  • The experimental results show the method have significant improvements over the baselines.

缺点

  1. One concern is about the evaluation. 1) The adopted tasks in this paper consider the dimensions no more than 3D. This may not reflect the effectiveness of the proposed method, since the paper claims its contribution in high-dimensional permutation equivariance. 2) The tasks seem simple that they only consider a linear mapping. It would be better to evaluate the method in real-world datasets, where the mapping from inputs to outputs could be complex. It would be better to show the method can be an effective module in neural networks for complex tasks.

  2. Another concern is about the novelty. The idea seems straightforward to combine 1D equivariant functions applied to individual dimensions to preserve higher-dimensional equivariance.

问题

  1. Can this method act a high-dimensional permutation equivariant module in other neural networks for more complex tasks?

  2. Most graph neural networks are known as permutation equivariant, and most pooling methods are permutation invariant. What about the comparisons of this method with these works?

  3. What is the distance metric for calculating MAE&MSE in the experiments?

  4. What about the tasks with more dimensions, e.g., more than 3D?

评论

In order to enhance the practicality and complexity of the task, we incorporated real-world datasets and conducted experiments on the transmission design task discussed in Section 4.3. We replaced the original results in Table 3 with the new results and move the original Table 3 to the appendix as supplementary results.

The new dataset is provided by QuaDRiGa 11 . QuaDRiGa, short for QUAsi Deterministic RadIo channel GenerAtor, is used for generating realistic radio channel impulse responses for system-level simulations of mobile radio networks. These simulations are used to determine the performance of new digital-radio technologies in order to provide an objective indicator for the standardization process in bodies like 3rd Generation Partnership Project (3GPP). Furthermore, the channel generator is continuously updated with the evolution of wireless communication standards 22 . For more details, please refer to the official website of QuaDRiGa: https://quadriga-channel-model.de.

The dataset configuration we used is as follows

Scenario3GPP_38.901_UMa_NLOSCenter Frequency3.5GHz
Transmit Antenna3gpp-3dReceive Antennaomni
AP Height25mUser Height1.5m
Transmitter Antenna Downtilt10Subcarrier Spacing30kHz
AP/User DistributionUniformly distributed within a 500m radius

where the scenario '3GPP\_38.901\_UMa\_NLOS ' refers to a specific configuration defined by the 3GPP in technical specification 38.901

33 . This scenario is designed to model wireless communication in urban macro environments where non-line-of-sight (NLOS) conditions are prevalent. In this scenario, QuaDRiGa simulates a multi-path channel model that includes the effects of buildings, structures, and other obstacles typically found in urban macro environments. The model considers the propagation of radio waves in NLOS conditions, accounting for reflections, diffractions, and scattering from surrounding objects.

The dataset from the generator closely approximates real-world scenarios, ensuring the practical value of experimental results.

11 Jaeckel S, Raschkowski L, Börner K, et al. QuaDRiGa: A 3-D multi-cell channel model with time evolution for enabling virtual field trials JJ . IEEE transactions on antennas and propagation, 2014, 62(6): 3242-3256.

22 Wang C X, Bian J, Sun J, et al. A survey of 5G channel measurements and models JJ . IEEE Communications Surveys & Tutorials, 2018, 20(4): 3142-3168.

33 3GPP TR 38.901 v16.1.0, "Study on channel model for frequencies from 0.5 to 100 GHz," Tech. Rep., 2019.

评论

The computation methods of these two distance metrics are as follows

{\rm MAE} = (\|\Re({{\bf X}\_1})-\Re({\bf X}\_2)\|\_1+\|\Im({\bf X}\_1)-\Im({\bf X}\_2)\|\_1)/KN,\\ {\rm MSE} = (\|\Re({\bf X}\_1)-\Re({\bf X}\_2)\|^2\_F+\|\Im({\bf X}\_1)-\Im({\bf X}\_2)\|^2\_F)/KN, \end{aligned}$$ where ${\bf X}\_1\in\mathbb{C}^{N\times K}$ and ${\bf X}\_2\in\mathbb{C}^{N\times K}$ are two matrices that need to be measured for their distance. It is worth noting that the upper and lower parts of Table I in the paper correspond to two separate experiments conducted for training networks with respect to these two metrics. These details are provided in Appendix E.1.
评论

About graph neural networks: Graph neural networks are designed to solve graph-related problems, which often consider modeling of nodes and edges

11 . Due to the unordered nature of nodes, graph neural networks possess permutation equivariance. It is worth noting that most graph neural networks can be viewed as one-dimensional equivariant networks, as evidenced by references 22 and 33 . As our work primarily focuses on the construction of multi-dimensional equivariant networks, our approach is more general than graph neural networks that target one-dimensional equivariant network design.

About pooling methods: The pooling function is utilized to aggregate multiple features, and this characteristic can be employed to construct permutation-invariant networks. In Sections 3.2 and 3.3 of the paper, pooling functions are respectively introduced to acquire global information and provide invariance. Most pooling functions can be applied within the proposed framework, as the multi-dimensional equivariance maintained by our framework does not depend on the selection of pooling function. Therefore, designing pooling functions is beyond the scope of this paper.

11 Wu Z, Pan S, Chen F, et al. A comprehensive survey on graph neural networks JJ . IEEE transactions on neural networks and learning systems, 2020, 32(1): 4-24.

22 Vignac C, Loukas A, Frossard P. Building powerful and equivariant graph neural networks with structural message-passing JJ . Advances in neural information processing systems, 2020, 33: 14143-14155.

33 Velickovic P, Cucurull G, Casanova A, et al. Graph attention networks JJ . stat, 2017, 1050(20): 10-48550.

评论

There are various ways to apply one-dimensional equivariant functions to multiple dimensions, but not all of them can maintain the equivariant property in multiple dimensions. For example, one can flatten the equivariant dimensions of the multi-dimensional data and apply one-dimensional equivariant functions. Alternatively, one can fix one equivariant dimension and flatten the remaining dimensions along with the feature dimension to create a new feature dimension. However, these approaches do not satisfy the multi-dimensional equivariance.

In our work, we propose a plug-and-play paradigm for the construction of arbitrary high-dimensional equivariant networks with well-designed one-dimensional equivariant networks. Unlike the examples mentioned, these proposed combination methods have been mathematically proven to satisfy the equivariant property in multiple dimensions. Moreover, we employed feature reuse and introduced global information at each layer to enhance performance while maintaining the multi-dimensional equivariance. Additionally, it has been proven that the constructed equivariant layers can degenerate into the multi-dimensional equivariant linear layers described in 11 , which is characterized by a complex mathematical expression. In summary, while our proposed method has a straightforward expression, its foundation and underlying analysis are not straightforward. The concise expression reduces the cost of further exploitation and analysis, reflecting the principle we advocate.

11 Hartford J, Graham D, Leyton-Brown K, et al. Deep models of interactions across sets CC . International Conference on Machine Learning. PMLR, 2018: 1909-1918.

评论

The involved mappings are relatively complex non-linear mappings. A linear mapping between two vector spaces needs to satisfy additivity, which means f(X+Y)=f(X)+f(Y)f(X+Y)=f(X)+f(Y). Clearly, the pseudo-inverse and the optimal solution for downlink transmission design do not satisfy this property. Specifically, for the transmission design task in Sections 4.2 and 4.3, finding the optimal results requires solving Problem (26) and Problem (28), respectively. These two problems are typical NP-hard problems and are difficult to solve 11 . The typical solving algorithms usually involve iterative operations on high-dimensional matrices and require multiple initializations to select the optimal results 22 , which is very time consuming and can not guarantee optimality.

11 Zhao X, Lu S, Shi Q, et al. Rethinking WMMSE: Can its complexity scale linearly with the number of BS antennas? JJ . IEEE Transactions on Signal Processing, 2023, 71: 433-446.

22 Christensen S S, Agarwal R, De Carvalho E, et al. Weighted sum-rate maximization using weighted MMSE for MIMO-BC beamforming design JJ . IEEE Transactions on Wireless Communications, 2008, 7(12): 4792-4799.

评论

We sincerely thank you for your careful reviews and constructive comments.

The proposed framework can be easily extended to higher dimensions, and we illustrate the framework of network construction by using a task of 4D mapping as an example.

Consider the precoding design for cell-free multi-user multiple-input multiple-output (MU-MIMO) systems There is a CPU controlling MM distributed APs to support KK UEs, where each AP is equipped with N_TN\_{T} antennas and each AP is equipped with N_RN\_{R} antennas. The channel between the mm-th AP and the kk-th UE is denoted as H_k,mCN_T×N_R{{\bf H}}\_{k,m}\in{\mathbb{C}}^{N\_{T}\times N\_{R}}, and the corresponding precoding matrix is W_k,mCN_T×N_R{\bf W}\_{k,m}\in{\mathbb{C}}^{N\_{T}\times N\_{R}}. By stacking the channel and precoding matrices belonging to all APs and UEs, we have HCM×K×N_T×N_R{{\bf{\mathsf{H}}}}\in{\mathbb{C}}^{M\times K\times N\_{T}\times N\_{R}} and WCM×K×N_T×N_R{{\bf{\mathsf{W}}}}\in{\mathbb{C}}^{M\times K\times N\_{T}\times N\_{R}}. Similar to the task in Section 4.3, we consider the optimization problem of maximizing the sum rate under transmit power constraints. It can be proved that the mapping from H{{\bf{\mathsf{H}}}} to certain optimal W=f(H){\bf{\mathsf{W}}}^{\star}=f({{\bf{\mathsf{H}}}}) is 4D permutation equivariant.

For this task, we concatenate the real parts and imaginary parts of H{{\bf{\mathsf{H}}}}, and the shapes of the input and the target output are H~RM×K×N_T×N_R×2{\widetilde {{\bf{\mathsf{H}}}}}\in{\mathbb R}^{M\times K\times N\_{T}\times N\_{R}\times 2} and W~RM×K×N_T×N_R×2{\widetilde {{\bf{\mathsf{W}}}}}\in{\mathbb R}^{M\times K\times N\_{T}\times N\_{R}\times 2}, respectively. For this, the architecture of layer SMPEL:RM×K×N_T×N_R×DRM×K×N_T×N_R×D{\rm SMPEL}:{\mathbb R}^{M\times K\times N\_{T}\times N\_{R} \times D}\to {\mathbb R}^{M\times K\times N\_{T}\times N\_{R}\times D} can be represented as follows: SMPEL(X)=FC_1(σ(LN(M))),M=FC_2([O_1,O_2,O_3,O_4,Xˉ_P_1,...,Xˉ_P_15]),O_0=X, O_n=PE1D_n(O_n1), n{1,2,3,4},\begin{aligned} {\rm SMPEL}({\bf{\mathsf{X}}}) = {\rm FC}\_1(\sigma({\rm LN}({\bf{\mathsf{M}}}))),\\ {\bf{\mathsf{M}}} = {\rm FC}\_2([{\bf{\mathsf{O}}}\_1, {\bf{\mathsf{O}}}\_2, {\bf{\mathsf{O}}}\_3, {\bf{\mathsf{O}}}\_4, {\bar {\bf{\mathsf{X}}}}\_{{\mathbb P}\_1}, ..., {\bar {\bf{\mathsf{X}}}}\_{{\mathbb P}\_{15}}]),\\ {\bf{\mathsf{O}}}\_0 = {\bf{\mathsf{X}}},\ {\bf{\mathsf{O}}}\_{n} = {\rm PE1D}\_{n}({\bf{\mathsf{O}}}\_{n-1}),\ n\in\{1, 2, 3, 4\}, \end{aligned} where FC_1:RDRD{\rm FC}\_1:{\mathbb R}^{D}\to{\mathbb R}^{D} and FC_2:R(4+15)DRD{\rm FC}\_2:{\mathbb R}^{(4+15)D}\to{\mathbb R}^{D} represent the linear layer applied at the last dimension. PE1D_1:RM×DRM×D{\rm PE1D}\_1:{\mathbb R}^{M\times D}\to {\mathbb R}^{M\times D}, PE1D_2:RK×DRK×D{\rm PE1D}\_2:{\mathbb R}^{K\times D}\to {\mathbb R}^{K\times D}, PE1D_3:RN_T×DRN_T×D{\rm PE1D}\_3:{\mathbb R}^{N\_{T}\times D}\to {\mathbb R}^{N\_{T}\times D}, and PE1D_4:RN_R×DRN_R×D{\rm PE1D}\_4:{\mathbb R}^{N\_{R}\times D}\to {\mathbb R}^{N\_{R}\times D} denotes the 1D equivariant layer operating at the 1,2,3, and 4-th dimensions, respectively.

Then, the four-dimensional equivariant neural network used to model ff can be constructed as follows W~=σ_out(FC_out(SMPEN(FC_in(H~)))),\begin{aligned} {\widetilde {{\bf{\mathsf{W}}}}} = \sigma\_{\rm out}({\rm FC}\_{\rm out}({\rm SMPEN}({\rm FC}\_{\rm in}({\widetilde {{\bf{\mathsf{H}}}}})))), \end{aligned} where FC_in:R2RD{\rm FC}\_{\rm in}:{\mathbb R}^{2}\to {\mathbb R}^{D} and FC_out:RDR2{\rm FC}\_{\rm out}:{\mathbb R}^{D}\to {\mathbb R}^{2} are used to is used to change the feature dimensions. σ_out()\sigma\_{\rm out}(\cdot) is designed to ensure that W~{\widetilde {{\bf{\mathsf{W}}}}} satisfies the power constraints. At this point, the construction of the four-dimensional permutation-equivariant network from H~{\widetilde {{\bf{\mathsf{H}}}}} to W~{\widetilde {{\bf{\mathsf{W}}}}} is completed. Following this approach, the framework can be easily extended to higher dimensions.

审稿意见
6

This paper proposes a novel serial multi-dimensional permutation equivariance framework called SMPE by serially composing multiple one-dimensional equivariant layers and incorporating dense connections for feature reuse to enable multi-dimensional interactions.

优点

  1. The proposed SMPEL seems to be reasonable.

  2. The experimental results demonstrate the effectiveness of the proposed SMPE.

缺点

  1. The pooling operation for multi-dimensional permutation invariance is not clear. Which type of pooling do you use in the proposed method? How does the pooling layer help with the multi-dimensional permutation invariance?

  2. Some experimental setups are not clear. For instance, in Section 4.2, the authors do not explicitly mention what evaluation metric is used in Table 2. It's unclear which type of pooling is used in the proposed method.

问题

  1. How does XˉP\bar{X}_\mathbb{P} with the weight wPGI\mathcal{w}^{GI}_P to preserve the global information? How do you learn the weights wPGI\mathcal{w}^{GI}_P and wPPE\mathcal{w}^{PE}_P? Can you elaborate on this?

  2. The pooling operation for multi-dimensional permutation invariance is not clear. Which type of pooling do you use in the proposed method? How does the pooling layer help with the multi-dimensional permutation invariance?

  3. In the experiment, the authors provide various variants of SMPEN. For instance, in Table 3, SMPEN2D-TF sometimes outperforms SMPEN3D-TF, while SMPEN3D-TF achieve the best performance in different training settings. Given a task, how to determine which type of methods should be chosen to achieve the best performance?

评论

This is an interesting topic that is currently being considered. Although we are unable to provide rigorous mathematical conclusions at this moment, we can analyze the following points from an experimental perspective and offer practical recommendations.

1. The impact of the number of objects K_1,...,K_NK\_1,...,K\_N on performance:
We consider the performance of a trained network from two perspectives: the testing performance in the training configuration and in other configurations. During testing in the training configuration, the number of of objects has a negligible impact on the network performance. However, when tested in other configurations, if the number of a certain type of objects in the training configuration is small, the change of this number will have a significant impact on the network performance. For example, in Table 3, the performance of SMPEN3D-TF trained under configuration B{\mathcal{B}} shows poor performance when tested under configuration A{\mathcal{A}}. Conversely, if the number of such objects is large in the training configuration, minor variations in number have a negligible impact on the performance. For instance, in Table 2, SMPEN2D-TF trained under configuration B{\mathcal{B}} exhibits excellent performance when tested under configuration C{\mathcal{C}}. Intuitively, when the number difference is small, the modeled equivariant mappings are similar, while the distinctions in the modeled mappings become more significant as the number difference grows large.

2. The impact of the relationship between the number of object types NN and the network dimensionality on performance:
When the dimension of the equivariant network is equal to the number of object types in the input, the input matches the network. With adequate training, the network can achieve good performance under the training configuration. When the dimension of the equivariant network is smaller than the number of object types in the input, multiple types of objects are reshaped into a single dimension for input. Due to equivariance, the network loses its ability to distinguish the merged types, resulting in a decrease in performance. For example, in Table 3, SMPEN3D-TF trained under configuration A{\mathcal{A}} performs better than SMPEN2D-TF when tested under configuration A{\mathcal{A}}. However, merging multiple types of objects into a single type will increase the number of the new type of objects. According to the analysis of the first point, this operation may lead to better performance of the network when tested under other configurations. For instance, in Table 3, SMPEN2D-TF trained under configuration B{\mathcal{B}} performs better than SMPEN3D-TF when tested under configuration A{\mathcal{A}}.

In conclusion, we provide the following recommendations for network selection.

  • If the number of each type of objectis is relatively large, choose an equivariant network with the same dimension as the number of types. For example, for the task in Table 2, we choose SMPEN2D-TF instead of SetTrans1D.

  • If the number of each type of objectis is relatively small, to achieve better testing performance under the training configuration, it is still advisable to choose an equivariant network with the same dimension as NN. To achieve better testing performance under other configurations, selecting a network with lower equivariant dimension can be considered. For instance, for the task in Table 3, SMPEN3D-TF performs better under the training configuration, while SMPEN2D-TF performs better under other configurations.

Finally, many thanks for your careful reviews and constructive comments.

评论

Thanks for the detailed clarification and I would like to raise my score to 6.

评论

The weighted sum of the processed features and the global information is performed through equations (5) in the paper. The processing of the linear layer FC_2{\rm FC}\_2 can be rewritten as FC_2([O_1,...,O_N,Xˉ_P_1,...,Xˉ_P_2N1])=_n=1NO_n×_(N+1)WPE_n+_i=12N1Xˉ_P_i×_(N+1)WGI_i+_(N+1)b,\begin{aligned} {\rm FC}\_2([{\bf{\mathsf{O}}}\_1, ..., {\bf{\mathsf{O}}}\_N, {\bar{\bf{\mathsf{X}}}}\_{{\mathbb P}\_1}, ..., {\bar{\bf{\mathsf{X}}}}\_{{\mathbb P}\_{2^N-1}}])=\sum\_{n=1}^{N}{\bf{\mathsf{O}}}\_n\times\_{(N+1)}{\bf W}^{\rm PE}\_n + \sum\_{i=1}^{2^N-1}{\bar{\bf{\mathsf{X}}}}\_{{\mathbb P}\_i}\times\_{(N+1)}{\bf W}^{\rm GI}\_i +\_{(N+1)} {\bf b}, \end{aligned}

Where WPE_nRD×D{\bf W}^{\rm PE}\_n\in{\mathbb R}^{D\times D} and WGI_iRD×D{\bf W}^{\rm GI}\_i\in{\mathbb R}^{D\times D} are the matrix weights representing wPE_nw^{\rm PE}\_n and wGI_P_iw^{\rm GI}\_{{\mathbb P}\_i}, and bRD{\bf b}\in{\mathbb R}^{D} is the bias of this layer. The operation '×_(N+1)\times\_{(N+1)}' denotes that the multiplication acts on the (N+1)(N+1)-th dimension of the tensor (i.e., the last dimension) 11 . Similarly, the bias is added along the last dimension, which does not affect the equivariance. By training this linear layer FC_2{\rm FC}\_2, the weights wPE_nw^{\rm PE}\_n and wGI_P_iw^{\rm GI}\_{{\mathbb P}\_i} can be learned.

11 Kolda, Tamara G., and Brett W. Bader. Tensor decompositions and applications. SIAM review 51(3), 455-500, 2009.

评论

Xˉ_P{\bar{\bf{\mathsf{X}}}}\_{{\mathbb P}} is obtained from pooling functions, which means it is aggregated from the information of each object. Its invariance to objects implies that it contains the global information of all object features. In each layer of the network, Xˉ_P{\bar{\bf{\mathsf{X}}}}\_{{\mathbb P}} is connected to the output after being weighted by learnable parameters wGIw^{\rm GI}. This means that after passing through this layer, the global information of the input is still preserved in the output of this layer. Similarly, by stacking layers, the global information of the input is preserved in the network.

评论

Due to the limitations of the main text, we have added Appendix E titled "Experimental Setup" in the revised version, which includes the specific mathematical expressions for the evaluation metric of each experiment and the selection of used pooling operations. For example, the evaluation metric Rˉ_M{\bar R}\_{\rm M} in Table 2 is defined as the average of R_MR\_{\rm M} across seven SNR values {0,5,10,15,20,25,30}\{0, 5, 10, 15, 20, 25, 30\}, where the expression of R_MR\_{\rm M} is given by R_M=_k=1Klog_2(1+h_kHw_k2σ2+_ikhH_kw_i),\begin{aligned} R\_{\rm M}=\sum\_{k=1}^{K}\log\_2\left(1+\frac{|{\bf h}\_k^H{{\bf w}}\_k|^2}{\sigma^2+\sum\_{i\neq k}|{\bf h}^H\_k{\bf w}\_i|}\right), \end{aligned} where the definitions of symbols can be found in Section D.2.

评论

Reference 22 demonstrates that a general invariant function can be decomposed into f(x)=ρ(sum(ϕ(x_1),...,ϕ(x_K)))f({\bf x})=\rho\left({\rm sum}(\phi(x\_1),...,\phi(x\_K))\right), where ϕ\phi and ρ\rho are the suitable transformations. More generally, reference 11 reformulates this as f(x)=ρ(pool(ϕ(x_1),...,ϕ(x_K)))f({\bf x})=\rho\left({\rm pool}(\phi(x\_1),...,\phi(x\_K))\right). Furthermore, it was mentioned in 11 that if the transformation ϕ\phi appied at x{\bf x} is a stack of permutation equivariant layer, the model remains permutation invariant, and it is easy to prove that f(x)=pool(PEL(x))f({\bf x})={\rm pool}({\rm PEL}({\bf x})) is a permutation invariant function, where PEL(){\rm PEL}(\cdot) is the equivariant layer. Based on this, it can be demonstrated that a multi-dimensional equivariant function can be transformed into a multi-dimensional invariant function by pooling the output across multiple dimensions. Thus, we transform SMPEN into SMPIN in Section 3.3, expressed as SMPIN(X)=Pool_N(SMPEN(X)){\rm SMPIN}({\bf{\mathsf{X}}}) = {\rm Pool}\_{{\mathbb N}}\left({\rm SMPEN}({\bf{\mathsf{X}}})\right), where Pool_N{\rm Pool}\_{{\mathbb N}} represents the pooling function operating on dimensions in N{\mathbb N}.

11 Lee J, Lee Y, Kim J, et al. Set transformer: A framework for attention-based permutation-invariant neural networks CC . International conference on machine learning. PMLR, 2019: 3744-3753.

22 Zaheer M, Kottur S, Ravanbakhsh S, et al. Deep sets JJ . Advances in neural information processing systems, 2017, 30.

评论

We sincerely appreciate your thorough reviews and insightful comments.

In this work, pooling functions are used in two instances: Firstly, in Section 3.2, a pooling function is utilized to extract global information from the input of each layer. Secondly, in Section 3.3, pooling functions are introduced across multiple dimensions to transform multi-dimensional equivariant functions into multi-dimensional invariant functions. Without loss of generality, both of the two types are set to be the mean function in the experiments, since the design of the pooling function is not the focus of this work.

We have supplemented this information in Section 4 and Appendex E.1.

评论

We sincerely appreciate all the constructive comments. The main changes we have made are as follows:

  1. We have added Appendix E (page 17-18) titled "Experimental Setup", which includes the specific mathematical expressions for the evaluation metric, the selection of pooling functions, other network hyperparameters, etc.

  2. To enhance the practicality and complexity of the task, we incorporated real-world datasets and conducted experiments on the transmission design task discussed in Section 4.3 (page 9). The introduction to the datasets can be found in Appendix E.3 (page 17).

  3. We have added ablation experiments about the networks constructed in a sequential or parallel manner in Appendix F.1 (page 18).

  4. The running time comparison for networks across different tasks have been added to Appendix F.2 (page 18). This result demonstrates that our proposed methods have comparable runtimes to other methods.

  5. The discussion on higher-order permutation equivariant networks has been added to Section 2.1, specifically in the ``Linear layers'' part (page 3).

  6. The discussion on permutation equivariant graph neural networks for solving graph classification tasks has been added to Section 2.1, specifically in the ``Applications'' part (page 4).

AC 元评审

This paper proposes a serial multi-dimensional permutation equivariance framework for designing multi-dimensional equivariant mappings. While the reviewers found the results of the paper solid and useful, they also had some concerns on the novelty of the techniques and the scope of the numerical validation. Given all their scores are at the borderline and the need to evaluate more thoroughly the claims/changes made by the authors during the rebuttal process, I regrettably cannot recommend acceptance of the paper at this point.

为何不给更高分

There are concerns on the novelty of the techniques and the scope of the numerical validation.

为何不给更低分

N/A

最终决定

Reject