Learning Cortico-Muscular Dependence through Orthonormal Decomposition of Density Ratios
摘要
评审与讨论
The paper presents a novel approach called Functional Maximal Correlation Algorithm with Trace cost (FMCA-T) for estimating cortico-muscular dependence by leveraging orthonormal decomposition of density ratios. This method is designed to model the relationship between EEG (electroencephalography) and EMG (electromyography) signals, addressing the challenges of interpretability, scalability, and local temporal dependence in cortico-muscular connectivity. The key contributions include introducing a matrix trace cost optimization for improved stability and efficiency, demonstrating robustness against nonstationary noise and delays, and effectively capturing movement and subject information from EEG features for enhanced classification accuracy. The proposed method outperforms existing baselines, particularly in cross-subject scenarios, and provides insights into channel-level and temporal dependencies, reinforcing its potential applications in brain-computer interface development and neuromuscular disorder diagnostics.
优点
- Innovative Method: Introduces the Functional Maximal Correlation Algorithm with Trace cost (FMCA-T), providing a novel approach for estimating cortico-muscular dependence.
- Improved Stability and Efficiency: Utilizes matrix trace cost optimization, which is more stable and computationally efficient compared to traditional log-determinant cost methods.
- Enhanced Classification Accuracy: Effectively captures movement and subject information from EEG features, significantly improving classification accuracy, especially in cross-subject scenarios
- Validation on Multiple Datasets: Validated using both simulated and real EEG-EMG datasets, confirming the method’s effectiveness and robustness.
- Open Data and Reproducibility: Offers open access to datasets and detailed implementation code, facilitating reproducibility and further research in the field.
缺点
The provided baselines are relatively few; future work could expand on this.
问题
- While the paper discusses the improved stability and efficiency of the FMCA-T method, can the authors provide more detailed about the computational resources required for training and inference?
局限性
1.Limited Dataset Size: The cross-subject classification performance drops, potentially due to the limited dataset size of only 25 participants. Larger and more diverse datasets may be needed to validate the method’s robustness comprehensively. 2. Generalization to Other Modalities: While the method shows promise for EEG and EMG signals, its generalizability to other types of biosignals or broader neural data modalities has not been extensively disscuss.
We thank the reviewer for the comments. Please find below our replies to the concerns/questions.
1. Limited baselines.
We have added EEG-Conformer (available on GitHub) and Deep4 (from the braindecode repository on GitHub) in the attached pdf. All baselines were implemented following the official codes. Results indicate that FMCA-T outperforms the additional baselines on almost all tasks. Deep4 performs slightly better on cross-subject 11-movement classification task. We will update Table 1 in the revised manuscript.
2. Computational resources for training and inference.
Experiments on the simulated sinusoidal dataset and the additional simulated EEG and EMG dataset were conducted on an NVIDIA GeForce RTX 3090. Experiments on the experimental EEG and EMG dataset were conducted on an NVIDIA GeForce A5000. All training and inference can be conducted on a single GPU.
3. Limited dataset size.
We agree with the reviewer about needing a larger, diverse dataset for better validation. Although unavailable now, we hope our method will encourage large-scale datasets, which we plan to explore further.
4. Generalization to other modalities.
We thank the reviewer for proposing the possible extension to broader neural data modalities. This is an interesting question for further investigation. We expect our method can still apply to learning the dependence between other modality data, e.g., between EMG and kinematics or between EEG and visual data. We will continue studying such scenarios in future work.
5. Ethics review.
The public EEG and EMG dataset was reviewed and approved by the Institutional Review Board at Korea University (1040548-KU-IRB-17-181-A-2).
Thank you for your thorough and detailed responses to my concerns. I appreciate the effort you have put into addressing the points raised. Based on your clarifications and the additional information provided, I am satisfied that my concerns have been addressed.
We thank the reviewer for the quick response and we are glad to see that our responses addressed the concerns raised by the reviewer. We will make sure to include all the new analysis in the revised manuscript.
The paper presents a new method to model the relationship between cortical and muscular oscillations using EEG and EMG recordings. Traditional methods like Cortico-Muscular Coherence (CMC) have limitations, so the authors propose using statistical dependence estimators to learn eigenvalues, eigenfunctions, and projection spaces. This approach improves interpretability, scalability, and local temporal dependence. Experimental results show that the method accurately classifies movements and subjects, highlighting specific EEG channel activations during movements, and demonstrates robustness against noise and delays, suggesting its potential for diagnosing neuromuscular disorders and developing brain-computer interfaces.
优点
-
The paper combines statistical dependence estimators with neural network optimization techniques. This fusion of methodologies enhances the ability to capture high-level and contextual connectivity between cortical and muscular oscillations.
-
The paper provides a detailed description of the proposed methodology, including the mathematical foundations, algorithmic implementation, and practical considerations. The inclusion of eigenvalues, eigenfunctions, and projection spaces adds depth to the analysis.
-
The authors conduct comprehensive experiments to validate their method. The results demonstrate the method's robustness against nonstationary noise and random delays, confirming its reliability and practical applicability.
缺点
-
Mathematical and Algorithmic Complexity: The proposed method involves complex mathematical formulations and advanced statistical techniques that may be challenging for a broader audience to grasp. Simplifying some of the mathematical derivations or providing more intuitive explanations and visualizations could make the paper more accessible.
-
Interpretation of Results: While the method highlights specific EEG channel activations during movements, the physiological and neuroscientific significance of these results could be further elaborated. Providing more detailed discussions on how these findings align with or differ from existing neuroscience research would enhance the interpretability and relevance of the results.
-
Scalability: The scalability of the proposed method to larger datasets or longer signal durations is not thoroughly addressed. Discussing the computational complexity and providing benchmarks on how the method performs with varying data sizes would be valuable.
问题
-
The method identifies specific EEG channel activations during movements. Could you provide more detailed explanations or references to how these findings align with existing neuroscientific knowledge? What are the physiological implications of the identified activations, and how do they contribute to our understanding of cortico-muscular connectivity?
-
The author mention potential applications in diagnosing neuromuscular disorders and developing brain-computer interfaces. Can you provide concrete examples or case studies where your method has been or could be applied? What specific benefits or improvements does your method offer over existing approaches in these applications?
局限性
Plz go and check weaknesses and questions.
We thank the reviewer for the instructive comments. Please find below our replies to the concerns/questions.
1. Mathematical and algorithmic complexity. We thank the reviewer for the suggestion. We will add more explanations of our methodology to the supplementary.
2. Interpretation of results and physiological implications of the identified activations. In terms of interpretation, we found that the frontal central areas (FC) are most activated in the spatial-level dependence maps for most subjects. Since we used eigenfunctions that decompose the density ratio for classification, this implies that these FC areas contribute significantly to movement classification. It is reasonable for FC1 to show the most activation in the dependence map. A related previous study also found movement-related cortical potential changes on FC1-FC2 and C2 electrodes [1].
We have additionally added a frequency analysis using event-related desynchronization (ERD) in the beta band. In the general response, we addressed the differences and consistency between frequency topologies and our dependence ratio map.
3. Scalability. We appreciate the reviewer's suggestion. Given the limited time frame and the limited availability of paired EEG-EMG datasets, it is difficult to show scalability on new real-world datasets. Instead, we divided the dataset into smaller sub-datasets and ran experiments with increased dataset sizes to demonstrate how our framework scales up. Fig. 4(a)~4(c) in the attached letter show that: (a) the convergence speed of training errors is almost the same, with smaller datasets converging slightly faster; (b) downstream classification accuracies steadily increase when using more data; (c) dependence decreases when using more data.
We observed the expected trend in dependence scores - as more data is used, it implies greater uncertainty in the data.
The computational complexity was not particularly heavy, and we did not see it as an obstacle. We trained the model on an A5000 GPU and did not encounter significant difficulties.
4. Potential applications. Corticomuscular coupling is promising for quantitively measuring movement disorders. This application has been validated in stroke populations [2, 3] and in Parkinson’s disease [4]. Therefore, the proposed dependence measure can be used as an effective and clinically relevant neural marker to evaluate movement performance of stroke patients and to indicate Parkinson’s disease pathology. Moreover, as demonstrated in the manuscript, EEG eigenfunctions show excellent performance in movement classifications. Therefore, the proposed method also has great potential in improving brain-computer interfaces. We will add this discussion in the revised manuscript.
[1] Spring J N, Place N, Borrani F, et al. Movement-related cortical potential amplitude reduction after cycling exercise relates to the extent of neuromuscular fatigue[J]. Frontiers in human neuroscience, 2016, 10: 257.
[2] Chen X, Xie P, Zhang Y, et al. Abnormal functional corticomuscular coupling after stroke[J]. NeuroImage: Clinical, 2018, 19: 147-159.
[3] Liu J, Wang J, Tan G, et al. Correlation evaluation of functional corticomuscular coupling with abnormal muscle synergy after stroke[J]. IEEE Transactions on Biomedical Engineering, 2021, 68(11): 3261-3272.
[4] Zokaei N, Quinn A J, Hu M T, et al. Reduced cortico-muscular beta coupling in Parkinson’s disease predicts motor impairment[J]. Brain communications, 2021, 3(3): fcab179.
Thank you for your effort and response. I will keep my rating.
We appreciate the reviewer's response. The reviewer's concerns on the previous version of the manuscript mainly pertained to the importance of this work and its consistency with physiological evidence. We feel we have addressed these comments specifically, as we briefly summarize below. If the reviewer still has concerns on these aspects, we would be truly grateful if they could kindly specify any remaining concerns, allowing us the opportunity to address them thoroughly.
Below follows a summary of how we addressed the concerns.
In terms of importance, our study measures the dependence and extracts features between two biosignals (EEG from the brain and EMG from muscles) without using any labels, yet the extracted features still capture contextual information such as participant movement. This can be highly beneficial for large, unlabeled datasets in BCI.
In terms of disorder analysis, conditions such as Parkinson's and stroke can affect cortical activation, as patients pathologically experience difficulties in planning and executing movements. The coherence between EEG and EMG signals has been widely used as an identifier for these diseases, as they can induce alterations or disruptions in neuronal pathways, leading to abnormal coherence and dependence between the signals.
In terms of neuroscientific evidence, frontal central area (FC) activation is expected. In neurophysiology, the motor cortex (central areas C3, C4) is linked to motor control. The pre-motor cortex and frontal cortex (FC) are linked to the cognitive process and motor planning. Thus FC activation matches the expectations. Additionally, we included simulated data in a controlled setting, simulating C3/C4 activation during right/left hand movements. Our dependence maps match the ground truth.
The authors apply novel but already existing (https://www.sciencedirect.com/science/article/pii/S0047259X2300074X, https://arxiv.org/pdf/2212.04631) machinery based approach on the orthonormal decomposition of density to decipher the relationship between cortical brain activity and the electromyographic signals during basic hand movements. The work is based on the publicly available dataset and the code is available. The unknown decomposition is modeled by a pair of neural networks concurrently processing EEG and EMG data in order to arrive at the internal representation for each of the modality. The internal representations are then aligned to minimize the rank of the joint covariance matrix. To guide the learning process, the authors propose a somewhat novel loss function equal to the negative trace of the canonical correlation matrix calculated using latent representations. The authors test their approach on the downstream tasks of classifying movement types in both within and across subject designs. They also apply the obtained representations to distinguish between participants based on their EEG data. The authors provide some interpretation to the obtained solution in the form of channel and temporal maps indicating the electrodes and time moments that contribute to the decoding most.
优点
- The authors applied novel but existing methodology of orthonormal density decomposition to the EEG+EMG dataset for the first time
- The authors introduced a novel loss function and showed that it provides better performance in the downstream task of EEG-based classification of movement types
- The authors used multi subject dataset
- The authors attempted to provide interpretation of the obtained decision rule
- The authors present detailed results of their experiments in the appendix
缺点
- Several inaccuracies and lack of details in the mathematical expressions: 1.1 line 100, last expression and additional p(z) is needed in the integral 1.2 equation 3 - do the eigenvalues need to be normalized? Does the sum exclude the first normalized eigenvalue?
- I would argue against the suggested novelty of the proposed loss function as it seems like the loss function that is closely associated with the Canonical correlation analysis (CCA) (equation 4). Generally speaking, the proposed approach boils down to the CCA in the latent variable space with latents computed by means of a CNN.
- The authors claim that they “..design a specialized network topology to generate features for individual channels and time intervals, ensuring that the internal layers of this network quantify channel-level and temporal-level features, similar to [22–24]” - however unlike for instance the EEGnet, the authors use non-linearities in the temporal network (prior to the spatial) which in my view prevents the straightforward interpretation of the obtained representations at least using simple correlational measures. See also Q.1 and 2.
- The authors did not validate their approach to interpreting the decision rule and obtaining spatial and temporal maps with simulated data. This needs to be done and the simulated data should contain not only the neuronal sources coupled to the simulated EMG but also the sources unrelated to the signal of interest (EMG). The authors then need to demonstrate that their methods infers the proper spatial patterns corresponding to the task-relevant simulated neuronal sources. Ideally, the obtained maps should be ranked based on their importance for the overall decoding accuracy. If this is not possible within the review cycle, the authors should significantly reduce the proportion of the manuscript dedicated to the physiological plausibility of their solution and instead describe limitations related to potentially non-physiological origin of the extracted features.
- It is disappointing that when interpreting the decision rule the authors did not provide information regarding the EEG frequency domain their network got tuned to during the training.
问题
-
Having significant experience in the domain of recording and analyzing electrophysiological data I founnd the obtained maps very suspicious. While EEG electrode FC1 can indeed be implicated and be coherent with EMG, I would expect other electrodes such as C3, C5 to have some significant contribution to the EEG derived latents that are maximally aligned with EMG. Instead in addition to FC1 we see the involvement of peripheral electrodes and the frontal electrodes. These electrodes often lose proper contact with the skin and become sensitive to the physical movements due to capacitive effects, when slight body displacements during the actual movement causes significant fluctuation in the electrode-skin capacitance and modulates the signals registered by EEG. The analysis of frequency response of the temporal layer (see W.5) may help to resolve this potential issue.
-
In the dataset used by the authors the reference channel was located in the midline between FC1 and FC2 sensors. Such an arrangement often results in low variance of the signals located close to the reference. The spatial patterns that the authors demonstrate in Figures 5 and 11 show peaks around Fc1 and FC2. The ground electrode was located at the edge of the EEG cap between F1 and F2 and the temporal maps show them as the next best electrodes after Fc1. Could it be that within certain normalization steps the authors explicitly or implicitly divided the data by the channel variance or multiplied the data by a poorly conditioned and not properly regularized inverse covariance that the role of these electrodes got artificially inflated?
-
The authors show spatial maps for several other selective subjects and clusters to illustrate across subject reproducibility. What about the spatial maps corresponding to the other latent channels\clusters for SUB3?
-
Why did the authors not follow the EEGNet architecture and decided to use non-linearities between the temporal and spatial processing blocks? Avoiding nonlinearities in the front end would improve interpretability and would help to make the presentation more convincing.
-
How do the authors avoid the trivial training result, i.e. that the two networks will simply learn to generate similar EMG and EEG embeddings regardless of the input data?
局限性
The authors partly addressed limitations but several items, see Weaknesses section, are left out.
We thank the reviewer for the detailed insightful feedback.
1.1 Novelty over CCA. We acknowledge the reviewer's observation that the final cost is the trace of a normalized canonical correlation matrix between two multivariate neural networks.
But we emphasize the link between cost optimization and joint density ratio , which the reviewer might have misunderstood.
Vanilla CCA uses linear models to maximize a correlation coefficient. After KICA was introduced (i.e., Kernelized CCA), the problem became minimizing matrix costs using universal approximators, with costs including log determinant, Frobenius norm, and matrix trace. Our trace cost is closer to HSIC. We have thoroughly compared our method with KICA and HSIC as baselines, both of which use kernels while we use neural nets.
Only recently, studies proved that optimizing matrix costs is equivalent to factorizing joint density ratios, including FMCA and Gaussian universal features. An informal proof: if and are two orthonormal sets, optimizing the norm of decomposes the joint density ratio and finds its eigenfunctions.
This paper further shows that all such costs, including log determinant, Frobenius norm, and trace, are fundamentally equivalent, differing only in the convex functions applied to eigenvalues. We consider this a major novelty.
1.2 More discussions about novelties. Clarifying the question on , it is the variable making and conditionally independent, such as movement types, not the network's latent features that approximate eigenfunctions. Lemma~1 explains why eigenfunction can be used for downstream tasks. Another novelty is analyzing dependence within the network for temporal and spatial activation.
1.3 Normalization. Eigenvalues of are well-regularized and don't need normalization: (1) All eigenvalues are positive and bounded by 1; (2) regardless of dependence; (3) independence iif there exists only one non-zero eigenvalue. This is due to the implicit normalization dividing by , corresponding to the cost's normalization and .
The normalization is costs also enforces orthonormality constraints on features, so outputs won't be constants (trivial solution)
2.1 Simulated dataset. We have added additional experiments with simulated dataset, shown in Fig. 3 in the general response.
We simulated EEG and EMG signals for left/right motor and sensory activations in 20 subjects using EEGSourceSim. Motor sources were used to simulate the corresponding EMG signals. We trained the networks on paired EEG-EMG samples from 16 subjects' data as training samples, and visualized the spatial-level dependence maps for the remaining 4 subjects as test samples. Then we plotted ground truth brain activations calculated from motor ROI and forward matrices.
The spatial-level dependence is highly similar to ground truth activations, indicating the learned ratio captured real brain activations.
2.2 Dependence shows major activations around FC areas. We have added a frequency analysis using event-related desynchronization (ERD) in the beta band (Fig. 2) and complete spatial dependence maps of our method (Fig. 3), both for Subject 3.
The frequency topology shows that left-central areas (e.g., C3) are commonly activated across sessions and movements while frontal-central (FC) and central-parietal areas have unique patterns for each movement. In comparison, our dependence maps show the most activated areas around FC areas.
We argue that the difference in activation occurs because we are quantifying nonlinear dependence, not correlations. It has been experimentally proven that the eigenfunctions of this dependence yield high accuracies in downstream classification tasks, which means that the activation maps should show the areas that make each movement most distinguishable. Thus the frontal central and central parietal areas are activated in dependence maps. In frequency topologies, C3 is activated for all movements, but it's unable to be used as an identifier for individual movements. This causes the differences between our maps and frequency maps. More details can be found in the general response.
The public dataset used has been validated before publication. No activation or deactivation was found on peripheral or frontal electrodes in the frequency analysis-based topographies.
2.3 Electrode arrangement and normalization steps. The electrode arrangement is reliable and has been used in commercial equipment. In EEG analysis, if the target electrode is near the reference, a weighted average filter will be employed to enhance the signal. Our study instead had no clear target before analysis. Therefore, we applied an average reference across electrodes, which is commonly used in EEG analysis. During preprocessing, all EEG channels were normalized to the first sample's maximum amplitude. Trials exceeding an absolute amplitude of 5 were discarded. No channel-wise normalization was used that could affect the variance.
3.1 Choice of nonlinearity differs from EEGNet. In EEGNet paper, the authors found that nonlinear activations did not improve performance, which is why they chose linear functions, thus making it a matter of preference.
Although linearity may help prevent overfitting in cases of long signal durations, our experiments showed no significant overfitting. Using ReLU or Sigmoid may provide a more stable approximation when computing cross-layer ratios, as the ratios should be non-negative and bounded.
Thanks for the effort and explanation - I did not say you did linear CCA. The real data result is still meaningless. I will keep my rating as is.
We would like to thank the reviewer for the engagement in this dialogue. We hope that we have sufficiently addressed the concerns regarding our algorithm. We understand that the remaining concerns are related to the interpretation of dependence maps. We want to emphasize that since our results are based on nonlinear dependence analysis between modalities, EEG and EMG, we should expect that the localization differs from using frequency analysis for EEG alone or correlation analysis. Indeed, our results incorporate two sources of information by estimating the joint density ratio between the two: if there is coupling, we should observe a better localization. We have made efforts, including frequency analysis, to ensure that this difference is not caused by electrode movements or normalization. Thus we are confident that the observed activation of premotor areas is caused by the EEG-EMG dependence.
We thank the reviewer for the thoughtful analysis and look forward to pursuing this line of thought to further validate our hypothesis.
Thanks for the efforts. The paper is clearer now however it was not the lack of clarity that affected my score. The fact that the simulated data analysis gives the expected result is reassuring, however such simulations is a must before a method is applied to real data. In your future work try to reproduce your result with EMG+EEG data RECORDED with ear reference. Also, please, remember, in non-invasive EEG with at least 19 channels it is NEVER that a single electrode that is important. Also, keep in mind the difference between the filter weights and the patterns (source topographies) (Haufe et al., 2014). It is also NEVER from the brain if a single electrode remains persistently active over a long time interval. Even the alpha rhythmic activity when the eyes are closed comes in bursts! I will keep my score intact but thanks for the great fight and I hope your work is accepted!
This paper introduces a novel approach to analyzing cortico-muscular connectivity using statistical dependence measures based on density ratio decomposition. The authors apply a method called Functional Maximal Correlation Algorithm with Trace cost (FMCA-T) to paired EEG and EMG recordings. The key idea is to decompose the density ratio between EEG and EMG signals into eigenvalues and eigenfunctions, which can capture important contextual information that affects the EEG-EMG dependency such as type of movement or subject without having them labeled. They also use the learned eigenfunctions as feature projectors and train a classifier on top for movement type classification tasks.
The authors test their approach on simulated data (SinWav) and a real EEG-EMG dataset with 25 subjects performing 11 different upper limb movements. They compare FMCA-T against several baseline methods for dependence estimation and classification. They find that the learned eigenfunctions capture factors such as movement type and subject identity. Further, FMCA-T outperforms the baselines, for example by 10% for cross-subject classification of arm-reaching, hand-grasping and wrist-twisting.
优点
- very sophisticated method with clear motivation
- original idea cleanly mathematically derived (as far as I can tell)
- produces good results
缺点
- classification baselines, could be stronger, e.g. by also using EEG Conformer and Deep4
- text is very dense at times, definitely found some part hard to read, but not sure how much it can be made easier, possibly you could explain some concepts used in 2.2 in more detail in the supplementary
问题
"In scenarios where X and Y are statistically independent, all eigenvalues are zero" > doesn't this make the density ratio 0 then? shouldn't the density ratio be 1 if they are independent?
局限性
Authors could discuss a bit more under what conditions the assumption of conditional independence may be problematic and when it is fine for EEG/EMG. In terms of what one may expect to see in analyses as performed here under different conditions. Of course, evaluation on further datasets would also be helpful for this.
We thank the reviewer for the comments.
1. Classification baselines could be stronger. We have now added EEG-Conformer (available on GitHub) and Deep4 (from the braindecode repository on GitHub) as new baselines in the general response letter. The results show that FMCA-T surpasses the added baselines in almost all tasks, with Deep4 slightly ahead in cross-subject 11-movement classification.
2. Independence criterion with the eigenvalues. The reviewer correctly pointed out our mistake. A more precise argument would be: two variables are independent (i.e., ) if and only if the spectrum has a single positive eigenvalue .
Define the linear operator with and as functions of and . It can be confirmed that yields , making it an eigenfunction of the operator with eigenvalue . Two variables are independent if and only if the only positive eigenvalue is this trivial one. All eigenvalues of the operator are non-negative and bounded by . More than one positive eigenvalue implies dependence. We will correct the error.
3. Applicability and limitations of conditional independence assumption.
a. We demonstrated the system's ability to capture discrete and easily distinguishable factors like movement types, participants, and sessions. But it struggles with finer-grained factors, such as sub-movements that include arm-reaching in six directions, hand-grasping three objects, and wrist-twisting with two motions.
One possible explanation is that non-invasive techniques like EEG face more challenges to extract finer movement information compared to invasive techniques. Consequently, the ratio becomes trivial, and finer information may not be easily extracted from the dependent components between EEG and EMG.
b. Future work is needed for scenarios with a continuous conditioned variable . In Lemma 1, we proposed the expression , which applies when is discrete. This may not account for continuous variables such as the force of muscle contractions and kinematics (joint angle).
4. Density of texts. We appreciate the reviewer's feedback and will improve the formatting.
Thanks for your answers and thanks for the additional baselines. Keep in mind with density I was not referring so much to the formatting, more to the text itself. As said. I am also not sure how much this can be improved to be easier to read but if you find some way would be great.
We thank the reviewer for their further replies. We will definitely try to reduce the density of the text in the revised manuscript, particularly in Section 2.2, as the reviewer mentioned. We have considered the following possible solutions: (1) Moving reference materials (such as log determinant) to the supplementary, which will allow us to have more space for discussing the new cost and density ratio factorization; (2) Adding more explanations to the concept, for example, the definition of the density ratio, why it's positive definite such that its factorization exists, and making a side-by-side comparison with properties of coherence analysis; (3) We have included pseudo codes in the supplementary, which we hope will provide more clarification, and we can explain the motivation behind each step in Section 2.2 from an implementary perspective; (4) We will check other sections and add more explanations if needed.
We appreciate the clarification and insights. We understand the reviewer's concern, as Sections 2.2 and 2.3 are our main proposals. We are confident that we can reduce the density of the text in the revised version. We once again thank the reviewer for the engagement in the discussion.
We appreciate the reviewers' feedback. The following responses address their shared concerns.
We have attached a letter containing the additional results, including the requested classification baselines, a frequency analysis of brain activations, full maps for Subject 3, maps for simulated EEG-EMG data, and model scalability with dataset size.
1. Additional classification baselines. We have added EEG-Conformer and Deep4 results as baselines. FMCA-T surpasses these supervised methods in most tasks, except Deep4's slight advantage in cross-subject 11-movement classification. This is shown in Table 1 of the attached letter.
2. Additional frequency analysis. To address the reviewer's concern regarding frequency topologies, we have performed a frequency analysis of brain activations using event-related desynchronization (ERD) in the beta band.
Fig. 1 shows results for Subject 3 across all movement types and sessions. We observed that the left central area is commonly activated across all movement types, frontal-central (FC) and central-parietal areas have unique patterns for each movement, and no apparent activation or deactivation was found in peripheral or frontal electrodes.
This is consistent with conventional EEG analysis and validates the dataset.
3. Dependence map interpretation. We have added dependence maps for all nine clusters from Subject 3 in Fig. 2, as requested by the reviewer. FC areas are consistently activated in most maps, matching our main paper.
To explain why the activated areas (FC) may differ from frequency topologies (left central), note that we are measuring nonlinear dependence, not correlations. Eigenfunctions for this dependence have been proven to yield high classification accuracies in downstream classification tasks, so we expect the activated areas in the dependence map to contribute significantly to classifying and identifying movements and participants, thus leading to FC activation.
Frequency topologies potentially support this argument. Since C3 is activated across all movement types, it may not be an effective identifier for distinguishing individual movements. Instead, FC has unique activation patterns for each movement.
4. Additional results on simulated EEG-EMG dataset. We have added results from a simulated EEG-EMG dataset to validate the spatial-level dependence map. We simulated EEG and EMG signals for left/right motor and sensory activations in 20 subjects using EEGSourceSim. Motor sources were used to simulate the corresponding EMG signals. As shown in Fig. 3 in the letter, the spatial-level density ratio learned from the dataset shows similar activation patterns to the ground truth motor activations computed from motor ROI and forward matrices.
Perhaps the best paper of my lot and the only one I give some consideration as to whether recommend it for oral. After some consideration, I have favoured recommending for poster though because of the heavy mathematical content which is usually better communicated over a poster than in an oral. Range of rating was only 2 points difference suggesting good agreement between reviewers. Rounded average score was 6 (accept) Based on the reviews a major strength is the pragmatical good results (R-MC3m and R-fPCn) attributable to the novel cost function (R-MC3m) as well as its practical applicability (R-J9kE). Two of the reviewers timidly mentioned mathematical complexity/sophistication (R-J9kE and R-fPCn), and it is certainly a maths heavy draft to the point that even to person with mathematical inclination like me, feels abusive at times. I truly believe clarity could have been improved by obfuscating less the message with the abuse of maths. Because the reviews gave more praise than requested further clarifications, rebuttal was relatively succinct. It was pointed to the authors that the IRB approval cannot be hidden (the authors did disclosed on the rebuttal though), but otherwise, no further issues with ethics were highlighted.