PaperHub
6.4
/10
Poster4 位审稿人
最低4最高4标准差0.0
4
4
4
4
3.3
置信度
创新性3.0
质量3.0
清晰度2.8
重要性2.8
NeurIPS 2025

Quantum Visual Fields with Neural Amplitude Encoding

OpenReviewPDF
提交: 2025-05-12更新: 2025-10-29

摘要

关键词
quantum machine learningneural fields

评审与讨论

审稿意见
4

This work proposes quantum visual fields, an end-to-end trainable quantum circuit framework for learning implicit representations in computer graphics and visual-data-driven machine learning tasks. It claims that the proposed quantum visual fields consistently outperform multiple baseline methods across various tasks.

优缺点分析

strength

  • it is organized and well-written, clearly presents the core ideas.
  • the extensive numerical experiments are provided.

weakness

  • to demonstrate the practical of the proposed method, the related experiments on noise cases are necessary.
  • it is better to provide the necessary introduction about the visual fields for the audiences that not familiar to the related fields.
  • some font size of the labels in figures are too small which is hard to read.
  • it is better to give the rigorous analysis to support the claims.

问题

  1. is there any theoretical analysis such as theorem that explain why the proposed method can achieve better performance compared to classical way? i.e. rigorously analysis the gap between the proposed model with classical method.
  2. how to prepare the initial Gibbs state practically? I mean instead of numerical simulation, as you may know, the dimension (Pi in appendix algorithm 1) will grow exponentially with the system size, if it only classically simulates such process, it may face the stability issues.
  3. in line 221, it mentioned that we need mm-dimension latent representation, how to choose the specific mm?

局限性

yes

最终评判理由

Based on rebuttal, the authors basically addressed my concerns.

格式问题

there is no major formatting issues in this paper

作者回复

We sincerely thank the reviewers for their thorough evaluation, constructive feedback, and insightful comments, which will substantially enhance the clarity and quality of our work. We are particularly encouraged by the positive reception of our novel contributions, as highlighted by Reviewer DtQp’s acknowledgment of our "novel framework with theoretical rigor that advances applications." and Reviewer WDGf’s appreciation for our "innovative approach with strong empirical results that addresses scalability."

We would like to respectfully draw the attention of the reviewing committee that Reviewer 8wGH’s comments appear to pertain to a different manuscript. We have therefore carefully asked for further clarification. Below, we systematically address concerns raised by Reviewer 4LWU in detail:

Q1: to demonstrate the practicality of the proposed method, the related experiments on noise cases are necessary.

We acknowledge this point and agree that noise due to imperfect hardware is necessary for evaluations of quantum machine learning methods for practical tasks such as visual field representation. As assessment of QVF’s performance under noise is also required by reviewer DTQP in Q4, we kindly ask the reviewer to refer to this reply, where we provide comprehensive examinations.

Q2: 1) it is better to provide the necessary introduction about the visual fields for the audiences that not familiar to the related fields; 2) some font size of the labels in figures are too small which is hard to read.

This is extremely helpful for improving our work and its readability for the general audience. In response to the raised concerns, we will expand the introduction and include a detailed explanation of general visual fields and carefully check to ensure all font sizes meet the standard in the revision.

Q3: is there any theoretical analysis such as theorem that explain why the proposed method can achieve better performance compared to classical way? i.e. rigorously analysis the gap between the proposed model with classical method.

We believe this question is very similar to and partially overlaps with the question Q3 by reviewer DtQp which inquires about the motivations of using quantum methods (such as prior works such as QIREN and our QVF) over classical methods. We therefore kindly direct this reviewer to that response along with listed references.

Here we briefly repeat and summarize it in terms of algorithmic performance: 1) spectral learning bias: high-fidelity visual field representation greatly benefits from un-biased spectral learning. Classical methods favor learning low-frequency signals, i.e. biased spectral learning. On the theoretical level, spectral learning or incorporating Fourier features have been shown to be effective in relieving this problem; 2) computational separation between our quantum and the classical methods: our QVF leverages quantum machine learning which inherently performs un-biased spectral learning. Additionally, due to the unique properties of quantum processing such as superposition, the expressiveness of such spectral learning grows exponentially w.r.t. quantum resources which makes it inefficient and nearly impossible to simulate classically on a large scale, establishing provable exponential separation between the proposed and classical methods. In summary, theoretically our method can enhance visual field representation by leveraging inherent un-biased spectral learning properties with only logarithmic overhead. This is also empirically verified that our QVF has significantly better performance to perform un-biased spectral learning; see Fig.5 for an intuitive visualization example.

Q4: how to prepare the initial Gibbs state practically? I mean instead of numerical simulation, as you may know, the dimension (Pi in appendix algorithm 1) will grow exponentially with the system size, if it only classically simulates such process, it may face the stability issues.

Stability in classical simulations: in our classical simulation, we did leverage tricks such as log-sum-exp when preparing Gibbs states to ensure numerical stability. This would be clarified in the revision.

Preparation of Gibbs state on a quantum computer: prior works have discussed ways to prepare Gibbs state on a quantum computer, via thermalization protocol for example; see the works that we cite in L166-L169.

Our future plans for reducing the exponentially-growing dimension: We are also exploring more scalable ways to prepare Gibbs states. For example, we are planning to leverage tensor train decomposition with matrix product operator representations [1,2], which can potentially reduce computational scaling to polynomial complexity while maintaining sufficient accuracy. We are also exploring the possibility to prepare Gibbs state with approximated amplitude encoding [3] as an alternative approach. The detailed development and validation will be pursued in future research as it aligns with broader efforts to develop scalable Gibbs quantum state preparation protocols.

[1] Green et al. Quantum encoding of structured data with matrix product states. 2025.

[2] Melnikov et al. Quantum state preparation using tensor networks. 2023.

[3] Ben-dov et al. Approximate encoding of quantum states using shallow circuits. 2024.

Q5: in line 221, it mentioned that we need m-dimension latent representation, how to choose the specific m?

The dimension m is determined by the intrinsic dimensionality of the target output—for example, m = 3 for RGB images or m =1 for scalar-valued signed distance fields (SDFs). This ensures alignment between the number of measured qubits (Eq. 6) and the required output dimensions while maintaining an injective mapping. The constraint n≥m (where n is the total qubit count) guarantees physically meaningful representations. This helps avoid unnecessary computational overhead due to either redundant measurement or classical post-processing. This will be further clarified in the revision.

评论

Thanks for the response, I have no further questions.

审稿意见
4

This paper proposes Quantum Visual Fields, a quantum-inspired approach for learning visual representations using parameterized quantum circuits. The method avoids classical post-processing by combining Hamiltonian-driven dynamics with unitary evolution and projective measurements. The authors show its effectiveness on visual tasks involving 2D and 3D data, outperforming several existing quantum circuit baselines.

优缺点分析

Strengths:

The paper proposes an innovative quantum approach (QVF) that avoids classical post-processing, which is conceptually novel and well-motivated.

The method effectively addresses scalability issues faced by existing quantum implicit neural representations, a key limitation in the current literature.

Empirical results are strong, showing superior performance over baselines across multiple visual representation tasks.

Weaknesses:

The overall presentation is somewhat compact, and some parts are not clearly explained. For example, the description of Equation (1) is somewhat confusing: some symbols used in the formula are not explained, and conversely, some symbols appearing in the subsequent explanation do not appear in the equation itself.

The number of parameters is a critical factor in implicit neural representation tasks, as increased model capacity often leads to better performance. However, the paper does not provide a comparison of parameter counts between the proposed method and the baselines, which makes it difficult to assess the efficiency and fairness of the evaluation.

问题

I am curious why the authors categorize their method as “quantum-inspired” rather than “quantum machine learning,” given that the approach still relies on quantum simulators for experiments. If it is indeed a quantum-inspired method, would it be appropriate to include comparisons with classical algorithms such as SIREN to more comprehensively evaluate its advantages and limitations?

As far as I understand, data re-uploading circuits can be expressed as Fourier series expansions. Why is it that the authors use only amplitude encoding yet still achieve a Fourier series representation? Does this imply that quantum circuit-based machine learning models are inherently Fourier-based? Furthermore, what is the difference between the expressivity of the proposed model and the Fourier expressivity of data re-uploading circuits?

局限性

See the Questions and weaknesses.

最终评判理由

The author's response addressed my concerns regarding quality and clarity, so I raised the corresponding score.

格式问题

N/A

作者回复

We sincerely thank the reviewers for their thorough evaluation, constructive feedback, and insightful comments, which will substantially enhance the clarity and quality of our work. We are particularly encouraged by the positive reception of our novel contributions, as highlighted by Reviewer DtQp’s "novel framework with theoretical rigor that advances applications." and Reviewer 4LWU’s "well-organized, end-to-end quantum framework for learning INRs for visual data."

We would like to respectfully draw the attention of the reviewing committee that Reviewer 8wGH’s comments appear to pertain to a different manuscript. We have therefore carefully asked for further clarification. Below, we systematically address concerns raised by Reviewer WDGf in detail:

(Q1) The overall presentation is somewhat compact, and some parts are not clearly explained. For example, the description of Equation (1) is somewhat confusing: some symbols used in the formula are not explained, and conversely, some symbols appearing in the subsequent explanation do not appear in the equation itself.

We promise to address such mentioned issues and improve the presentation quality with proper formatting and caption clarity in the revised version to make sure it strictly sticks to the standard. For the mentioned instance of Eq.1, we here give a clearer explanation as follows:

γ\gamma(.) refers to positional encoding function; L is the maximum frequency level (also known as the bandwidth) in γ\gamma(.). Θ\Theta is the field query coordinate.

(Q2) The number of parameters is a critical factor in implicit neural representation tasks, as increased model capacity often leads to better performance. However, the paper does not provide a comparison of parameter counts between the proposed method and the baselines, which makes it difficult to assess the efficiency and fairness of the evaluation.

We agree with the reviewer regarding the parameter efficiency and provide the following clarifications here:

Quantum baseline (QIREN): We provide the parameter comparison with QIREN in the reply to question Q2 from reviewer DTQP.

Classical baseline: The only parametrization difference when comparing QVF with classical baselines comes from integrating the quantum circuits while other configurations remain the same as mentioned in Remark 3 in the paper. Across our representation learning experiments, we adopt a quantum circuit of five qubits bringing parameter overhead of 170 additional parameters. This means parametrizations of QVF and classical baselines differ by a parameter count of only 170 which we consider as negligible compared to the classical baseline with parametrization on the order 1e5. We believe this configuration makes it a fair comparison for QVF and classical baseline for the performance difference and efficiency. We also investigate by scaling up the latent space dimension of the classical method and found that it needs approx. 1.12e5 parametrization to have comparable performance with our QVF with parametrization of 0.52e5. This shows the effectiveness of our QVF design. The details would be clarified and incorporated in the revision.

(Q3) I am curious why the authors categorize their method as “quantum-inspired” rather than “quantum machine learning,” given that the approach still relies on quantum simulators for experiments. If it is indeed a quantum-inspired method, would it be appropriate to include comparisons with classical algorithms such as SIREN to more comprehensively evaluate its advantages and limitations?

Quantum vs. Quantum-inspired: We appreciate the reviewer’s thoughtful question regarding classification of QVF. Indeed, our work QVF, and previous work such as QIREN, 3D-QAE, are quantum machine learning methods and, therefore, our work should be better positioned as “quantum” or “quantum machine learning” instead of “quantum-inspired”. We will adjust the manuscript accordingly.

Comparison with Siren: While QVF is a quantum machine learning method and our closest work and main competitor QIREN already outperforms SIREN in their work and shows promising performance, we also compared with SIREN for completeness. SIREN follows the dense multi-layer neural net structure but replaces traditional ReLU activations with periodic sinusoidal activation functions with the corresponding initialization strategy. We reported the comparison results with SIREN denoted by MLP + sin in Tab.3 and showed that our QVF outperforms SIREN as well. The comparison is also strictly controlled with parametrization consistent with other classical baselines with parameter overhead of only 170 parameters, which we believe makes the comparison convincing. We will clarify this and expand the discussion of classical comparisons in the revised manuscript.

(Q4) As far as I understand, data re-uploading circuits can be expressed as Fourier series expansions. Why is it that the authors use only amplitude encoding yet still achieve a Fourier series representation? Does this imply that quantum circuit-based machine learning models are inherently Fourier-based? Furthermore, what is the difference between the expressivity of the proposed model and the Fourier expressivity of data re-uploading circuits?

Fourier-based learning nature of quantum circuit: Quantum circuit-based machine learning models are indeed inherently Fourier-based; see L181-194. Specifically, the Fourier interpretability of quantum circuits arises from the unitary evolution of quantum states and the spectral decomposition of observables. The accessible frequency spectrum depends on the encoding of classical data.

Expressivity difference between our method and data-reuploading circuits: The key distinction lies in the adaptability of the spectrum. In data-reuploading circuits, the pattern of frequency spectrum remains fixed due to handcrafted gate sequences (e.g. repeated Pauli rotations) while in QVF, our learnable energy-based encoding dynamically adjusts the frequency spectrum that automatically captures data-dependent frequency priors, allowing the quantum model to emphasize relevant spectral components while avoiding spectral rigidity of re-uploading circuits; see analysis in L181-L194. Such theoretical considerations are also reflected in the experiments where our QVF outperforms QIREN with parametrization on a similar level.

评论

Thank you for the author's reply. I have improved the scores for quality and clarity.

审稿意见
4

This paper proposes a quantum-inspired multi-dimensional multi-modal fusion model to enhance EEG-based emotion recognition. The approach mimics quantum probability principles to capture higher-order correlations across spatial, temporal, and frequency dimensions from multi-modal EEG signals. Experimental results on benchmark datasets demonstrate superior performance compared to conventional deep learning fusion strategies.

优缺点分析

+Introduces a quantum-inspired framework to model complex dependencies in EEG signals.

+Achieves better classification performance over state-of-the-art fusion techniques on standard datasets.

+Effectively combines multiple EEG modalities, enhancing robustness and generalizability.

-The quantum-inspired mechanisms lack biological or physiological interpretability for EEG practitioners.

-Higher-dimensional fusion may increase training time and require more computational resources.

-The method is tested on specific datasets, and its performance across different tasks or populations remains uncertain.

问题

How does the quantum-inspired fusion mechanism improve over traditional tensor-based or attention-based fusion in terms of capturing non-linear relationships in EEG data, and can this approach generalize to other bio-signals like fNIRS or EMG?

局限性

YES.

格式问题

None

作者回复

We sincerely appreciate the reviewer’s time and thoughtful comments. Upon careful consideration, we note that the summary and questions raised appear to pertain to a different work addressing EEG-based emotion recognition, whereas our paper introduces Quantum-inspired Visual Field Representation (QVF)—a novel quantum machine learning framework designed to enhance visual field representations with key contributions including: (1) superior performance compared to our closest competitor QIREN while maintaining model compactness; (2) support learning of visual field collections and larger problem scales such as 3D geometries for the first time in the domain; and (3) a rigorously designed ansatz that improves trainability via constrained real Hilbert space traversal. We would be delighted to provide further clarification on any aspect of QVF during the author-reviewer discussion phase and improve the manuscript’s quality.

评论

Dear Reviewer 8wGH,

We would like to briefly follow up regarding our submission, since the author-reviewer discussion phase nears its end. We will do our best to address your feedback in the remaining time.

Best regards

the Authors

审稿意见
4

This paper presents QVFs, a novel quantum-inspired framework for visual representation learning. The proposed approach employs a amplitude encoding scheme to effectively map classical data into quantum space. In contrast to conventional methods that rely on computationally intensive post-processing (typically through MLP layers), the proposed framework directly generates outputs through quantum circuit sampling. Comprehensive experiments are conducted across diverse 2D and 3D visual representation tasks, including image reconstruction and 3D geometry representation. The empirical results consistently demonstrate the superior performance and effectiveness of the proposed method.

优缺点分析

Strengths:

  1. Novel Framework: This paper proposes a quantum-based learning framework that integrates (1) an innovative amplitude encoding method to bridge classical data and quantum space, and (2) parameterized quantum circuits (PQCs) with a lightweight architecture and superior representation capability compared to previous approaches.

  2. Advancing Applications: Alongside prior works (e.g., 3D-QAE and QIREN), the proposed QVFs is one of the three study to systematically explore quantum machine learning (QML) for visual implicit neural representations (INRs). The proposed work expands the application of QML in visual computing, offering new insights and opportunities for researchers in this field.

  3. Empirical Validation: Extensive experiments are conducted across diverse visual representation tasks, demonstrating the effectiveness and robustness of the proposed proposed method.

  4. Theoretical Rigor: Detailed mathematical derivations and analyses are provided to ensure clarity and theoretical soundness.

Weaknesses

The manuscript suffers from several significant issues that need to be addressed.

  1. the presentation quality is subpar, with poorly formatted figures and tables that are not properly centered and use font sizes too small to be legible. Many of the captions fail to adequately explain the content; for example, Fig. 5(a) displays two rows of figures without any explanation in either the caption or main text.

2, the paper lacks essential ablation studies to validate its core claims about efficiency. While the authors criticize previous methods for relying on computationally heavy post-processing like MLPs in QIREN, they provide no quantitative comparison of parameters or computational requirements between their approach and existing methods. This omission substantially weakens the paper's central argument.

3, the rationale for applying quantum methods to visual representation tasks remains unconvincing. Given the current capabilities of classical approaches using CNNs or MLPs with modern GPU hardware - which already perform well on standard datasets like CIFAR10 - the paper fails to demonstrate what unique advantages quantum-based frameworks offer in this domain. The authors should provide either theoretical analysis or empirical evidence showing clear benefits of their quantum approach over classical methods under comparable conditions.

问题

  1. As indicated in the paper, the proposed algorithm's performance may be compromised by noise from hardware. Taking the weaknesses 3 into consideration, this raises concerns regarding the practical viability of extending quantum-based frameworks to visual representation tasks. The authors should provide a more rigorous evaluation methodology to properly assess the framework's feasibility under real-world conditions.

  2. The current study lacks sufficient ablation studies examining computational efficiency. To substantiate the key claim of replacing heavy post-processing (e.g., MLP layers) with the proposed quantum circuit module, the authors need to provide analyses towards computational complexity.

局限性

yes

最终评判理由

All my questions have been adequately addressed. I acknowledge the academic rigor of this paper, though my main concern remains with its presentation quality. As the authors have stated they will focus on improving the presentation, and given that only presentational modifications seem likely possible, I will maintain my current score.

格式问题

None of the figures and tables are formatted as centered.

作者回复

We sincerely thank the reviewers for their thorough evaluation, constructive feedback, and insightful comments, which will substantially enhance the clarity and quality of our work. We are also particularly encouraged by the positive reception of our contributions, as highlighted by Reviewer WDGf’s "innovative approach with strong empirical results that addresses scalability," and Reviewer 4LWU’s "well-organized, end-to-end quantum framework for learning INRs for visual data."

We would like to respectfully draw the attention of the reviewing committee that Reviewer 8wGH’s comments appear to pertain to a different manuscript. We have therefore carefully asked for further clarification. Below, we systematically address concerns raised by Reviewer DtQp in detail:

(Q1) The manuscript suffers from several significant issues that need to be addressed. The presentation quality is subpar, with poorly formatted figures and tables that are not properly centered and use font sizes too small to be legible. Many of the captions fail to adequately explain the content; for example, Fig. 5(a) displays two rows of figures without any explanation in either the caption or main text.

We promise to address such mentioned issues and improve the presentation quality with proper formatting and caption clarity in the revised version to make sure it strictly sticks to the standard.

For the instance of Fig. 5(a), the correct caption should be: “Qualitative learning process visualization for (top) our QVF and (bottom) classical model.”

(Q2) The paper lacks essential ablation studies to validate its core claims about efficiency. While the authors criticize previous methods for relying on computationally heavy post-processing like MLPs in QIREN, they provide no quantitative comparison of parameters or computational requirements between their approach and existing methods. This omission substantially weakens the paper's central argument.

We agree that such qualitative comparisons are essential and appreciate the reviewer for pointing it out. We provide further details due to the mentioned architectural differences:

Parameter comparison: In our experiments, QVF has a total number of parameters of 0.52e5 vs. 0.74e5 in QIREN [1]. This difference mainly comes from the design of QVF, in which the post-processing MLP is not needed, as the reviewer noted.

Computational requirements:

a) Execution time: we compare the time required by QVF and QIREN, respectively, to perform 1) gradient calculation and trainable parameter update; and 2) inference latency, with the reported number averaged over 5 iterations. As QIREN only supports single-image representation learning, we leverage single CIFAR-10 images to perform this benchmark. For QIREN, it takes 1.42s for training for a single epoch and 1.38s for inference vs. 0.82s and 0.68s for our QVF, achieving almost 2x reduction in both cases.

(b) Quantum circuit optimization: Another factor we considered when designing QVF is the optimization of the quantum circuit. In QIREN, the post-processing with MLPs requires constructing an additional computational graph, which needs to be combined with the parameter shift rule for optimizing the quantum circuit. QVF does not have this issue as it does not need further classical post-processing. This also contributes to the observed training efficiency, which can be more significant on future real hardware. These additional results will be included in the revision.

[1] Zhao et al. Quantum implicit neural representations. ICML, 2024.

(Q3) the rationale for applying quantum methods to visual representation tasks remains unconvincing. Given the current capabilities of classical approaches using CNNs or MLPs with modern GPU hardware - which already perform well on standard datasets like CIFAR10 - the paper fails to demonstrate what unique advantages quantum-based frameworks offer in this domain. The authors should provide either theoretical analysis or empirical evidence showing clear benefits of their quantum approach over classical methods under comparable conditions.

We included a brief theoretical justification of using quantum methods for visual field representation in the paper; see L28-34. We expand upon it and provide further theoretical motivations here:

Visual representation tasks require accurate capture of high-frequency details while classical methods like CNNs or MLPs tend to prioritize learning low-frequency details. In literature, learning incorporating Fourier features has been proven a promising approach for such tasks, given its ability to model fine-grained signal structures effectively [1,2,3,4]. It has been recently proven that quantum circuits exhibit a strong connection to Fourier processing, offering an exponential separation in computational efficiency compared to classical methods [5]. These reasons theoretically motivate our QVF, along with the previous work QIREN, and the use of quantum computers for exploration.

On the empirical side, the work of QIREN already shows promise compared to several classical methods. We, therefore, first show that QVF outperforms QIREN across the tasks supported and demonstrated in their work. We then show that QVF is capable of a broader set of tasks not supported by QIREN, and we leverage a consistent comparison strategy to demonstrate that QVF outperforms the commonly used classical methods in Tab. 3. Lastly, energy consumption presents another compelling motivation for executing quantum circuits on future quantum machines in general. Studies have shown that future quantum systems can operate with energy requirements several orders of magnitude lower than their classical counterparts for certain problems [6]. Given the escalating energy demands of modern AI systems, quantum approaches can provide a viable pathway toward sustainable AI development.

While the exploration of quantum methods for general AI remains in its nascent stages, with existing works such QIREN and our work QVF, we believe visual representation tasks serve as a principled and empirically validated starting point with support from both theoretical insights and experimental results.

[1] Rahaman et al. On the spectral bias of neural networks. ICML, 2019.

[2] Mildenhall et al. NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. Communications of the ACM, 2021.

[3] Tancik et al. Fourier features let networks learn high frequency functions in low dimensional domains. Neurips, 2020.

[4] Benbarka et al. Seeing Implicit Neural Representations as Fourier Series. WACV, 2022.

[5] Schuld et al. The effect of data encoding on the expressive power of variational quantum machine learning models. Physical Review A, 2021.

[6] Meier et al. Energy-consumption advantage of quantum computation, PRX Energy, 2025.

(Q4) As indicated in the paper, the proposed algorithm's performance may be compromised by noise from hardware. Taking the weaknesses 3 into consideration, this raises concerns regarding the practical viability of extending quantum-based frameworks to visual representation tasks. The authors should provide a more rigorous evaluation methodology to properly assess the framework's feasibility under real-world conditions.

Under real-world conditions, the performance of our QVF—as well as any quantum algorithm executable on near-term hardware—can indeed be affected by noise, a consequence of the current developmental stage of quantum devices. Beyond our initial analysis of measurement noise provided in the paper; see L339-348, we further investigate the effects of quantum gate infidelity during real hardware execution, another dominant error channel in current devices. Gate operation infidelity arises from intrinsic control imperfections in quantum hardware, resulting in stochastic deviations of the implemented gate operations from their ideal targets. These imperfections constrain gate fidelity to finite precision, which can be effectively modeled as zero-mean perturbations to the gate parameters within a bounded range. To simulate the impact of such noise, we introduce zero-mean Gaussian perturbations with varying standard deviations: higher values correspond to the noise levels typical of current near-term quantum devices, while lower values reflect anticipated improvements in future hardware. We report the results of this analysis using PSNR metric, alongside corresponding noise-free baselines originally reported in Tab.2, to quantify the degradation in performance under different noise regimes:

MethodNo-noiseσ = 0.01σ = 0.05σ = 0.1
Ours (Gaussian) + ReLU30.06 ± 0.1130.1 ± 0.1328.42 ± 0.1525.78 ± 0.23
Ours (Identity) + ReLU30.02 ± 0.2329.6 ± 0.1527.98 ± 0.1825.96 ± 0.18
Ours (Gaussian) + Sin32.59 ± 0.2132.4 ± 0.2230.66 ± 0.1627.94 ± 0.21
Ours (Identity) + Sin32.67 ± 0.3432.8 ± 0.2630.34 ± 0.2228.12 ± 0.23

Note that these results are averaged over 50 trials. It can be observed that under mild noise, the performance of QVF is barely influenced. With increasing perturbations, the representation fidelity of QVF decreases as expected. We emphasize that our QVF by design is inherently resistant to phase flip errors (see L211-213 for our quantum ansatz design), which introduce imaginary components, due to our engineered restrictions of information processing on real Hilbert subspace. These results will be later incorporated into the revised paper.

(Q5) the authors need to provide analyses towards computational complexity.

See reply to Q2. All additional analyses will be later incorporated into the revision.

评论

Thank you for your rebuttal. All my questions have been adequately addressed. I acknowledge the academic rigor of this paper, though my main concern remains with its presentation quality. As the authors have stated they will focus on improving the presentation, and given that only presentational modifications seem likely possible, I will maintain my current score.

最终决定

The submission introduces QVF (Quantum Visual Fields), a fully quantum implicit representation framework inspired by the unitary evolution of gate-based quantum systems. QVF distinguishes itself by eliminating classical post-processing and embedding learning directly into Hamiltonian-driven unitary dynamics. The model leverages quantum conservation laws and projective measurements for feature extraction, aiming to inherit desirable quantum properties such as spectral interpretability, numerical stability, and reversibility. The method is benchmarked on 2D image and 3D geometry representation tasks using up to six qubits, and is shown to outperform prior quantum baselines in terms of fidelity and learning speed.

All four reviewers rated the paper as borderline accept, with varying confidence levels and limited participation during the rebuttal phase. The authors were responsive during rebuttal, and the reviewers who engaged acknowledged that their concerns (primarily regarding presentation quality and clarity) were sufficiently addressed. Given the absence of detailed rebuttal engagement and the fact that all reviewers marginally leaned toward acceptance, the overall recommendation remains borderline.