7.6

/10

Spotlight3 位审稿人

最低4最高5标准差0.5

3.0

置信度

创新性3.0

质量3.0

清晰度3.0

重要性3.0

NeurIPS 2025

Characterizing control between interacting subsystems with deep Jacobian estimation

Adam Joseph Eisen,Mitchell Ostrow,Sarthak Chandra,Leo Kozachkov,Earl K Miller,Ila R Fiete

OpenReview PDF

提交: 2025-05-12更新: 2025-10-29

TL;DR

A data-driven nonlinear control-theoretic framework to characterize subsystem interactions, leveraging a deep-learning method to learn dynamical system Jacobians.

摘要

关键词

control theorydynamical systemsneurosciencedeep learningmodular systems

评审与讨论

审稿意见

评分: 5置信度: 32025-06-08

Authors introduce a modeling approach for predicting next states in an observed dynamical system. They specifically utilize the linearized dynamics and learn the Jacobian to both achieve an interpretable modeling of the observed dynamics (though locally) that later enables control of the system and be able to predict next states for a given initial condition and inferred set of time-varying jacobians. They design a novel learning algorithm to train these Jacobians and demonstrate these two specific claims with simulations.

优缺点分析

Strengths

The focus on predicting not just dynamics, but also the jacobian is novel and quite interesting. This is particularly relevant since the multiplication J(t) x is under constrained by just observed dynamics, but Jacobian is needed for defining optimal control dimensions.
The authors demonstrate both of their claims with simulations and RNN training, which is almost sufficient (see below).

Weaknesses

The writing is quite dense, which leads to many questions in the mind of (this) reader. In several cases, I felt that the results were too good to be true (which can also become a strength once authors clearly set the boundaries of their method).
I believe a different RNN architecture may be more appropriate to test these claims (see below).

问题

It is still not clear to me how Jacobian at a given time point can be approximated without repetition of the same trials/time windows/states? One would ideally need to sample around a particular state to be correctly estimate the jacobian? There is some coarse-graining here that I am not seeing/not being discussed explicitly. Can you comment on this?
The model of the RNN should be shown in the main text. In my opinion, this work would significantly benefit from using more complex RNN architectures. Specifically, the RNN architecture used here is known to have neural activities confined to low-dimensional hyperplanes, hence the jacobian estimation is actually quite simpler and maybe effectively low-dimensional. You can check this by computing the PC dimensionality in learned RNNs, which would be important to note here. This may be ok, since most behavioral tasks animals perform in the labs are low-dimensional anyways, but this point should be discussed and raised in the limitations. In my opinion, using LSTMs trained on slightly more complex tasks (and validated to have high PC dimensionality) may be a good control for this.
The discussion on Gramian is too quick. Could you please expand on this? The reader cannot really appreciate what is exactly being computed, how it relates to the experiments etc.

Here are some simple extra suggestions:

Line 150: waiting -> weighting?
Title was a bit misleading for me. This seems to be a neuroscience-relevant work, though only looking at the title, I would have likely not read this work. (and it would have been a mistake on my part, since this is quite interesting!)
The transition to Eq. (1) is weird. At first, I thought that was part of the figure.
Section 6 should likely come way earlier, before the results section.

局限性

See above.

最终评判理由

I think this work is a nice addition to Neurips, as agreed by all reviewers.

格式问题

NA.

作者回复

2025-07-31

We thank the reviewer for their constructive feedback and questions about the paper, and address them here.

Dense writing and boundaries of the method

We rewrote section 3 to clarify key intuitions and moved technical details to the appendix for improved narrative flow. If accepted, we will use the extra page to space out text and figures.

To better define the method’s boundaries and clarify our claims, we substantially revised the limitations section (see also reviewer 1):

A future challenge for JacobianODEs is partially observed dynamics. Recent work has identified that it is possible to learn latent embeddings that approximately recover the true state from partial observation [108-112]. Jacobian-based dynamics learning has been performed in a latent space [61, 113], yet it is unclear whether this translates to accurate Jacobian estimation, given the challenges related to automatic differentiation and Jacobian estimation presented here.

We also note that reachability estimates depend sensitively on several factors: the alignment of cross-subsystem interactions and within-subsystem dynamics, the eigenvectors and eigenvalues of within-subsystem Jacobians, and the way activity propagates within each subsystem (see appendix C.1). While our method reliably captures broad trends in reachability over time, fine-grained, time point-specific comparisons should be interpreted with caution.

Finally, although JacobianODEs scale well to moderately high-dimensional systems, their performance in systems that are orders of magnitude larger than those considered here (e.g., recordings of thousands of neurons via calcium imaging) remains to be tested. In many practical settings, this may not be a limitation: neural representations during tasks often exhibit intrinsic dimensionalities comparable to those studied here, enabling JacobianODEs to operate in a reduced dimensionality. Notably, most models converged in under 45 minutes on a single GPU, suggesting favorable scaling. Future work should explore the viability of the method in higher dimensions, and assess whether dimensionality reduction strategies (such as latent state models or low-rank Jacobian approximations) can further improve scalability and accuracy.

The model of the RNN should be shown in the main text

We agree that this was an oversight. We now introduce the RNN architecture directly in section 5, as suggested.

Suggested analysis of more complex RNN architectures, due to potentially low-dimensional Jacobian estimation

TL;DR: The Jacobians were not low-rank. While both RNN activity and Jacobians lie on low-dimensional manifolds relative to the 128D state space, they match dimensionality observed in neural data, as the reviewer noted. The 64D Lorenz 96 system had much higher intrinsic dimensionality, supporting our method’s effectiveness on complex systems.

The reviewer astutely notes that RNN activity is often confined to low-dimensional manifolds, potentially simplifying Jacobian estimation. We address this point through the following analyses.

Jacobian rank: We first note that both the RNN hidden weight matrix and nearly all Jacobians had full rank (128; ~0.1% had rank 127). To assess the effective rank, we computed the participation ratio (PR), a singular value-based measure of intrinsic dimensionality. The RNN weight matrix had a PR of 123, and the Jacobians had a mean PR of ~106 (std 0.7).

Intrinsic-data dimensionality: As the reviewer notes, the data itself may be low-dimensional. Since the Jacobian is a function of the trajectory data, the Jacobian manifold’s intrinsic dimension is at most that of the trajectory manifold. To estimate the intrinsic dimensions, we pooled time points from a large sample of trajectories. For RNN state vectors, we computed PR directly; for Jacobians, we flattened them into $n^2$ -dimensional vectors before computing PR. We found the RNN trajectory PR to be ~13.5, and the Jacobian PR to be ~6. Given the 128D extrinsic space, this confirms both RNN states and Jacobians lie on lower-dimensional manifolds.

Notably, prior work estimated the dimensionality of prefrontal cortex delay activity at 24 using ~4000 neurons, falling to ~6 when subsampled to 100 neurons (Rigotti et al., The importance of mixed selectivity in complex cognitive tasks. Nature, 2013). These values were interpreted as high-dimensional. Thus, the dimensionalities we observe are consistent with biological systems performing complex tasks.

Furthermore, in the 64-dimensional Lorenz 96 system, the trajectory PR was ~47.5, and the Jacobians had a PR of ~50, which we believe to be high-dimensional enough to demonstrate the effectiveness of our method on complex systems.

We note that in larger systems, approximating the Jacobian as low rank may be more efficient. Rather than computing each of the $n^2$ outputs directly, we can constrain the Jacobian to be low-rank by designing the output as a low-rank combination of vectors (e.g., it learns to output a low-rank singular value decomposition).

We agree that applying our method to more complex architectures (e.g., LSTMs) and tasks is an important direction for future work. For now, we believe our results sufficiently demonstrate the method’s ability to capture rich, high-dimensional dynamics.

As suggested, we added a discussion of dimensionality to the limitations section, as well as to the appendix. We thank the reviewer for prompting this clarification of an important aspect of the method.

How the Jacobian at a given time point can be approximated without repetition

We appreciate this question and recognize it was not fully addressed. The loop closure loss provides the primary mechanism as it constrains the Jacobian’s form beyond fitting observed trajectories. We have rewritten the loop closure section as follows (and see reviewer 2 discussion):

The Jacobian captures how perturbations to the system will propagate along any direction in state space. Estimating it purely from dynamics constrains only the direction of the flow, leaving the full solution underdetermined. To address this, we again exploit the fact that each row of the Jacobian is a conservative vector field. Specifically, we note that for any piecewise smooth loop $\mathcal{C}\_{\text{loop}}$ , we have $\left \Vert \oint\_{\mathcal{C}\_{loop}} \mathbf{J} ds \right \Vert_2 = 0$ . Thus, by integrating along loops that contain directions orthogonal to the system's dynamics (and penalizing the deviation from zero), we encourage the estimated Jacobians to capture information about other directions in state space (see appendix A.3 for full technical details). To ensure broad coverage of tangent space directions, we form loops from concatenations of line integrals between randomly selected data points. This strategy samples diverse directions from the tangent space while remaining easy to compute. This self-supervised loss term builds on the loss introduced by Iyer et. al. [63]. It constrains $\hat{\mathbf{J}}^{\theta}$ to satisfy both the dynamics and conservativity. This improves Jacobian estimation accuracy significantly (see appendix C.4 for ablation studies).

Observation noise also helps the model learn responses in directions beyond the primary flow. We added small observation noise to each batch during training on the task-trained RNNs. As noted in the “Jacobian reconstruction quality” section: “Noise encourages the model to explore how perturbations around the observed trajectory evolve, which is crucial for learning accurate Jacobians in high-dimensional systems.”

We hope this addresses the reviewer’s question and are happy to clarify further.

Gramian discussion is too quick

We added the Gramian equation into the main paper, and agree it is a key quantity of interest. We also expanded the discussion of this quantity as follows:

Since Jacobians are time-dependent, we obtain a linear time-varying representation of control dynamics along the trajectory. This enables computation of time-varying reachability ease, capturing how readily each area drives the other toward novel states [56-58]. Reachability is quantified via the reachability Gramian, a matrix defining a local metric in the tangent space. For the above control system capturing the influence of area B on area A, the time-varying reachability Gramian on the interval $[t\_0,t\_1]$ is defined as $\mathbf{W}\_r(t\_0, t\_1) \triangleq \int\_{t\_0}^{t\_1}\mathbf{\Phi}(t\_1, \tau)\mathbf{B} (\tau)\mathbf{B}^T(\tau)\mathbf{\Phi}^T(t\_1, \tau) d\tau$ where $\mathbf{\Phi}$ (computed from $\mathbf{J}^{A \to A}$ ) denotes the state-transition matrix of the intrinsic dynamics of subsystem A without any input (i.e., $\delta\mathbf{x}^A(t) = \mathbf{\Phi}(t, t_0)\delta\mathbf{x}^A(t\_0)$ ), and $\mathbf{B}(\tau) = \mathbf{J}^{B \to A}(\mathbf{x}(\tau))$ [56]. The Gramian $\mathbf{W}\_r(t\_0, t\_1)$ is symmetric and positive semidefinite for every $t\_1 > t\_0$ [56]. Each eigenvalue of the reachability Gramian quantifies how easily the target subsystem can be driven along its corresponding eigenvector. Thus, the trace of the reachability Gramian reflects average ease of reaching new states. The minimum eigenvalue indicates the lowest ease of control (i.e., the highest cost), and its corresponding eigenvector denotes the most difficult direction to control. For further details see appendix B.1.

Typo on line 150

This has been fixed.

Misleading title

Thank you for the comment. We’ve considered adding more neuroscientific context, and would appreciate your thoughts on this alternative: Deep Jacobian estimation for characterizing control between interacting subsystems with applications to neuroscience.

Transition to equation (1)

Equation 1 has been moved to better separate it from Figure 1.

Relocating the related work section

We moved the related work section (section 6) to follow immediately after the introduction.

2025-08-01

I thank the authors for their detailed responses. I recommend acceptance of this work, which seems to be reviewer consensus to begin with. Good luck!

审稿意见

评分: 4置信度: 22025-07-03

The authors provide a novel method of local deep Jacobian estimation for non-linear dynamical systems, involving optimising a neural network jointly on trajectory prediction and adherence to conservativity. The authors used this method to model the interarea controllability of a two-area task-trained RNN, showing that their method outperforms other Jacobian estimation methods in predicting and exerting control on local dynamics.

优缺点分析

Strengths

The authors provide a well-motivated Jacobian estimation technique, providing extensive results on synthetic data.
The authors are careful to provide details throughout the paper about the technical and intuitive details of their method e.g. appendix C.1, C.2 provide welcome information and results to reinforce the method motivation, and ablation studies justify each component of the model

Weaknesses This work presents an interesting method and theoretically strong results. The primary weaknesses of this paper are around the communication and format of information:

Section 2.1 seems out of place. While the authors do emphasise the importance of modelling interareal/intersystem communication in the introduction, this does not play a role in the training of JacobianODE, or understanding of the method as a whole. Introducing these concepts in any technical detail may be better placed near section 5, where interarea systems’ data are first introduced.
Overall, there is a poor separation of information in the main text and in the appendix. Section 3.1 provides much information that is not strictly necessary for understanding the training procedure in Figure 3, or at least requires an understanding of the appendix material to take advantage of this level of explanation. This could be moved to the appendix to create a clearer narrative in the main text. This would mirror other methods that were key to appreciating the results of this paper, e.g. ILQR, being delegated to the appendix.
Despite being an innovation for this work, section D.8.2 seems like insufficient exposition on the formulation and importance of the loop closure loss. Table 3 shows that this is critical for performance, but no investigation or context is provided on how one could curate optimal loops for training.

Less importantly:

Images of trajectories provide little information in their current form. Dimensionality reduction, e.g. by tSNE, could provided a better qualitative visualisation of the quality of predictions by each model.
A final minor weakness is that spacing is limited throughout the paper, table captions are difficult to discern, and information on figures are often too small

问题

Why was this method of loop curation chosen? Do you expect other design choices to significantly change performance, beyond a simple binary ablation?
Figure 6B suggests that the model is systematically delayed in predicting changes to area reachability in opposite directions. Is there an intuitive explanation for why this is? Did you consider any investigations into this offset?
Did the authors consider training on other laboratory tasks where control theory has previously been applied, e.g. motor control? Comparison of insights provided by JacobianODE compared to previous analyses would provide powerful justification for the method

局限性

There are no relevant negative societal impacts associated with this work.

The authors sufficiently discuss limitations to their work in the final section of the paper. In particular, they highlight the need for an extension to partially observable dynamical systems. This would be a necessary step towards applying this method to real neural dynamics

最终评判理由

I am continuing to recommend this paper for acceptance, as per my original review and subsequent reponse.

格式问题

None

作者回复

2025-07-31

We thank the reviewer for their helpful comments and feedback, and are glad that they found the method to be well-motivated, interesting, and theoretically strong. We now address the questions and comments posed by the reviewer.

Section 2.1 is out of place

We have moved section 2.1 to the analysis of the task-trained RNN (section 5), which we agree is a more conceptually aligned location.

Main text/appendix split & loop construction explanation

To clarify the narrative flow, we moved “Constructing the time derivative” and technical loop closure details to the appendix. In their place, we provide a more intuitive discussion of loop closure in the main text (see next point).

Formulation of the loop closure loss and methods of constructing loops

TL;DR: The loop closure loss encourages the Jacobian to learn about directions in the tangent space that are orthogonal to the flow of the dynamics. We construct loops from concatenated lines between random points on the data manifold to encourage uniform sampling of these directions.

We agree that the loop closure loss deserves more explanation. We have revised the main text as follows:

The Jacobian captures how perturbations to the system will propagate along any direction in state space. Estimating it purely from dynamics constrains only the direction of the flow, leaving the full solution underdetermined. To address this, we again exploit the fact that each row of the Jacobian is a conservative vector field. Specifically, we note that for any piecewise smooth loop $\mathcal{C}\_{\text{loop}}$ , we have $\left \Vert \oint\_{\mathcal{C}\_{loop}} \mathbf{J} ds \right \Vert_2 = 0$ . Thus, by integrating along loops that contain directions orthogonal to the system's dynamics (and penalizing the deviation from zero), we encourage the estimated Jacobians to capture information about other directions in state space (see appendix A.3 for full technical details). To ensure broad coverage of tangent space directions, we form loops from concatenations of line integrals between randomly selected data points. This strategy samples diverse directions from the tangent space while remaining easy to compute. This self-supervised loss term builds on the loss introduced by Iyer et. al. [63]. It constrains $\hat{\mathbf{J}}^{\theta}$ to satisfy both the dynamics and conservativity. This improves Jacobian estimation accuracy significantly (see appendix C.4 for ablation studies).

As mentioned, we have also moved the technical details to the appendix.

Other choices (e.g., forward and backward loops along observed trajectories or arbitrary circles) were considered, but our method offers a strong tradeoff between tractability and uniformity in sampling. We have added a discussion of these other potential choices for loops to the appendix as well.

We appreciate the reviewer’s suggestion, which helped clarify a core component of our method.

Images of trajectories from sample systems

We agree that the images of the trajectories in their current forms may not be optimally informative. As suggested, we will run tSNE, as well as UMAP, on the data and add the most visually salient / illustrative representations to Figures 4, 5, and S2. We have also added time-series plots of the top Principal Components in the appendix to highlight the temporal unfolding of each system.

Limited spacing, table captions, and small figure information

Regarding spacing, as mentioned, we have moved many details to the appendix, which should create more space. If accepted, we will further space out the text and figures to fill out the extra allotted page.

We have updated the following table captions and figure information to make it easier to read:

More detail has been added to the caption for Table 1, Table S1, and Table S6
Table S2 and S3 have been made larger
The text in Figure 6 has been made larger

Delayed reachability curve in Figure 6B

TL;DR: The reachability Gramian is highly sensitive, and takes into account the alignment between cross- and within-area dynamics, as well as how within-area dynamics propagate over time. Small estimation errors can thus cause temporal offsets. We conjecture that small estimation errors in the within-area eigenvalues are the source of the observed delay, and now include a discussion of this phenomenon in our limitations.

As the reviewer astutely notes, in Figure 6B, cognitive-to-visual reachability peaks slightly late, while visual-to-cognitive hits a minimum slightly early. This likely stems from the sensitivity of the reachability Gramian, which arises from a complex interplay of components. These include the alignment of the cross-area interaction terms with the eigenvectors of the within-area dynamics, the corresponding eigenvalues of each of the eigenvectors, and how activity propagates within each area over successive time points. This phenomenon can be observed in the results in appendix C.1. Mathematically, consider the definition of the reachability Gramian currently in appendix B.1, in which the $\mathbf{\Phi}(t_1, \tau)$ term captures how within-subsystem dynamics propagate between times $\tau$ and $t_1$ , and the multiplicative term $\mathbf{\Phi}(t_1, \tau)\mathbf{B}(\tau)$ captures how the directions of interaction between the subsystems aligns with the within-subsystem dynamics. The Jacobian estimation for the task-trained RNN, while robust considering the Jacobian consisted of a $128 \times 128$ matrix at each time point, was not perfect (we achieved an $R^2$ of 0.68 with 5% observation noise, as reported). Thus, for such a sensitive measure, even small Jacobian estimation errors can lead to timing deviations.

We can therefore conjecture that the source of this delay might have to do with small estimation errors in the eigenvalues of the estimated within-area dynamics matrices over time. Broadly speaking, when considering control over short time windows, (moderately) unstable eigenvalues in the target subsystem yield higher reachability, as activity can more easily be driven outward along their corresponding eigenvectors. As mentioned, the exact reachability values will depend sensitively on many factors, but the eigenvalues should provide a clue. For instance, the average predicted maximum eigenvalue in the visual area peaks later than the ground truth, which may explain the cognitive-to-visual delay. In the cognitive area, a transient dip in the average predicted maximum eigenvalue could cause the early minimum in visual-to-cognitive reachability.

Crucially, the delays in the reachability curves do not obscure the core result of the figure. On average, the cognitive-to-visual reachability tended to peak later in the delay than the visual-to-cognitive reachability. We furthermore note that this bias does not extend to the actual Jacobian estimation, for which the average residual is very close to zero, indicating that the JacobianODEs constitute an unbiased estimator.

This analysis highlights a limitation of our approach that we did not previously address, which relates to the interpretability of the time-resolved reachability curves. As demonstrated, reachability is a highly sensitive measure. While our method reliably captures broad trends in reachability over time, fine-grained, time point-specific comparisons should be interpreted with caution. We have added this to the discussion of limitations in the paper, and we thank the reviewer for encouraging this clarification.

Training on other laboratory tasks (e.g. motor control)

We agree that applying our method to laboratory tasks involving motor control, where control-theoretic frameworks are well established, would be highly compelling. This is a promising direction for future work, and we appreciate the suggestion.

2025-08-02

I thank the authors for their conceptual clarifications, and I appreciate their effort in revising the relevant parts of their submission. I believe the core message of the submission will flow smoother now, I will continue to reccomend the acceptance of this work.

审稿意见

评分: 5置信度: 42025-07-05

Authors are modeling dynamical systems by learning from observational data of trajectories. Instead of learning the vector field of the dynamical system directly, they reformulate the problem to learn the system's Jacobian matrix.

The authors recognize an important fact: each row of the Jacobian corresponds to the gradient of a scalar function, and therefore represents a conservative vector field. This allows them to impose structure in the learning process — specifically, that line integrals over these fields are path-independent.

Although the full vector field f(x) does not need to be conservative, its Jacobian rows must be, since they are gradients by definition.

By applying a generalized form of the fundamental theorem of calculus, they express the function f(x(t)) — which is not modeled directly — as a path integral over the Jacobian.

This approach enables them to reuse standard ODE solvers to generate trajectories of the full nonlinear system — not just a linearized approximation.

In the loss function, they combine a trajectory reconstruction loss (to ensure consistency with observed dynamics) with a loop closure loss that enforces path independence, reflecting the conservative structure of Jacobian rows.

优缺点分析

Strengths:

elegant theoretical framework that is quite applicable for range of dynamical systems
casting the problem from learning vector field of dynamics to jacobian learning
efficient way of learning, without explicit need to model directly vector field

No major weaknesses.

问题

For the case of chaotic dynamical systems, I would like to see comparison how this approach compares to echo state networks:

see: Doan, Nguyen Anh Khoa, Wolfgang Polifke, and Luca Magri. "Physics-informed echo state networks for chaotic systems forecasting." Computational Science–ICCS 2019: 19th International Conference, Faro, Portugal, June 12–14, 2019, Proceedings, Part IV 19. Springer International Publishing, 2019.

Pathak, Jaideep, et al. "Model-free prediction of large spatiotemporally chaotic systems from data: A reservoir computing approach." Physical review letters 120.2 (2018): 024102.

Please state limitations more explicitly.

局限性

Please state limitations more explicitly.

最终评判理由

Author have performed additional experiments with echo networks, and as expected echo state networks worked quite good, but not good in estimating Jacobian. They have explained what are limitations to their paper is. I recommend acceptance of paper.

格式问题

作者回复

2025-07-31

We thank the reviewer for their thoughtful feedback, and appreciate that the reviewer found our theoretical framework to be both elegant and broadly applicable. We address here the questions posed by the reviewer.

Comparison to echo state networks

TL;DR: Echo State Networks (ESNs) achieve comparable short-term prediction on chaotic systems, but fail to recover accurate Jacobians. This stems from structural limitations in mapping ESN dynamics back to the source system.

The reviewer aptly suggested a comparison to echo state networks (ESNs, or reservoir networks) in the case of chaotic dynamical systems. ESNs have been shown to be able to forecast the Lorenz 63 system for a particular choice of hyperparameters (Pathak, J. et. al. Using machine learning to replicate chaotic attractors and calculate Lyapunov exponents from data. Chaos, 2017). We note that Pathak et. al. (2017) implements the same method as Pathak et. al. (2018) referenced by the reviewer. We implemented the ESN architecture and hyperparameters from Pathak et al. (2017) for the Lorenz 63 system, which involved a reservoir of 300 nodes. Our implementation of the Lorenz system was simulated with a sampling interval of ~0.015 s, very similar to the 0.02 s used in the paper. Thus, for consistency of comparison with JacobianODEs, we fit ESN models to data sampled every 0.015 s.

We trained the ESNs on the same volume of data (with 1% observation noise) as the other models, and swept the regularization parameter $\beta$ to ensure a good fit. To ensure stable internal dynamics, we used a 150-step warm-up before generating predictions. The resulting ESNs predicted the Lorenz system well, achieving 10-step test MSE of 0.03—comparable to JacobianODEs (0.07) and NeuralODEs (0.04).

However, obtaining the Jacobian of the original Lorenz system from the ESN presents challenges. While the Jacobian of the internal reservoir dynamics (a 300×300 matrix) can be uniquely computed and was used in prior work to estimate Lyapunov exponents (Pathak et. al. 2017), recovering the Jacobian of the original system requires inverting the non-square readout matrix (Banerjee, A. et. al. Using machine learning to assess short term causal dependence and infer network links. Chaos, 2019). As shown in Banerjee et al. (2019), this inversion yields a non-unique solution, making such Jacobians ill-suited for system identification, and are used in the paper "only for causality estimation purposes".

We estimated these Jacobians nonetheless, using a long (7200-step) warm-up and true state input for stability. The resulting Jacobians had a Frobenius norm error (averaged over five random seeds) of $250 \pm 0.3$ , compared to $3.3 \pm 0.2$ for JacobianODEs and $8.7 \pm 0.3$ for NeuralODEs. Thus, while ESNs can match predictive accuracy, they are significantly less accurate in capturing the system’s Jacobian, due to architectural limitations.

We have added a discussion of this comparison into the appendix of the paper, and we thank the reviewer for suggesting this valuable benchmark.

Stating limitations more explicitly

We have extended the limitations section to address limitations surrounding sensitivity of reachability and scalability, in addition to partial observability. The revised section is as follows:

A future challenge for JacobianODEs is partially observed dynamics. Recent work has identified that it is possible to learn latent embeddings that approximately recover the true state from partial observation [108-112]. Jacobian-based dynamics learning has been performed in a latent space [61, 113], yet it is unclear whether this translates to accurate Jacobian estimation, given the challenges related to automatic differentiation and Jacobian estimation presented here.

We also note that reachability estimates depend sensitively on several factors: the alignment of cross-subsystem interactions and within-subsystem dynamics, the eigenvectors and eigenvalues of within-subsystem Jacobians, and the way activity propagates within each subsystem (see appendix C.1). While our method reliably captures broad trends in reachability over time, fine-grained, time point-specific comparisons should be interpreted with caution.

Finally, although JacobianODEs scale well to moderately high-dimensional systems, their performance in systems that are orders of magnitude larger than those considered here (e.g., recordings of thousands of neurons via calcium imaging) remains to be tested. In many practical settings, this may not be a limitation: neural representations during tasks often exhibit intrinsic dimensionalities comparable to those studied here, enabling JacobianODEs to operate in a reduced dimensionality. Notably, most models converged in under 45 minutes on a single GPU, suggesting favorable scaling. Future work should explore the viability of the method in higher dimensions, and assess whether dimensionality reduction strategies (such as latent state models or low-rank Jacobian approximations) can further improve scalability and accuracy.

2025-08-04

I am happy with the authors response and extra experiments that they have done. I am staying with my ratings and decision to accept the paper.

最终决定Accept (spotlight)

2025-09-17

In this paper, the authors propose JacobianODE, a method for directly estimating Jacobians from time series data using a path integral loss and a loop closure constraint. The framework enables nonlinear control-theoretic analysis of subsystem interactions, demonstrated on synthetic dynamical systems (Van der Pol, Lorenz, Lorenz-96) and a task-trained RNN. Results show JacobianODE outperforms Neural ODEs and weighted regression on Jacobian estimation error, Lyapunov spectrum recovery, and control tasks.

Reviews were overall positive. R-ya7v acknowledged the elegant theoretical formulation and applicability across nonlinear dynamical systems, and noted that the shift from vector field to Jacobian learning is well-motivated. R-DEPk highlighted the strong empirical evaluation and clear ablations, though suggested that section structure and exposition could be improved, and requested clearer discussion of the loop closure loss and more informative visualizations. Reviewers also asked for comparisons to echo state networks (ESNs) and clearer articulation of limitations. The rebuttal sufficiently addressed a lot of these points. The authors added ESN comparisons, showing that ESNs perform well on short-term prediction but fail to recover accurate Jacobians, underscoring the main improvement with their approach. They also expanded the limitations section, noting sensitivity of reachability analyses and challenges for scalability and partial observability. Clarifications were made around loop construction, delayed reachability estimates, and figure presentation. Reviewers were satisfied with the rebuttal, maintaining support and recommending acceptance after revisions.

Overall, the paper introduces a clear and well-supported methodological advance for data-driven Jacobian estimation and its application to control-theoretic analysis. I recommend acceptance, with the expectation that the authors integrate the added ESN comparisons, expanded limitations, and clarified presentation into the final version.