Understanding Continuous-depth Networks through the Lens of Homogeneous Ricci Flows
摘要
评审与讨论
This papper invesitages continuous depth networks with the techniques of Ricci flow. This submission provides sufficiently amount of experiments. The homogeneous Ricci flows provides a explanation on continuous depth networks, which bridges for the first time the neural networks and Ricci flow. Furthermore, it is shown that Ricci soliton and Ricci curvature tensor can be learnt by continuous depth networks.
优点
This submission has provdied good illustrations of the evolving process, and a new perspective for understanding the interpretability of neural networks intuitively, which makes this paper novel.
缺点
Despite the claims of contribution, it is not clear that what is the main technical/theoretical results this paper has provided. This is mainly caused by the presentation and organization. It seems that no theoretical guarantee has been shown in any place in the main article.
It could be an interesting direction that Ricci flow and discrete depth networks come to a common ground, and I believe this is not well understood by the machine learning community. However, the layout of section 3 is far from being able to attract general audience of this venue.
问题
What is the main theoretical claim that the experiments are verifying for?
This work presents a novel geometric perspective of continuous-depth neural networks by using established tools from differential geometry such as the homogeneous Ricci flow to show how neural networks shape the underlying Ricci curvature of the representation space. The authors verify their theoretical contributions by visualising the evolution of the Ricci curvature and how it indeed leads to a separation of the data.
优点
- Studying neural networks from a more geometric perspective is a promising avenue that could lead to novel insights into the inner workings of neural networks. Leveraging the powerful tools from differential geometry such as Ricci flows and Ricci solitons is a helpful contribution that could foster more such works.
- The visualisations of the underlying tensors are very interesting and display that the theory indeed seems to be capturing the essence of representation learning.
缺点
The paper is really hard to follow and the authors don’t do a great job at explaining the (admittedly complex) concepts needed in this work. What made it however even tougher for me to follow is that it is very difficult to understand when and where exactly the structure of the neural network is actually used in the theory. The authors start with a rather abstract introduction to Ricci flow, Ricci solitons etc, which is great to have, but then the connection to continuous-depth networks happens very suddenly and in an unclear manner. What exactly is the structure of the neural network determining in the equations? Or differently put, which quantities previously kept abstract (e.g. the manifold, the metric tensor at time , the Ricci curvature etc) is now determined by assuming a neural network structure? I believe it is the diffeomorphisms?
Also, why are we working with the homogeneous Ricci flow instead of the standard Ricci flow? The authors also use a lot of concepts without introducing them, what is ad(K)-invariant for instance? Such things might be obvious to researchers closely working in this field, but even for people interested in theoretical ML research, this paper is really tough to read. There are lots of statements like “There is no doubt that after the discretisation, Eq 11 will degenerate into Eq 4” that are not obvious to me at least.
I think the ideas in this work are interesting, but the work in its current shape really does not explain them well, making it very difficult for me to assess this work positively. I’m happy to re-consider my score if the authors can clarify and potentially incorporate my feedback.
问题
- On page 6, when you define the Ricci curvature as the Lie derivative (equation 9), in the third line of the derivation, where did the diffeomorphisms and its pullback go? How are they defined in case of a neural network? I guess the diffeomorphisms is simply the forward pass from time to time ?
- For equation (10) right hand side, where did the limit go? Similarly in equation (12), the left-hand side seems to depend on while the right-hand side does not. There is also no dependence on in the product.
- Where in the theory do you explicitly use the fact that the output space is Euclidean? What would change if another structure were imposed?
This paper analyzes the behavior of continuous time neural networks. The key to doing this is by analyzing the evolution of the pullback metric, which allows one to compute the intermediate metrics for visualization.
优点
- N/A
缺点
- The naming of "Ricci Flow" is incorrect/misleading. Really, this is an intrinsic geometric flow, of which the Ricci Flow is a special case. This is because the intrinsic geometric flow evolves the metric according to some diff eq (in this case one parameterized by the neural network), whereas the Ricci Flow is a prescribed partial differential equation that doesn't depend on the neural network.
- The method can be simplified considerably in presentation (effectively removing most of the unnecessary manifold/homogenous space constructions). In particular, the real question is how does the jacobian evolve according to time (the other stuff is used to make sure it doesn't degenerate), as the pullback metric is just the inverse of (from which one can compute the evolution ), which is already well known. This is actually cleaner than the current method, which doesn't utilize the fact that is known for ODEs and instead has to approximate with a step size.
- The method has an intrinsic limitation since the Jacobian scales poorly with input/output dimension.
- Experimentally, the results are only shown for extremely toy 2d data through visualization. Beyond showing ``the method can extract something", this section doesn't convey much else.
问题
N/A
This paper focuses on exploring the behavior of continuous-depth neural networks, a topic that holds potential interest for the research community. However, after thorough evaluation, all reviewers have consistently found that the paper falls short of the required standards in terms of its soundness, clarity, and overall contributions. Additionally, the authors failed to participate in the rebuttal phase, missing the opportunity to address and potentially resolve the concerns raised by the reviewers. Given these significant shortcomings, I recommend rejecting this submission.
为何不给更高分
All reviewers suggest that this paper is a clear reject. I agree with the concerns raised by the reviewers, and the authors didn't use the rebuttal phase to provide any further clarifications.
为何不给更低分
N/A
Reject