LCOT: Linear Circular Optimal Transport
The paper proposes a new metric, called LCOT, for probability measures supported on the unit circle that is computationally efficient, has an explicit linear embedding, and is rooted in the Circular OT metric.
摘要
评审与讨论
This paper proposes a new computationally efficient metric in order to compare circular probability measures by extending the Linear Optimal Transport framework to the circle. The authors leverage a closed-form for the Monge map between the uniform distribution and another arbitrary probablity measure to provide a closed-form for the Linear Circular Optimal Transport metric with reference the uniform measure. Then, they study theoretically the metric and apply it to different experiments with a comparison with the Circular Optimal Transport distance.
优点
Overall, the paper is nicely written and very clear. It proposes a new metric and show its benefit compared to the previous Circular Optimal Transport distances.
- Very well written and clear
- New efficient metric on the circle with a theoretical analysis
- Experiments which show the behavior of LCOT in comparison with COT: runtimes, comparison of the distances through a MDS and barycenter computation which cannot be directly done with COT.
缺点
- The paper is pretty incremental as it feels like a specification of the Linear OT framework in the particular case of the circle, which relies heavily on previous works such as [1, 2] for the Linear OT framework and [3] for the closed-form on the circle.
- The experiments are only on toy data.
[1] Sarrazin, Clément, and Bernhard Schmitzer. "Linearized optimal transport on manifolds." arXiv preprint arXiv:2303.13901 (2023).
[2] Wang, Wei, Dejan Slepčev, Saurav Basu, John A. Ozolek, and Gustavo K. Rohde. "A linear optimal transportation framework for quantifying and visualizing variations in sets of images." International journal of computer vision 101 (2013): 254-269.
[3] Bonet, Clément, and Paul Berg, and Nicolas Courty, and François Septier, and Lucas Drumetz, and Minh-Tan Pham. "Spherical Sliced-Wasserstein". International Conference on Learning Representations (2023).
问题
In my opinion, the paper would really benefit from an experiment on real data. For now, the experiments are informatives, but do not make me specially see in which scenario I would benefit to apply the linear framework. Thus, finding an application where the computations of multiple pairwise OT distances on the circles is required, seems to be very important in my opinion.
Some references seems to be missing. Notably, in [1, 2], derivations of the Monge map on the circle where made.
[1] Beraha, Mario, and Matteo Pegoraro. "Wasserstein Principal Component Analysis for Circular Measures." arXiv preprint arXiv:2304.02402 (2023).
[2] Quellmalz, Michael, Robert Beinert, and Gabriele Steidl. "Sliced optimal transport on the sphere." arXiv preprint arXiv:2304.09092 (2023).
We thank the Reviewer for their time, consideration, and invaluable feedback.
Weaknesses:
- The paper is pretty incremental as it feels like a specification of the Linear OT framework in the particular case of the circle, which relies heavily on previous works such as [1, 2] for the Linear OT framework and [3] for the closed-form on the circle.
Comment: Regarding the choice of focusing on the circle within the framework of Linear Optimal Transport (OT), we aimed to highlight the advantages of delving into a specific manifold. While previous works such as [1] operate in abstract manifolds (M), our decision to fixate on the circle enabled the derivation of new, concrete closed-form formulas, as evidenced in our paper. These expressions, however, hold significance within the defined manifold and may not extend beyond its boundaries.
The utilization of the closed-form expression of alpha from [3], alongside insights from the works of Rabin et al. and Delon et al., indeed forms a foundation for certain aspects in section 2 of our paper. However, it is important to highlight that while some components build upon prior works, the framework as a whole is novel. We believe that the proposed comptuationally efficient framework could be of interest to the machine learning research community.
We also acknowledge the influence and inspiration drawn from [2] in the development of some of the concepts and tools introduced in our work. However, it's crucial to note that while [2] predominantly deals with Euclidean domains, our focus on non-Euclidean domains presents its own set of challenges and complexities.
[1] Sarrazin, Clément, and Bernhard Schmitzer. "Linearized optimal transport on manifolds." arXiv preprint arXiv:2303.13901 (2023).
[2] Wang, Wei, Dejan Slepčev, Saurav Basu, John A. Ozolek, and Gustavo K. Rohde. "A linear optimal transportation framework for quantifying and visualizing variations in sets of images." International journal of computer vision 101 (2013): 254-269.
[3] Bonet, Clément, and Paul Berg, and Nicolas Courty, and François Septier, and Lucas Drumetz, and Minh-Tan Pham. "Spherical Sliced-Wasserstein". International Conference on Learning Representations (2023).
- The experiments are only on toy data.
Comment: See Questions below.
Questions:
- In Appendix A.5.2 we added an application on image retrieval that requires multiple OT comparisons for each query.
In a similar fashion, we also added an experiment on color interpolation between images in Appendix A.5.1 to show some non simulated data application. (However, this one does not require multiple pairwise OT distances.)
- We thank the reviewer for mentioning the works [1,2]. Unfortunately, we were unaware of these recent papers at the time of writing our paper. We have included these references in our manuscript.
[1] Beraha, Mario, and Matteo Pegoraro. "Wasserstein Principal Component Analysis for Circular Measures." arXiv preprint arXiv:2304.02402 (2023).
[2] Quellmalz, Michael, Robert Beinert, and Gabriele Steidl. "Sliced optimal transport on the sphere." arXiv preprint arXiv:2304.09092 (2023).
Standard OT metrics are well studied on data sample points having Euclidean structure, e.g. Wasserstein and Gromov-Wasserstein distances. Recently, the OT community addresses gained ample interest in OT metrics on non-Eculean spaces. This paper investigates OT between measures supported on the unit circle, aka circular probability measures. The paper starts with a background on circular OT by defining probability measures with 1-periodicity. This gives the CDF function with respect to a reference point. Proposition 2.3 is quite interesting since it expresses the COT between circular probability as an optimization problem on the real line by searching cutting points . Using classical optimal transport on the real line, we can construct the Monge map between data points on the circular probability measures.
This work proposes LCOT a new discrepancy between circular probability measures, allowing to embedding the origin measures by computing their Monge map with respect to a fixed reference measure (uniform or Lebesgue measure). If the cost function is the -power of the absolute value, then LCOT is a proper distance and coincides with the norm on the unit circle endowed with the reference measure.
优点
- Paper easy to follow and well written.
- Proposing a novel linearization of the circular OT metric (LCOT) by embedding the origin measures via a Monge displacement with respect to reference measures.
- LCOT outperforms classical COT in terms of computational complexity.
- Extensive numerical experiments comparing COT and LCOT (with different reference measures) on simulated datasets.
- The proofs sound good to me.
缺点
- There is a backbone result in the paper which is the uniqueness of (see Remark 2.4). There is an empirical showing of this uniqueness in Figure 8, Appendix A.2. However, I am still not convinced, since in Equation (8) one can get different cutting points, and using Equation (9), this may give different . Probably, the result is trivial but I think it deserves to be proved.
- As it can be seen in Figure 4, the LCOT with non-uniform reference measure is slower than the LCOT-Uniform. Is there any constant depending on the choice of the reference measure that can be added to the complexity of LCOT?
- The experiments are conducted on simulated data, I think the application of LCOT in real data in one of the detailed domains (see Paragraph 2 in the introduction) would add more importance to this approach. I am not aware of the existence of such benchmark data.
问题
Related work
Missing related work: "The Statistics of Circular Optimal Transport" by Hundrieser et al. 2022.
Minor Typos
- Page 6: Definition 3.2
(LCOT distance):I think should be (LCOT discrepancy) - Page 7: Caption of Figure 4: "Left: Wall-clock time for and "
- Page 9: first line of conclusion "distance" --> "discrepancy
- Page 10: References: wasserstein --> Wasserstein
伦理问题详情
NA
We thank the Reviewer for their time, consideration, and invaluable feedback.
Weaknesses:
-
We included the proof of the uniqueness of in the Appendix. The argument is based on the paper [Delon et. al. "Fast transport optimization for Monge costs on the circle" SIAM Journal on Applied Mathematics (2010)]. The result holds when considering absolutely continuous measures and , and cost with an strictly convex function. Essentially, the idea is to prove that the integral in Equation (8), view as a function of , is continuous and strictly convex. Therefore, a unique global minimum is attained. Figure 8 in Appendix A.2 was meant to show this.
-
To the best of our knowledge, there is no such constant related to the reference measure that can affect the complexity of LCOT, except for the sample size. The reason the uniform case is faster is that, in this case, the optimal has a closed form, allowing us to apply it directly to the computation of the embedding. For the non-uniform case, is unknown, and we must use a COT solver (e.g., the binary search algorithm in [Delon et. al. "Fast transport optimization for Monge costs on the circle" SIAM Journal on Applied Mathematics (2010)]). The computational time for this is included in Figure 4. In other words, if we do not use the closed form of for the uniform case, the computation cost would be the same as for the non-uniform case. For other reference measures there might be closed-form solutions for the minimizer that we are not aware of.
-
We are not aware of the existence of such benchmarks either. However, we added an experiment on color interpolation between images in the Appendix (A.5.1) to show some non simulated data application. Also, in Appendix A.5.2, we added an application on image retrieval that requires multiple OT comparisons for each query.
Questions:
Related work
Missing related work: The mentioned paper (["The Statistics of Circular Optimal Transport" by Hundrieser et al. 2022.]) was already cited in the manuscript and studied during our work. Please, check previous to last reference on page 10, and first line page 2.
Minor Typos:
We thank the reviewer for the thorough read. We corrected all the typos. As for Definition 3.2, it is in general a discrepancy (as the reviewer pointed out). Nevertheless, when for we proved that it is indeed a distance.
This work studied the optimal transport on the unit circle. Since in the optimal transport on the circle, in Eqs. (7) and (9) is not obtained in the closed-form, the optimal transport on the circle does not have a closed-form solution. This work proposed its approximated variant, LCOT, such that LCOT has the closed-formued solution. Then, this work showed that in the special case, LCOT became equivalent to the original optimal transport on the circle (Remark 3.4) and evaluated the effectiveness of LCOT in the synthetic data.
优点
- In Fig. 4, this work demonstrated that the LCOT can be computed much faster than the original optimal transport on the circle, which is consistent with the theoretical analysis.
- The relationship between the original optimal transport on the circle and the LCOT is discussed in Remark 3.4.
缺点
- The intuitive interpretation of LCOT, including Eq. (17) and equations in Def. 3.2, is not clear to the reviewer.
- The authors claimed that "invertible nature of the LCOT embedding allows us to directly calculate the barycenter" in Sec. 4, but the algorithm is missing. It is important for future research to show detailed algorithms in this paper.
- In Fig. 4, the unit (e.g., seconds, mili seconds) is missing.
问题
- Could the authors explain the definition of LCOT, including Eq. (17) and equations in Def. 3.2, more intuitively?
- Remark 3.4. claimed that the COT and LCOT are equivalent. Can it be proved that the barycenter under COT and LCOT are also equivalent?
We thank the Reviewer for their time, consideration, and invaluable feedback.
Weaknesses:
-
See Question 1 below.
-
We explicitate the mentioned phrase in the Appendix A.4 "LCOT Barycenter". We thank the reviewer for this comment since now our definition is more concise.
Precisely, the LCOT barycenter is defined as follows:
Given target measures , as is a distance, their LCOT barycenter is defined by the measure such that
In the embedding space, it can be shown that the minimizer of is given by the circular mean
For each , the last value is the average of the angles , which is then normalized to fall within the range . By using the closed formula for the inverse of the LCOT Embedding provided by Proposition 3.5, we can go back to the measure space obtaining the LCOT barycenter between as #.
In our experiments, we use the last expression.
- We included the units (seconds) in the caption of Figure 4.
Questions:
- Consider two circular measures and and the uniform circular measure as the reference . Calculating COT from to can be intuitively thought as finding an optimal cut on the circle for , cutting the circle into the line and then solving a one-dimensional OT problem in this flat space, which has a closed-form solution. This process allows us to find a Monge map (and the corresponding optimal displacement) between and . Now, given that this mapping is invertible (i.e., for each push forward we also have a pull back), we can first push to the reference, , and then pull back to . This process will result in a transport map between and , which is unique, but is not necessarily the optimal transport map between and . Below, we describe this process more formally.
The LCOT-Embedding encodes the optimal circular displacement given in Equation (15). That is, a reference measure is fixed and for each target measure , the LCOT-Embedding is defined as , where is the optimal location to place .
For simplicity in the notation, let us denote by .
Since we have an optimal displacement for each , we have that is a function. To measure the disimilarity between to target measures and in the embedding space, we use a definition that allows us to use the standard -norms when . The particularity here is that these distances are for functions supported on the circle which makes the formulas more convoluted. The take away from Def 3.2 is that the LCOT distance between and is essentially a -norm ().
- In general, COT and LCOT are not equivalent distances. In Remark 3.4 we pointed out that the COT cost between and can be recovered by using the LCOT distance when is the reference. For other pairs of measures, LCOT is still a 'transport based' distance, just not the same as COT. As a consequence, COT and LCOT barycenter are not equivalent for arbitrary sets of measures.
This paper explores optimal transport (OT) metrics for circular probability measures, introducing Linear Circular Optimal Transport (LCOT) as a novel discrepancy measure. LCOT efficiently compares circular probability measures by extending the Linear Optimal Transport framework to the unit circle. The authors derive a closed-form solution for the Monge map between the uniform distribution and arbitrary probability measures, enabling a computationally efficient computation of LCOT. The metric, based on a power cost function, acts as a proper distance, coinciding with the norm on the unit circle under the reference measure. Theoretical analysis and experimental comparisons with Circular Optimal Transport distance support the efficacy of LCOT.
Overall, the closed-form variant of COT is a nice contribution for OT community. In the paper, it only compares the Linear version of COT and COT. It would be interesting to see the difference between the LCOT and Linear OT.
为何不给更高分
All reviewers put weak acceptance to this paper and this paper is a borderline paper. Moreover, the approach is interesting, but it is a minor topic. So, a poster presentation is good for this work.
为何不给更低分
N/A
Accept (poster)